16.8 Collectors

A collector encapsulates the functions required for performing reduction: the supplier, the accumulator, the combiner, and the finisher. It can provide these functions since it implements the Collector interface (in the java.util.stream package) that defines the methods to create these functions. It is passed as an argument to the collect(Collector) method in order to perform a reduction operation. In contrast, the collect(Supplier, BiConsumer, BiConsumer) method requires the functions supplier, accumulator, and combiner, respectively, to be passed as arguments in the method call.

Details of implementing a collector are not necessary for our purposes, as we will exclusively use the extensive set of predefined collectors provided by the static factory methods of the Collectors class in the java.util.stream package (Table 16.7, p. 1005). In most cases, it should be possible to find a predefined collector for the task at hand. The collectors use various kinds of containers for performing reduction— for example, accumulating to a map, or finding the minimum or maximum element. For example, the Collectors.toList() factory method creates a collector that performs mutable reduction using a list as a mutable container. It can be passed to the collect(Collector) terminal operation of a stream.

It is a common practice to import the static factory methods of the Collectors class in the code so that the methods can be called by their simple names.

Click here to view code image

import static java.util.stream.Collectors.*;

However, the practice adopted in this chapter is to assume that only the Collectors class is imported, enforcing the connection between the static methods and the class to be done explicitly in the code. Of course, static import of factory methods can be used once familiarity with the collectors is established.

Click here to view code image

import java.util.stream.Collectors;

The three-argument collect() method is primarily used to implement mutable reduction, whereas the Collectors class provides collectors for both functional and mutable reduction that can be either used in a stand-alone capacity or composed with other collectors.

One group of collectors is designed to collect to a predetermined container, which is evident from the name of the static factory method that creates it: toCollection, toList, toSet, and toMap (p. 979). The overloaded toCollection() and toMap() methods allow a specific implementation of a collection and a map to be used, respectively—for example, a TreeSet for a collection and a TreeMap for a map. In addition, there is the joining() method that creates a collector for concatenating the input elements to a String—however, internally it uses a mutable StringBuilder (p. 984).

Collectors can be composed with other collectors; that is, the partial results from one collector can be additionally processed by another collector (called the downstream collector) to produce the final result. Many collectors that can be used as a downstream collector perform functional reduction such as counting values, finding the minimum and maximum values, summing values, averaging values, and summarizing common statistics for values (p. 998).

Composition of collectors is utilized to perform multilevel grouping and partitioning on stream elements (p. 985). The groupingBy() and partitionBy() methods return composed collectors to create classification maps. In such a map, the keys are determined by a classifier function, and the values are the result of a downstream collector, called the classification mapping. For example, the CDs in a stream could be classified into a map where the key represents the number of tracks on a CD and the associated value of a key can be a list of CDs with the same number of tracks. The list of CDs with the same number of tracks is the result of an appropriate downstream collector.

Leave a Reply

Your email address will not be published. Required fields are marked *