Java SE 8 For the Really Impatient, Note 4
3,255 words in 20 minutes
Chapter 2 The Stream API
Grouping and Partioning
groupingBy: forms groups of values with the same characteristic
The function Locale::getCountry is the classifier function of the grouping.
When the classifier function is a predicate function(that is, a function returning a boolean value), the stream elements are partitioned into 2 list: those where the function returns true and the complement. In this case it’s more efficient to use partitioningBy instead.
If you call the groupingByConcurrent method, you get a concurrent map that, when used with a parallel stream, is concurrently populated. Analogous to toConcurrentMap.
If you want sets instead of list, use downstream collector Collectors.toSet.
Other downstream collectorscounting: produces a count of the collected elements
summing(Int|Long|Double): takes a function argument, applies the function to the downstream elements, and produces their sum
maxBy and minBy: take a comparator and produce max and min of the downstream elements
mapping: applies a function to downstream results, and it requires yet another collector for processing its results
|
|
summary statistics object: if the grouping or mapping function has return type int, long, or double
reducing: applies a general reduction to downstream elements. 3 forms:
reducing(binaryOperator)(identity is null)reducing(identity, binaryOperator)reducing(identity, mapper, binaryOperator), mapper function is applied and its values are reduced
|
|
|
|
Only use downstream collectors in connection with groupingBy or partitioningBy to avoid convoluted expressions. Otherwise, simply use methods like map, reduce, count, max or min directly on streams.
Primitive Type Streams
Wrap each integer into a wrapper object like Stream<Integer> is inefficient. Same for the other primitive types.
IntStream, LongStream, DoubleStream can store primitive values directly.
For the other primitives:IntStream: store short, char, byte and booleanDoubleStream: float
Create an IntStream: use IntStream.of or Arrays.stream
IntStream and LongStream have static methods range and rangeClosed that generate integer ranges with step size one
The CharSequence interface has methods codePoints and chars that yield an IntStream of the Unicode codes of the characters or of the code units in the UTF-16 encoding
Use mapToInt, mapToLong, mapToDouble methods to transform a stream of objects to primitive types
boxed: converts a primitive type stream to an object stream
Differences between primitive type streams and object streams:
toArrayreturns primitive type arrays- Methods that yield an optional result return an
OptionalInt,OptionalLongorOptionalDouble. They have methodsgetAsInt,getAsLongandgetAsDoubleinstead ofget. sum,average,max,minare defined.- The
summaryStatisticsmethod yield an object of typeIntSummaryStatistics,LongSummaryStatistics, orDoubleSummaryStatistics
The Random class has methods ints, longs and doubles that return primitive type streams of random numbers
Parallel Streams
Must have a parallel stream to parallelize bulk operations.
By default, stream operations create sequential streams, except for Collection.parallelStream().
parallel: converts any sequential stream into a parallel one
The operations are stateless and can be executed in arbitrary order.
A bad example, something you cannot do
The function passed to forEach runs concurrently in multiple threads, updating a shared array. Race condition!
Ensure that any functions you pass to parallel stream operations are threadsafe. You can use an array of AtomicInteger objects. Or you can simply use the facilities of streams library and group strings by length.
By default, streams that arise from ordered collections (arrays and lists), from ranges, generators, and iterators, or from calling Stream.sorted, are ordered.
Some operations can be more effectively parallelized when the ordering requirement is dropped.Stream.unordered means there will be no ordering. Stream.distinct can benefit from it because on an ordered stream, distinct retains the first of all equal elements.That impedes parallelization. limit can be speeded up if you just want any n elements from a stream and don’t care which ones you get.
Merging map is expensive. The Collectors.groupingByConcurrent method uses a shared concurrent map. The collector is unordered already.
Noninterference
Do not modify the collection that is backing a stream while carrying out a stream operation, even if it’s threadsafe. Remember that streams don’t collect their own data - the data is always in a separate collection.
Since intermediate stream operations are lazy, it’s possible to mutate the collection up to the point when the terminal operation executes.
Bad example updating collection during operation
Functional Interfaces
Predicate: an interface with one nondefault method returning a boolean value
boolean return type is important.
Functional Interfaces Used in the Stream API
| Functional Interfaces | Parameter Types | Return Type | Description |
|---|---|---|---|
| Supplier<T> | None | T | Supplies a value of type T |
| Consumer<T> | T | void | Consumes a value of type T |
| BiConsumer<T, U> | T, U | void | Consumes values of types T and U |
| Predicate<T> | T | boolean | A Boolean-valued function |
| ToIntFunction<T> ToLongFunction<T> ToDoubleFunction<T> |
T | int long double |
An int-, long-, or double-valued function |
| IntFunction<R> LongFunction<R> DoubleFunction<R> |
int long double |
R | A function with argument of type int, long, or double |
| Function<T, R> | T | R | A function with argument of type T |
| BiFunction<T, U, R> | T, U | R | A function with arguments of types T and U |
| UnaryOperator<T> | T | T | A unary operator on the type T |
| BinaryOperator<T> | T, T | T | A binary operator on the type T |

