Java SE 8 For the Really Impatient, Note 4
3,255 words in 20 minutes
Chapter 2 The Stream API
Grouping and Partioning
groupingBy
: forms groups of values with the same characteristic
The function Locale::getCountry
is the classifier function of the grouping.
When the classifier function is a predicate function(that is, a function returning a boolean
value), the stream elements are partitioned into 2 list: those where the function returns true and the complement. In this case it’s more efficient to use partitioningBy
instead.
If you call the groupingByConcurrent
method, you get a concurrent map that, when used with a parallel stream, is concurrently populated. Analogous to toConcurrentMap
.
If you want sets instead of list, use downstream collector Collectors.toSet
.
Other downstream collectorscounting
: produces a count of the collected elements
summing(Int|Long|Double)
: takes a function argument, applies the function to the downstream elements, and produces their sum
maxBy
and minBy
: take a comparator and produce max and min of the downstream elements
mapping
: applies a function to downstream results, and it requires yet another collector for processing its results
|
|
summary statistics object: if the grouping or mapping function has return type int
, long
, or double
reducing
: applies a general reduction to downstream elements. 3 forms:
reducing(binaryOperator)
(identity is null)reducing(identity, binaryOperator)
reducing(identity, mapper, binaryOperator)
, mapper function is applied and its values are reduced
|
|
|
|
Only use downstream collectors in connection with groupingBy
or partitioningBy
to avoid convoluted expressions. Otherwise, simply use methods like map
, reduce
, count
, max
or min
directly on streams.
Primitive Type Streams
Wrap each integer into a wrapper object like Stream<Integer>
is inefficient. Same for the other primitive types.
IntStream
, LongStream
, DoubleStream
can store primitive values directly.
For the other primitives:IntStream
: store short, char, byte and booleanDoubleStream
: float
Create an IntStream
: use IntStream.of
or Arrays.stream
IntStream
and LongStream
have static methods range
and rangeClosed
that generate integer ranges with step size one
The CharSequence
interface has methods codePoints
and chars
that yield an IntStream
of the Unicode codes of the characters or of the code units in the UTF-16 encoding
Use mapToInt
, mapToLong
, mapToDouble
methods to transform a stream of objects to primitive types
boxed
: converts a primitive type stream to an object stream
Differences between primitive type streams and object streams:
toArray
returns primitive type arrays- Methods that yield an optional result return an
OptionalInt
,OptionalLong
orOptionalDouble
. They have methodsgetAsInt
,getAsLong
andgetAsDouble
instead ofget
. sum
,average
,max
,min
are defined.- The
summaryStatistics
method yield an object of typeIntSummaryStatistics
,LongSummaryStatistics
, orDoubleSummaryStatistics
The Random
class has methods ints
, longs
and doubles
that return primitive type streams of random numbers
Parallel Streams
Must have a parallel stream to parallelize bulk operations.
By default, stream operations create sequential streams, except for Collection.parallelStream()
.
parallel
: converts any sequential stream into a parallel one
The operations are stateless and can be executed in arbitrary order.
A bad example, something you cannot do
The function passed to forEach
runs concurrently in multiple threads, updating a shared array. Race condition!
Ensure that any functions you pass to parallel stream operations are threadsafe. You can use an array of AtomicInteger
objects. Or you can simply use the facilities of streams library and group strings by length.
By default, streams that arise from ordered collections (arrays and lists), from ranges, generators, and iterators, or from calling Stream.sorted
, are ordered.
Some operations can be more effectively parallelized when the ordering requirement is dropped.Stream.unordered
means there will be no ordering. Stream.distinct
can benefit from it because on an ordered stream, distinct
retains the first of all equal elements.That impedes parallelization. limit
can be speeded up if you just want any n elements from a stream and don’t care which ones you get.
Merging map is expensive. The Collectors.groupingByConcurrent
method uses a shared concurrent map. The collector is unordered already.
Noninterference
Do not modify the collection that is backing a stream while carrying out a stream operation, even if it’s threadsafe. Remember that streams don’t collect their own data - the data is always in a separate collection.
Since intermediate stream operations are lazy, it’s possible to mutate the collection up to the point when the terminal operation executes.
Bad example updating collection during operation
Functional Interfaces
Predicate
: an interface with one nondefault method returning a boolean value
boolean return type is important.
Functional Interfaces Used in the Stream API
Functional Interfaces | Parameter Types | Return Type | Description |
---|---|---|---|
Supplier<T> | None | T | Supplies a value of type T |
Consumer<T> | T | void | Consumes a value of type T |
BiConsumer<T, U> | T, U | void | Consumes values of types T and U |
Predicate<T> | T | boolean | A Boolean-valued function |
ToIntFunction<T> ToLongFunction<T> ToDoubleFunction<T> |
T | int long double |
An int-, long-, or double-valued function |
IntFunction<R> LongFunction<R> DoubleFunction<R> |
int long double |
R | A function with argument of type int, long, or double |
Function<T, R> | T | R | A function with argument of type T |
BiFunction<T, U, R> | T, U | R | A function with arguments of types T and U |
UnaryOperator<T> | T | T | A unary operator on the type T |
BinaryOperator<T> | T, T | T | A binary operator on the type T |