Streams in Java are a way to lazily consume a possibly infinite amount of data in memory sequentially or, in some cases, even in parallel.
Streams are consumed in stream pipelines, which are composed by:
- a source;
- 0 or more intermediate operations;
- a terminal operation.
The source could be, to give a few examples, a Collection
, an Array
or even an I/O channel. These are typically consumed sequentially.
The intermediate operations, like filter()
or .map()
, always return a new stream and are lazy, that is, they are executed only when the terminal operation is invoked, so the traversal only happens at the end - not for each operation.
Terminal operations, such as forEach()
or findFirst()
, trigger the consume of the stream. These are optimized to execute the fewer possible operations, as an example, the findFirst()
terminal will only process the first item in the stream.
Once the stream is consumed, it can't be used anymore, and a new one must be created starting from the source.
A few key notes:
- Try to use stateless operations as they remove the possibility of having nondeterministic or wrong output;
- Prefer a reduce operation over a for loop to have easy access to parallelization as long as the operations are stateless and associative;
Source: https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/stream/package-summary.html