21. Iterators
An iterator provides sequential access to each element of some source of data.
Typical iterators allow the elements in a collection, such as a list, array, or TreeMap to be accessed one by one, but they can also provide access to data by carrying out some monadic effect, such as reading files.
Iterators provide a common interface to all of these operations.
Code that is written to the iterator API can be agnostic as to the source of the data.
Each iterator maintains an internal state that enables it to determine the next value. Because Lean is a pure functional language, consuming an iterator does not invalidate it, but instead copies it with an updated state. As usual, reference counting is used to optimize programs that use values only once into programs that destructively modify values.
To use iterators, import Std.Data.Iterators.
Mixing Collections
Combining a list and an array using List.zip or Array.zip would ordinarily require converting one of them into the other collection.
Using iterators, they can be processed without conversion:
def colors : Array String := #["purple", "gray", "blue"]
def codes : List String := ["aa27d1", "a0a0a0", "0000c5"]
#eval colors.iter.zip codes.iter |>.toArray
Avoiding Intermediate Structures
In this example, an array of colors and a list of color codes are combined. The program separates three intermediate stages:
-
The names and codes are combined into pairs.
-
The pairs are transformed into readable strings.
-
The strings are combined with newlines.
def colors : Array String := #["purple", "gray", "blue"]
def codes : List String := ["aa27d1", "a0a0a0", "0000c5"]
def go : IO Unit := do
let colorCodes := colors.iter.zip codes.iter
let colorCodes := colorCodes.map fun (name, code) =>
s!"{name} ↦ #{code}"
let colorCodes := colorCodes.fold (init := "") fun x y =>
if x.isEmpty then y else x ++ "\n" ++ y
IO.println colorCodes
#eval go
The intermediate stages of the computation do not allocate new data structures.
Instead, all the steps of the transformation are fused into a single loop, with Iter.fold carrying out one step at a time.
In each step, a single color and color code are combined into a pair, rewritten to a string, and added to the result string.
The Lean standard library provides three kinds of iterator operations.
Producers create a new iterator from some source of data.
They determine which data is to be returned by an iterator, and how this data is to be computed, but they are not in control of when the computations occur.
Consumers use the data in an iterator for some purpose.
Consumers request the iterator's data, and the iterator computes only enough data to satisfy a consumer's requests.
Combinators are both consumers and producers: they create new iterators from existing iterators.
Examples include Iter.map and Iter.filter.
The resulting iterators produce data by consuming their underlying iterators, and do not actually iterate over the underlying collection until they themselves are consumed.
Each built-in collection for which it makes sense to do so can be iterated over.
In other words, the collection libraries include iterator producers.
By convention, a collection type Coll provides a function Coll.iter that returns an iterator over the elements of a collection.
Examples include List.iter, Array.iter, and TreeMap.iter.
Additionally, other built-in types such as ranges support iteration using the same convention.