Scala - Iterators with Examples
• Iterators provide memory-efficient traversal of collections by computing elements on-demand rather than storing entire sequences in memory
Key Insights
• Iterators provide memory-efficient traversal of collections by computing elements on-demand rather than storing entire sequences in memory
• Once consumed, Scala iterators cannot be reused—calling methods like toList or foreach exhausts the iterator permanently
• Iterator transformations are lazy and chainable, enabling powerful pipeline operations without intermediate collection allocations
Understanding Iterators in Scala
Iterators represent a fundamental abstraction for sequential access to collection elements. Unlike collections that hold all elements in memory, iterators generate values on-demand through the next() method and track position with hasNext. This lazy evaluation model makes iterators ideal for processing large datasets, infinite sequences, or expensive computations.
val numbers = Iterator(1, 2, 3, 4, 5)
while (numbers.hasNext) {
println(numbers.next())
}
// Attempting to reuse exhausted iterator
println(numbers.hasNext) // false - iterator is consumed
The critical characteristic: iterators are stateful and mutable. Each call to next() advances the internal cursor, making the previous element inaccessible.
Creating Iterators
Scala provides multiple approaches to iterator construction beyond simple collection conversion.
// From collections
val listIter = List(1, 2, 3).iterator
val arrayIter = Array("a", "b", "c").iterator
// From ranges
val rangeIter = (1 to 100).iterator
// Infinite iterators
val infiniteOnes = Iterator.continually(1)
val naturalNumbers = Iterator.from(0)
val fibonacciStream = Iterator.iterate((0, 1)) {
case (a, b) => (b, a + b)
}.map(_._1)
// Custom iterators
val customIter = new Iterator[Int] {
private var current = 0
def hasNext: Boolean = current < 5
def next(): Int = {
if (!hasNext) throw new NoSuchElementException
current += 1
current
}
}
// Take first 10 Fibonacci numbers
println(fibonacciStream.take(10).toList)
// Output: List(0, 1, 1, 2, 3, 5, 8, 13, 21, 34)
The Iterator.continually and Iterator.from methods create unbounded sequences, while iterate generates values by repeatedly applying a function—perfect for recursive sequences.
Transformation Operations
Iterator transformations return new iterators without evaluating elements immediately. This lazy behavior chains operations efficiently.
val data = Iterator(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
val result = data
.filter(_ % 2 == 0) // Keep even numbers
.map(_ * 10) // Multiply by 10
.drop(1) // Skip first element
.take(2) // Take next 2 elements
println(result.toList) // List(40, 60)
Each transformation creates a wrapper iterator. No computation occurs until terminal operations like toList, foreach, or reduce force evaluation.
// Demonstrate laziness
val lazyIter = Iterator(1, 2, 3, 4, 5)
.map { x =>
println(s"Mapping $x")
x * 2
}
println("Iterator created, no output yet")
lazyIter.take(2).foreach(println)
// Output:
// Iterator created, no output yet
// Mapping 1
// 2
// Mapping 2
// 4
Grouping and Partitioning
Iterators support sophisticated grouping operations for data segmentation.
val numbers = Iterator(1, 2, 3, 4, 5, 6, 7, 8, 9)
// Partition into two iterators
val (evens, odds) = numbers.duplicate match {
case (it1, it2) => (it1.filter(_ % 2 == 0), it2.filter(_ % 2 != 0))
}
println(s"Evens: ${evens.toList}")
println(s"Odds: ${odds.toList}")
// Group consecutive elements
val grouped = Iterator(1, 2, 3, 4, 5, 6).grouped(2)
grouped.foreach(group => println(group.toList))
// Output:
// List(1, 2)
// List(3, 4)
// List(5, 6)
// Sliding windows
val sliding = Iterator(1, 2, 3, 4, 5).sliding(3)
sliding.foreach(window => println(window.toList))
// Output:
// List(1, 2, 3)
// List(2, 3, 4)
// List(3, 4, 5)
The duplicate method creates two independent iterators from one source—essential when you need multiple passes over the same data.
Aggregation and Reduction
Terminal operations consume iterators while producing final results.
val numbers = Iterator(1, 2, 3, 4, 5)
// Basic aggregations
println(numbers.sum) // 15
val numbers2 = Iterator(1, 2, 3, 4, 5)
println(numbers2.product) // 120
val numbers3 = Iterator(1, 2, 3, 4, 5)
println(numbers3.max) // 5
// Fold operations
val numbers4 = Iterator(1, 2, 3, 4, 5)
val sum = numbers4.fold(0)(_ + _)
println(sum) // 15
// Reduce with operation tracking
val numbers5 = Iterator(1, 2, 3, 4, 5)
val factorial = numbers5.reduce { (acc, n) =>
println(s"$acc * $n")
acc * n
}
// Output shows intermediate steps
// 1 * 2
// 2 * 3
// 6 * 4
// 24 * 5
println(factorial) // 120
Remember: these operations exhaust the iterator. Attempting to reuse it afterward returns empty results.
Combining Iterators
Merge multiple iterators using concatenation, zipping, or interleaving.
// Concatenation
val iter1 = Iterator(1, 2, 3)
val iter2 = Iterator(4, 5, 6)
val combined = iter1 ++ iter2
println(combined.toList) // List(1, 2, 3, 4, 5, 6)
// Zipping - pairs elements
val letters = Iterator("a", "b", "c")
val numbers = Iterator(1, 2, 3, 4)
val zipped = letters.zip(numbers)
println(zipped.toList) // List((a,1), (b,2), (c,3))
// Zip with index
val indexed = Iterator("x", "y", "z").zipWithIndex
println(indexed.toList) // List((x,0), (y,1), (z,2))
// Flatten nested iterators
val nested = Iterator(Iterator(1, 2), Iterator(3, 4), Iterator(5))
val flattened = nested.flatten
println(flattened.toList) // List(1, 2, 3, 4, 5)
Practical Use Cases
Processing Large Files
import scala.io.Source
def processLargeFile(filename: String): Unit = {
val source = Source.fromFile(filename)
try {
val lines = source.getLines() // Returns iterator
val result = lines
.filter(_.nonEmpty)
.map(_.trim)
.filter(_.startsWith("ERROR"))
.take(100)
.toList
println(s"Found ${result.length} error lines")
} finally {
source.close()
}
}
This approach processes files line-by-line without loading entire contents into memory.
Generating Test Data
case class User(id: Int, name: String, score: Int)
val testUsers = Iterator.from(1)
.map { id =>
User(id, s"user_$id", scala.util.Random.nextInt(100))
}
.take(1000)
// Process in batches
testUsers.grouped(100).foreach { batch =>
println(s"Processing batch of ${batch.length} users")
// Simulate database insert
}
Stream Processing Pipeline
def processDataStream(source: Iterator[String]): Map[String, Int] = {
source
.flatMap(_.split("\\s+"))
.map(_.toLowerCase.replaceAll("[^a-z]", ""))
.filter(_.nonEmpty)
.foldLeft(Map.empty[String, Int]) { (acc, word) =>
acc + (word -> (acc.getOrElse(word, 0) + 1))
}
}
val text = Iterator(
"Scala iterators are powerful",
"Iterators enable lazy evaluation",
"Powerful tools for data processing"
)
val wordCount = processDataStream(text)
println(wordCount)
// Map(scala -> 1, iterators -> 2, are -> 1, powerful -> 2, ...)
Performance Considerations
Iterators excel when memory efficiency matters more than random access. They avoid intermediate collection allocations during transformations.
// Memory-efficient: processes one element at a time
val efficientSum = (1 to 1000000).iterator
.filter(_ % 2 == 0)
.map(_ * 2)
.take(1000)
.sum
// Memory-intensive: creates intermediate collections
val inefficientSum = (1 to 1000000)
.filter(_ % 2 == 0) // Creates new collection
.map(_ * 2) // Creates another collection
.take(1000) // Creates yet another
.sum
However, iterators impose overhead for each hasNext/next call. For small datasets or when multiple traversals are needed, use collections directly. Choose iterators when processing streams, large files, or infinite sequences where lazy evaluation provides clear benefits.