Scala - Iterators with Examples | Application Architect

Key Insights

• Iterators provide memory-efficient traversal of collections by computing elements on-demand rather than storing entire sequences in memory • Once consumed, Scala iterators cannot be reused—calling methods like toList or foreach exhausts the iterator permanently • Iterator transformations are lazy and chainable, enabling powerful pipeline operations without intermediate collection allocations

Understanding Iterators in Scala

Iterators represent a fundamental abstraction for sequential access to collection elements. Unlike collections that hold all elements in memory, iterators generate values on-demand through the next() method and track position with hasNext. This lazy evaluation model makes iterators ideal for processing large datasets, infinite sequences, or expensive computations.

val numbers = Iterator(1, 2, 3, 4, 5)

while (numbers.hasNext) {
  println(numbers.next())
}

// Attempting to reuse exhausted iterator
println(numbers.hasNext) // false - iterator is consumed

The critical characteristic: iterators are stateful and mutable. Each call to next() advances the internal cursor, making the previous element inaccessible.

Creating Iterators

Scala provides multiple approaches to iterator construction beyond simple collection conversion.

// From collections
val listIter = List(1, 2, 3).iterator
val arrayIter = Array("a", "b", "c").iterator

// From ranges
val rangeIter = (1 to 100).iterator

// Infinite iterators
val infiniteOnes = Iterator.continually(1)
val naturalNumbers = Iterator.from(0)
val fibonacciStream = Iterator.iterate((0, 1)) { 
  case (a, b) => (b, a + b) 
}.map(_._1)

// Custom iterators
val customIter = new Iterator[Int] {
  private var current = 0
  def hasNext: Boolean = current < 5
  def next(): Int = {
    if (!hasNext) throw new NoSuchElementException
    current += 1
    current
  }
}

// Take first 10 Fibonacci numbers
println(fibonacciStream.take(10).toList)
// Output: List(0, 1, 1, 2, 3, 5, 8, 13, 21, 34)

The Iterator.continually and Iterator.from methods create unbounded sequences, while iterate generates values by repeatedly applying a function—perfect for recursive sequences.

Transformation Operations

Iterator transformations return new iterators without evaluating elements immediately. This lazy behavior chains operations efficiently.

val data = Iterator(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

val result = data
  .filter(_ % 2 == 0)           // Keep even numbers
  .map(_ * 10)                   // Multiply by 10
  .drop(1)                       // Skip first element
  .take(2)                       // Take next 2 elements

println(result.toList) // List(40, 60)

Each transformation creates a wrapper iterator. No computation occurs until terminal operations like toList, foreach, or reduce force evaluation.

// Demonstrate laziness
val lazyIter = Iterator(1, 2, 3, 4, 5)
  .map { x =>
    println(s"Mapping $x")
    x * 2
  }

println("Iterator created, no output yet")

lazyIter.take(2).foreach(println)
// Output:
// Iterator created, no output yet
// Mapping 1
// 2
// Mapping 2
// 4

Grouping and Partitioning

Iterators support sophisticated grouping operations for data segmentation.

val numbers = Iterator(1, 2, 3, 4, 5, 6, 7, 8, 9)

// Partition into two iterators
val (evens, odds) = numbers.duplicate match {
  case (it1, it2) => (it1.filter(_ % 2 == 0), it2.filter(_ % 2 != 0))
}

println(s"Evens: ${evens.toList}")
println(s"Odds: ${odds.toList}")

// Group consecutive elements
val grouped = Iterator(1, 2, 3, 4, 5, 6).grouped(2)
grouped.foreach(group => println(group.toList))
// Output:
// List(1, 2)
// List(3, 4)
// List(5, 6)

// Sliding windows
val sliding = Iterator(1, 2, 3, 4, 5).sliding(3)
sliding.foreach(window => println(window.toList))
// Output:
// List(1, 2, 3)
// List(2, 3, 4)
// List(3, 4, 5)

The duplicate method creates two independent iterators from one source—essential when you need multiple passes over the same data.

Aggregation and Reduction

Terminal operations consume iterators while producing final results.

val numbers = Iterator(1, 2, 3, 4, 5)

// Basic aggregations
println(numbers.sum)        // 15

val numbers2 = Iterator(1, 2, 3, 4, 5)
println(numbers2.product)   // 120

val numbers3 = Iterator(1, 2, 3, 4, 5)
println(numbers3.max)       // 5

// Fold operations
val numbers4 = Iterator(1, 2, 3, 4, 5)
val sum = numbers4.fold(0)(_ + _)
println(sum) // 15

// Reduce with operation tracking
val numbers5 = Iterator(1, 2, 3, 4, 5)
val factorial = numbers5.reduce { (acc, n) =>
  println(s"$acc * $n")
  acc * n
}
// Output shows intermediate steps
// 1 * 2
// 2 * 3
// 6 * 4
// 24 * 5
println(factorial) // 120

Remember: these operations exhaust the iterator. Attempting to reuse it afterward returns empty results.

Combining Iterators

Merge multiple iterators using concatenation, zipping, or interleaving.

// Concatenation
val iter1 = Iterator(1, 2, 3)
val iter2 = Iterator(4, 5, 6)
val combined = iter1 ++ iter2
println(combined.toList) // List(1, 2, 3, 4, 5, 6)

// Zipping - pairs elements
val letters = Iterator("a", "b", "c")
val numbers = Iterator(1, 2, 3, 4)
val zipped = letters.zip(numbers)
println(zipped.toList) // List((a,1), (b,2), (c,3))

// Zip with index
val indexed = Iterator("x", "y", "z").zipWithIndex
println(indexed.toList) // List((x,0), (y,1), (z,2))

// Flatten nested iterators
val nested = Iterator(Iterator(1, 2), Iterator(3, 4), Iterator(5))
val flattened = nested.flatten
println(flattened.toList) // List(1, 2, 3, 4, 5)

Practical Use Cases

Processing Large Files

import scala.io.Source

def processLargeFile(filename: String): Unit = {
  val source = Source.fromFile(filename)
  try {
    val lines = source.getLines() // Returns iterator
    
    val result = lines
      .filter(_.nonEmpty)
      .map(_.trim)
      .filter(_.startsWith("ERROR"))
      .take(100)
      .toList
    
    println(s"Found ${result.length} error lines")
  } finally {
    source.close()
  }
}

This approach processes files line-by-line without loading entire contents into memory.

Generating Test Data

case class User(id: Int, name: String, score: Int)

val testUsers = Iterator.from(1)
  .map { id =>
    User(id, s"user_$id", scala.util.Random.nextInt(100))
  }
  .take(1000)

// Process in batches
testUsers.grouped(100).foreach { batch =>
  println(s"Processing batch of ${batch.length} users")
  // Simulate database insert
}

Stream Processing Pipeline

def processDataStream(source: Iterator[String]): Map[String, Int] = {
  source
    .flatMap(_.split("\\s+"))
    .map(_.toLowerCase.replaceAll("[^a-z]", ""))
    .filter(_.nonEmpty)
    .foldLeft(Map.empty[String, Int]) { (acc, word) =>
      acc + (word -> (acc.getOrElse(word, 0) + 1))
    }
}

val text = Iterator(
  "Scala iterators are powerful",
  "Iterators enable lazy evaluation",
  "Powerful tools for data processing"
)

val wordCount = processDataStream(text)
println(wordCount)
// Map(scala -> 1, iterators -> 2, are -> 1, powerful -> 2, ...)

Performance Considerations

Iterators excel when memory efficiency matters more than random access. They avoid intermediate collection allocations during transformations.

// Memory-efficient: processes one element at a time
val efficientSum = (1 to 1000000).iterator
  .filter(_ % 2 == 0)
  .map(_ * 2)
  .take(1000)
  .sum

// Memory-intensive: creates intermediate collections
val inefficientSum = (1 to 1000000)
  .filter(_ % 2 == 0)  // Creates new collection
  .map(_ * 2)          // Creates another collection
  .take(1000)          // Creates yet another
  .sum

However, iterators impose overhead for each hasNext/next call. For small datasets or when multiple traversals are needed, use collections directly. Choose iterators when processing streams, large files, or infinite sequences where lazy evaluation provides clear benefits.