Scala - partition, span, splitAt

Scala provides three distinct methods for dividing collections: `partition`, `span`, and `splitAt`. Each serves different use cases and has different performance characteristics. Choosing the wrong...

Key Insights

  • partition splits collections based on a predicate into two groups (true/false), span splits at the first predicate failure while maintaining order, and splitAt divides at a specific index position
  • span differs critically from partition by stopping evaluation at the first false predicate, making it O(n) best case versus partition’s always O(n) traversal
  • Understanding these methods prevents unnecessary collection iterations and enables more expressive functional transformations in production Scala code

Understanding Collection Splitting Methods

Scala provides three distinct methods for dividing collections: partition, span, and splitAt. Each serves different use cases and has different performance characteristics. Choosing the wrong method leads to unnecessary iterations or unclear intent.

All three methods return a tuple of two collections (A, B), but the logic determining what goes into each collection differs fundamentally.

partition: Split by Predicate

partition divides a collection into two groups based on a predicate function. Elements satisfying the predicate go into the first collection; those failing go into the second.

val numbers = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
val (evens, odds) = numbers.partition(_ % 2 == 0)

println(evens)  // List(2, 4, 6, 8, 10)
println(odds)   // List(1, 3, 5, 7, 9)

The predicate is evaluated for every element. Order is preserved within each resulting collection.

case class Transaction(id: String, amount: Double, status: String)

val transactions = List(
  Transaction("T1", 100.0, "completed"),
  Transaction("T2", 250.0, "pending"),
  Transaction("T3", 75.0, "completed"),
  Transaction("T4", 500.0, "failed"),
  Transaction("T5", 150.0, "completed")
)

val (completed, notCompleted) = transactions.partition(_.status == "completed")

println(completed.map(_.id))     // List(T1, T3, T5)
println(notCompleted.map(_.id))  // List(T2, T4)

Performance Characteristics

partition always traverses the entire collection. Time complexity is O(n). This is necessary because elements satisfying the predicate can appear anywhere in the collection.

val largeList = (1 to 1000000).toList
val (positive, negative) = largeList.partition(_ > 0)
// Evaluates predicate 1,000,000 times

span: Split at First Failure

span takes a predicate and returns two collections: the longest prefix where all elements satisfy the predicate, and the remainder. Crucially, it stops evaluating at the first predicate failure.

val numbers = List(2, 4, 6, 7, 8, 10, 12)
val (allEven, rest) = numbers.span(_ % 2 == 0)

println(allEven)  // List(2, 4, 6)
println(rest)     // List(7, 8, 10, 12)

Notice that 8, 10, and 12 are even, but they appear in the second collection because span stopped at 7.

val logs = List(
  "INFO: System started",
  "INFO: Loading configuration",
  "INFO: Database connected",
  "ERROR: Connection timeout",
  "INFO: Retrying connection",
  "WARN: High memory usage"
)

val (infoLogs, otherLogs) = logs.span(_.startsWith("INFO"))

println(infoLogs.size)   // 3
println(otherLogs.size)  // 3

When to Use span

Use span when you need elements until a condition breaks. Common scenarios include:

// Processing sorted data until threshold
val sortedPrices = List(10.0, 15.0, 20.0, 25.0, 30.0, 35.0)
val (affordable, expensive) = sortedPrices.span(_ <= 25.0)
// affordable: List(10.0, 15.0, 20.0, 25.0)
// expensive: List(30.0, 35.0)

// Reading file until separator
val lines = List("header1", "header2", "---", "data1", "data2")
val (headers, rest) = lines.span(_ != "---")
// headers: List("header1", "header2")
// rest: List("---", "data1", "data2")

Performance Advantage

span can short-circuit. If the first element fails the predicate, it returns immediately.

val numbers = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
val (evens, rest) = numbers.span(_ % 2 == 0)
// Evaluates predicate once, returns (List(), List(1, 2, 3, ...))

Best case: O(1). Worst case: O(n) when all elements satisfy the predicate.

splitAt: Split by Index

splitAt divides a collection at a specific index position. No predicate evaluation occurs.

val numbers = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
val (first, second) = numbers.splitAt(5)

println(first)   // List(1, 2, 3, 4, 5)
println(second)  // List(6, 7, 8, 9, 10)

The first collection contains elements from index 0 to n-1. The second contains elements from index n onward.

// Splitting at index 0
val (empty, all) = numbers.splitAt(0)
// empty: List()
// all: List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

// Splitting beyond collection size
val (all2, empty2) = numbers.splitAt(100)
// all2: List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
// empty2: List()

Practical Applications

// Pagination
def paginate[A](items: List[A], pageSize: Int, page: Int): List[A] = {
  val (_, rest) = items.splitAt(pageSize * (page - 1))
  val (pageItems, _) = rest.splitAt(pageSize)
  pageItems
}

val items = (1 to 100).toList
println(paginate(items, 10, 3))  // List(21, 22, ..., 30)

// Batch processing
def processBatches[A](items: List[A], batchSize: Int): Unit = {
  def process(remaining: List[A]): Unit = {
    if (remaining.isEmpty) return
    val (batch, rest) = remaining.splitAt(batchSize)
    println(s"Processing batch: $batch")
    process(rest)
  }
  process(items)
}

processBatches((1 to 7).toList, 3)
// Processing batch: List(1, 2, 3)
// Processing batch: List(4, 5, 6)
// Processing batch: List(7)

Choosing the Right Method

Decision matrix for collection splitting:

// Use partition: Need all elements satisfying a condition
val data = List(1, -2, 3, -4, 5, -6, 7)
val (positive, negative) = data.partition(_ > 0)

// Use span: Need prefix until condition fails
val sortedData = List(1, 2, 3, 4, 5, 6, 7)
val (lessThanFive, rest) = sortedData.span(_ < 5)

// Use splitAt: Need fixed position split
val (firstHalf, secondHalf) = data.splitAt(data.length / 2)

Combining Methods

These methods compose well for complex transformations:

case class Event(timestamp: Long, eventType: String, data: String)

val events = List(
  Event(1000, "login", "user1"),
  Event(1100, "login", "user2"),
  Event(1200, "purchase", "item1"),
  Event(1300, "logout", "user1"),
  Event(1400, "login", "user3"),
  Event(1500, "purchase", "item2")
)

// Get first 3 logins, then separate purchases from other events
val (first3, rest) = events.span(_.eventType == "login")
val (purchases, others) = rest.partition(_.eventType == "purchase")

println(first3.size)    // 2 (stops at first non-login)
println(purchases.size) // 2
println(others.size)    // 2

Working with Streams and LazyLists

These methods work with lazy collections, but behavior differs:

val stream = LazyList.from(1)

// span on infinite stream - safe if predicate fails early
val (small, large) = stream.span(_ < 10)
println(small.toList)  // List(1, 2, 3, 4, 5, 6, 7, 8, 9)

// splitAt on infinite stream - safe, creates lazy splits
val (first100, rest) = stream.splitAt(100)
println(first100.take(5).toList)  // List(1, 2, 3, 4, 5)

// partition on infinite stream - DANGEROUS, never terminates
// val (evens, odds) = stream.partition(_ % 2 == 0)  // Don't do this

Performance Summary

For a collection of size n:

  • partition: Always O(n) time, O(n) space
  • span: O(1) to O(n) time depending on predicate, O(n) space worst case
  • splitAt: O(n) time for List (must traverse to index), O(1) for Vector/Array
import scala.collection.immutable.Vector

val list = (1 to 1000000).toList
val vector = (1 to 1000000).toVector

// List splitAt: O(n) - must traverse
val (l1, l2) = list.splitAt(500000)

// Vector splitAt: O(log n) - tree structure
val (v1, v2) = vector.splitAt(500000)

Choose Vector when frequent splits at arbitrary indices are required. Use List when splits happen at the beginning or when building collections incrementally.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.