Scala - partition, span, splitAt | Application Architect

Key Insights

partition splits collections based on a predicate into two groups (true/false), span splits at the first predicate failure while maintaining order, and splitAt divides at a specific index position
span differs critically from partition by stopping evaluation at the first false predicate, making it O(n) best case versus partition’s always O(n) traversal
Understanding these methods prevents unnecessary collection iterations and enables more expressive functional transformations in production Scala code

Understanding Collection Splitting Methods

Scala provides three distinct methods for dividing collections: partition, span, and splitAt. Each serves different use cases and has different performance characteristics. Choosing the wrong method leads to unnecessary iterations or unclear intent.

All three methods return a tuple of two collections (A, B), but the logic determining what goes into each collection differs fundamentally.

partition: Split by Predicate

partition divides a collection into two groups based on a predicate function. Elements satisfying the predicate go into the first collection; those failing go into the second.

val numbers = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
val (evens, odds) = numbers.partition(_ % 2 == 0)

println(evens)  // List(2, 4, 6, 8, 10)
println(odds)   // List(1, 3, 5, 7, 9)

The predicate is evaluated for every element. Order is preserved within each resulting collection.

case class Transaction(id: String, amount: Double, status: String)

val transactions = List(
  Transaction("T1", 100.0, "completed"),
  Transaction("T2", 250.0, "pending"),
  Transaction("T3", 75.0, "completed"),
  Transaction("T4", 500.0, "failed"),
  Transaction("T5", 150.0, "completed")
)

val (completed, notCompleted) = transactions.partition(_.status == "completed")

println(completed.map(_.id))     // List(T1, T3, T5)
println(notCompleted.map(_.id))  // List(T2, T4)

Performance Characteristics

partition always traverses the entire collection. Time complexity is O(n). This is necessary because elements satisfying the predicate can appear anywhere in the collection.

val largeList = (1 to 1000000).toList
val (positive, negative) = largeList.partition(_ > 0)
// Evaluates predicate 1,000,000 times

span: Split at First Failure

span takes a predicate and returns two collections: the longest prefix where all elements satisfy the predicate, and the remainder. Crucially, it stops evaluating at the first predicate failure.

val numbers = List(2, 4, 6, 7, 8, 10, 12)
val (allEven, rest) = numbers.span(_ % 2 == 0)

println(allEven)  // List(2, 4, 6)
println(rest)     // List(7, 8, 10, 12)

Notice that 8, 10, and 12 are even, but they appear in the second collection because span stopped at 7.

val logs = List(
  "INFO: System started",
  "INFO: Loading configuration",
  "INFO: Database connected",
  "ERROR: Connection timeout",
  "INFO: Retrying connection",
  "WARN: High memory usage"
)

val (infoLogs, otherLogs) = logs.span(_.startsWith("INFO"))

println(infoLogs.size)   // 3
println(otherLogs.size)  // 3

When to Use span

Use span when you need elements until a condition breaks. Common scenarios include:

// Processing sorted data until threshold
val sortedPrices = List(10.0, 15.0, 20.0, 25.0, 30.0, 35.0)
val (affordable, expensive) = sortedPrices.span(_ <= 25.0)
// affordable: List(10.0, 15.0, 20.0, 25.0)
// expensive: List(30.0, 35.0)

// Reading file until separator
val lines = List("header1", "header2", "---", "data1", "data2")
val (headers, rest) = lines.span(_ != "---")
// headers: List("header1", "header2")
// rest: List("---", "data1", "data2")

Performance Advantage

span can short-circuit. If the first element fails the predicate, it returns immediately.

val numbers = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
val (evens, rest) = numbers.span(_ % 2 == 0)
// Evaluates predicate once, returns (List(), List(1, 2, 3, ...))

Best case: O(1). Worst case: O(n) when all elements satisfy the predicate.

splitAt: Split by Index

splitAt divides a collection at a specific index position. No predicate evaluation occurs.

val numbers = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
val (first, second) = numbers.splitAt(5)

println(first)   // List(1, 2, 3, 4, 5)
println(second)  // List(6, 7, 8, 9, 10)

The first collection contains elements from index 0 to n-1. The second contains elements from index n onward.

// Splitting at index 0
val (empty, all) = numbers.splitAt(0)
// empty: List()
// all: List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

// Splitting beyond collection size
val (all2, empty2) = numbers.splitAt(100)
// all2: List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
// empty2: List()

Practical Applications

// Pagination
def paginate[A](items: List[A], pageSize: Int, page: Int): List[A] = {
  val (_, rest) = items.splitAt(pageSize * (page - 1))
  val (pageItems, _) = rest.splitAt(pageSize)
  pageItems
}

val items = (1 to 100).toList
println(paginate(items, 10, 3))  // List(21, 22, ..., 30)

// Batch processing
def processBatches[A](items: List[A], batchSize: Int): Unit = {
  def process(remaining: List[A]): Unit = {
    if (remaining.isEmpty) return
    val (batch, rest) = remaining.splitAt(batchSize)
    println(s"Processing batch: $batch")
    process(rest)
  }
  process(items)
}

processBatches((1 to 7).toList, 3)
// Processing batch: List(1, 2, 3)
// Processing batch: List(4, 5, 6)
// Processing batch: List(7)

Choosing the Right Method

Decision matrix for collection splitting:

// Use partition: Need all elements satisfying a condition
val data = List(1, -2, 3, -4, 5, -6, 7)
val (positive, negative) = data.partition(_ > 0)

// Use span: Need prefix until condition fails
val sortedData = List(1, 2, 3, 4, 5, 6, 7)
val (lessThanFive, rest) = sortedData.span(_ < 5)

// Use splitAt: Need fixed position split
val (firstHalf, secondHalf) = data.splitAt(data.length / 2)

Combining Methods

These methods compose well for complex transformations:

case class Event(timestamp: Long, eventType: String, data: String)

val events = List(
  Event(1000, "login", "user1"),
  Event(1100, "login", "user2"),
  Event(1200, "purchase", "item1"),
  Event(1300, "logout", "user1"),
  Event(1400, "login", "user3"),
  Event(1500, "purchase", "item2")
)

// Get first 3 logins, then separate purchases from other events
val (first3, rest) = events.span(_.eventType == "login")
val (purchases, others) = rest.partition(_.eventType == "purchase")

println(first3.size)    // 2 (stops at first non-login)
println(purchases.size) // 2
println(others.size)    // 2

Working with Streams and LazyLists

These methods work with lazy collections, but behavior differs:

val stream = LazyList.from(1)

// span on infinite stream - safe if predicate fails early
val (small, large) = stream.span(_ < 10)
println(small.toList)  // List(1, 2, 3, 4, 5, 6, 7, 8, 9)

// splitAt on infinite stream - safe, creates lazy splits
val (first100, rest) = stream.splitAt(100)
println(first100.take(5).toList)  // List(1, 2, 3, 4, 5)

// partition on infinite stream - DANGEROUS, never terminates
// val (evens, odds) = stream.partition(_ % 2 == 0)  // Don't do this

Performance Summary

For a collection of size n:

partition: Always O(n) time, O(n) space
span: O(1) to O(n) time depending on predicate, O(n) space worst case
splitAt: O(n) time for List (must traverse to index), O(1) for Vector/Array

import scala.collection.immutable.Vector

val list = (1 to 1000000).toList
val vector = (1 to 1000000).toVector

// List splitAt: O(n) - must traverse
val (l1, l2) = list.splitAt(500000)

// Vector splitAt: O(log n) - tree structure
val (v1, v2) = vector.splitAt(500000)

Choose Vector when frequent splits at arbitrary indices are required. Use List when splits happen at the beginning or when building collections incrementally.