Scala - partition, span, splitAt
Scala provides three distinct methods for dividing collections: `partition`, `span`, and `splitAt`. Each serves different use cases and has different performance characteristics. Choosing the wrong...
Key Insights
partitionsplits collections based on a predicate into two groups (true/false),spansplits at the first predicate failure while maintaining order, andsplitAtdivides at a specific index positionspandiffers critically frompartitionby stopping evaluation at the first false predicate, making it O(n) best case versuspartition’s always O(n) traversal- Understanding these methods prevents unnecessary collection iterations and enables more expressive functional transformations in production Scala code
Understanding Collection Splitting Methods
Scala provides three distinct methods for dividing collections: partition, span, and splitAt. Each serves different use cases and has different performance characteristics. Choosing the wrong method leads to unnecessary iterations or unclear intent.
All three methods return a tuple of two collections (A, B), but the logic determining what goes into each collection differs fundamentally.
partition: Split by Predicate
partition divides a collection into two groups based on a predicate function. Elements satisfying the predicate go into the first collection; those failing go into the second.
val numbers = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
val (evens, odds) = numbers.partition(_ % 2 == 0)
println(evens) // List(2, 4, 6, 8, 10)
println(odds) // List(1, 3, 5, 7, 9)
The predicate is evaluated for every element. Order is preserved within each resulting collection.
case class Transaction(id: String, amount: Double, status: String)
val transactions = List(
Transaction("T1", 100.0, "completed"),
Transaction("T2", 250.0, "pending"),
Transaction("T3", 75.0, "completed"),
Transaction("T4", 500.0, "failed"),
Transaction("T5", 150.0, "completed")
)
val (completed, notCompleted) = transactions.partition(_.status == "completed")
println(completed.map(_.id)) // List(T1, T3, T5)
println(notCompleted.map(_.id)) // List(T2, T4)
Performance Characteristics
partition always traverses the entire collection. Time complexity is O(n). This is necessary because elements satisfying the predicate can appear anywhere in the collection.
val largeList = (1 to 1000000).toList
val (positive, negative) = largeList.partition(_ > 0)
// Evaluates predicate 1,000,000 times
span: Split at First Failure
span takes a predicate and returns two collections: the longest prefix where all elements satisfy the predicate, and the remainder. Crucially, it stops evaluating at the first predicate failure.
val numbers = List(2, 4, 6, 7, 8, 10, 12)
val (allEven, rest) = numbers.span(_ % 2 == 0)
println(allEven) // List(2, 4, 6)
println(rest) // List(7, 8, 10, 12)
Notice that 8, 10, and 12 are even, but they appear in the second collection because span stopped at 7.
val logs = List(
"INFO: System started",
"INFO: Loading configuration",
"INFO: Database connected",
"ERROR: Connection timeout",
"INFO: Retrying connection",
"WARN: High memory usage"
)
val (infoLogs, otherLogs) = logs.span(_.startsWith("INFO"))
println(infoLogs.size) // 3
println(otherLogs.size) // 3
When to Use span
Use span when you need elements until a condition breaks. Common scenarios include:
// Processing sorted data until threshold
val sortedPrices = List(10.0, 15.0, 20.0, 25.0, 30.0, 35.0)
val (affordable, expensive) = sortedPrices.span(_ <= 25.0)
// affordable: List(10.0, 15.0, 20.0, 25.0)
// expensive: List(30.0, 35.0)
// Reading file until separator
val lines = List("header1", "header2", "---", "data1", "data2")
val (headers, rest) = lines.span(_ != "---")
// headers: List("header1", "header2")
// rest: List("---", "data1", "data2")
Performance Advantage
span can short-circuit. If the first element fails the predicate, it returns immediately.
val numbers = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
val (evens, rest) = numbers.span(_ % 2 == 0)
// Evaluates predicate once, returns (List(), List(1, 2, 3, ...))
Best case: O(1). Worst case: O(n) when all elements satisfy the predicate.
splitAt: Split by Index
splitAt divides a collection at a specific index position. No predicate evaluation occurs.
val numbers = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
val (first, second) = numbers.splitAt(5)
println(first) // List(1, 2, 3, 4, 5)
println(second) // List(6, 7, 8, 9, 10)
The first collection contains elements from index 0 to n-1. The second contains elements from index n onward.
// Splitting at index 0
val (empty, all) = numbers.splitAt(0)
// empty: List()
// all: List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
// Splitting beyond collection size
val (all2, empty2) = numbers.splitAt(100)
// all2: List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
// empty2: List()
Practical Applications
// Pagination
def paginate[A](items: List[A], pageSize: Int, page: Int): List[A] = {
val (_, rest) = items.splitAt(pageSize * (page - 1))
val (pageItems, _) = rest.splitAt(pageSize)
pageItems
}
val items = (1 to 100).toList
println(paginate(items, 10, 3)) // List(21, 22, ..., 30)
// Batch processing
def processBatches[A](items: List[A], batchSize: Int): Unit = {
def process(remaining: List[A]): Unit = {
if (remaining.isEmpty) return
val (batch, rest) = remaining.splitAt(batchSize)
println(s"Processing batch: $batch")
process(rest)
}
process(items)
}
processBatches((1 to 7).toList, 3)
// Processing batch: List(1, 2, 3)
// Processing batch: List(4, 5, 6)
// Processing batch: List(7)
Choosing the Right Method
Decision matrix for collection splitting:
// Use partition: Need all elements satisfying a condition
val data = List(1, -2, 3, -4, 5, -6, 7)
val (positive, negative) = data.partition(_ > 0)
// Use span: Need prefix until condition fails
val sortedData = List(1, 2, 3, 4, 5, 6, 7)
val (lessThanFive, rest) = sortedData.span(_ < 5)
// Use splitAt: Need fixed position split
val (firstHalf, secondHalf) = data.splitAt(data.length / 2)
Combining Methods
These methods compose well for complex transformations:
case class Event(timestamp: Long, eventType: String, data: String)
val events = List(
Event(1000, "login", "user1"),
Event(1100, "login", "user2"),
Event(1200, "purchase", "item1"),
Event(1300, "logout", "user1"),
Event(1400, "login", "user3"),
Event(1500, "purchase", "item2")
)
// Get first 3 logins, then separate purchases from other events
val (first3, rest) = events.span(_.eventType == "login")
val (purchases, others) = rest.partition(_.eventType == "purchase")
println(first3.size) // 2 (stops at first non-login)
println(purchases.size) // 2
println(others.size) // 2
Working with Streams and LazyLists
These methods work with lazy collections, but behavior differs:
val stream = LazyList.from(1)
// span on infinite stream - safe if predicate fails early
val (small, large) = stream.span(_ < 10)
println(small.toList) // List(1, 2, 3, 4, 5, 6, 7, 8, 9)
// splitAt on infinite stream - safe, creates lazy splits
val (first100, rest) = stream.splitAt(100)
println(first100.take(5).toList) // List(1, 2, 3, 4, 5)
// partition on infinite stream - DANGEROUS, never terminates
// val (evens, odds) = stream.partition(_ % 2 == 0) // Don't do this
Performance Summary
For a collection of size n:
partition: Always O(n) time, O(n) spacespan: O(1) to O(n) time depending on predicate, O(n) space worst casesplitAt: O(n) time for List (must traverse to index), O(1) for Vector/Array
import scala.collection.immutable.Vector
val list = (1 to 1000000).toList
val vector = (1 to 1000000).toVector
// List splitAt: O(n) - must traverse
val (l1, l2) = list.splitAt(500000)
// Vector splitAt: O(log n) - tree structure
val (v1, v2) = vector.splitAt(500000)
Choose Vector when frequent splits at arbitrary indices are required. Use List when splits happen at the beginning or when building collections incrementally.