Scala - ArrayBuffer (Mutable Array) | Application Architect

Key Insights

ArrayBuffer provides O(1) amortized append operations and random access, making it the go-to mutable collection when you need dynamic array behavior in Scala
Unlike immutable Vector, ArrayBuffer modifies data in-place, reducing memory overhead for scenarios involving frequent updates or large-scale data transformations
Converting between ArrayBuffer and immutable collections is straightforward, allowing you to leverage mutability during construction and immutability for safe sharing

Understanding ArrayBuffer Fundamentals

ArrayBuffer is Scala’s resizable array implementation, part of the scala.collection.mutable package. It maintains an internal array that grows automatically when capacity is exceeded, typically doubling in size to achieve amortized constant-time appends.

import scala.collection.mutable.ArrayBuffer

// Creating ArrayBuffers
val empty = ArrayBuffer[Int]()
val withElements = ArrayBuffer(1, 2, 3, 4, 5)
val withInitialCapacity = new ArrayBuffer[String](100)

// Basic operations
val buffer = ArrayBuffer(10, 20, 30)
buffer += 40                    // Append single element
buffer ++= Seq(50, 60)         // Append multiple elements
buffer.insert(0, 5)            // Insert at index 0
buffer.remove(2)               // Remove element at index 2

println(buffer)  // ArrayBuffer(5, 10, 30, 40, 50, 60)

Performance Characteristics

ArrayBuffer excels in scenarios requiring frequent additions and random access. Understanding its performance profile helps you choose the right collection.

import scala.collection.mutable.ArrayBuffer
import scala.collection.immutable.Vector

def measureTime(block: => Unit): Long = {
  val start = System.nanoTime()
  block
  System.nanoTime() - start
}

// Append performance comparison
val iterations = 100000

val arrayBufferTime = measureTime {
  val ab = ArrayBuffer[Int]()
  for (i <- 0 until iterations) ab += i
}

val vectorTime = measureTime {
  var v = Vector[Int]()
  for (i <- 0 until iterations) v = v :+ i
}

println(s"ArrayBuffer: ${arrayBufferTime / 1000000}ms")
println(s"Vector: ${vectorTime / 1000000}ms")
// ArrayBuffer is typically 10-20x faster for sequential appends

Common Operations and Patterns

ArrayBuffer supports a rich API for manipulation. Here are the most frequently used operations in production code.

import scala.collection.mutable.ArrayBuffer

val data = ArrayBuffer(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

// Filtering and transformation
data.filter(_ % 2 == 0)                    // Returns new ArrayBuffer
data.filterInPlace(_ % 2 == 0)            // Modifies in-place
data.map(_ * 2)                            // Returns new ArrayBuffer
data.mapInPlace(_ * 2)                    // Modifies in-place

// Bulk operations
data.prepend(0)                            // Add to front
data.prependAll(Seq(-2, -1))              // Add multiple to front
data.appendAll(Seq(11, 12))               // Append multiple
data.insertAll(5, Seq(100, 200))          // Insert at position

// Removal operations
data.remove(0)                             // Remove at index
data.remove(0, 3)                         // Remove 3 elements starting at 0
data.subtractOne(100)                     // Remove first occurrence
data.clear()                              // Remove all elements

// Slicing and copying
val slice = data.slice(2, 5)              // Returns new ArrayBuffer
val clone = data.clone()                  // Deep copy

Building ArrayBuffers from Data Sources

Real applications often construct ArrayBuffers from various data sources. Here are idiomatic patterns.

import scala.collection.mutable.ArrayBuffer
import scala.io.Source

// From ranges and collections
val fromRange = ArrayBuffer.range(0, 10)
val fromSeq = ArrayBuffer.from(List(1, 2, 3))
val filled = ArrayBuffer.fill(5)("default")
val tabulated = ArrayBuffer.tabulate(5)(n => n * n)

// From file processing
def readLinesFromFile(filename: String): ArrayBuffer[String] = {
  val buffer = ArrayBuffer[String]()
  val source = Source.fromFile(filename)
  try {
    for (line <- source.getLines()) {
      buffer += line.trim
    }
  } finally {
    source.close()
  }
  buffer
}

// Conditional building
def buildFiltered(data: Seq[Int], threshold: Int): ArrayBuffer[Int] = {
  val result = ArrayBuffer[Int]()
  for (value <- data) {
    if (value > threshold) {
      result += value
    }
  }
  result
}

// Using builder pattern
val builder = ArrayBuffer.newBuilder[Double]
builder += 1.0
builder += 2.5
builder ++= Seq(3.7, 4.2)
val built = builder.result()

Converting Between Mutable and Immutable

A common pattern is building with ArrayBuffer and converting to immutable collections for safe sharing.

import scala.collection.mutable.ArrayBuffer

// Conversion methods
val buffer = ArrayBuffer(1, 2, 3, 4, 5)

val vector: Vector[Int] = buffer.toVector
val list: List[Int] = buffer.toList
val array: Array[Int] = buffer.toArray
val seq: Seq[Int] = buffer.toSeq        // Returns immutable Seq

// Builder pattern for immutable result
def processData(input: Seq[String]): Vector[Int] = {
  val buffer = ArrayBuffer[Int]()
  for (str <- input) {
    str.toIntOption.foreach(buffer += _)
  }
  buffer.toVector
}

// In-place processing then freeze
def computeResults(data: Seq[Double]): Seq[Double] = {
  val working = ArrayBuffer.from(data)
  working.mapInPlace(x => x * x)
  working.filterInPlace(_ > 10.0)
  working.sorted.toSeq  // Return immutable sorted result
}

Real-World Use Cases

Here are practical scenarios where ArrayBuffer is the optimal choice.

import scala.collection.mutable.ArrayBuffer

// CSV parsing with dynamic row accumulation
case class Record(id: Int, name: String, value: Double)

def parseCSV(lines: Iterator[String]): ArrayBuffer[Record] = {
  val records = ArrayBuffer[Record]()
  for (line <- lines.drop(1)) {  // Skip header
    val parts = line.split(",")
    if (parts.length == 3) {
      try {
        records += Record(
          parts(0).toInt,
          parts(1).trim,
          parts(2).toDouble
        )
      } catch {
        case _: NumberFormatException => // Skip invalid rows
      }
    }
  }
  records
}

// Batch processing with accumulation
def processBatches[T](items: Seq[T], batchSize: Int)(
  processor: Seq[T] => Unit
): Unit = {
  val batch = ArrayBuffer[T]()
  for (item <- items) {
    batch += item
    if (batch.size >= batchSize) {
      processor(batch.toSeq)
      batch.clear()
    }
  }
  if (batch.nonEmpty) {
    processor(batch.toSeq)
  }
}

// Dynamic graph construction
case class Graph(adjacency: Map[Int, ArrayBuffer[Int]])

def buildGraph(edges: Seq[(Int, Int)]): Graph = {
  val adj = scala.collection.mutable.Map[Int, ArrayBuffer[Int]]()
  for ((from, to) <- edges) {
    adj.getOrElseUpdate(from, ArrayBuffer[Int]()) += to
  }
  Graph(adj.toMap)
}

// Stream processing with windowing
def slidingWindowAggregate(
  stream: Iterator[Double],
  windowSize: Int
): Iterator[Double] = {
  val window = ArrayBuffer[Double]()
  stream.map { value =>
    window += value
    if (window.size > windowSize) {
      window.remove(0)
    }
    window.sum / window.size
  }
}

Memory Management and Optimization

ArrayBuffer’s internal array can grow larger than needed. Use these techniques for memory efficiency.

import scala.collection.mutable.ArrayBuffer

// Trimming excess capacity
val buffer = new ArrayBuffer[Int](1000)
buffer ++= (1 to 100)
buffer.trimToSize()  // Reduces internal array to actual size

// Pre-sizing for known capacity
def efficientBuild(expectedSize: Int): ArrayBuffer[Int] = {
  val buffer = new ArrayBuffer[Int](expectedSize)
  for (i <- 0 until expectedSize) {
    buffer += i * 2
  }
  buffer
}

// Reusing buffers
class DataProcessor {
  private val workBuffer = ArrayBuffer[String]()
  
  def process(input: Seq[String]): Seq[String] = {
    workBuffer.clear()
    workBuffer ++= input.filter(_.nonEmpty)
    workBuffer.mapInPlace(_.toLowerCase)
    workBuffer.toSeq
  }
}

// Avoiding unnecessary allocations
def efficientFilter(data: ArrayBuffer[Int], predicate: Int => Boolean): Unit = {
  var writeIdx = 0
  var readIdx = 0
  while (readIdx < data.length) {
    if (predicate(data(readIdx))) {
      if (writeIdx != readIdx) {
        data(writeIdx) = data(readIdx)
      }
      writeIdx += 1
    }
    readIdx += 1
  }
  data.reduceToSize(writeIdx)
}

Thread Safety Considerations

ArrayBuffer is not thread-safe. Use synchronization or concurrent collections for multi-threaded scenarios.

import scala.collection.mutable.ArrayBuffer
import java.util.concurrent.ConcurrentLinkedQueue
import scala.jdk.CollectionConverters._

// Synchronized wrapper
class SynchronizedBuffer[T] {
  private val buffer = ArrayBuffer[T]()
  
  def add(elem: T): Unit = buffer.synchronized {
    buffer += elem
  }
  
  def toSeq: Seq[T] = buffer.synchronized {
    buffer.toSeq
  }
}

// Thread-local accumulation
def parallelProcess(data: Seq[Int]): Seq[Int] = {
  val results = data.par.map { value =>
    val local = ArrayBuffer[Int]()
    // Process and accumulate in thread-local buffer
    for (i <- 0 until value) {
      local += i * value
    }
    local.toSeq
  }
  results.flatten.seq.to(ArrayBuffer).toSeq
}

ArrayBuffer strikes the right balance between performance and usability for mutable sequential collections. Use it when building collections incrementally, processing streams, or when performance profiling shows immutable collections are a bottleneck.