Scala - Set with Examples

Sets are unordered collections that contain no duplicate elements. Scala provides both immutable and mutable Set implementations, with immutable being the default. The immutable Set is part of...

Key Insights

  • Scala provides three Set implementations: immutable Set (default), mutable Set, and specialized variants like SortedSet and LinkedHashSet for ordered collections
  • Sets guarantee uniqueness through hash-based equality checks, making them O(1) for lookups but requiring proper hashCode/equals implementations for custom objects
  • Immutable Sets use structural sharing for memory efficiency, while mutable Sets offer in-place modifications—choose based on whether you need thread-safety and functional programming patterns

Understanding Scala Sets

Sets are unordered collections that contain no duplicate elements. Scala provides both immutable and mutable Set implementations, with immutable being the default. The immutable Set is part of scala.collection.immutable, while mutable Set resides in scala.collection.mutable.

import scala.collection.mutable

// Immutable Set (default)
val immutableSet = Set(1, 2, 3, 4, 5)

// Mutable Set (explicit import required)
val mutableSet = mutable.Set(1, 2, 3, 4, 5)

// Attempting to add duplicates - silently ignored
val setWithDuplicates = Set(1, 2, 2, 3, 3, 3)
println(setWithDuplicates) // Set(1, 2, 3)

Creating and Initializing Sets

Multiple approaches exist for Set creation depending on your data source and requirements.

// Empty sets
val empty1 = Set.empty[Int]
val empty2 = Set[String]()

// From collections
val fromList = List(1, 2, 3, 2, 1).toSet
val fromArray = Array("a", "b", "c", "a").toSet

// Using apply method
val explicit = Set(10, 20, 30)

// Range conversion
val rangeSet = (1 to 10).toSet

// From varargs
def createSet(elements: Int*): Set[Int] = elements.toSet
val varargSet = createSet(5, 10, 15, 10, 5)
println(varargSet) // Set(5, 10, 15)

Basic Set Operations

Sets support standard collection operations with performance characteristics optimized for membership testing.

val numbers = Set(1, 2, 3, 4, 5)

// Membership testing - O(1) average case
println(numbers.contains(3))  // true
println(numbers(3))           // true (apply method)
println(numbers.contains(10)) // false

// Size and emptiness
println(numbers.size)      // 5
println(numbers.isEmpty)   // false
println(numbers.nonEmpty)  // true

// Adding elements (returns new Set for immutable)
val withSix = numbers + 6
val withMultiple = numbers ++ Set(6, 7, 8)

// Removing elements
val withoutThree = numbers - 3
val withoutMultiple = numbers -- Set(1, 2)

println(withoutMultiple) // Set(3, 4, 5)

Mutable Set Operations

Mutable Sets modify the collection in-place, providing different method names to indicate side effects.

import scala.collection.mutable

val mutableNumbers = mutable.Set(1, 2, 3)

// In-place addition
mutableNumbers += 4
mutableNumbers ++= Set(5, 6, 7)

// In-place removal
mutableNumbers -= 2
mutableNumbers --= Set(1, 3)

// Add with return value
val wasAdded = mutableNumbers.add(10)    // true
val alreadyPresent = mutableNumbers.add(10) // false

// Remove with return value
val wasRemoved = mutableNumbers.remove(10) // true
val notPresent = mutableNumbers.remove(99) // false

// Clear all elements
mutableNumbers.clear()
println(mutableNumbers.isEmpty) // true

Set Algebra Operations

Sets excel at mathematical set operations like union, intersection, and difference.

val setA = Set(1, 2, 3, 4, 5)
val setB = Set(4, 5, 6, 7, 8)

// Union - all elements from both sets
val union = setA union setB
val union2 = setA ++ setB
println(union) // Set(1, 2, 3, 4, 5, 6, 7, 8)

// Intersection - common elements
val intersection = setA intersect setB
val intersection2 = setA & setB
println(intersection) // Set(4, 5)

// Difference - elements in A but not in B
val difference = setA diff setB
val difference2 = setA -- setB
println(difference) // Set(1, 2, 3)

// Symmetric difference - elements in either set but not both
val symDiff = (setA diff setB) union (setB diff setA)
println(symDiff) // Set(1, 2, 3, 6, 7, 8)

// Subset checks
println(Set(1, 2).subsetOf(setA))    // true
println(setA.subsetOf(Set(1, 2)))    // false

Transforming Sets

Functional transformations create new Sets while preserving uniqueness constraints.

val numbers = Set(1, 2, 3, 4, 5)

// Map - transform elements
val doubled = numbers.map(_ * 2)
println(doubled) // Set(2, 4, 6, 8, 10)

// Map with potential duplicates collapsed
val modulo = Set(1, 2, 3, 4, 5, 6).map(_ % 3)
println(modulo) // Set(0, 1, 2)

// Filter - select elements
val evens = numbers.filter(_ % 2 == 0)
println(evens) // Set(2, 4)

// FlatMap - flatten and transform
val words = Set("hello", "world")
val chars = words.flatMap(_.toSet)
println(chars) // Set(h, e, l, o, w, r, d)

// Collect - partial function application
val result = numbers.collect {
  case x if x % 2 == 0 => x * 10
}
println(result) // Set(20, 40)

Ordered Sets

When element ordering matters, use SortedSet or LinkedHashSet.

import scala.collection.{SortedSet, mutable}

// SortedSet - maintains natural ordering
val sorted = SortedSet(5, 2, 8, 1, 9)
println(sorted) // TreeSet(1, 2, 5, 8, 9)

// Custom ordering
implicit val reverseOrdering: Ordering[Int] = Ordering.Int.reverse
val reverseSorted = SortedSet(5, 2, 8, 1, 9)
println(reverseSorted) // TreeSet(9, 8, 5, 2, 1)

// LinkedHashSet - maintains insertion order
val linked = mutable.LinkedHashSet(5, 2, 8, 1, 9)
println(linked) // LinkedHashSet(5, 2, 8, 1, 9)

linked += 3
println(linked) // LinkedHashSet(5, 2, 8, 1, 9, 3)

// Range operations on SortedSet
val numbers = SortedSet(1, 3, 5, 7, 9, 11, 13)
println(numbers.range(5, 11)) // TreeSet(5, 7, 9)
println(numbers.from(7))      // TreeSet(7, 9, 11, 13)
println(numbers.until(7))     // TreeSet(1, 3, 5)

Custom Objects in Sets

Custom classes require proper hashCode and equals implementations for Set membership.

case class Person(name: String, age: Int)

// Case classes automatically implement hashCode/equals
val people = Set(
  Person("Alice", 30),
  Person("Bob", 25),
  Person("Alice", 30) // Duplicate - will be excluded
)
println(people.size) // 2

// Custom class without case class
class Employee(val id: Int, val name: String) {
  override def equals(obj: Any): Boolean = obj match {
    case that: Employee => this.id == that.id
    case _ => false
  }
  
  override def hashCode(): Int = id.hashCode()
  
  override def toString: String = s"Employee($id, $name)"
}

val employees = Set(
  new Employee(1, "Alice"),
  new Employee(2, "Bob"),
  new Employee(1, "Alice Smith") // Same id - treated as duplicate
)
println(employees.size) // 2

Performance Considerations

Set operations have specific performance characteristics that influence design decisions.

import scala.collection.{immutable, mutable}

// HashSet - O(1) average lookup, insertion, deletion
val hashSet = immutable.HashSet(1, 2, 3, 4, 5)

// TreeSet (SortedSet) - O(log n) operations, maintains order
val treeSet = immutable.TreeSet(1, 2, 3, 4, 5)

// BitSet - memory efficient for dense integer sets
val bitSet = immutable.BitSet(1, 2, 3, 100, 1000)
println(bitSet.contains(100)) // true - still efficient

// Benchmark example pattern
def measureTime[T](operation: => T): (T, Long) = {
  val start = System.nanoTime()
  val result = operation
  val elapsed = System.nanoTime() - start
  (result, elapsed)
}

val largeSet = (1 to 1000000).toSet
val (found, time) = measureTime {
  largeSet.contains(999999)
}
println(s"Found: $found in ${time / 1000000.0} ms")

Practical Use Cases

Sets solve real-world problems where uniqueness and fast lookups matter.

// Remove duplicates from data
def uniqueEmails(emails: List[String]): Set[String] = 
  emails.map(_.toLowerCase.trim).toSet

// Find common interests
case class User(name: String, interests: Set[String])

def findCommonInterests(user1: User, user2: User): Set[String] =
  user1.interests intersect user2.interests

val alice = User("Alice", Set("scala", "hiking", "photography"))
val bob = User("Bob", Set("scala", "photography", "cooking"))
println(findCommonInterests(alice, bob)) // Set(scala, photography)

// Validate unique constraints
def hasUniqueIds(items: List[(Int, String)]): Boolean =
  items.map(_._1).toSet.size == items.size

val validData = List((1, "A"), (2, "B"), (3, "C"))
val invalidData = List((1, "A"), (2, "B"), (1, "C"))
println(hasUniqueIds(validData))   // true
println(hasUniqueIds(invalidData)) // false

// Cache visited nodes in graph traversal
def bfs(start: Int, graph: Map[Int, List[Int]]): Set[Int] = {
  def loop(queue: List[Int], visited: Set[Int]): Set[Int] = queue match {
    case Nil => visited
    case head :: tail if visited.contains(head) => 
      loop(tail, visited)
    case head :: tail =>
      val neighbors = graph.getOrElse(head, List.empty)
      loop(tail ++ neighbors, visited + head)
  }
  loop(List(start), Set.empty)
}

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.