Scala - Parallel Collections



You can perform parallel operations on collections in Scala. You can use multicore processors for better performance. You can extend parallel collections using the standard Scala collections.

Set Up

Before Scala 2.13.0, the parallel collections were part of the standard library. But, from Scala 2.13 onwards, these have been moved to a separate library. You need to add the following dependency to your build.sbt file to use parallel collections in Scala 2.13 or above -

libraryDependencies += "org.scala-lang.modules" %% "scala-parallel-collections" % "1.0.4"

Now, you can add the following import statement to bring the parallel methods into scope -

import scala.collection.parallel.CollectionConverters._

Declaring Parallel Collections

The following is the syntax for declaring a parallel collection variable.

Syntax

import scala.collection.parallel.immutable.ParVector

var z: ParVector[String] = ParVector("Zara", "Nuha", "Ayan")

Here, z is declared as a ParVector of String which has three members. Values can be added by using commands like the following:

Command

var myVector: ParVector[String] = z :+ "Naira"

Processing Parallel Collections

Below is an example program showing how to create, initialize, and process a parallel collection.

Example

import scala.collection.parallel.immutable.ParVector

object Demo {
   def main(args: Array[String]) = {
      var myVector: ParVector[String] = ParVector("Zara", "Nuha", "Ayan")
      // Add an element
      var myVector1: ParVector[String] = myVector :+ "Naira"
      // Remove an element
      var myVector2: ParVector[String] = myVector.filterNot(_ == "Nuha")
      // Create empty collection
      var myVector3: ParVector[String] = ParVector.empty[String]
      println(myVector)
      println(myVector1)
      println(myVector2)
      println(myVector3)
   }
}

Save the above program in Demo.scala. Use the following commands to compile and execute this program.

Command

> scalac Demo.scala
> scala Demo

Output

ParVector(Zara, Nuha, Ayan)
ParVector(Zara, Nuha, Ayan, Naira)
ParVector(Zara, Ayan)
ParVector()

Parallel Operations

You can perform various parallel operations like, map, filter, and foreach on parallel collections. You can execute these operations concurrently using multiple processors.

Example: Parallel Map, Filter, and Foreach

Below is an example of using the map, filter, and foreach operations in parallel collections to perform tasks concurrently.

import scala.collection.parallel.immutable.ParVector

object Demo {
   def main(args: Array[String]) = {
      val numbers: ParVector[Int] = ParVector(1, 2, 3, 4, 5)

      // Perform parallel map operation
      val doubled = numbers.map(_ * 2)
      println("Doubled: " + doubled)

      // Perform parallel filter operation
      val evens = numbers.filter(_ % 2 == 0)
      println("Evens: " + evens)

      // Perform parallel foreach operation
      numbers.foreach(println)
   }
}

Save the above program in Demo.scala. Use the following commands to compile and execute this program.

Command

> scalac Demo.scala
> scala Demo

Output

Doubled: ParVector(2, 4, 6, 8, 10)
Evens: ParVector(2, 4)
1
2
3
4
5

Converting to Parallel Collections

You can convert a standard Scala collection to a parallel collection using the par method. This method transforms a sequential collection into a parallel collection. So you can perform parallel operations on it.

Example: Converting to Parallel Collection

Below is an example of converting a standard sequential collection to a parallel collection using the par method.

import scala.collection.immutable.Vector

object Demo {
   def main(args: Array[String]) = {
      val seqVector: Vector[Int] = Vector(1, 2, 3, 4, 5)
      // Convert to parallel collection
      val parVector = seqVector.par
      // Perform parallel operation
      val sum = parVector.sum
      println("Sum: " + sum)
   }
}

Save the above program in Demo.scala. Use the following commands to compile and execute this program.

Command

> scalac Demo.scala
> scala Demo

Output

Sum: 15

Advanced Parallel Operations

In addition to basic operations like map and filter, parallel collections also support more advanced operations like reduce and fold. These can be performed in parallel for better performance.

Example: Parallel Reduce

The reduce operation combines elements of a collection using a binary operation. You can apply this operation in parallel across the elements, when performed on a parallel collection.

import scala.collection.parallel.immutable.ParVector

object Demo {
   def main(args: Array[String]) = {
      val numbers: ParVector[Int] = ParVector(1, 2, 3, 4, 5)
      // Perform parallel reduce operation
      val sum = numbers.reduce(_ + _)
      println("Sum: " + sum)
   }
}

Save the above program in Demo.scala. Use the following commands to compile and execute this program.

Command

> scalac Demo.scala
> scala Demo

Output

Sum: 15

Example: Parallel Fold

The fold operation is similar to reduce. But it starts with an initial value and combines elements using a binary operation. You can apply this operation in parallel, when performed on a parallel collection.

import scala.collection.parallel.immutable.ParVector

object Demo {
   def main(args: Array[String]) = {
      val numbers: ParVector[Int] = ParVector(1, 2, 3, 4, 5)
      // Perform parallel fold operation
      val sum = numbers.fold(0)(_ + _)
      println("Sum using fold: " + sum)
   }
}

Save the above program in Demo.scala. Use the following commands to compile and execute this program.

Command

> scalac Demo.scala
> scala Demo

Output

Sum using fold: 15

Parallel Collections Summary

  • Scala parallel collections provide parallel execution of operations on collections.
  • You can declare and process parallel collections similar to standard collections.
  • Various parallel operations, like, map, filter, reduce, and foreach can be performed concurrently.
  • Standard collections can be converted to parallel collections using the par
  • There are some limitations of parallel collections, like, side-effecting operations, non-associative operations, and race conditions when using parallel collections.
Advertisements