val numbers = List(1, 2, 3)
.length numbers
numbers: List[Int] = List(1, 2, 3) res0_1: Int = 3
We have already seen one standard data type in tuples (see Section 4.4.3).
After tuples, the most commonly-used standard data type is probably List
. A List
is a sequential, functional data structure.
val numbers = List(1, 2, 3)
.length numbers
numbers: List[Int] = List(1, 2, 3) res0_1: Int = 3
Like tuples, the List
type constructor is parametrized by another type. In the above, we have that numbers
is bound to a value of type List[Int]
, that is, a list of integers. A list of strings would have type List[String]
.
While it is very uncommon to do so, it is possible to index into lists:
numbers(0)
numbers(1)
numbers(2)
res1_0: Int = 1 res1_1: Int = 2 res1_2: Int = 3
numbers(3)
java.lang.IndexOutOfBoundsException: 3 scala.collection.LinearSeqOps.apply(LinearSeq.scala:117) scala.collection.LinearSeqOps.apply$(LinearSeq.scala:114) scala.collection.immutable.List.apply(List.scala:79) ammonite.$sess.cmd2$Helper.<init>(cmd2.sc:1) ammonite.$sess.cmd2$.<clinit>(cmd2.sc:7)
Another way for getting the first element is getting the head
of the list:
.head numbers
res3: Int = 1
Scalaism: As all operators in Scala are methods, the expression numbers(0)
is syntactic sugar for a method call to apply
:
.apply(0) numbers
res4: Int = 1
An empty list can also be written as Nil
val empty: List[Int] = List()
val nil: List[Int] = Nil
empty: List[Int] = List() nil: List[Int] = List()
We can then prepend to a list with ::
as follows:
val numbers = List(1, 2, 3)
val consZero = 0 :: numbers
val consTen = 10 :: numbers
val numbersHead = numbers.head
val consZeroHead = consZero.head
val consTenHead = consTen.head
numbers: List[Int] = List(1, 2, 3) consZero: List[Int] = List(0, 1, 2, 3) consTen: List[Int] = List(10, 1, 2, 3) numbersHead: Int = 1 consZeroHead: Int = 0 consTenHead: Int = 10
Note that there is no imperative update here. A List
is an immutable, functional data structure. An immutable, functional data Prepending to numbers
does not change it. Rather consZero
and consTen
are bound to new lists.
Recall from Section 4.3, the note about immutablility enabling efficient representations. Because List
s are immutable, prepending is still a constant-time operation (i.e., \(O(1)\)). The consZero
and consTen
can share the same tail (i.e., numbers
), that is, only 5 nodes are needed in total to represent the lists numbers
, consZero
, and consTen
:
val consZeroTail = consZero.tail
val consTenTail = consTen.tail
consZeroTail eq consTenTail
numbers eq consZeroTailList(1, 2, 3)
numbers eq == List(1, 2, 3) numbers
consZeroTail: List[Int] = List(1, 2, 3) consTenTail: List[Int] = List(1, 2, 3) res7_2: Boolean = true res7_3: Boolean = true res7_4: Boolean = false res7_5: Boolean = true
where in Scala, eq
is the reference equality operator, while ==
is structural equality.
Note that the List(1, 2, 3)
constructor is equivalent to the following:
val numbers = 1 :: 2 :: 3 :: Nil
numbers: List[Int] = List(1, 2, 3)
The ::
operator is often read as “cons”, referencing historically the name cons
in Lisp for the primitive that constructs a cell with two values. Note that ::
is a right-associative binary operator, so it is parsed as like
val numbers = 1 :: (2 :: (3 :: Nil))
.::(3).::(2).::(1) Nil
numbers: List[Int] = List(1, 2, 3) res9_1: List[Int] = List(1, 2, 3)
and as with other binary operators, it is just syntactic sugar for method call.
It is quite common to work with lists directly using pattern matching on Nil
or ::
:
def isEmpty(l: List[Int]): Boolean = l match {
case Nil => true
case head :: tail => false
}
isEmpty(numbers)
defined function isEmpty res10_1: Boolean = false
Note that the List API also defines a method :::
for appending two lists together:
val numbers = List(1, 2, 3)
::: List(4, 5, 6) numbers
numbers: List[Int] = List(1, 2, 3) res11_1: List[Int] = List(1, 2, 3, 4, 5, 6)
The append method :::
is a linear-time operation in the length of its left argument (i.e., \(O(\)numbers.length
\()\) in this example). Why must this be the case?
As noted above, a List
is an immutable, functional data structure.
numbers(1) = 20
cmd12.sc:1: value update is not a member of List[Int]
did you mean updated?
val res12 = numbers(1) = 20
^Compilation Failed
:
Compilation Failed
While it is not common to use this method, the closest analogue is return a new list that is the original list element at index 1
updated
:
val numbers_ = numbers.updated(1, 20)
numbers_: List[Int] = List(1, 20, 3)
While it is also not common to use, a for
loop in Scala enables iteration through a List
:
for (n <- numbers) println(n)
1
2
3
A for
loop in Scala is actually syntactic sugar for a method call to a higher-order method foreach
:
{ n => println(n) } numbers foreach
1
2
3
We call foreach
higher-order, as it takes a function as a parameter of type Int => Unit
(e.g., { n => println(n) }
in the above example). The function parameter, sometimes called a callback, describes what to do for each element of the list.
Note that println
is a function that conforms to Int => Unit
, so we could pass it directly:
.foreach(println) numbers
1
2
3
Higher-order methods are the most common way to use List
s. However, foreach
is less used than others, as the callback of type Int => Unit
must inherently be imperative to do anything interesting. Why? Consider the type of the function we get that awaits receiving the callback to foreach
:
val awaitingCallback: (Int => Unit) => Unit = numbers.foreach(_)
awaitingCallback(println)
1
2
3
awaitingCallback: Int => Unit => Unit = ammonite.$sess.cmd16$Helper$$Lambda$2150/0x0000000800b46840@2223c63d
In the above, read numbers.foreach(_)
more like numbers.foreach
. The extra (_)
is a low-level Scalaism to convert a method into a function that is only needed for this explanation and rarely needed in practice.
Or as another example, consider trying to write a function sum
that sums up a list of integers List[Int]
with a for
loop or foreach
:
def sum(l: List[Int]): Int = {
var acc = 0
for (n <- l) acc += n
acc}
sum(numbers)
defined function sum res17_1: Int = 6
The only way to remember the accumulated sum so far is with a mutable variable var acc
.
Instead, the idiomatic way to compute such a sum is to use another higher-order method that permits accumulation in a functional manner, such as reduce
:
def sum(l: List[Int]): Int = l reduce { (acc, n) => acc + n }
sum(numbers)
defined function sum res18_1: Int = 6
Not only does the sum
definition become a one-liner, but it decouples the scheduling of work on each element of the list and lets the library implement that (e.g., sequentially left-to-right, concurrently, or even distributed!).
While not necessarily recommended, one might sometimes see an alternative Scala syntax for function literals using placeholders _
:
def sum(l: List[Int]): Int = l.reduce(_ + _)
sum(numbers)
defined function sum res19_1: Int = 6
Each placeholder _
corresponds to a formal parameter, so _ + _
is syntactic sugar for (x,y) => x + y
. Like with any diet, take sugar in moderation.
We will consider in subsequent chapters how to effectively use such higher-order methods. For the moment, simply recognize that such higher-order methods exist and is the idiomatic way to work with List
s. Furthermore, this API design is particularly powerful and becoming commonplace in almost all languages (even in Java!). For example, in big-data applications, this design enables streaming where the data can be consumed in an online manner as a stream. Consider the following for a taste:
val l = List(1, 2, 3, 4, 5, 6)
val sumEvens = l filter { i => i % 2 == 0 } reduce { (acc, n) => acc + n }
l: List[Int] = List(1, 2, 3, 4, 5, 6) sumEvens: Int = 12
Or,
val sumEvens = l.filter(_ % 2 == 0).reduce(_ + _)
sumEvens: Int = 12
The foreach
method is an abstraction of the object-oriented Iterator Pattern:
val it = numbers.iterator
while (it.hasNext) {
val n = it.next()
println(n)
}
1
2
3
it: Iterator[Int] = empty iterator
In essence, the foreach
method uses the callback parameter to allow the client to specify the body of the while
loop. A benefit is that common programming errors in using such an object-oriented API—like calling it.next()
after the it
has no more elements—cannot happen when using foreach
.
.next() it
java.util.NoSuchElementException: head of empty list scala.collection.immutable.Nil$.head(List.scala:629) scala.collection.immutable.Nil$.head(List.scala:628) scala.collection.StrictOptimizedLinearSeqOps$$anon$1.next(LinearSeq.scala:253) ammonite.$sess.cmd23$Helper.<init>(cmd23.sc:1) ammonite.$sess.cmd23$.<clinit>(cmd23.sc:7)
As alluded to above, Scala has a rich API for List
s. Such libraries are designed to be extremely generic for many use cases, so they necessarily have a fair amount of complexity. Nonetheless, it is worthwhile getting used to reading such API documentation.
A List
is a immutable, functional sequential collection of elements of the same type (i.e., a singly-linked list), while an Array
is a fixed-size, mutable indexable collection of elements of the same type. In this course, we have little need for Array
, but it exists in Scala for particular use cases.
Another commonly used built-in data type is Option
. It is either a None
for or a Some
of some value:
val none: Option[Int] = None
val some: Option[Int] = Some(42)
none: Option[Int] = None some: Option[Int] = Some(value = 42)
Or, using some API methods on Option
:
val none: Option[Int] = Option.empty
val some: Option[Int] = Option(42)
none: Option[Int] = None some: Option[Int] = Some(value = 42)
The Option
type is useful for methods that may optionally return a value (i.e., would error in some cases).
For example, we may want to define a division method that returns None
if the client attempts to divide by zero:
def div(n: Int, m: Int): Option[Int] = m match {
case 0 => None
case _ => Some(n / m)
}
defined function div
Or, as another example, the head
method for List
s errors if the input list is empty:
val emptyList: List[Int] = Nil
emptyList: List[Int] = List()
val h: Int = emptyList.head
java.util.NoSuchElementException: head of empty list scala.collection.immutable.Nil$.head(List.scala:629) scala.collection.immutable.Nil$.head(List.scala:628) ammonite.$sess.cmd28$Helper.<init>(cmd28.sc:1) ammonite.$sess.cmd28$.<clinit>(cmd28.sc:7)
Instead, the headOption
method returns an Option
using None
for an empty list and Some
for a non-empty list:
val h: Option[Int] = emptyList.headOption
h: Option[Int] = None
We can then work with options also using pattern matching:
def head(l: List[Int]): Option[Int] = l match {
case Nil => None
case h :: _ => Some(h)
}
head(List(1, 2, 3))
defined function head res30_1: Option[Int] = Some(value = 1)
We can think of an Option
value as a 0-or-1 element list and thus all of the higher-order iteration methods are available:
.foreach(println) some
42
Maps are particularly useful data structures for storing associations between keys and values. For example, we describe value environments for a programming language as maps from variables to values in Section 4.1.1.
type Env = Map[String,Int]
val env: Env = Map("nOranges" -> 4, "nApples" -> 7, "nPears" -> 10)
defined type Env env: Env = Map("nOranges" -> 4, "nApples" -> 7, "nPears" -> 10)
We can lookup in maps based on a key:
env("nApples")
.apply("nApples") env
res33_0: Int = 7 res33_1: Int = 7
env("nDogs")
java.util.NoSuchElementException: key not found: nDogs scala.collection.immutable.Map$Map3.apply(Map.scala:399) ammonite.$sess.cmd34$Helper.<init>(cmd34.sc:1) ammonite.$sess.cmd34$.<clinit>(cmd34.sc:7)
As we see above, since a given key may not exist in a map, there is a get
method that instead returns an Option
:
"nApples"
env contains "nApples"
env get "nDogs"
env contains "nDogs" env get
res35_0: Boolean = true res35_1: Option[Int] = Some(value = 7) res35_2: Boolean = false res35_3: Option[Int] = None
Another commonly-used alterative to apply
and get
for lookup in a map is getOrElse
that takes an extra parameter for what to return in the case that the key does not exist:
getOrElse ("nDogs", 0) env
res36: Int = 0
Finally, we often want to extend maps:
val env_ = env + ("nBananas" -> 17)
env_: Map[String, Int] = Map( "nOranges" -> 4, "nApples" -> 7, "nPears" -> 10, "nBananas" -> 17 )
Note that just like with List
, the above is a “functional update” that returns a new Map
where env
still exists:
env env_
res38_0: Env = Map("nOranges" -> 4, "nApples" -> 7, "nPears" -> 10) res38_1: Map[String, Int] = Map( "nOranges" -> 4, "nApples" -> 7, "nPears" -> 10, "nBananas" -> 17 )
A functional update is sometimes where one might want to intentionally shadow (i.e., write val env = env + ("nBananas" -> 17)
in the above) to prevent referencing the unextended env
in a particular scope.
We use the ->
operator with maps for visual clarity, but it is actually not special. It is just an alias for constructing pairs:
"nBananas" -> 17
res39: (String, Int) = ("nBananas", 17)
val env_ = env + (("nBananas", 17))
env_: Map[String, Int] = Map( "nOranges" -> 4, "nApples" -> 7, "nPears" -> 10, "nBananas" -> 17 )
As a collection in the standard library, Map
also has the usual higher-order iteration methods, such as
.foreach(println) env
(nOranges,4)
(nApples,7)
(nPears,10)
The Scala standard library has many other core functional data structures. Another commonly used one is the Set
data structure that keeps a single copy of each element while supporting fast membership testing, union, intersection, and iteration operations.
Scala is also an object-oriented language where code and data can be encapsulated together. A class declaration introduces a new type name, specifies data that it packages together, and defines methods that operate on that data:
class Dog(name: String, breed: String, age: Int) {
override def toString =
s"Woof! My name is $name, I am a $breed, and I am $age years old."
}
val samuel = new Dog("Samuel", "Alsatian", 11)
val bo = new Dog("Bo", "Portuguese Water Dog", 10)
samuel.toString
defined class Dog samuel: Dog = Woof! My name is Samuel, I am a Alsatian, and I am 11 years old. bo: Dog = Woof! My name is Bo, I am a Portuguese Water Dog, and I am 10 years old. res42_3: String = "Woof! My name is Samuel, I am a Alsatian, and I am 11 years old."
In the above, we define a method toString
that overrides the default toString
method that is defined for all objects—that we can see is used as the printer above.
We can see that defining classes in Scala is quite convenient, eliminating a lot of the repetitive boilerplate code with constructors, field declarations, etc. seen in, for example, Java and C++.
Scala has a shorthand for defining a class with single instance (sometimes called a singleton) with the object
keyword:
object Dog {
def birth(name: String, breed: String): Dog =
new Dog(name, breed, 0)
}
val sadie = Dog.birth("Sadie", "Pointer")
.toString sadie
defined object Dog sadie: Dog = Woof! My name is Sadie, I am a Pointer, and I am 0 years old. res43_2: String = "Woof! My name is Sadie, I am a Pointer, and I am 0 years old."
A object
is like a module with function definitions, type definitions, etc. If an object
has the same name as a class
in the same file, then it is the companion object for the class and has special accessibility to that class’s instances.
In relation to Java or C++, we can see the parameters to the class on line 1 as the parameter list to the constructor to create private fields with the same name that are accessible by methods. However, those fields are not accessible outside of the class’s methods:
.name samuel
cmd44.sc:1: value name is not a member of cmd44.this.cmd42.Dog
val res44 = samuel.name
^Compilation Failed
:
Compilation Failed
Often, we just want classes that store associated data together. If the fields are immutable, it is perfectly acceptable for them to be accessible. We can do so as follows to make public val
fields:
class Dog(val name: String, val breed: String, val age: Int)
val samuel = new Dog("Samuel", "Alsatian", 11)
.name samuel
defined class Dog samuel: Dog = ammonite.$sess.cmd44$Helper$Dog@15f44dcf res44_2: String = "Samuel"
However, in this case, we would like to treat the Dog
just like a specialized tuple and use things like pattern matching, but we can’t
val Dog(name, _, _) = samuel
cmd45.sc:1: object Dog is not a case class, nor does it have a valid unapply/unapplySeq member
val Dog(name, _, _) = samuel
^Compilation Failed
:
Compilation Failed
We can do so by making Dog
a case class
:
case class Dog(name: String, breed: String, age: Int)
val samuel = Dog("Samuel", "Alsatian", 11)
.name
samuelval Dog(_, breed, _) = samuel
defined class Dog samuel: Dog = Dog(name = "Samuel", breed = "Alsatian", age = 11) res45_2: String = "Samuel" breed: String = "Alsatian"
We can think of a case class
as a “functional class” used for storing immutable data. We can define methods on such classes as well, but that is somewhat secondary.
In addition to a tuples grouping associated data together, we want to be able to define alternatives:
trait Pet
case class Dog(name: String, breed: String) extends Pet
case class Cat(name: String, breed: String) extends Pet
def greet(pet: Pet): String = pet match {
case Dog(name, _) => s"Woof, $name!"
case Cat(name, _) => s"Meow, $name!"
}
greet(Dog("Samuel", "Altsatian"))
greet(Cat("Jenkins", "Siamese"))
defined trait Pet defined class Dog defined class Cat defined function greet res46_4: String = "Woof, Samuel!" res46_5: String = "Meow, Jenkins!"
A trait
introduces a new type name and is roughly a class
interface. In the above, Dog
and Cat
can be both be a Pet
. The greet
function takes as input a Pet
and uses pattern matching to distinguish whether it is a Dog
or a Cat
.
Pet
in the above example is a very simple algebraic data type. As an aside, an algebraic data type is named such because it combines products of data (i.e., tuples) and sums of data (i.e., cases).
The built-in collection types Option
and List
described above are both algebraic data types. Consider the following Option
-like definition:
sealed trait MyOption
case object MyNone extends MyOption
case class MySome(i: Int) extends MyOption
val none: MyOption = MyNone
val some: MyOption = MySome(42)
def getOrElse(o: MyOption, default: Int): Int = o match {
case MyNone => default
case MySome(i) => i
}
getOrElse(none, 0)
getOrElse(some, 0)
defined trait MyOption defined object MyNone defined class MySome none: MyOption = MyNone some: MyOption = MySome(i = 42) defined function getOrElse res47_6: Int = 0 res47_7: Int = 42
One new thing to note is the sealed
qualifier, which says that all the classes that derive from MyOption
(i.e., the alternatives) are defined here, so the programmer and compiler do not need to worry about other possible cases for MyOption
.
Algebraic data types can also be recursive. Recursive data types is exemplified by List
s, which we revisit subsequentlyin ?sec-recursion.
One difference between the built-in Option
and MyOption
in the above is that Option
is parametrized by a type of the value that may or may not exist. We can extend our definition of MyOption
with a type parameter A
as follows:
sealed trait MyOption[A]
case class MyNone[A]() extends MyOption[A]
case class MySome[A](v: A) extends MyOption[A]
val none: MyOption[Int] = MyNone()
val some: MyOption[Int] = MySome(42)
def getOrElse[A](o: MyOption[A], default: A): A = o match {
case MyNone() => default
case MySome(v) => v
}
getOrElse(none, 0)
getOrElse(some, 0)
defined trait MyOption defined class MyNone defined class MySome none: MyOption[Int] = MyNone() some: MyOption[Int] = MySome(v = 42) defined function getOrElse res48_6: Int = 0 res48_7: Int = 42
Observe that getOrElse
(as well as the MyNone
and MySome
constructors) have a type parameter list (written brackets []
) and a value parameter list (written in parentheses ()
).
The getOrElse
function is generic in the parametrized type A
. Being generic, the getOrElse
function is also called parametric polymorphic.
Note that MyOption
is not quite the same definition as Option
, but it is close.