42
42L
1.618f
1.618
true
'a'
"Hello!"
res0_0: Int = 42 res0_1: Long = 42L res0_2: Float = 1.618F res0_3: Double = 1.618 res0_4: Boolean = true res0_5: Char = 'a' res0_6: String = "Hello!"
Broadly speaking, the “schism” between imperative programming and functional programming comes down to the basic notion of what defines a computation step. In the imperative computational model, we focus on executing statements for its effects on a memory. A program consists of a sequence of statements (or sometimes called commands or instructions) that is largely viewed as fixed and separate from the memory (or sometimes called the store) that it is modifying. Assembly languages and C are often held as examples of imperative programming. In the functional computational model, we focus on evaluating expressions, that is, rewriting expressions until we obtain a value. A program and the computation “state” is an expression (also sometimes called a term). To a first approximation, there is no external memory. Expression rewriting is actually not so unfamiliar. Primary school arithmetic is expression evaluation:
\[ \begin{array}{rcl} (1 + 1) + (1 + 1) & \longrightarrow& 2 + (1 + 1) \\ & \longrightarrow& 2 + 2 \\ & \longrightarrow& 4 \end{array} \]
where the \(\longrightarrow\) arrow signifies an evaluation rewriting step.
In actuality, the “schism” is somewhat false. Few languages are exclusively imperative or exclusively functional in the sense defined above. “Imperative programming languages” have effect-free expression subsets (e.g., for arithmetic), while “functional programming languages” have effectful expressions (e.g., for printing to the screen). Being effect-free or pure has advantages by being independent of how a machine evaluates expressions (i.e., called referential transparency). For example, the final result does not depend on the order of evaluation (e.g., whether the left \((1 + 1)\) or the right \((1 + 1)\) is evaluated first, or whether they are done in parallel), which makes it easier to reason about programs in isolation (e.g., the meaning of \((1 + 1) + (1 + 1)\)) and for compilers to optimize your code. At the same time, interacting with the underlying execution engine can be powerful, and thus we at times want effects. The potentially surprising idea at this point is how often we can program effectively without effects.
We consider and want to support both effect-free and effectful computation. The take-home message here is it is too simplistic to say a programming language is imperative or functional. Rather, we see that it is a bias in perspective in how we see computation and programs. For imperative languages, programs, and constructs, we speak of statement execution that modifies a memory or data store. For functional languages, programs, and constructs, we think of expression evaluation that reduces to a value or terminal result. We will see how this bias affects, for example, how we program repetition (i.e., looping versus recursion or comprehensions).
Note that the term “functional programming language” is quite overloaded in practice. For example, it may refer to the language having the expression rewriting bias described above, being pure and free of effectful expressions, or having first-class functions(discussed in ).
Many languages, including JavaScript and Scala, have aspects of both, including the features that are often considered the most characteristic: mutation and first-class functions.
We begin our language study by focusing on a small subset of Scala. Our intent is not to give an exhaustive tutorial or manual on Scala, but rather to use Scala as an example language to highlight concepts that underly many other programming languages. For a tutorial on Scala, see, for example, Programming in Scala [1].
Basic expressions, values, and types are seemingly boring, but they also form the basis of any programming language. A value has a type, which defines the operations that can be applied to it. Scala has all the familiar basic types, such as Int
, Long
, Float
, Double
, Boolean
, Char
, and String
.
We can directly write down values of these types using literals:
42
42L
1.618f
1.618
true
'a'
"Hello!"
res0_0: Int = 42 res0_1: Long = 42L res0_2: Float = 1.618F res0_3: Double = 1.618 res0_4: Boolean = true res0_5: Char = 'a' res0_6: String = "Hello!"
An expression can be a literal or consist of operations that await to be evaluated. For example, here are some expected expressions:
44 - 2
!true
true && false
1 < 2
if (1 < 2) 3 else 4
"Hello" + "!"
res1_0: Int = 42 res1_1: Boolean = false res1_2: Boolean = false res1_3: Boolean = true res1_4: Int = 3 res1_5: String = "Hello!"
In the above, we see Scala infers the type of each expression and evaluates each expression to its value. A value is an expression cannot be evaluated any further — the result of evaluating an expression.
We can check that an expression has the expected type as follows:
42: Int
42L: Long
1.618f: Float
1.618: Double
true: Boolean
'a': Char
"Hello!": String
44 - 2: Int
!true: Boolean
true && false: Boolean
1 < 2: Boolean
if (1 < 2) 3 else 4: Int
"Hello" + "!": String
res2_0: Int = 42 res2_1: Long = 42L res2_2: Float = 1.618F res2_3: Double = 1.618 res2_4: Boolean = true res2_5: Char = 'a' res2_6: String = "Hello!" res2_7: Int = 42 res2_8: Boolean = false res2_9: Boolean = false res2_10: Boolean = true res2_11: Int = 3 res2_12: String = "Hello!"
Often, we want to refer to arbitrary values, types, or expressions in a programming language. To do so, we use meta-variables that stand for entities in our language of interest, such as \(v\) for a value, \(\tau\) for a type, and \(e\) for an expression.
We have annotated types on all of the expressions above, that is, we assert that the value that results from evaluating that expression (if one results) should have that type. In this case, all of these examples are well-typed expressions, that is, the typing assertion holds for them.
42: Boolean
cmd3.sc:1: type mismatch;
found : Int(42)
required: Boolean
val res3 = 42: Boolean
^Compilation Failed
:
Compilation Failed
The typing assertion is also an expression, so we can annotate sub-expressions to check that they have the expected type. Doing so becomes useful for debugging when an expression \(e\) becomes complicated.
(44: Int) - 2
res3: Int = 42
Scala is statically typed. In essence, this statement means that the Scala compiler will perform some validation at compile-time (called type checking) and only translate well-typed expressions.
true - 2
cmd4.sc:1: value - is not a member of Boolean
val res4 = true - 2
^Compilation Failed
:
Compilation Failed
We discuss type checking in later chaptersfurther in; for now, it suffices to view type checking as making sure all operations in all sub-expressions have the “expected type.”
(1 + 2) + (3 + 4): Int
- if 1 + 2: Int
- if 1: Int
- if 2: Int
- if 3 + 4: Int
- if 3: Int
- if 4: Int
Here, we check explicitly that each sub-expression has the expected type:
((1: Int) + (2: Int): Int) + ((3: Int) + (4: Int): Int)
res4: Int = 10
We state that an expression \(e\) is well-typed with type \(\tau\) using essentially the same notation as Scala, that is, we write
\[ e: \tau \quad \text{for expression $e$ has type $\tau$.} \]
An expression may not always yield a value. For example, a divide-by-zero expression
42 / 0
java.lang.ArithmeticException: / by zero ammonite.$sess.cmd5$Helper.<init>(cmd5.sc:1) ammonite.$sess.cmd5$.<clinit>(cmd5.sc:7)
generates a run-time error, that is, an error that is raised during evaluation. Some languages are described as being dynamically typed, which means no type checking is performed before evaluation. Rather, a run-time type error is raised when evaluation encounters an operations that cannot be applied to the argument values. In general, the term static means before evaluating the program, while the term dynamic means during the evaluation of the program.
We can also test that an expression has the expected value at run-time with an assert
expression:
assert(44 - 2 == 42)
assert(1 < 2 == true)
assert((if (1 < 2) 3 else 4) == 3)
assert("Hello" + "!" == "Hello!")
Nothing is printed in the above because all of the assert
ions pass (i.e., all of the given expressions to assert
evaluate to true
).
assert(44 - 2 == 0)
java.lang.AssertionError: assertion failed scala.Predef$.assert(Predef.scala:265) ammonite.$sess.cmd7$Helper.<init>(cmd7.sc:1) ammonite.$sess.cmd7$.<clinit>(cmd7.sc:7)
The above is now a run-time error.
Unlike some other common languages (e.g., JavaScript, C, Java), Scala does not distinguish between expressions and statements. Instead, constructs we might consider as “effectful statements” are expressions that have type Unit
.
assert(44 - 2 == 42): Unit
println("Hello!"): Unit
Hello!
The Unit
type has one single value “()
” (usually called the “unit” value).
(): Unit
Since the unit value ()
itself is uninteresting and usually associated with side-effecting expression, the printer in the above simply chooses to not print unit values.
Scala has the all of the usual operators on numeric, Boolean
, and String
data types. For the numeric types, Scala will perform conversions implicitly using methods like toInt
, toLong
, and toString
.
3L + 4
3 + 4L
(3L + 4L).toInt
3.toString
res10_0: Long = 7L res10_1: Long = 7L res10_2: Int = 7 res10_3: String = "3"
An interesting aspect of Scala is that all operators are actually methods.
3 + 4
3.+(4)
"Hello".endsWith("lo")
"Hello" endsWith "lo"
res11_0: Int = 7 res11_1: Int = 7 res11_2: Boolean = true res11_3: Boolean = true
Binary operators like 3 + 4
is just shorthand for a method call 3.+(4)
. That is, these two syntactically different expressions have the same semantics. The term “syntactic sugar” is sometimes used in this case (e.g., 3 + 4
is syntactic sugar for 3.+(4)
). Note in the above that this works for any binary method, not just ones using symbols (e.g., endWith
as above).
We need a way to write down evaluation to describe how values are computed. Recall that in our setting, the computation state is an expression, so we write \[ e\longrightarrow e' \qquad \text{for expression $e$ \emph{steps to} expression $e'$ in one step.} \]
For example, \[ (1 + 2) + (3 + 4) \longrightarrow 3 + (3 + 4) \]
is the only case assuming left-to-right evaluation order.
What exactly is “one step” is a matter of definition, which we do not worry about much at this point. Rather, we may write
\[ e\longrightarrow^{\ast}e' \qquad \text{for expression $e$ \emph{steps to} $e'$ in 0 or more steps,} \]
that is, in some number of steps.
For example, all of the following hold:
\[ \begin{array}{rcl} (1 + 2) + (3 + 4) & \longrightarrow^{\ast}& (1 + 2) + (3 + 4) \\ (1 + 2) + (3 + 4) & \longrightarrow^{\ast}& 3 + (3 + 4) \\ (1 + 2) + (3 + 4) & \longrightarrow^{\ast}& 3 + 7 \\ (1 + 2) + (3 + 4) & \longrightarrow^{\ast}& 10 \end{array} \]
For any expression, the possible next steps dictate how evaluation proceeds and is related to concepts like evaluation order and eager versus lazy evaluation, which we revisit laterin ?sec-operational-semantics and ?sec-procedural-abstraction. Eager evaluation means that sub-expressions are evaluated to values before applying the operation. At this point, it may be hard to imagine anything but eager evaluation. In our current subset of Scala, eager evaluation applies (though Scala supports both).
Sometimes, we only care about the final value of an expression (i.e., its value), so we write \[ e\Downarrow v \qquad \text{for expression $e$ \emph{evaluates to} value $v$.} \]
When we write this particular expression \(e\)
3 + 4
res12: Int = 7
we are asking the interpreter \(\Downarrow\) for its value \(v\), which in this case is 7
. That is, we say that for Scala, the following evaluation relation holds:
\[ 3 + 4 \Downarrow 7 \;. \]
We can see that Scala evaluates +
left-to-right by adding side-effecting expressions, such as printing to console:
{ println("eval((1 + 2) + (3 + 4))");
{ println("- if eval(1 + 2)");
{ println(" - if eval(1)"); 1 } +
{ println(" - if eval(2)"); 2 }
} +
{ println("- eval(3 + 4)");
{ println(" - if eval(3)"); 3 } +
{ println(" - if eval(4)"); 4 } } }
eval((1 + 2) + (3 + 4))
- if eval(1 + 2)
- if eval(1)
- if eval(2)
- eval(3 + 4)
- if eval(3)
- if eval(4)
res13_1: Int = 10
Here, we add a println
printing expression to each sub-expression (using the indention format when we considered static type checking above in Section 3.2.1).