Recall that the most characteristic feature of imperative computation is mutation—that is, executing assignment for its side effect that updates a memory (cf. Section 3.1).
32.1 JavaScripty: Mutable Variables
32.1.1 Syntax
To introduce mutable state, we introduce mutable variables declared as follows:
A declaration allocates a new mutable variable , assigns it an initial value from evaluating , and whose scope is .
We also introduce an assignment expression:
that writes the value of to a memory location named by expression . Note that we use the C and JavaScript-style assignment operator , which unfortunately looks like mathematical equality but is very different.
Let us consider JavaScripty with number literals and mutable variables:
Figure 32.1: Abstract syntax of JavaScripty with number literals and mutable variables.
To focus on mutation, let us drop constant-variable declarations for the moment and have only mutable-variable declarations. The mutable-variable declaration allocates a new memory cell with fresh address , evaluates to a value , stores value in the new memory cell, and evaluates with in scope pointing to the new memory cell.
The assignment expression evaluates to a location value , to a value , updates the memory cell named by with value , and returns .
We introduce the unary, pointer-dereference expression in the abstract syntax to use as an intermediate expression during reduction. It is not present in the concrete syntax of JavaScript, though it corresponds to the pointer-dereference expression in the C and C++ languages.
An address is a memory address for a memory cell that stores some content like values. A location value is a reduced expression that names a memory location.
Note that we do not consider an address to be a value here. In low-level languages like C and C++, an address is called a pointer and is a first-class value (i.e., an address is a is a value). A location value is also called an l-value for the value of an expression on the left-hand–side of an assignment, while a value is called a r-value for the value of an expression on the right-hand–side of an assignment.
trait Expr // ecaseclassN(n:Double)extends Expr // e ::= ncaseclassVar(x:String)extends Expr // e ::= xcaseclassVarDecl(x:String, e1: Expr, e2: Expr)extends Expr // e ::= var x = e1; e2caseclassAssign(e1: Expr, e2: Expr)extends Expr // e ::= e1 = e2caseclassDeref(e1: Expr)extends Expr // e ::= *e1caseclassA(a:Int)extends Expr // e ::= adefisValue(e: Expr):Boolean= e match{caseN(_)=>truecase _ =>false}
defined traitExpr
defined classN
defined classVar
defined classVarDecl
defined classAssign
defined classDeref
defined classA
defined functionisValue
We represent an address in Scala as an A(a) for a positive integer a. We can think of the address A(1) corresponding to the hexadecimal 0x00000004 memory address on a 32-bit machine.
Let consider the example expression with assignment:
JavaScript
var i =1;i =2
AST Representation in Scala
val e_assign =VarDecl("i",N(1),Assign(Var("i"),N(2)))
A mutable variable is a variable that can be updated. We can think of a mutable variable as a box that can be filled with a value, and then the value can be updated by filling the box with a new value. A memory cell is such a box that has an address to reference that box. A memory is then a finite set of such memory cells:
Figure 32.2: Memories for JavaScripty with mutable variables.
We also view a memory as a finite map from addresses to values and write for looking up the value corresponding to address in memory .
32.1.2.2 Location Values
We have noted above that a location value is a reduced expression that names a memory location. Like defining values, we can view this as unary judgment form on expressions:
Figure 32.3: Location values for JavaScripty with mutable variables.
In particular, the location value for a variable is given by the expression for some address .
32.1.2.3 Judgment Form for Imperative Computation
To define a small-step operational semantics with mutable variables, we need to update our reduction-step judgment form to include memories:
that says, “Closed expression with memory reduces to closed expression with updated memory .”
We can see the state of the machine as a pair of the expression and the memory :
and thus the small-step judgment form is . This machine state with a program that we execute for its effects on an off-to-the-side memory is the characteristic feature of imperative computation.
In pure functional computation, the machine state is just the program that we evaluate to a value (i.e., iterate ).
32.1.2.4 Inference Rules for Mutation
The rule says that dereferencing an address reduces to the value stored in the memory cell named by :
Note that it would be equivalent to write as follows:
The rule says that assigning value to memory location in memory updates memory with the cell :
The side-condition that the address is in the domain of memory (i.e., ) says that there is already an allocated memory cell in so that we are writing to mean updating that cell from to in memory .
We shall see that memory addresses are introduced by allocation, so the alternative system that does not check this condition in the rule
is essentially the same (technically, called being bisimilar).
The rules describes allocating a new memory cell for a mutable variable:
We choose a fresh address , which we state with the side condition that and thus the is a new cell in the updated memory . The reduced expression is interesting. The scope of variable is the continuation expression , so we must eliminate free-variable occurrences of in . We do this by substituting the location value corresponding to in .
In the following, we repeat the above rules along with the needed for mutable variables:
Figure 32.4: Small-step operational semantics with mutable variables.
The rule says that if is not a location value, then we need to reduce it to be able to do the assignment:
Note that rule is needed if addresses were first-class values (cf. pointers in C and C++). However, we see that it is not actually needed for this variant of JavaScripty where addresses are not first-class. In this case, we can also restrict the syntax of assignment to
Studying the rule, we see that assignment expressions where is in scope would become on substitution where either or would apply.
Indeed, the actual concrete syntax of JavaScript is restricted such that only certain the expression forms like variables can be written on the left-hand–side of assignment.
32.1.3 Implementation
32.1.3.1 Memories
Let us define Mem as an abstract data type to represent a memory (defined in Figure 32.2) in terms of a Scala Map[A, Expr]:
sealedclass Mem private(m:Map[A, Expr], nextAddr:Int){defapply(a: A): Expr =m(a)overridedef toString:String= m.toStringdef+(av:(A, Expr)): Mem ={val(a, _)= avrequire(m.contains(a))newMem(m + av, nextAddr)}defalloc(v: Expr):(A, Mem)={val fresha =A(nextAddr)(fresha,newMem(m +(fresha -> v), nextAddr +1))}}object Mem {val empty: Mem =newMem(Map.empty,1)}
defined classMem
defined objectMem
The apply and toString methods simply delegate to the corresponding methods on the underlying m:Map[A, Expr].
In addition to underlying m:Map[A, Expr], the additional nextAddr:Int field represents the next available address. The + method for updating the memory checks that the given address to update a is already in the map. The alloc method implements allocating a fresh address fresha: A by using the next available address, extending the map with the new cell (fresha -> v) using the given initialization value v, and advances the next available address to nextAddr +1.
The abstract data type Mem thus maintains a consistency invariant between the map m: Map[A, Expr] and the next available address nextAddr:Int. As a client, any address A that we obtained from alloc must have a corresponding mapping in m:
val(a, m)= Mem.empty.alloc(N(42))
a: A = A(a = 1)
m: Mem = Map(A(1) -> N(42.0))
One further step we could take is to make A an abstract data type where the only way for a client to get an address A is via alloc.
32.1.3.2 Location Values
We define isLValue defining the expression forms that are location values following the judgment form (see Figure 34.5):
defisLValue(e: Expr):Boolean= e match{caseDeref(A(_))=>truecase _ =>false}
defined functionisLValue
32.1.3.3 Substitution
Comparing and , they are the same with respect to binding a variable whose scope is . Thus, substitution works the same for both.
We define substitute(with_e, x, in_e) to implement
with_exin_e
assume that with_e and in_e have non-intersecting free variables (following Figure 21.1 in Section 21.9):
While the above implementation of step faithfully implements the small-step operational semantics judgment form , we see it requires threading explicitly different versions of the memory state m: Mem (e.g., m_), which could be error prone.
Recall the idea of representing mutation effects by encapsulating a function of type S =>(S, A) for a state type S and a main value type A into a data type DoWith[S, A] (see Section 30.5):
DoWith._
sealedclass DoWith[S, A]private(doer: S =>(S, A)){def map[B](f: A => B): DoWith[S, B]=new DoWith[S, B]({(s: S)=>{val(s_, a)=doer(s);(s_,f(a))}})def flatMap[B](f: A => DoWith[S, B]): DoWith[S, B]=new DoWith[S, B]({(s: S)=>{val(s_, a)=doer(s);f(a)(s_)}})defapply(s: S):(S, A)=doer(s)}object DoWith {def doget[S]: DoWith[S, S]=new DoWith[S, S]({ s =>(s, s)})def doput[S](s: S): DoWith[S,Unit]=new DoWith[S,Unit]({ _ =>(s,())})def doreturn[S, A](a: A): DoWith[S, A]=new DoWith[S, A]({ s =>(s, a)})def domodify[S](f: S => S): DoWith[S,Unit]=new DoWith[S,Unit]({ s =>(f(s),())})}import DoWith._
defined classDoWith
defined objectDoWithimport DoWith._
Rearranging the type of step slightly, we see that we can implement the judgment form using a step function of type Expr => Mem =>(Mem, Expr) or Expr => DoWith[Mem, Expr].
For convenience and to warm up, let us start by defining a helper function to the alloc method of Mem using a DoWith[Mem, A]:
defmemalloc(v: Expr): DoWith[Mem, A]= doget flatMap { m =>val(a, m_)= m.alloc(v)doput(m_) map { _ => a }}
defined functionmemalloc
We now translate the explicit state passing version of step from above (Section 32.1.3.4.1) to use an encapsulated DoWith[Mem, Expr] state as follows:
We see that the memory state fades into the background, except where it is explicitly needed. It is in the implementation of the rules where this fading into the background is particularly salient—whatever effect on memory happens in the recursive call to step is just passed along.
To use step, we can still run each step explicitly:
Observe that we have explicitly threaded the memory state in these top-level calls to step to show the intermediate memory state and expressions. That is, we called step(e_assign) to get a DoWith[Mem, Expr] that we then called with Mem.empty to get (m_, e_assign_) and then called the DoWith[Mem, Expr] resulting from step(e_assign_) with the current memory m_.
But we do not have to get the intermediate memory state m_. We can get the DoWith[Mem, Expr] for the two steps and then run it:
The two step calls here look somewhat like having a side-effect on a mutable memory state, but in actuality, immutable memory states are threaded in the background:
One might realize that before considering mutation in the above, we have considered another side-effecting JavaScripty expression in logging to the console:
console.log("Hello, World!")
undefined
The console.log() expression evaluates to a value, logs that value to the console as a side-effect, and evaluates to undefined. Its effect is external to its final value undefined.
We gave this rule for :
that states the logging effect informally with the “ printed” condition.
If we want to describe explicitly that there is log of values (e.g., separated by linefeed characters ), then we need to similarly reify a log state
and extend our small-step judgment with a log state :