19 Functions and Dynamic Scoping

$\newcommand{\TirName}[1]{\text{#1}} \newcommand{\inferrule}[3][]{ \let\and\qquad \begin{array}{@{}l@{}} \TirName{#1} \\ \displaystyle \frac{#2}{#3} \end{array} } \newcommand{\infer}[3][]{\inferrule[#1]{#2}{#3}} $

19.1 Functions Are Values

A code abstraction mechanism like functions is essential to what we would consider a programming language. Let us consider our object language JavaScripty with variables and base types (Section 18.6)

trait Expr                                                       // e
case class N(n: Double) extends Expr                             // e ::= n
case class Var(x: String) extends Expr                           // e ::= x
case class ConstDecl(x: String, e1: Expr, e2: Expr) extends Expr // e ::= const x = e1; e2

defined trait Expr
defined class N
defined class Var
defined class ConstDecl

and extend it with function values:

\[ \begin{array}{rrrll} \text{values} & v& \mathrel{::=}& \texttt{(}x\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} e_1 \\ \text{expressions} & e& \mathrel{::=}& e_1\texttt{(}e_2\texttt{)} \\ \text{variables} & x \end{array} \tag{19.1}\]

case class Fun(x: String, e1: Expr) extends Expr // e ::= (x) => e1
case class Call(e1: Expr, e2: Expr) extends Expr // e ::= e1(e2)

defined class Fun
defined class Call

A function literal $\texttt{(}x\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} e_1$ has a formal parameter $x$ and function body $e_1$. Note that a function literal is a value because it is an expression that cannot reduce any further until it is called. A function call expression $e_1\texttt{(}e_2\texttt{)}$ expects $e_1$ to evaluate to a function literal and $e_2$ to a value (called the actual argument) to use for the formal parameter in evaluating the function body.

def isValue(e: Expr): Boolean = e match {
  case N(_) | Fun(_, _) => true
  case _ => false
}

type Env = Map[String, Expr]
val empty: Env = Map.empty
def lookup(env: Env, x: String): Expr = env(x)
def extend(env: Env, x: String, v: Expr): Env = {
  require(isValue(v))
  env + (x -> v)
}

defined function isValue
defined type Env
empty: Env = Map()
defined function lookup
defined function extend

case class DynamicTypeError(e: Expr) extends Exception {
  override def toString: String = s"TypeError: in expression $e"
}

defined class DynamicTypeError

Functions as we have here are called first-class functions because they are values that can be passed and returned like any other type of values (e.g., numbers, booleans, strings).

For simplicity and to focus in on their essence, all functions have exactly one parameter and are anonymous and cannot be recursive. Since functions are first-class values, we can define multi-parameter functions via currying (i.e., functions that return functions).

19.2 Dynamic Scoping

We first try to implement function call in the most straightforward way. What we will discover is that we have made a historical mistake and have ended up with a form of dynamic scoping.

The evaluation judgment form $E \vdash e \Downarrow v$ says, “In value environment $E$, expression $e$ evaluates to value $v$.” We extend the definition of this judgment form for function call $e_1\texttt{(}e_2\texttt{)}$ with the $\TirName{EvalCall}$ rule as follows:

$\fbox{$E \vdash e \Downarrow v$}$

$\inferrule[EvalCall]{ E \vdash e_1 \Downarrow \texttt{(}x\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} e' \and E \vdash e_2 \Downarrow v_2 \and E[x \mapsto v_2] \vdash e' \Downarrow v' }{ E \vdash e_1\texttt{(}e_2\texttt{)} \Downarrow v' }$

Figure 19.1: Defining evaluation of a function call expression that “accidentally” implements dynamic scoping.

This rule says that we evaluate $e_1$ to a function value $\texttt{(}x\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} e'$ and evaluate $e_2$ to a value $v_2$. Then, we extend the environment to bind the formal parameter $x$ to the actual argument $v_2$ to evaluate the function body expression $e'$ to a value $v'$.

First, observe that we can only evaluate a call expression $e_1\texttt{(}e_2\texttt{)}$ if $e_1$ evaluates to a function. It is a type error if $e_1$ does not evaluate to a function. This is indeed one of the few run-time errors in JavaScript.

Let us implement this judgment form:

def eval(env: Env, e: Expr): Expr = e match {
  // EvalVal
  case v if isValue(e) => v
  // EvalVar
  case Var(x) => lookup(env, x)
  // EvalConstDecl
  case ConstDecl(x, e1, e2) => {
    val v1 = eval(env, e1)
    eval(extend(env, x, v1), e2)
  }
  // EvalCall
  case Call(e1, e2) => eval(env, e1) match {
    case Fun(x, e) => {
      val v2 = eval(env, e2)
      eval(extend(env, x, v2), e)
    }
    case _ => throw DynamicTypeError(e)
  }
}

defined function eval

eval(empty, Call(N(1), N(2)))

TypeError: in expression Call(N(1.0),N(2.0))
  ammonite.$sess.cmd4$Helper.eval(cmd4.sc:17)
  ammonite.$sess.cmd5$Helper.<init>(cmd5.sc:1)
  ammonite.$sess.cmd5$.<clinit>(cmd5.sc:7)

Now, recall that the scope of a variable in most languages is a static property—for any variable use, the variable binding site it references does not depend on program execution. Dynamic scoping is thus when the binding site of the variable being used does depend on program execution.

If we study $\TirName{EvalCall}$ closely (Figure 19.1), we can get a hint of the “accidental” appearance of dynamic scoping. The function body expression $e'$ may have free variable uses that under static scoping should reference variables where the function is defined, but it is being evaluated in a value environment $E$ that is potentially very different from the value environment when it was defined. Can we come up with example that exhibits dynamic scoping with this $\TirName{EvalCall}$ rule?

Let us reimplement this judgment for with some instrumentation to show derivations:

def eval(level: Int, env: Env, e: Expr): Expr = {
  val indent = " " * level
  val v = e match {
    // EvalVal
    case v if isValue(e) => {
      println(s"\n${indent}------------------------ EvalVal")
      v
    }
    // EvalVar
    case Var(x) => {
      val v = lookup(env, x)
      println(s"\n${indent}------------------------ EvalVar")
      v
    }
    // EvalConstDecl
    case ConstDecl(x, e1, e2) => {
      val v1 = eval(level, env, e1)
      val v2 = eval(level + 6, extend(env, x, v1), e2)
      println(s"${indent}------------------------ EvalConstDecl")
      v2
    }
    // EvalCall
    case Call(e1, e2) => {
      eval(level, env, e1) match {
        case Fun(x, e) => {
          val v2 = eval(level + 4, env, e2)
          val v = eval(level + 8, extend(env, x, v2), e)
          println(s"${indent}------------------------ EvalCall")
          v
        }
        case _ => throw DynamicTypeError(e)
      }
    }
  }
  println(s"${indent}$env ⊢ $e ⇓ $v")
  v
}

def eval(e: Expr): Expr = eval(0, empty, e)

defined function eval
defined function eval

We use indention to indicate the different premises of a multi-premise rule:

eval( Call(Fun("x", Var("x")), N(2)) )


------------------------ EvalVal
Map() ⊢ Fun(x,Var(x)) ⇓ Fun(x,Var(x))

    ------------------------ EvalVal
    Map() ⊢ N(2.0) ⇓ N(2.0)

        ------------------------ EvalVar
        Map(x -> N(2.0)) ⊢ Var(x) ⇓ N(2.0)
------------------------ EvalCall
Map() ⊢ Call(Fun(x,Var(x)),N(2.0)) ⇓ N(2.0)

res7: Expr = N(n = 2.0)

To construct an example that exhibits dynamic scoping, we define a function that under static scoping references an outer variable binding that gets shadowed by a variable when its body is later evaluated:

const x = 1;
const g = (y) => x;
((x) => g(2))(3)

Under static scoping, the variable use x in the function defined on line 2 references the variable binding of x on line 1 and should always return 1. However, using $\TirName{EvalCall}$ in Figure 19.1, it ends up referencing the variable binding at line 3 and returning 3:

val e_dynamicScoping =
  ConstDecl("x", N(1),
  ConstDecl("g", Fun("y", Var("x")),
  Call(Fun("x", Call(Var("g"), N(2))), N(3))))

val v_dynamicScoping = eval(e_dynamicScoping)


------------------------ EvalVal
Map() ⊢ N(1.0) ⇓ N(1.0)

      ------------------------ EvalVal
      Map(x -> N(1.0)) ⊢ Fun(y,Var(x)) ⇓ Fun(y,Var(x))

            ------------------------ EvalVal
            Map(x -> N(1.0), g -> Fun(y,Var(x))) ⊢ Fun(x,Call(Var(g),N(2.0))) ⇓ Fun(x,Call(Var(g),N(2.0)))

                ------------------------ EvalVal
                Map(x -> N(1.0), g -> Fun(y,Var(x))) ⊢ N(3.0) ⇓ N(3.0)

                    ------------------------ EvalVar
                    Map(x -> N(3.0), g -> Fun(y,Var(x))) ⊢ Var(g) ⇓ Fun(y,Var(x))

                        ------------------------ EvalVal
                        Map(x -> N(3.0), g -> Fun(y,Var(x))) ⊢ N(2.0) ⇓ N(2.0)

                            ------------------------ EvalVar
                            Map(x -> N(3.0), g -> Fun(y,Var(x)), y -> N(2.0)) ⊢ Var(x) ⇓ N(3.0)
                    ------------------------ EvalCall
                    Map(x -> N(3.0), g -> Fun(y,Var(x))) ⊢ Call(Var(g),N(2.0)) ⇓ N(3.0)
            ------------------------ EvalCall
            Map(x -> N(1.0), g -> Fun(y,Var(x))) ⊢ Call(Fun(x,Call(Var(g),N(2.0))),N(3.0)) ⇓ N(3.0)
      ------------------------ EvalConstDecl
      Map(x -> N(1.0)) ⊢ ConstDecl(g,Fun(y,Var(x)),Call(Fun(x,Call(Var(g),N(2.0))),N(3.0))) ⇓ N(3.0)
------------------------ EvalConstDecl
Map() ⊢ ConstDecl(x,N(1.0),ConstDecl(g,Fun(y,Var(x)),Call(Fun(x,Call(Var(g),N(2.0))),N(3.0)))) ⇓ N(3.0)

e_dynamicScoping: ConstDecl = ConstDecl(
  x = "x",
  e1 = N(n = 1.0),
  e2 = ConstDecl(
    x = "g",
    e1 = Fun(x = "y", e1 = Var(x = "x")),
    e2 = Call(
      e1 = Fun(x = "x", e1 = Call(e1 = Var(x = "g"), e2 = N(n = 2.0))),
      e2 = N(n = 3.0)
    )
  )
)
v_dynamicScoping: Expr = N(n = 3.0)

19.3 Closures

The example that exhibits dynamic scoping suggests some possible fixes to implement static scoping. We observe that a free variable use in a function body references a variable binding at the time the function is defined, not when the function body is evaluated. This suggests that a function body should be evaluated in the value environment when it is defined.

A closure is exactly this $\texttt{(}x\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} e_1[E]$ —a pair consisting of a function literal $\texttt{(}x\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} e_1$ and its value environment at the time of its definition $E$. Function values are now closures:

\[ \begin{array}{rrrll} \text{expressions} & e& \mathrel{::=}& \texttt{(}x\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} e_1 \mid e_1\texttt{(}e_2\texttt{)} \\ \text{values} & v& \mathrel{::=}& \texttt{(}x\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} e_1[E] \\ \text{variables} & x \end{array} \]

case class Closure(fun: Fun, env: Env) extends Expr // e ::= (x) => e1[E]

def isValue(e: Expr): Boolean = e match {
  case N(_) | Closure(_, _) => true
  case _ => false
}

type Env = Map[String, Expr]
val empty: Env = Map.empty
def lookup(env: Env, x: String): Expr = env(x)
def extend(env: Env, x: String, v: Expr): Env = {
  require(isValue(v))
  env + (x -> v)
}

defined class Closure
defined function isValue
defined type Env
empty: Env = Map()
defined function lookup
defined function extend

We add a rule $\TirName{EvalFun}$ that says that evaluating a function literal creates a closure. Then, evaluating the function body $e'$ in $\TirName{EvalCall}$ uses the value environment from the closure $E'$:

$\fbox{$E \vdash e \Downarrow v$}$

$\inferrule[EvalFun]{ }{ E \vdash \texttt{(}x\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} e \Downarrow \texttt{(}x\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} e [ E] { } }$

$\inferrule[EvalCall]{ E \vdash e_1 \Downarrow \texttt{(}x\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} e'[E'] \and E \vdash e_2 \Downarrow v_2 \and E'[x \mapsto v_2] \vdash e' \Downarrow v' }{ E \vdash e_1\texttt{(}e_2\texttt{)} \Downarrow v' }$

def eval(env: Env, e: Expr): Expr = e match {
  // EvalVal
  case v if isValue(e) => v
  // EvalVar
  case Var(x) => lookup(env, x)
  // EvalConstDecl
  case ConstDecl(x, e1, e2) => {
    val v1 = eval(env, e1)
    eval(extend(env, x, v1), e2)
  }
  // EvalFun
  case f @ Fun(x, e) => Closure(f, env)
  // EvalCall
  case Call(e1, e2) => eval(env, e1) match {
    case Closure(Fun(x, e_), env_) => {
      val v2 = eval(env, e2)
      eval(extend(env_, x, v2), e_)
    }
    case _ => throw DynamicTypeError(e)
  }
}

val v_dynamicScopingFixed = eval(empty, e_dynamicScoping)

defined function eval
v_dynamicScopingFixed: Expr = N(n = 1.0)

19.4 Substitution

This observation about “accidental” dynamic scoping also suggests another strategy for implementing static scoping. We avoid the chance of dynamic scoping if we avoid free variables, that is, we maintain the invariant that we evaluate only closed expressions. It is possible to maintain this invariant by using substitution.

We write ${}[e_1/x_1]e$ for a scope-respecting substitution of expression $e_1$ for free variable uses of $x_1$ in expression $e$. This function can be defined by induction on the structure of $e$, though it does require some care to respect binding and scope. In particular, substitution applies to free variable uses of $x_1$ and must be capture-avoiding (i.e., avoiding the capture of any free variable uses in $e_1$).

Given a scope-respecting substitution, we define an evaluation judgment for only closed expressions again $e \Downarrow v$.

We return to the case where function literals are values (though they will be closed).

\[ \begin{array}{rrrll} \text{values} & v& \mathrel{::=}& \texttt{(}x\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} e' \\ \text{expressions} & e& \mathrel{::=}& e_1\texttt{(}e_2\texttt{)} \\ \text{variables} & x \end{array} \]

In $\TirName{EvalConstDecl}$ and $\TirName{EvalCall}$, we can see that we use substitution to effectively “apply” the value environment eagerly one-binding-at-a-time to the expression so that we never need to reify it:

$\fbox{$e \Downarrow v$}$

no $\TirName{EvalVar}$ rule

$\inferrule[EvalConstDecl]{ e_1 \Downarrow v_1 \and {}[v_1/x]e_2 \Downarrow v_2 }{ \mathbf{const}\;x\;\texttt{=}\;e_1\texttt{;}\;e_2 \Downarrow v_2 }$

no $\TirName{EvalFun}$ rule

$\inferrule[EvalCall]{ e_1 \Downarrow \texttt{(}x\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} e' \and e_2 \Downarrow v_2 \and {}[v_2/x]e' \Downarrow v' }{ e_1\texttt{(}e_2\texttt{)} \Downarrow v' }$

There is no $\TirName{EvalVar}$ rule because variable uses are replaced by the values their bound when the binding site is evaluated. And there is no $\TirName{EvalFun}$ because function literals are again function values.

19.5 Recursive Functions

Thus far we have considered anonymous function literals $\texttt{(}y\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} e'$ that cannot be recursive. To allow for recursive function definitions, we enrich the function expression Fun with a parameter for an optional variable name to refer to itself:

case class Fun(xopt: Option[String], y: String, e1: Expr) extends Expr // e ::= xopt(y) => e1

defined class Fun

Correspondingly, let us extend our abstract syntax for JavaScripty as follows:

\[ \begin{array}{rlrll} \text{expressions} & e& \mathrel{::=}& x^{?}\texttt{(}y\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} e_1 \mid e_1\texttt{(}e_2\texttt{)} \\ \text{optional variables} & x^{?}& \mathrel{::=}& x\mid\varepsilon \\ \text{variables} & x \end{array} \]

Observe that we define $x^{?}$ as the non-terminal for an optional variable.

When a function expression has a name $x\texttt{(}y\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} e'$, then it is can be recursive. In particular, variable $x$ is an additional formal parameter, and the function body $e'$ may have free variable uses of $x$. The variable $x$ gets bound to itself (i.e., the function value for $x\texttt{(}y\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} e'$) on a function call.

In terms of the Expr representation, the xopt can be Some( $x$ ) corresponding to $x\texttt{(}y\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} e_1$ or None corresponding to $\texttt{(}y\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} e_1$.

Note that we consider $x^{?}\texttt{(}y\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} e_1$ abstract syntax. In particular, $x\texttt{(}y\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} e_1$ is not valid concrete syntax in JavaScript, as we discuss next.

Exercise 19.1 (Big-Step Semantics for Potentially-Recursive Functions) Give a rule $\TirName{EvalCallRec}$ that defines function call to a named-function literal $x\texttt{(}y\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} e_1$. Either extend the evaluation judgment form for potentially-open expressions with value environments $E \vdash e \Downarrow v$ using closures or the evaluation judgment form for closed expressions $e \Downarrow v$ or both.

19.6 JavaScripty: Concrete Syntax: Functions

Recall from Section 14.5 that in the concrete syntax, $\mathbf{const}$-bindings are declarations (and not expressions). \[ \begin{array}{rrrl} \text{declarations} & d& \mathrel{::=}& \mathbf{const}\;x\;\texttt{=}\;e\texttt{;} \mid s \mid d_1\;d_2 \mid\varepsilon \\ \text{statements} & s& \mathrel{::=}& e\texttt{;} \mid\texttt{\{}\;d\;\texttt{\}} \mid\texttt{;} \\ \text{expressions} & e& \mathrel{::=}& \texttt{(}e\texttt{)} \mid\cdots \mid e_1\texttt{(}e_2\texttt{)} \mid\texttt{(}x\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} e \\ & & \mid& \texttt{(}x\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} \texttt{\{}\;body\;\texttt{\}} \mid\mathop{\mathbf{function}} \mathrel{x^{?}\texttt{(}y\texttt{)}} \texttt{\{}\;body\;\texttt{\}} \\ \text{variables} & x, y \end{array} \]

To accommodate declarations in function bodies, JavaScript has additional concrete syntax for function literals (in addition to $\texttt{(}x\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} e$): \[ \texttt{(}x\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} \texttt{\{}\;body\;\texttt{\}} \quad \text{and} \quad \mathop{\mathbf{function}} \mathrel{x^{?}\texttt{(}y\texttt{)}} \texttt{\{}\;body\;\texttt{\}} \]

In both of these variants, a function body $body$ is surrounded by curly braces (i.e., $\texttt{\{}\;\;\texttt{\}}$) and consists of a declaration $d$ (e.g., for $\mathbf{const}$-bindings) followed by a $\mathbf{return}$ keyword, a return-expression $e$, and a trailing $\texttt{;}$:

\[ \begin{array}{rrrl} \text{function bodies} & body& \mathrel{::=}& d\;\mathbf{return}\;e\texttt{;} \end{array} \]

Note that JavaScript permits function bodies that leave out the $\mathbf{return}$ keyword or the return-expression $e$. When the $\mathbf{return}$ keyword is left out, the meaning of a function body is to implicitly return $\mathbf{undefined}$.

The $\mathbf{function}$ keyword syntax may have a function name but whose definition must be a function body $\texttt{\{}\;body\;\texttt{\}}$.