A code abstraction mechanism like functions is essential to what we would consider a programming language. Let us consider our object language JavaScripty with variables and base types (Section 18.6)
trait Expr // ecaseclassN(n:Double)extends Expr // e ::= ncaseclassVar(x:String)extends Expr // e ::= xcaseclassConstDecl(x:String, e1: Expr, e2: Expr)extends Expr // e ::= const x = e1; e2
defined traitExpr
defined classN
defined classVar
defined classConstDecl
caseclassFun(x:String, e1: Expr)extends Expr // e ::= (x) => e1caseclassCall(e1: Expr, e2: Expr)extends Expr // e ::= e1(e2)
defined classFun
defined classCall
A function literal \(\texttt{(}x\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} e_1\) has a formal parameter\(x\) and function body \(e_1\). Note that a function literal is a value because it is an expression that cannot reduce any further until it is called. A function call expression \(e_1\texttt{(}e_2\texttt{)}\) expects \(e_1\) to evaluate to a function literal and \(e_2\) to a value (called the actual argument) to use for the formal parameter in evaluating the function body.
defined functionisValue
defined typeEnvempty: Env = Map()
defined functionlookup
defined functionextend
caseclassDynamicTypeError(e: Expr)extendsException{overridedef toString:String=s"TypeError: in expression $e"}
defined classDynamicTypeError
Functions as we have here are called first-class functions because they are values that can be passed and returned like any other type of values (e.g., numbers, booleans, strings).
For simplicity and to focus in on their essence, all functions have exactly one parameter and are anonymous and cannot be recursive. Since functions are first-class values, we can define multi-parameter functions via currying (i.e., functions that return functions).
19.2 Dynamic Scoping
We first try to implement function call in the most straightforward way. What we will discover is that we have made a historical mistake and have ended up with a form of dynamic scoping.
The evaluation judgment form \(E \vdash e \Downarrow v\) says, “In value environment \(E\), expression \(e\) evaluates to value \(v\).” We extend the definition of this judgment form for function call \(e_1\texttt{(}e_2\texttt{)}\) with the \(\TirName{EvalCall}\) rule as follows:
This rule says that we evaluate \(e_1\) to a function value \(\texttt{(}x\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} e'\) and evaluate \(e_2\) to a value \(v_2\). Then, we extend the environment to bind the formal parameter \(x\) to the actual argument \(v_2\) to evaluate the function body expression \(e'\) to a value \(v'\).
First, observe that we can only evaluate a call expression \(e_1\texttt{(}e_2\texttt{)}\) if \(e_1\) evaluates to a function. It is a type error if \(e_1\) does not evaluate to a function. This is indeed one of the few run-time errors in JavaScript.
TypeError: in expression Call(N(1.0),N(2.0))
ammonite.$sess.cmd4$Helper.eval(cmd4.sc:17)
ammonite.$sess.cmd5$Helper.<init>(cmd5.sc:1)
ammonite.$sess.cmd5$.<clinit>(cmd5.sc:7)
Now, recall that the scope of a variable in most languages is a static property—for any variable use, the variable binding site it references does not depend on program execution. Dynamic scoping is thus when the binding site of the variable being used does depend on program execution.
If we study \(\TirName{EvalCall}\) closely (Figure 19.1), we can get a hint of the “accidental” appearance of dynamic scoping. The function body expression \(e'\) may have free variable uses that under static scoping should reference variables where the function is defined, but it is being evaluated in a value environment \(E\) that is potentially very different from the value environment when it was defined. Can we come up with example that exhibits dynamic scoping with this \(\TirName{EvalCall}\) rule?
Let us reimplement this judgment for with some instrumentation to show derivations:
To construct an example that exhibits dynamic scoping, we define a function that under static scoping references an outer variable binding that gets shadowed by a variable when its body is later evaluated:
const x =1;const g = (y) => x;((x) =>g(2))(3)
Under static scoping, the variable use x in the function defined on line 2 references the variable binding of x on line 1 and should always return 1. However, using \(\TirName{EvalCall}\) in Figure 19.1, it ends up referencing the variable binding at line 3 and returning 3:
val e_dynamicScoping =ConstDecl("x",N(1),ConstDecl("g",Fun("y",Var("x")),Call(Fun("x",Call(Var("g"),N(2))),N(3))))val v_dynamicScoping =eval(e_dynamicScoping)
The example that exhibits dynamic scoping suggests some possible fixes to implement static scoping. We observe that a free variable use in a function body references a variable binding at the time the function is defined, not when the function body is evaluated. This suggests that a function body should be evaluated in the value environment when it is defined.
A closure is exactly this \(\texttt{(}x\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} e_1[E]\) —a pair consisting of a function literal \(\texttt{(}x\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} e_1\) and its value environment at the time of its definition \(E\). Function values are now closures:
defined classClosure
defined functionisValue
defined typeEnvempty: Env = Map()
defined functionlookup
defined functionextend
We add a rule \(\TirName{EvalFun}\) that says that evaluating a function literal creates a closure. Then, evaluating the function body \(e'\) in \(\TirName{EvalCall}\) uses the value environment from the closure \(E'\):
\(\fbox{$E \vdash e \Downarrow v$}\)
\(\inferrule[EvalFun]{
}{
E \vdash
\texttt{(}x\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} e
\Downarrow
\texttt{(}x\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} e [ E]
{
}
}\)
\(\inferrule[EvalCall]{
E \vdash e_1 \Downarrow \texttt{(}x\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} e'[E']
\and
E \vdash e_2 \Downarrow v_2
\and
E'[x \mapsto v_2] \vdash e' \Downarrow v'
}{
E \vdash
e_1\texttt{(}e_2\texttt{)}
\Downarrow v'
}\)
defined functionevalv_dynamicScopingFixed: Expr = N(n = 1.0)
19.4 Substitution
This observation about “accidental” dynamic scoping also suggests another strategy for implementing static scoping. We avoid the chance of dynamic scoping if we avoid free variables, that is, we maintain the invariant that we evaluate only closed expressions. It is possible to maintain this invariant by using substitution.
We write \({}[e_1/x_1]e\) for a scope-respecting substitution of expression \(e_1\) for free variable uses of \(x_1\) in expression \(e\). This function can be defined by induction on the structure of \(e\), though it does require some care to respect binding and scope. In particular, substitution applies to free variable uses of \(x_1\) and must be capture-avoiding (i.e., avoiding the capture of any free variable uses in \(e_1\)).
Given a scope-respecting substitution, we define an evaluation judgment for only closed expressions again \(e \Downarrow v\).
We return to the case where function literals are values (though they will be closed).
In \(\TirName{EvalConstDecl}\) and \(\TirName{EvalCall}\), we can see that we use substitution to effectively “apply” the value environment eagerly one-binding-at-a-time to the expression so that we never need to reify it:
There is no \(\TirName{EvalVar}\) rule because variable uses are replaced by the values their bound when the binding site is evaluated. And there is no \(\TirName{EvalFun}\) because function literals are again function values.
19.5 Recursive Functions
Thus far we have considered anonymous function literals \(\texttt{(}y\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} e'\) that cannot be recursive. To allow for recursive function definitions, we enrich the function expression Fun with a parameter for an optional variable name to refer to itself:
Observe that we define \(x^{?}\) as the non-terminal for an optional variable.
When a function expression has a name \(x\texttt{(}y\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} e'\), then it is can be recursive. In particular, variable \(x\) is an additional formal parameter, and the function body \(e'\) may have free variable uses of \(x\). The variable \(x\) gets bound to itself (i.e., the function value for \(x\texttt{(}y\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} e'\)) on a function call.
In terms of the Expr representation, the xopt can be Some(\(x\)) corresponding to \(x\texttt{(}y\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} e_1\) or None corresponding to \(\texttt{(}y\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} e_1\).
Note that we consider \(x^{?}\texttt{(}y\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} e_1\) abstract syntax. In particular, \(x\texttt{(}y\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} e_1\) is not valid concrete syntax in JavaScript, as we discuss next.
Exercise 19.1 (Big-Step Semantics for Potentially-Recursive Functions) Give a rule \(\TirName{EvalCallRec}\) that defines function call to a named-function literal \(x\texttt{(}y\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} e_1\). Either extend the evaluation judgment form for potentially-open expressions with value environments \(E \vdash e \Downarrow v\) using closures or the evaluation judgment form for closed expressions \(e \Downarrow v\) or both.
19.6 JavaScripty: Concrete Syntax: Functions
Recall from Section 14.5 that in the concrete syntax, \(\mathbf{const}\)-bindings are declarations (and not expressions). \[
\begin{array}{rrrl}
\text{declarations} & d& \mathrel{::=}&
\mathbf{const}\;x\;\texttt{=}\;e\texttt{;}
\mid s
\mid d_1\;d_2
\mid\varepsilon
\\
\text{statements} & s& \mathrel{::=}&
e\texttt{;}
\mid\texttt{\{}\;d\;\texttt{\}}
\mid\texttt{;}
\\
\text{expressions} & e& \mathrel{::=}&
\texttt{(}e\texttt{)}
\mid\cdots
\mid e_1\texttt{(}e_2\texttt{)}
\mid\texttt{(}x\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} e
\\
& & \mid& \texttt{(}x\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} \texttt{\{}\;body\;\texttt{\}}
\mid\mathop{\mathbf{function}} \mathrel{x^{?}\texttt{(}y\texttt{)}} \texttt{\{}\;body\;\texttt{\}}
\\
\text{variables} & x, y
\end{array}
\]
To accommodate declarations in function bodies, JavaScript has additional concrete syntax for function literals (in addition to \(\texttt{(}x\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} e\)): \[
\texttt{(}x\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} \texttt{\{}\;body\;\texttt{\}} \quad \text{and} \quad \mathop{\mathbf{function}} \mathrel{x^{?}\texttt{(}y\texttt{)}} \texttt{\{}\;body\;\texttt{\}}
\]
In both of these variants, a function body \(body\) is surrounded by curly braces (i.e., \(\texttt{\{}\;\;\texttt{\}}\)) and consists of a declaration \(d\) (e.g., for \(\mathbf{const}\)-bindings) followed by a \(\mathbf{return}\) keyword, a return-expression \(e\), and a trailing \(\texttt{;}\):
Note that JavaScript permits function bodies that leave out the \(\mathbf{return}\) keyword or the return-expression \(e\). When the \(\mathbf{return}\) keyword is left out, the meaning of a function body is to implicitly return \(\mathbf{undefined}\).
The \(\mathbf{function}\) keyword syntax may have a function name but whose definition must be a function body \(\texttt{\{}\;body\;\texttt{\}}\).