When we considered just a single type (e.g., numbers) in Section 21.2, we defined a one-step reduction relation such that for any closed expression \(e\): either \(e\;\mathsf{value}\) (i.e., \(e\) is a value) or \(e\longrightarrow e'\) for some \(e'\) (i.e., \(e\) can take a step to some \(e'\)). This property is very nice, but as soon as we added another type of value, things got messy. We considered different possible designs:
We defined conversions between different types of values (e.g., coercing values \(v\) to numbers \(n\) with the judgment form \(v \rightsquigarrow n\)).
We defined dynamic type checking with the judgment form \(e\longrightarrow r\) where step-result \(r\mathrel{::=}\mathop{\mathsf{typeerror}}e \mid e'\) is either a type-error result or a one-step reduced expression.
While conversions preserve the nice property that every expression is either a value or has a next step, a drawback with conversions is that some types of values simply do not have sensible conversions. For example, how should the number \(\texttt{3}\) convert to a function value?
Dynamic type checking changes the judgment form \(e\longrightarrow r\) so that the next step could be to a \(\mathsf{typeerror}\), but that adds complexity for identifying and propagating errors.
Either choice comes at a cost in complexity.
26.1 JavaScripty: Numbers and Functions
26.1.1 Syntax
Let us consider JavaScripty just with number literals, anonymous function literals, and function call expressions:
trait Expr // ecaseclassN(n:Double)extends Expr // e ::= ncaseclassFun(x:String, e1: Expr)extends Expr // e ::= (x) => e1caseclassVar(x:String)extends Expr // e ::= xcaseclassCall(e1: Expr, e2: Expr)extends Expr // e ::= e1(e2)defisValue(e: Expr):Boolean= e match{caseN(_)|Fun(_, _)=>truecase _ =>false}
defined traitExpr
defined classN
defined classFun
defined classVar
defined classCall
defined functionisValue
26.1.2 Small-Step Operational Semantics
The small-step operational semantics just consists of reducing function call expressions:
defsubst(v: Expr, x:String, e: Expr)={defsubst(e: Expr): Expr = e match{caseN(_)=> ecaseFun(y, e1)=>if(x == y) e elseFun(y,subst(e1))caseVar(y)=>if(x == y) v else ecaseCall(e1, e2)=>Call(subst(e1),subst(e2))}subst(e)}defstep(e: Expr): Expr = e match{// DoCallcaseCall(Fun(x, e1), v2)ifisValue(v2)=>subst(v2, x, e1)// SearchCall2caseCall(v1, e2)ifisValue(v1)=>Call(v1,step(e2))// SearchCall1caseCall(e1, e2)=>Call(step(e1), e2)}
defined functionsubst
defined functionstep
26.2 Getting Stuck
In Figure 26.2, we restate the small-step operational semantics rules for reducing function calls. Observe in the \(\TirName{DoCall}\) rule that a reduction step only makes sense if we are calling a function value. Otherwise, the set of rules simply say that call expressions are evaluated left-to-right and that both the function and the argument expressions must be values before continuing to evaluating with the body of the function. This latter choice is known as call-by-value semantics; we will return to this notion in ?sec-call-by-name.
Note that these rules do not say anything about how to evaluate an ill-typed expression, such as \[
e_{\text{illtyped}} \colon\quad
\texttt{3}\texttt{(}\texttt{4}\texttt{)}
\] Intuitively, evaluating this expression should result in an error. We do not state this error explicitly. Rather, we see that this is an expression that is (1) not value \(v\) and (2) can make no further progress (i.e., there’s no reduction rule that specifies a next-step expression \(e'\)). We call such an expression a stuck expression, which captures the idea that it is erroneous in some way.
In implementation, we get undefined behavior (e.g., crashing with a MatchError):
scala.MatchError: N(4.0) (of class ammonite.$sess.cmd0$Helper$N)
ammonite.$sess.cmd1$Helper.step(cmd1.sc:11)
ammonite.$sess.cmd1$Helper.step(cmd1.sc:15)
ammonite.$sess.cmd3$Helper.<init>(cmd3.sc:1)
ammonite.$sess.cmd3$.<clinit>(cmd3.sc:7)
26.3 Dynamic Typing
As we saw in our introduction to dynamic typing (Section 21.3), another formalization and implementation choice would be to make such ill-typed expressions step to an error token with an updated the judgment form \(e\longrightarrow r\):
An ill-typed function call now step to \(\mathsf{typeerror}\) with rule \(\TirName{TypeErrorCall}\). We also need to extend rules for evaluating other all other expression forms that propagate the \(\mathsf{typeerror}\) token if one is encountered in searching for a redex. We show \(\TirName{PropagateCall1}\) and \(\TirName{PropagateCall2}\), which are two such rules, that correspond to \(\TirName{SearchCall1}\) and \(\TirName{SearchCall2}\), respectively.
With this instrumentation, we distinguish a dynamic type error for any other reason for getting stuck. For example, an expression with free variables, such as
should get stuck. Since our one-step evaluation relation is intended for closed expressions, we should view this as an internal error of the interpreter implementation rather than an error in the input program.
scala.MatchError: Var(f) (of class ammonite.$sess.cmd0$Helper$Var)
ammonite.$sess.cmd5$Helper.step(cmd5.sc:1)
ammonite.$sess.cmd5$Helper.step(cmd5.sc:11)
ammonite.$sess.cmd8$Helper.<init>(cmd8.sc:1)
ammonite.$sess.cmd8$.<clinit>(cmd8.sc:7)
26.4 Static Typing
We saw how “bad” expressions, such as, \[
e_{\text{illtyped}}\colon\quad \texttt{3(4)}
\] are erroneous in that they “get stuck” according to our simpler small-step operational semantics or result in a \(\mathsf{typeerror}\) according to our semantics with dynamic typing. This expression is “bad” because a call expression \(e_1\texttt{(}e_2\texttt{)}\) is only applicable to function values. We say that such an expression \(\texttt{3(4)}\) is ill-typed or not well-typed.
A type is a classification of values that characterizes the valid operations for these values. A type system consists of a language of types and a typing judgment that defines when an expression has a particular type. When we say that an expression \(e\) has a type \(\tau\) (written with judgment form \(e : \tau\)), we mean that if \(e\) evaluates to a value \(v\), then that value \(v\) should be of type \(\tau\). In this way, a type system predicts some property about how an expression evaluates at run-time.
What is interesting is that we can design a type system to rule out ill-typed expressions before evaluation. If we only permit well-typed expressions to run, then we can use our simpler small-step operational semantics and know that we never get stuck getting to a value. This observation leads to the well-known quote:
Well-typed programs can’t go wrong. — Robin Milner [1]
Correspondingly, we can use the simpler interpreter implementation and know that a crash corresponds to an internal error of the implementation rather than an error in the input program.
26.5 TypeScripty: Numbers and Functions
Let us consider a statically-typed language that we affectionately call TypeScripty. In fact, like with JavaScripty and JavaScript, we make TypeScripty a subset of TypeScript whenever possible or indicate when we make a choice to make them differ.
26.5.1 Syntax
Let us consider the abstract syntax of TypeScripty with numbers and functions:
In Figure 26.4, we show a language of types \(\tau\) that includes base types for numbers \(\texttt{number}\) and a constructed type for function values \[
\texttt{(}x\texttt{:}\,\tau\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} \tau'
\] A function type \(\texttt{(}x\texttt{:}\,\tau\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} \tau'\) classifies function values that has a parameter \(x\) of type \(\tau\) to produce a return value of type \(\tau'\). Compared with JavaScripty syntax (Figure 26.1), our expression language \(e\) has been modified just slightly to add type annotations to function parameters. Take note of the parallel between values \(v\) and types \(\tau\)—specifically, each type of value has a form in the type language \(\tau\).
We can represent the abstract syntax of TypeScripty with functions in Scala as follows:
trait Typ // τcaseobject TNumber extends Typ // τ ::= numbercaseclassTFun(xt:(String, Typ), tret: Typ)extends Typ // τ ::= (x: τ) => τ'trait Expr // ecaseclassN(n:Double)extends Expr // e ::= ncaseclassFun(xt:(String, Typ), e1: Expr)extends Expr // e ::= (x: τ) => e1caseclassVar(x:String)extends Expr // e ::= xcaseclassCall(e1: Expr, e2: Expr)extends Expr // e ::= e1(e2)defisValue(e: Expr):Boolean= e match{caseN(_)|Fun(_, _)=>truecase _ =>false}
defined traitTyp
defined objectTNumber
defined classTFun
defined traitExpr
defined classN
defined classFun
defined classVar
defined classCall
defined functionisValue
For the Expr type, the only change compared with the abstract syntax representation of JavaScripty (Section 26.1.1) is this additional Typ parameter in the Fun constructor.
26.5.2 Small-Step Operational Semantics
Let us consider the small-step operational semantics of TypeScripty with functions:
The small-step operational semantics of TypeScripty here with functions are the same as JavaScripty with functions (Figure 26.2).
defsubst(v: Expr, x:String, e: Expr)={defsubst(e: Expr): Expr = e match{caseN(_)=> ecaseFun(yt @ (y, _), e1)=>if(x == y) e elseFun(yt,subst(e1))caseVar(y)=>if(x == y) v else ecaseCall(e1, e2)=>Call(subst(e1),subst(e2))}subst(e)}defstep(e: Expr): Expr ={require(!isValue(e)) e match{// DoCallcaseCall(Fun((y, _), e1), v2)=>subst(v2, y, e1)// SearchCall2caseCall(v1, e2)ifisValue(v1)=>Call(v1,step(e2))// SearchCall1caseCall(e1, e2)=>Call(step(e1), e2)}}
defined functionsubst
defined functionstep
26.6 Typing Judgment
We want to judge when an expression \(e\) will evaluate to a value \(v\) of a particular type \(\tau\). We have already seen that this judgment form
\[
e : \tau
\]
that says, “Expression \(e\) has type \(\tau\).” What we mean by “has type” is that the expression \(e\) evaluates to a value of the corresponding type \(\tau\).
Like with evaluation, a typing judgment form is defined by a set of typing rules that is the first step towards defining a type checking algorithm. Hence, typing is often called the static semantics of a language, while evaluation is the dynamic semantics.
A type error is an expression that violates the prescribed typing rules (i.e., may produce a value outside the set of values that it is supposed to have). We define a typing judgment form inductively on the syntactic structure of expressions.
Recall from our earlier discussion on binding that to give a semantics and hence a type to an expression with free variable, we need an environment. For example, consider the expression \[
\texttt{f}\texttt{(}\texttt{4}\texttt{)}
\] Is this expression well-typed? It depends. If in the environment, the variable \(\texttt{f}\) is stated to have type \(\texttt{(}x\texttt{:}\,\texttt{number}\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} \tau'\), then it is well-typed; otherwise, it is not. We see that the type of an expression \(e\) depends on a type environment\(\Gamma\) that gives the types of the free variables of \(e\):
that says informally, “In typing environment \(\Gamma\), expression \(e\) has type \(\tau\).” Observe how similar this judgment form is to our big-step evaluation judgment form \(E \vdash e \Downarrow v\) from Section 18.5. This parallel is more than a mere coincidence. A standard type checker works by inferring the type of an expression by recursively inferring the type of each sub-expression. A big-step interpreter computes the value of an expression by recursively computing the value of each sub-expression. In essence, we can view a type checker as an abstract evaluator over a type abstraction of concrete values.
In Figure 26.6, we define typing of TypeScripty with number literals, function literals, and function call expressions. The first two rules \(\TirName{TypeNumber}\) and \(\TirName{TypeFunction}\) describe the types of values. The type of the number literal \(n\) is \(\texttt{number}\) as expected. The \(\TirName{TypeFunction}\) rule is more interesting:
A function value has a function type \(\texttt{(}y\texttt{:}\,\tau\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} \tau'\) (also sometimes called simply an “arrow” type) whose parameter type is \(\tau\) and return type is \(\tau'\). The return type \(\tau'\) is obtained by inferring the type of the body expression \(e\) under the extended environment \(\Gamma, x : \tau\). In TypeScript, the parameter name \(y\) in the function type \(\texttt{(}y\texttt{:}\,\dots\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} \dots\) is essentially inconsequential and does not need to match the parameter name \(x\) in the function literal \(\texttt{(}x\texttt{:}\,\dots\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} \dots\).
For a closed expression \(e\), we write \(e : \tau\) for \(\cdot \vdash e : \tau\), that is, well-typed with an empty type environment: \[
\inferrule{
\cdot \vdash e : \tau
}{
e : \tau
}
\]
We can translate the rules defining the typing judgment forms \(e : \tau\) and \(\Gamma \vdash e : \tau\) into a Scala implementation as follows:
defhastype(e: Expr): Typ ={defhastype(tenv: TEnv, e: Expr): Typ = e match{// TypeNumbercaseN(_)=> TNumber// TypeFunctioncaseFun((x,t), e1)=>TFun((x,t),hastype(tenv +(x -> t), e1))// TypeVarcaseVar(x)=>tenv(x)// TypeCallcaseCall(e1, e2)=>hastype(tenv, e1)match{caseTFun((_,t), tret)if t ==hastype(tenv, e2)=> tret}}hastype(Map.empty, e)}
defined functionhastype
To test, let us consider the ill-typed expression from above, as well as a well-typed one: \[
e_{\text{welltyped}} \colon\quad \texttt{(}\texttt{(}\texttt{i}\texttt{:}\,\texttt{number}\texttt{)} \mathrel{\texttt{=}\!\texttt{>}} \texttt{i}\texttt{)}\texttt{(}\texttt{4}\texttt{)}
\qquad\qquad\qquad
e_{\text{illtyped}} \colon\quad \texttt{3}\texttt{(}\texttt{4}\texttt{)}
\]
val e_welltyped =Call(Fun(("i", TNumber),Var("i")),N(4))val e_illtyped =Call(N(3),N(4))
In the case of the ill-typed term, this implementation of hastype simply crashes with a MatchError:
hastype(e_illtyped)
scala.MatchError: TNumber (of class ammonite.$sess.cmd9$Helper$TNumber$)
ammonite.$sess.cmd12$Helper.hastype$1(cmd12.sc:10)
ammonite.$sess.cmd12$Helper.hastype(cmd12.sc:14)
ammonite.$sess.cmd15$Helper.<init>(cmd15.sc:1)
ammonite.$sess.cmd15$.<clinit>(cmd15.sc:7)
To give a better error message to the programmer, we may want to identify the unexpected, bad type tbad for a particular sub-expression esub of input expression e:
caseclassStaticTypeError(tbad: Typ, esub: Expr, e: Expr)extendsException{overridedef toString:String=s"invalid type ${tbad} for sub-expression ${esub} in ${e}"}defhastype(e: Expr): Typ ={defhastype(tenv: TEnv, e: Expr): Typ = e match{// TypeNumbercaseN(_)=> TNumber// TypeFunctioncaseFun((x,t), e1)=>TFun((x,t),hastype(tenv +(x -> t), e1))// TypeVarcaseVar(x)=>tenv(x)// TypeCallcaseCall(e1, e2)=>hastype(tenv, e1)match{caseTFun((_,t), tret)if t ==hastype(tenv, e2)=> tretcase tbad =>throwStaticTypeError(tbad, e1, e)}}hastype(Map.empty, e)}
defined classStaticTypeError
defined functionhastype
hastype(e_illtyped)
invalid type TNumber for sub-expression N(3.0) in Call(N(3.0),N(4.0))
ammonite.$sess.cmd16$Helper.hastype$1(cmd16.sc:16)
ammonite.$sess.cmd16$Helper.hastype(cmd16.sc:19)
ammonite.$sess.cmd17$Helper.<init>(cmd17.sc:1)
ammonite.$sess.cmd17$.<clinit>(cmd17.sc:7)
26.7 Type Soundness
Our goal has been to design a type system such that whenever we say an expression \(e\) is well-typed (i.e., has a type \(\tau\) for some type \(\tau\)), then it can never get stuck in a step (i.e., \(e \longrightarrow e'\) for some reduced expression \(e'\)) in reducing to a value. This proposition is a meta-property relating our typing judgment form \(e : \tau\) and our reduction-step judgment form \(e \longrightarrow e'\) that we break up into two parts, progress and preservation:
Proposition 26.1 (Progress) If \(e : \tau\), then \(e \longrightarrow e'\) for some expression \(e'\).
Proposition 26.2 (Preservation) If \(e \longrightarrow e'\) and \(e : \tau\), then \(e' : \tau\).
If we can prove these propositions, then we say that our type system is sound, that is, it correctly classifies expressions that on evaluation, will never get stuck or result in an error. If our type system correctly claims that an expression \(e\) is well-typed, then every step in iteratively reducing \(e\) to a value will remain well-typed according to Progress and Preservation.
While the typically production scenario is to apply type checking statically before evaluating at full speed without checking, we can operationize this property to test our interpreter implementation. Let us instrument the reduction to a value judgment form \(e \hookrightarrow_{\tau} v\) to say, “Expression \(e\) reduces to a value \(v\) using some number of steps while checking the preservation of type \(\tau\) at each step.”:
\(\fbox{$e \hookrightarrow_{\tau} v$}\)
\(\inferrule[ReducesValue]{
e\;\mathsf{value}
}{
e \hookrightarrow_{\tau} e
}\)
\(\inferrule[ReducesProgressAndPreservation]{
e \longrightarrow e'
\and
\cdot \vdash e' : \tau
\and
e' \hookrightarrow_{\tau} e''
}{
e \hookrightarrow_{\tau} e''
}\)
We define a generic iterate function and a iterateStepPAP function that implements the \(e \hookrightarrow_{\tau} v\) judgment form:
def iterate[A](acc: A)(step: A =>Option[A]): A ={defloop(acc: A): A =step(acc)match{caseNone=> acccaseSome(acc)=>loop(acc)}loop(acc)}defiterateStepPAP(e: Expr): Expr ={// Check e is well-typed in that it doesn't throw StaticTypeError.val ty =hastype(e)// Iterate step while checking type preservation.iterate(e){// ReducesValuecase v ifisValue(v)=>None// ReducesProgressAndPreservationcase e =>{val e_ =step(e)val ty_ =hastype(e_)require(ty == ty_)Some(e_)}}}
defined functioniterate
defined functioniterateStepPAP
in contrast to the “production” iterateStep that type checks statically before evaluation at full speed:
defiterateStep(e: Expr): Expr ={// Check e is well-typed in that it doesn't throw StaticTypeError.val _ =hastype(e)// Iterate step at full speed.iterate(e){// ReducesToValuecase v ifisValue(v)=>None// ReducesToStepcase e =>Some(step(e))}}
defined functioniterateStep
They should evaluate a given expression to the same value:
val v_welltyped_step =iterateStep(e_welltyped)val v_welltyped_steppap =iterateStepPAP(e_welltyped)assert(v_welltyped_step == v_welltyped_steppap)
Milner, R. 1978. A theory of type polymorphism in programming. J. Comput. Syst. Sci. 17, 3 (1978), 348–375. DOI:https://doi.org/10.1016/0022-0000(78)90014-4.