defined trait Expr defined class N defined class Plus
Meeting 08 - Abstract Syntax and Parsing
What questions does your neighbor have?
In-Class Slides
In-Class Jupyter
Book Chapter
Announcements
- HW 2 due this
FridayMonday 9/23 6pm
Today
- Revisit last bit of Concrete Syntax
- Abstract Syntax
- Triage Your Questions
- Using VSCode or the terminal to test your code (on coding.csel.io)?
- Auto-testing with GitHub Actions?
- Lab 1?
- Revisit and Go Deeper On:
- Concrete Syntax (Meeting 07), if time permits
Questions?
- Review:
- How do you show that a grammar is ambiguous?
Questions?
One-Slide Review
Concrete syntax is …
Abstract syntax is …
Parsers convert …, which have deal with …
… and … are ways to to describe and deal with a common form of ambiguity.
Concrete syntax is the surface structure of a language (=strings).
Abstract syntax is the deep structure of a language (=trees/terms).
Parsers convert concrete syntax into abstract syntax, which have deal with ambiguity.
Precedence and associativity are ways to describe and deal with a common form of ambiguity.
Motivation
- We’ve seen pitfalls when writing grammars.
- Parser makes life easier down the road in the rest of the tool. We’ve seen it is hard enough to recognize tree structure by itself.
An Ambiguous Grammar
\[\begin{array}{rrrl} \text{expressions} & e & \mathrel{::=}& n \mid e\;\texttt{/}\;e \mid e\;\texttt{-}\;e \end{array}\]
Abstract Syntax Trees
sealed trait Expr // e ::=
case class N(n: Int) extends Expr // n
case class Divide(e1: Expr, e2: Expr) extends Expr // | e1 / e2
case class Minus(e1: Expr, e2: Expr) extends Expr // | e1 - e2
It matches the ambiguous grammar when read as concrete syntax. But it is also standard to write such a grammar and read it (or expect the reader to read it) as abstract syntax.
Abstract Syntax Trees
Minus(N(10), Divide(N(10), N(10)))
Abstract Syntax Trees versus Parse Trees
Parsing
Take a Theory of Computation course for more language theory. We focus here on the practical aspect of reading and writing BNF grammars.
There are also many advanced parser libraries and generator tools. You might use one in a Compiler Construction course.
An Ambiguous Grammar
\[ \begin{array}{rrrl} \text{expressions} & e & \mathrel{::=}& n \mid e + e \\ \text{numbers} & n \end{array} \]
Recursive-Descent Parsing
Automate the top-down, leftmost parsing derivation.
Ask: What’s that?
And trying productions left-to-right until finding a prefix match.
Recursive-Descent Parsing
Two rules:
- Best to have an unambiguous grammar.
Why? Predictability.
- No left-recursion in the grammar.
Why? Avoid unbounded “lookahead”.
Combinator Parsing
import $ivy.$
What’s a combinator?
A higher-order function/method.
So a combinator parsing library is simply one where you use it by calling methods with your own callback functions
We’ve seen a higher-order function/method before, right? Yes,
1
2
3
Restricting the Concrete Syntax
\[\begin{array}{rrrl} \text{terms} & t, \mathit{term} & \mathrel{::=}& \mathit{num} \mid\texttt{(}\; \mathit{expr} \;\texttt{)} \\ \text{expressions} & e, \mathit{expr} & \mathrel{::=}& \mathit{term} \mathbin{\texttt{+}} \mathit{term} \\ \text{numbers} & n, \mathit{num} \end{array}\]
Not ambiguous. No left recursion.
These are s-expressions!
Let’s Implement a Parser!
defined object ExprParser
defined object AParser
defined object ExprParser
First, \[ \mathit{num} \]
Then, def parse(s: String): Either[String, String]
.
Then, \[ \mathit{term} \mathrel{::=}\mathit{num} \]
res18_0: Either[String, String] = Right(value = "1") res18_1: Either[String, String] = Left( value = "Failure: '(' expected but '<' found" )
Then, def parse(s: String): Either[String, Expr]
.
res19_0: Either[String, Expr] = Right(value = N(n = 1.0)) res19_1: Either[String, Expr] = Left( value = "Failure: '(' expected but '<' found" )
Then, \[ \mathit{term} \mathrel{::=}\texttt{(}\; \mathit{expr} \;\texttt{)} \]
\[ \mathit{expr} \mathrel{::=}\mathit{term} \mathbin{\texttt{+}} \mathit{term} \]
res20_0: Either[String, Expr] = Right( value = Plus(e1 = Plus(e1 = N(n = 1.0), e2 = N(n = 2.0)), e2 = N(n = 3.0)) ) res20_1: Either[String, Expr] = Right( value = Plus(e1 = N(n = 1.0), e2 = Plus(e1 = N(n = 2.0), e2 = N(n = 3.0))) )
res21: Either[String, Expr] = Left(value = "Failure: end of input expected")