COMS W4115
Programming Languages and Translators
Lecture 11: Syntax-Directed Translation
February 25, 2015
Lecture Outline
- Syntax-directed definitions and translation schemes
- Synthesized and inherited attributes
- S-attributed SDDs
- L-attributed SDDs
1. Syntax-Directed Definitions and Translation Schemes
- The syntax analyzer translates its input token stream into an intermediate
language representation of the source program, usually an abstract
syntax tree (AST).
- A syntax-directed definition can be used to specify this translation.
- A syntax-directed definition (SDD) is a context-free grammar with attributes
attached to grammar symbols and semantic rules
attached to the productions.
- The semantic rules define values for attributes associated with the
symbols of the productions. These values can be computed by
creating a parse tree for the input and then making a sequence of
passes over the parse tree, evaluating some or all of the rules on each
pass. SDDs are a very useful conceptual tool in the specification of translations.
- A syntax-directed translation scheme (SDTS) is a context-free grammar with program
fragments, called semantic actions, embedded within production bodies.
SDTSs are useful for implementing syntax-directed definitions.
Think of a Yacc specification as an SDTS.
2. Synthesized and Inherited Attributes
- Attributes are values computed at the nodes of a parse tree.
- Synthesized attributes
are values that are computed at a node N in a parse tree from attribute
values of the children of N and perhaps N itself.
Synthesized attributes can be
easily computed by a shift-reduce parser that keeps the
values of the attributes on the parsing stack.
See Sect. 5.4.2 of ALSU.
- An SDD is S-attributed if every attribute is synthesized.
S-attributed SDDs are useful for bottom-up parsing.
- Inherited attributes
are values that are computed at a node N in a parse tree from attribute
values of the parent of N, the siblings of N, and N itself.
- An SDD is L-attributed is every attribute is either synthesized or
inherited from the parent or from the left.
L-attributed SDDs are useful for top-down parsing.
See Sect. 5.2.4 of ALSU for details.
3. Examples of S-Attributed SDDs
- Example 1. Here is an S-attributed SDD translating signed bit strings
into decimal numbers. The attributes,
BNum.val
,
Sign.val
, List.val
, and
Bit.val
, are all synthesized attributes that
represent integers.
BNum → Sign List { BNum.val = Sign.val × List.val }
Sign → + { Sign.val = +1 }
Sign → - { Sign.val = -1 }
List → List1 Bit { List.val = 2 × List1.val + Bit.val }
List → Bit { List.val = Bit.val }
Bit → 0 { Bit.val = 0 }
Bit → 1 { Bit.val = 1 }
Example 2. Here are Yacc translation rules implementing the SDD above
for translating signed bit strings
into decimal numbers. The identifiers $$, $1, $2 and so on in Yacc actions
are synthesized attributes.
BNum : Sign List { $$ = $1 * $2; }
;
Sign : '+' { $$ = +1; }
| '-' { $$ = -1; }
;
List : List Bit { $$ = 2*$1 + $2; }
| Bit
;
Bit : '0' { $$ = 0; }
| '1' { $$ = 1; }
;
Example 3. Here is an S-attributed SDD based on an SLR(1) grammar that
translates arithmetic expressions
into ASTs.
E
has the synthesized attributed E.node
and
T
the synthesized attribute T.node
.
E.node
and T.node
point to a node in the AST.
The function Node(op, left, right)
returns a pointer to a node with three fields:
the first labeled op
, the second a pointer
to a left subtree, and the third a pointer to a right subtree.
The function Leaf(id, value)
returns a pointer to a node
with two fields: the first labeled id
, the second the
value of the token. See Example 5.11 in ALSU.
E → E1 + T { E.node = Node('+', E1.node, T.node); }
E → T { E.node = T.node; }
T → ( E ) { T.node = E.node; }
T → id { T.node = Leaf(id, id.entry); }
4. Example of an L-Attributed SDD
- Example 4. Here is an L-attributed SDD based on an LL(1) grammar for translating arithmetic
expressions into ASTs. See Example 5.12 in ALSU.
E → T A { E.node = A.s;
A.i = T.node; }
A → + T A1 { A1.i = Node('+', A.i, T.node);
A.s = A1.s; }
A → ε { A.s = A.i; }
T → ( E ) { T.node = E.node; }
T → id { T.node = Leaf(id, id.entry); }
5. Practice Problems
- Using Yacc, implement a syntax-directed translator that
translates sequences of postfix Polish expressions into infix notation.
For example, your translator should map
345+*
into 3*(4+5)
.
- Optimize your translator so it doesn't generate any redundant parentheses.
For example, your translator should still map
345+*
into 3*(4+5)
but it should map 345*+
into 3+4*5
.
6. Reading
aho@cs.columbia.edu