COMS W4115
Programming Languages and Translators
Lecture 16: Translating Statements into Three-Address Code
March 25, 2015

Lecture Outline

Types
Three-address code
Translation of assignments
Arrays
Boolean expressions
If-statements
While-statements

1. Types

Type inference rules
Type conversions
For details see notes for Lecture 15, 3/23/2015.

2. Three-Address Code

Three-address code is a common intermediate representation generated by the front end of a compiler. It consists of instructions with a variety of simple forms:

Assignment instructions of the form x = y op z, x = op y, or x = y where x, y, and z are names or compiler-generated temporaries. y and z can also be constants.
Jump instructions of the form goto L, if x goto L, ifFalse x goto L, or if x relop y goto L where L is the label of some three-address instruction and relop is a relational operator such as <, <=, ==, and so on.
Parameter passing, procedure calling, and procedure returning instructions of the form param x; call p,n; and return y. Here p is the name of the procedure and n is the number of parameters it takes.
Indexed copy instructions of the form x = y[i] and x[i] = y.
Address and pointer assignments of the form x = &y x = *y, and *x = y.

Static single-assignment (SSA) form is a variant of three-address code in which assignments are to variables with distinct names. It also uses a special function, called a φ-function, to combine two definitions of the same variable arising from two different control-flow paths. See ALSU, Section 6.2.4, pp. 369-370.
In this lecture, we will show to translate common programming language statements into three-address code using syntax-directed translation. In practice, these kinds of translation are often produced by making traversals over the AST.

3. Translation of Assignments

The assignment statement a = b + -c; might be translated into the following sequence of three-address instructions:


  t1 = uminus c
  t2 = b + t1
  a = t2

Here is an SDTS that generates this kind of three-address code on the fly for assignments:


S → id = E ; { gen(top.get(id.lexeme) '=' E.addr); }
E → E₁ + E₂  { E.addr = new Temp();
               gen(E.addr '=' E₁.addr '+' E₂.addr); }
  | - E₁     { E.addr = new Temp();
               gen(E.addr '=' 'uminus' E₁.addr); }
  | ( E₁ )   { E.addr = E₁.addr; }
  | id       { E.addr = top.get(id.lexeme); }

The semantic actions use the attribute E.addr for the address of the location where the value of E is stored and the function top.get(id.lexeme) to retrieve the location for id.lexeme from the symbol table in its current scope. The function gen generates and outputs a three-address instruction.

4. Translation of Arrays

Referencing a one-dimensional array

In C and Java, array elements are numbered 0, 1,..., n-1 for an array A with n elements.
Element A[i] begins in location (base + i × w) where base is the relative address of the storage allocated for A and w is the width of each element.

Common layouts for multidimensional arrays

Row-major order
Column-major order

See Fig. 6.22 (p. 383) for an SDD generating three-address code for assignments with array references.
Example: three-address code for the expression c + a[i][j] assuming the width of an integer is 4


  t1 = i * 12
  t2 = j * 4
  t3 = t1 + t2
  t4 = a[t3]
  t5 = c + t4

5. Translation of Boolean Expressions

Boolean expressions are composed of boolean operators (&&, ||, !) applied to boolean variables, relational expressions, and other boolean expressions.
Short-circuit evaluation: Some languages, such as C and Java, do not require an entire boolean expression to be evaluated.

Given x && y, if x is false, then we can conclude the entire expression is false without evaluating y.
Given x || y, if x is true, then we can conclude the entire expression is true without evaluating y.

Numerical encoding

In C, the numerical value 0 represents false; a nonzero value represents true.

Positional encoding

The value of a boolean expression can be represented by a position in three-address code, and the boolean operators can be translated into jumps.
The expression


       if (x < 100 || x > 200 && x != y)
         x = 0;

can be translated into the following three-address instructions:


           if x < 100 goto L2
           ifFalse x > 200 goto L1
           ifFalse x != y goto L1
       L2: x = 0
       L1:

6. Translation of If-statements

Boolean expressions often appear in the context of flow-of-control statements such as:

If statements
If-else statement

See Figs. 6.36 (p. 402) and 6.37 for SDDs translating these statements with booleans into three-address code.
For the expression


       if (x < 100) || x > 200 && x != y)
         x = 0;

these SDDs produce the following three-address instructions:


           if x < 100 goto L2
           goto L3
       L3: if x > 200 goto L4
           goto L1
       L4: if x != y goto L2
           goto L1
       L2: x = 0
       L1:

This code can be transformed into the code in Section 4 by eliminating the redundant goto and changing the directions of the tests in the second and third if-statements.

7. Translation of While-statements

Consider the production S → while ( B ) S1 for while-statements. The shape of the code for implementing this production can take the form:


   begin: // beginning of code for S
      code to evaluate B
      if B is true goto B.true
      if B is false goto B.false
   B.true:
      code to evaluate S1
      goto begin
   B.false:  // this is where control flow will go after executing S

Here is an SDD for this translation (from Fig. 6.36, p. 402):


   S → while ( B ) S1 {
            begin = newlabel()
            B.true = newlabel()
            B.false = S.next
            S1.next = begin
            S.code = label(begin) || B.code ||
                     label(B.true) || S1.code ||
                     gen('goto' begin)
   }

8. Practice Problems

Use the SDD of Fig. 6.22 (ALSU, p. 383) to translate the assignment x = a[i][j] + b[i][j].
Add rules to the SDD in Fig. 6.36 (ALSU, p. 402) to translate do-while statements of the form:

S → do S while B

Show the code your SDD would generate for the program


     do
       do
         assign1
       while a < b
     while c < d

9. Reading

ALSU, Sections 6.4 - 6.8

aho@cs.columbia.edu

COMS W4115 Programming Languages and Translators Lecture 16: Translating Statements into Three-Address Code March 25, 2015