To study different phases of compiler

Problem Statement : To study how an assignment statement passes through different phases of

compiler.

Theory : We know a compiler is a single box that maps a source program into a

semantically equivalent target program. If we open up this box a little, we see that there are two

parts to this mapping: analysis and synthesis.

 The analysis part breaks up the source program into constituent pieces and imposes a

grammatical structure on them. It then uses this structure to create an intermediate

representation of the source program. If the analysis part detects that the source program

is either syntactically ill formed or semantically unsound, then it must provide

informative messages, so the user can take corrective action. The analysis part also

collects information about the source program and stores it in a data structure called a

symbol table, which is passed along with the intermediate representation to the synthesis

part.

 The synthesis part constructs the desired target program from the intermediate

representation and the information in the symbol table. The analysis part is often called

the front end of the compiler; the synthesis part is the back end.

If we examine the compilation process in more detail, we see that it operates as a sequence of

phases, each of which transforms one representation of the source program to another. A typical

decomposition of a compiler into phases is shown in Fig abc. In practice, several phases may be

grouped together, and the intermediate representations between the grouped phases need not be

constructed explicitly. The symbol table, which stores information about the entire source

program, is used by all phases of the compiler.

1. Lexical Analysis

The first phase of a compiler is called lexical analysis or scanning. The lexical analyzer reads

the stream of characters making up the source program and groups the characters into

meaningful sequences called lexemes. For each lexeme, the lexical analyzer produces as

output a token of the form

(token-name, attribute-value)

that it passes on to the subsequent phase, syntax analysis. In the token, the first component

token-name is an abstract symbol that is used during syntax analysis, and the second

component attribute-value points to an entry in the symbol table for this token. Information

from the symbol-table entry Is needed for semantic analysis and code generation.

2. Syntax Analysis

The second phase of the compiler is syntax analysis or parsing. The parser uses the first

components of the tokens produced by the lexical analyzer to create a tree-like intermediate

representation that depicts the grammatical structure of the token stream. A typical

representation is a syntax tree in which each interior node represents an operation and the

children of the node represent the arguments of the operation.

3. Semantic Analysis

The semantic analyzer uses the syntax tree and the information in the symbol table to check

the source program for semantic consistency with the language definition. It also gathers type

information and saves it in either the syntax tree or the symbol table, for subsequent use

during intermediate-code generation.

An important part of semantic analysis is type checking, where the compiler checks that each

operator has matching operands. For example, many programming language definitions

require an array index to be an integer; the compiler must report an error if a floating-point

number is used to index an array.

4. Intermediate Code Generation

In the process of translating a source program into target code, a compiler may construct one

or more intermediate representations, which can have a variety of forms. Syntax trees are a

form of intermediate representation; they are commonly used during syntax and semantic

analysis. After syntax and semantic analysis of the source program, many compilers generate

an explicit low-level or machine-like intermediate representation, which we can think of as a

program for an abstract machine. This intermediate representation should have two important

properties: it should be easy to produce and it should be easy to translate into the target

machine.

5. Code Optimization

The machine-independent code-optimization phase attempts to improve the intermediate

code so that better target code will result. Usually better means faster, but other objectives

may be desired, such as shorter code, or target code that consumes less power. A simple

intermediate code generation algorithm followed by code optimization is a reasonable way to

generate good target code. The optimizer can deduce that the conversion of 60 from integer

to floating point can be done once and for all at compile time, so the inttofloat operation can

be eliminated by replacing the integer 60 by the floating-point number 60.0.

6. Code Generation

The code generator takes as input an intermediate representation of the source program and

maps it into the target language. If the target language is machine code, registers Or memory

locations are selected for each of the variables used by the program. Then, the intermediate

instructions are translated into sequences of machine instructions that perform the same task.

A crucial aspect of code generation is the judicious assignment of registers to hold variables.

7. Symbol-Table Management

An essential function of a compiler is to record the variable names used in the source

program and collect information about various attributes of each name. These attributes may

provide information about the storage allocated for a name, its type, its scope (where in the

program its value may be used), and in the case of procedure names, such things as the

number and types of its arguments, the method of passing each argument (for example, by

value or by reference), and the type returned.

The symbol table is a data structure containing a record for each variable name, with fields

for the attributes of the name. The data structure should be designed to allow the compiler to

find the record for each name quickly and to store or retrieve data from that record quickly.
Conclusion : _________________________________________________________

Frequently asked questions:

1. What are compilers?

2. What is language processor?

3. What is interpreter? How it differs from compiler?

4. What are different phases of the compiler?

Search This Blog

Computer Concepts

To study different phases of compiler

Comments

Post a Comment