0% found this document useful (0 votes)
22 views

Final Compiler

The document discusses compilers and their key components and functions. A compiler is a program that translates a program written in one language (the source language) into an equivalent program in another language (the target language). The target language is typically machine code that a computer's processor understands. A compiler performs analysis of the source code, generates intermediate representations, performs optimizations, and generates the target code. It also checks for errors in the source program. Other tools like assemblers, linkers, loaders, and interpreters are involved in the translation process from source to executable code.

Uploaded by

usf94598
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

Final Compiler

The document discusses compilers and their key components and functions. A compiler is a program that translates a program written in one language (the source language) into an equivalent program in another language (the target language). The target language is typically machine code that a computer's processor understands. A compiler performs analysis of the source code, generates intermediate representations, performs optimizations, and generates the target code. It also checks for errors in the source program. Other tools like assemblers, linkers, loaders, and interpreters are involved in the translation process from source to executable code.

Uploaded by

usf94598
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

program that can read a program in one language” the source

language” and translate it into an equivalent program in another


language “the target language”.

compiler

• Usually, the source language is a high-level language like Java,


C++, Fortran, etc. whereas the target language is a machine
code or “code” that a computer's processor understands.
• An important role of the compiler is to report any errors in the
source program that it detects during the translation process.
• If the target program is an executable machine language
program, it can then be called by the user to process inputs
and produce outputs.
• Interpreter is another common kind of language processor.

appears to directly execute the operations specified in the source


program on inputs supplied by the user.

interpreter

• machine-language target program produced by a compiler is


usually much faster than an interpreter.
• Interpreter gives better error diagnostics than a compiler,
because it executes the source program statement by
statement.
The task of collecting the source program is sometimes entrusted
to a separate

preprocessor

• source program may be divided into modules stored in


separate files.
• The compiler may produce an assembly-language program
as its output, because assembly language is easier to
produce as output and is easier to debug.

produces relocatable machine code as its output.

Assembler

• The assembly language is then processed by a program


called an assembler
• Large programs are often compiled in pieces, so the
relocatable machine code may have to be linked together
with other relocatable object files and library files into the
code that actually runs on the machine.

resolves external memory addresses, where the code in one file


may refer to a location in another file.

Linker

puts together all executable object files into memory for execution

loader
• we've thought of a compiler as a single box that maps a
source program into a semantically equivalent target program.
• If we open this box, we can see that this mapping is divided
into two parts: analysis and synthesis.

breaks up the source program into constituent pieces and imposes


a grammatical structure on them. It then uses this structure to
create an intermediate representation of the source program.

Analysis part

collects information about the source program and stores it in a


data structure called a symbol table, which is passed along with
the intermediate representation to the synthesis part.

Analysis part

constructs the desired target program from the intermediate


representation and the information in the symbol table.

synthesis part

• analysis is called the front end, synthesis is the back end.

stores information about the entire source program, is used by all


phases of the compiler. It maps variables into attributes, i.e. type,
name, dimension, address, etc.

symbol table
reads the stream of characters making up the source program and
groups the characters into meaningful sequences called lexemes

lexical analyzer

• For each lexeme, it produces as output a token of the form:


<𝑡𝑜𝑘𝑒𝑛−𝑛𝑎𝑚𝑒, 𝑎𝑡𝑡𝑟𝑖𝑏𝑢𝑡𝑒 – 𝑣𝑎𝑙𝑢e>

is the abstract symbol used in the syntax analysis

token-name

points to an entry in the symbol table containing information for


the semantic analysis and code generation.

attribute-value

create a tree like intermediate representation that depicts the


grammatical structure of the token stream.

syntax analysis or parsing

in which each interior node represents an operation and the


children of the node represent the arguments of the operation

syntax tree
uses the syntax tree and the symbol table to check the source
program for semantic consistency with the language definition.

semantic analyzer

gathers type information and saves it in either the syntax tree or


the symbol table

semantic analyzer

• An important part of semantic analysis is type checking,


where the compiler checks that each operator has matching
operands.
• The language specification may permit some type
conversions called coercions.

compilers generate an explicit lowlevel or machine-like


intermediate representation, which we can think of as a program
for an abstract machine.

Intermediate Code Generation

a form that can be readily executed by a machine

generated intermediate code


• intermediate representation should have two important
properties
o It should be easy to produce.
o It should be easy to translate into the target machine.
• intermediate form called three-address code

consists of a sequence of assembly-like instructions with three


operands per instruction. Each operand can act like a register.

Intermediate Code Generation

attempts to improve the intermediate code so that better target


code will result.

machine-independent code-optimization

removes unnecessary code lines and arranges the sequence of


statements to speed up the execution of the program without
wasting resources.

code optimizer

takes as input an intermediate representation of the source


program and maps it into the target language.

code generator
• crucial aspect of code generation is the judicious assignment of
registers to hold variables.

data structure containing a record for each variable name, with


fields for the attributes of the name.

symbol table

• These attributes may provide information about the storage


allocated for a name, its type, its scope.

other more specialized tools have been created to help implement


various phases of a compiler.

automatically produce syntax analyzers from a grammatical


description of a programming language.

Parser generators

produce lexical analyzers from a regular-expression description of


the tokens of a language.

Scanner generators

translation engines that produce collections of routines for


walking a parse tree and generating intermediate code.

Syntax-directed
produce a code generator from a collection of rules for translating
each operation of the intermediate language into the machine
language for a target machine.

Code-generator generators

engines that facilitate the gathering of information about how


values are transmitted from one part of a program to each other
part. Data-flow analysis is a key part of code optimization.

Data-flow analysis

toolkits that provide an integrated set of routines for constructing


various phases of a compiler.

Compiler-construction
• macro instructions were added to assembly languages so that a
programmer could define parameterized shorthands for
frequently used sequences of machine instructions.

languages are the machine languages.

First-generation

assembly languages.

Second-generation

higher-level languages like Fortran, Cobol, Lisp, C, C + + , C # , Java.

Third-generation

languages designed for specific applications like NOMAD for


report generation, SQL for database queries, and Postscript for
text formatting.

Fourth-generation

languages include logic- and constraintbased languages such as


Prolog and OPS5.

Fifth-generation

languages in which a program specifies how a computation is to


be done

imperative

languages in which a program specifies what computation is to be


done.

Declarative
Languages such as C, C++, C#, and Java are.

imperative languages.

Functional languages such as ML and Haskell and constraint logic


languages such as Prolog are often considered to be

declarative languages

language that supports object-oriented programming

object-oriented language

interpreted languages with high-level operators designed for


"gluing together" computations.

Scripting languages

Awk, JavaScript, Perl, PHP, Python, Ruby, and Tel are popular
examples of

scripting languages

• Programs written in scripting languages are often much shorter


than equivalent programs written in languages like C.
• Compilers can help promote the use of high-level languages by
minimizing the execution overhead of the programs written in
these languages.
• A compiler must translate correctly the potentially infinite set
of programs that could be written in the source language

.
• Compiler writing is challenging.
o A compiler by itself is a large program.
o Moreover, many modern language-processing systems
handle several source languages and target machines
within the same framework; that is, they serve as
collections of compilers, possibly consisting of millions
of lines of code.
o Consequently, good software-engineering techniques
are essential for creating and evolving modern
language processors.

models are useful for describing the lexical units of programs


(keywords, identifiers, and such) and for describing the algorithms
used by the compiler to recognize those units.

finite-state machines and regular expressions

used to describe the syntactic structure of programming


languages such as the nesting of parentheses or control
constructs.

context-free grammars

important model for representing the structure of programs and


their translation into object code.

trees

the attempts that a compiler makes to produce code that is more


efficient than the obvious code.

optimization
• the optimization of code that a compiler performs has become
both more important and more complex.
• It is more complex because processor architectures have
become more complex, yielding more opportunities to improve
the way code executes.
• It is more important because massively parallel computers
require substantial optimization, or their performance suffers
by orders of magnitude.
• Compiler optimizations must meet the following design
objectives:
o must be correct
o improve the performance
o compilation time must be kept reasonable
o engineering effort required must be manageable.
• we must keep the system simple to assure that the engineering
and maintenance costs of the compiler are manageable.
• Applications of Compiler : Implementation, Optimizations for
Computer Architectures, Design of New Computer
Architectures, Program Translations, Code optimization,
Debugging

Compiler technology is used to implement a wide range of high-


level programming languages, including popular languages such as
Java, Python, C++, and JavaScript.

Implementation
provide efficient and accurate translations of high-level code into
machine code.

Implementation

Examples of these optimizations include parallelization, memory


hierarchy optimization, register allocation, and instruction
scheduling.

Optimizations for Computer Architectures

translating code from one programming language to another, or


from one dialect of a programming language to another.

Program Translations

when a program needs to be ported from one platform to


another, or when different parts of a system are written in
different languages and need to be integrated.

Program Translations

providing detailed error messages, diagnostic information, and


debugging tools that can help developers identify and diagnose
programming errors.

Debugging

optimize code by analyzing and transforming it to improve


performance, reduce code size, or improve energy efficiency.

Code optimization

You might also like