Compiler Construction – Midterm Notes
Compiler Introduction & Scope
Compiler → A program that converts source code (C, C++, Java) into machine language (0 and 1).
Purpose of course → Learn techniques to translate, optimize, and run code.
Real-life example → You speak Urdu, computer only understands binary. Compiler = translator.
Subjects of Compiler Course
Lexical Analysis → Breaks code into tokens. Example: x = x + y → x, =, x, +, y.
Syntax Analysis → Checks grammar of code.
Syntax Directed Translation → Uses grammar rules to plan translation.
Intermediate Code Generation (IR) → Rough draft of code.
Run-time Environment → Manages memory, variables, resources when program runs.
Code Generation → Converts IR into final machine code.
Optimization → Makes code shorter and faster, without changing output.
Why Study Compiler?
Even though compilers exist, knowledge is useful for:
• Building interpreters
• Understanding how languages work
• Writing optimized code
Example: You don’t build a car engine, but knowing basics helps fix small issues.
Important Terminology
Compiler → Translates full program into machine code.
Interpreter → Reads & executes code line by line.
Example:
• Compiler = Translate whole Urdu book → English, then read.
• Interpreter = Translate each line while reading aloud.
Knowledge Needed to Build Compiler
Algorithms → Problem-solving methods.
Programming Languages & Machines → Source & target understanding.
Operating Systems → Resource management.
Computer Architecture → Hardware details.
Abstract View of Compiler
Detects legal/illegal code.
Generates correct machine code.
Manages variables & memory.
Maintains output format.
Language Processing System
Steps in program execution:
1. Preprocessor → Modifies code (adds libraries).
2. Compiler → Converts into assembly.
3. Assembler → Converts into machine code (0 and 1).
4. Linker/Loader → Combines & executes program.
■ Example (Cake):
• Preprocessor = Collect ingredients
• Compiler = Mix ingredients
• Assembler = Bake cake
• Linker/Loader = Decorate & serve
Front-End vs Back-End
Front-End → Scanning, parsing, error detection, IR generation.
Back-End → IR → machine code, optimization.
Grammar & Parsing
Grammar → Rules for code (like English grammar).
Example:
::= |
::= |
::= + | -
Parser uses grammar to make parse tree.
Abstract Syntax Tree (AST)
Simplified parse tree.
Keeps only necessary information for translation.
Optimizations (Middle-End)
Constant Propagation → Replace variables with constant values.
Dead Code Elimination → Remove unused code.
Example:
x = 5;
x = 6; // First line is useless
Lecture 2 – Lexical Analysis
Role of Lexical Analyzer: Splits source code into tokens, removes spaces/comments, builds symbol
table.
Why Separate Lexical Analysis & Parsing: Makes design simple, faster, portable.
Tokens, Patterns, Lexemes: Token = category, Pattern = rule, Lexeme = actual text.
Lexical Errors & Recovery: Ignore, delete, insert, swap characters.
Input Buffering & Sentinels: Lookahead for tokens, EOF marker.
Tokens using Regular Expressions (RegEx): Example id → letter (letter|digit)*.
Finite Automata: DFA (deterministic), NFA (non-deterministic), conversion possible.
Lexical Analyzer Generators: Tools like Lex/Flex build scanners automatically.
Lexical Analysis Summary: Compiler phases, role of lexical analyzer, DFA/NFA, tools.