Various Data Structures Used in Compiler
Last Updated :
11 Jul, 2023
A compiler is a program that converts HLL(High-Level Language) to LLL(Low-Level Language) like machine-level language. The compiler has various data structures that the compiler uses to perform its operations. These data structures are needed by the phases of the compiler. Now we are going to discuss the various data structures in the compiler.
There are various data structures used in compilers such as:-
- Tokens
- Syntax Tree
- Symbol Table
- Literal Table
- Parse Tree
1. Tokens
Typically when a scanner scans the input and gathers a stream of characters into tokens, it represents the token symbolically it is represented as an enumerated data type representing the set of tokens in the source language. It is important to keep the characters string and the information derived from it.
2. Syntax Tree
A syntax tree is a tree data structure in which a node represents an operand and each interior node represents an operator. It is a dynamically allocated pointer-based tree data structure that is created as parsing proceeds. If the syntax tree is generated by the parser, then it is in the tree form.
For ex- Syntax tree for a+b*c.

3. Symbol Table
The symbol table is a data structure that is used to keep the information of identifiers, functions, variables, constants, and data types. It is created and maintained by the compiler because it keeps the information about the occurrence of entities. The symbol table is used in almost every phase of the compiler, we can see that in the below diagram of phases of a compiler. The scanner, parser, and semantic phase may enter identifiers into the symbol table and the optimization and code generation phase will access the symbol table to use the information provided by the symbol table to make appropriate decisions. Given the frequency of access to the symbol table, the insertion, deletion, and access operations should be well-optimized and efficient. The hash table is mainly used here.

4. Literal Table
A literal table is a data structure that is used to keep track of literal variables in the program. It holds constant and strings used in the program but it can appear only once in a literal table and its contents apply to the whole program, which is why deletions are not necessary for it. The literal table allows the reuse of constants and strings that plays an important role in reducing the program size.
5. Parse Tree
A parse tree is the hierarchical representation of symbols. The symbols include terminal or non-terminal. In the parse tree the string is derived from the starting symbol and the starting symbol is mainly the root of the parse tree. All the leaf nodes are symbols and the inner nodes are the operators or non-terminals. To get the output we can use Inorder Traversal.
For example:- Parse tree for a+b*c.

And there is intermediate code which also needs data structures to store the data.
6. Intermediate Code
Once the intermediate code is generated, the intermediate code can be stored as a linked list of structures, a text file, or an array of strings that only depends on the type of intermediate code that is generated. According to that, we choose the right data structures that will carry optimization.
Similar Reads
Compiler Design - Variants of Syntax Tree A syntax tree is a tree in which each leaf node represents an operand, while each inside node represents an operator. The Parse Tree is abbreviated as the syntax tree. The syntax tree is usually used when representing a program in a tree structure. Rules of Constructing a Syntax Tree A syntax tree's
7 min read
Last Minute Notes - Compiler Design In computer science, compiler design is the study of how to build a compiler, which is a program that translates high-level programming languages (like Python, C++, or Java) into machine code that a computer's hardware can execute directly. The focus is on how the translation happens, ensuring corre
13 min read
Compiler construction tools The compiler writer can use some specialized tools that help in implementing various phases of a compiler. These tools assist in the creation of an entire compiler or its parts. Some commonly used compiler construction tools include: Parser Generator - It produces syntax analyzers (parsers) from the
4 min read
What is USE, IN, and OUT in Compiler Design? Compiler design plays a crucial role in translating high-level programming languages into machine-readable code. During the compilation process, various terminologies are utilized to facilitate the conversion. This article aims to provide an overview of three essential concepts in compiler design: U
3 min read
Synthesis Phase in Compiler Design Pre-requisites: Phases of a Compiler The synthesis phase, also known as the code generation or code optimization phase, is the final step of a compiler. It takes the intermediate code generated by the front end of the compiler and converts it into machine code or assembly code, which can be executed
4 min read