What Happens When A SQL Is Executed
What Happens When A SQL Is Executed
86 1 Share
Have you ever wondered how a simple SQL command unlocks database power? SQL, or
Structured Query Language, is the backbone of modern data management. It allows
efficient retrieval, manipulation, and management of information in relational
databases.
Behind every query we run, there’s a complex sequence of processes. These transform
our commands into actions performed by the database management system (DBMS).
Mastering these processes lets us harness their full potential.
As developers, it’s crucial to understand the journey of a SQL statement. This journey
takes us through the SQL parser, query optimizer, execution engine, and underlying
storage engine. With this insight, we can:
https://blog.bytebytego.com/p/what-happens-when-a-sql-is-executed 1/13
4/25/24, 12:14 PM (1) What Happens When a SQL is Executed?
SQL Standards
SQL standards are developed and maintained by international standards organizations,
such as the International Organization for Standardization (ISO) and the American
National Standards Institute (ANSI). These standards are shaped with contributions
from industry experts and database vendors. Their goal is to ensure interoperability,
portability, and consistency across various SQL implementations. This enables
developers to write SQL code that can run on multiple database platforms.
https://blog.bytebytego.com/p/what-happens-when-a-sql-is-executed 2/13
4/25/24, 12:14 PM (1) What Happens When a SQL is Executed?
There are 4 stages in SQL standard development. It began with the early versions –
SQL-86, SQL-89, and SQL-92- – which formed the foundation and introduced major
keywords. SQL:1999 is often recognized as a pivotal point in modern SQL development.
However, later standards have continued to introduce valuable features and
improvements. These address ongoing challenges in data management and adapt to
technological advancements.
Each SQL standard builds on the work of its predecessors, contributing to the ongoing
evolution and refinement of the language.
SQL Statement
A SQL statement is a command interpreted and executed by the SQL engine. It is
primarily considered a declarative language, focusing on specifying what data should be
retrieved or manipulated rather than how to retrieve it.
The portability of SQL across different database platforms is one of its key advantages.
SQL queries written for one system often require minimal or no modifications to run on
another, as long as they adhere to the SQL standards.
Consider the following example of SQL syntax. The SELECT statement retrieves
specific data from one or more tables. It’s paired with the FROM clause, which specifies
the tables involved in the query.
To incorporate data from multiple tables, the JOIN clause establishes relationships
between them based on shared keys or conditions. The WHERE clause allows for the
filtering of rows based on specified conditions, further refining the result set.
For aggregating data based on common attributes, the GROUP BY clause is employed.
This facilitates summary statistics or grouping operations. Finally, the ORDER BY
clause sorts the query results in a specified order based on columns.
https://blog.bytebytego.com/p/what-happens-when-a-sql-is-executed 3/13
4/25/24, 12:14 PM (1) What Happens When a SQL is Executed?
Broadly speaking, MySQL is structured into two main tiers: the Server tier and the
Storage Engine tier.
Server Tier: This tier includes connectors, query caches, analyzers, optimizers, and
executors. It covers most of the core service functions of MySQL, including all built-in
functions such as date, time, mathematical, and cryptographic functions. All cross-
storage-engine functionalities, like stored procedures, triggers, and views, are
implemented at this layer.
Storage Engine Tier: This layer is responsible for data storage and retrieval. Its
pluggable architecture allows for the use of various storage engines, such as InnoDB,
MyISAM, and Memory. InnoDB is the most popular and has been the default storage
engine since MySQL version 5.5.5.
When executing a “create table” command, MySQL defaults to using InnoDB. However,
users can specify another engine by including options like engine=memory for an in-
memory engine or other supported engines according to their needs. Each storage
engine offers different methods of accessing table data and supports different features.
https://blog.bytebytego.com/p/what-happens-when-a-sql-is-executed 4/13
4/25/24, 12:14 PM (1) What Happens When a SQL is Executed?
The diagram below illustrates the high-level architecture of MySQL. Let's walk through
it step by step.
Connection Pool
Connecting to a database begins with the connection pool, accessed through various
client connectors. The connection pool's primary responsibilities include establishing a
connection with the client, granting permissions, and managing the connection.
Following a standard TCP handshake, the connection pool proceeds to authenticate the
user using the provided username and password.
https://blog.bytebytego.com/p/what-happens-when-a-sql-is-executed 5/13
4/25/24, 12:14 PM (1) What Happens When a SQL is Executed?
Once authenticated, the connection pool checks the permissions table to grant the
appropriate permissions to the connection. It's important to note that once a connection
is established, any changes made to the user’s permissions will not affect the
permissions of that existing connection.
If no further actions occur after the connection is established, it enters an idle state.
Should the client remain idle for too long, the connection pool will automatically sever
the connection. The client must then reconnect to initiate any new requests to the
database.
Query Cache
Once the connection is established, we can execute SQL statements. When the MySQL
server receives a query, it initially checks the query cache to determine if the statement
has been executed previously. Executed statements and their results may be cached
directly in memory as key-value pairs, where the query statement serves as the key and
the query result as the value.
If the query can be matched directly in this cache, the result is immediately returned to
the client.
If the statement is not found in the query cache, the query execution proceeds as
normal. Upon completion, the result is then stored in the query cache for potential
future use.
While using the query cache can seem quite efficient, it is generally not recommended
for tables that change tables frequently. Such changes invalidate the query cache too
often, reducing its effectiveness. Note that MySQL has removed the query cache starting
from version 8.0 due to these limitations.
Command Parser
If the query is not in the query cache, the request moves to the command parser. Here,
the parser performs a lexical analysis to classify each string in the SQL statement. For
example, it identifies keywords such as “SELECT” or “UPDATE”. Then, it performs a
https://blog.bytebytego.com/p/what-happens-when-a-sql-is-executed 6/13
4/25/24, 12:14 PM (1) What Happens When a SQL is Executed?
syntax analysis to determine if the statement complies with MySQL’s syntax rules. This
ensures that the arrangement of these elements forms a valid SQL command.
Optimizer
After parsing the SQL statement, the server knows what it needs to do next. But before
executing the statement, it consults the optimizer.
The optimizer determines the most efficient way to access data. When a table has
multiple indexes, the optimizer chooses which one to use. It also decides the sequence
for joining tables when a statement includes multiple joins.
Consider this scenario where we execute a join statement between users and orders
tables:
SELECT *
FROM users a
JOIN orders b
ON a.ID = b.ID
WHERE a.AGE = 30 AND b.PRODUCT_ID = 201;
Here, we have two options: filter by AGE = 30 on the users table or by PRODUCT_ID =
201 on the orders table. Although both approaches yield the same results, their efficiency
may be different. It is the optimizer’s job to select the best execution plan based on the
indexes and statistics available for the tables.
After selecting the best plan, the optimizer sends it to the executor.
Query Executor
The executor begins by verifying whether we have the necessary permissions to execute
the statement on the tables. Once permission is confirmed, it uses the interfaces
provided by the storage engine to access the data.
Let's look at the execution process using our previous example, assuming there are no
indexes on the tables and a table scan is required:
https://blog.bytebytego.com/p/what-happens-when-a-sql-is-executed 7/13
4/25/24, 12:14 PM (1) What Happens When a SQL is Executed?
1. Fetch the first row from the users table using InnoDB interfaces. If the AGE column
is not 30, discard the row and proceed to the next.
2. Continue this process to collect all rows where AGE = 30 until reaching the end of
the table.
The executor then sends the result set back to the client.
If the tables have predefined indexes, the executor first checks these before scanning the
entire table. Essentially, each index is associated with a B+ tree structure in the InnoDB
storage engine. For example, consider that we’ve created two indexes on the users table:
one for the ID column, and the other for the AGE column. These indexes help the
executor quickly locate the relevant rows by navigating through the B+ tree structures
instead of examining every single row in the table.
Indexes are categorized based on the data stored in the leaf node: a clustered index and a
non-clustered index. The index on the ID column is a clustered index, where the entire
row of data is stored in the leaf node. This is also called the primary key. The index on
the AGE column is a non-clustered index, storing only the primary key value in the leaf
node. This is also called the secondary index.
https://blog.bytebytego.com/p/what-happens-when-a-sql-is-executed 8/13
4/25/24, 12:14 PM (1) What Happens When a SQL is Executed?
An update statement does more than a select statement, which simply retrieves data
based on certain conditions. It actively modifies persisted data, adding layers of
complexity to the process.
Just like the select statement, an update statement passes through several stages: the
connection pool, analyzer, optimizer, and executor. However, it also involves two
important logging modules to guarantee data consistency: the redo log and the binlog.
Let's explore each of these modules.
Redo Log
First, let's explore the Write-Ahead Log (WAL) architecture, an essential concept in
database management.
In a WAL system, changes to the database, such as inserts, updates, and deletes, are first
recorded sequentially in a log file before they modify the actual database. This process
ensures that if a crash occurs before the changes are applied to the database, the system
can recover by replaying these log entries. This mechanism brings the database back to
a consistent state by restoring the information recorded in the log.
The redo Log in MySQL employs the WAL principle. It ensures that all transactions are
durable and the database remains consistent, even after a crash. The diagram below
shows the three layers in the redo log architecture:
1. Logical redo layer: This layer contains multiple records linked together, each
identified by a sequential offset number (sn).
2. Physical redo layer: Since the disk operates as a block device, this layer utilizes the
concept of blocks to read and write data. A single block may contain multiple redo
logs, or a redo log might be spread across several blocks.
3. File layer: Redo logs are stored in specific log files (like ib_logfile0, ib_logfile1, etc.).
To minimize the overhead associated with file creation and space allocation,
InnoDB recycles its REDO files.
https://blog.bytebytego.com/p/what-happens-when-a-sql-is-executed 10/13
4/25/24, 12:14 PM (1) What Happens When a SQL is Executed?
Source: Alibabacloud
This architecture ensures that InnoDB is crash-safe. It can recover all committed
transactions even after a database crash.
Next, we'll examine another crucial logging mechanism at the server layer: the binlog.
Binlog
The binlog differs from the redo log in three key ways:
1. Scope of use: The redo log is specific to the InnoDB engine, whereas the binlog
operates at the server layer and is accessible by all storage engines.
2. Type of logging: While the redo log captures physical changes made to the data
pages,the binlog logs the SQL statements used to execute the transactions,
recording them in a logical format.
3. Size management: The redo log has a fixed size and must be recycled regularly, but
the binlog can keep appending data to it.
https://blog.bytebytego.com/p/what-happens-when-a-sql-is-executed 11/13
4/25/24, 12:14 PM (1) What Happens When a SQL is Executed?
During an update operation, the executor first retrieves the relevant rows from memory
or disk and modifies them in memory. This process is managed using a protocol known
as Two-Phase Commit (2PC), crucial in distributed systems to ensure transaction
consistency across multiple participants.
In the first phase, known as the prepare stage, changes are recorded in the redo log.
Subsequently, these modifications are logged into the binlog during the second phase,
the commit stage, where the transaction is finalized.
This approach ensures synchronization between the redo log and the binlog. It is critical
for the consistency and durability of transactions.
Additionally, the InnoDB storage engine includes other crucial components like the lock
manager and transaction manager. These components are vital for managing
concurrency. For instance, in our update scenario, the rows being updated are locked
exclusively to prevent other transactions from modifying them. We will explore database
locks and their mechanisms in more detail in future issues.
Summary
In conclusion, executing SQL statements like SELECT and UPDATE in a DBMS is a
complex, multi-step process. It involves parsing, optimization, execution, and
transaction management.
While both SELECT and UPDATE statements follow similar execution paths, UPDATE
statements require additional mechanisms such as the redo log and binlog. The redo log
ensures the durability of transactional changes by recording them before they are
applied to the database. In contrast, the binlog captures changes made by UPDATE
statements, supporting both replication and recovery.
https://blog.bytebytego.com/p/what-happens-when-a-sql-is-executed 12/13
4/25/24, 12:14 PM (1) What Happens When a SQL is Executed?
86 Likes · 1 Restack
Comments
Write a comment...
https://blog.bytebytego.com/p/what-happens-when-a-sql-is-executed 13/13