0% found this document useful (0 votes)
35 views

Databases In-Depth – Complete Course

Uploaded by

Maruf Jony
Copyright
© © All Rights Reserved
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views

Databases In-Depth – Complete Course

Uploaded by

Maruf Jony
Copyright
© © All Rights Reserved
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 188

learn all about databases in this course

designed to help you understand the

complexities of database architecture

and

optimization from understanding the

foundational components like transaction

management and storage engines to

mastering Advanced indexing techniques

and exploring the inner workings of

sqlite this course equips you with the

knowledge to efficiently manage and

optimize Data Systems K persani

developed the course a lot of people get

confused between the difference of

buffer manager and the cash manager you

cannot process things directly on test

taking logs is just a part of handling

concurrency these make sure that our DB

becomes distributed This Together forms

the back end of the database software

Engineers deal with databases on a daily

basis but there are only a few software

Engineers who actually get to work on

these

databases understand truly what is

happening inside and look at the

internal architecture but most of the

databases that we use commonly these

days are actually open source so why

can't we go through this open source


code and understand what is happening

inside you tell me wouldn't it be so

cool and so Dam awesome if we could not

only understand the internal

architecture of databases but also be

able to debug the code I don't know

about you as a software engineer I have

always wanted to do this and as a

teacher this has always been in my

bucket list so yes all of this is

finally happening introducing to you the

free course of databases in depth by edu

courses where we are not only going to

understand the databases on a higher

level and the requirements and all but

we are going to look at the internal

architecture understand why a particular

database Works in a particular way and

also debug the common queries and all

the databases now why am I calling this

a free course even though it is going to

be available on YouTube itself because

I'm structuring it like a proper course

itself so let me tell you how it is

structured first this is the first video

of the first lecture of the course and

how we are going to get started is we

will understand the general components

that are there in any database it will


be a great place to start and you will

understand that how data flows how data

is stored and so on but as we pick up

the databases in particular and

understand them in detail you will

understand that some databases obviously

focus on some components more they just

call the component something else or

they just combine a few components and

again call them something else but this

will be a great place to start and it

will be a guideline for reference as we

pick up the database is 1 by one so

because this is the first video please

know that this is a relatively shorter

video all the videos after this are

going to be at least 1 hour long we will

get started with understanding the most

important Concepts in any database which

are BB Plus TRS and LSM TRS this will

give you an understanding of how data is

stored in the most common databases and

once you understand these important

Concepts we will get started with the

simplest database out there which is

sqlite now every DB that we pick up we

are going to dedicate one video

completely to understanding the internal

architecture at least one video and we

are going to pick up one video where we


will debug the queries the common

queries so this is how we will pick up

each of the databases one by one as I

said we will get started with a simple

DB which is SQ light then we will pick

up more complex databases like mongodb

postgress and so on so this is how the

entire courses is structured and let's

get started with the first video but

before that I will just like to say that

if you think this course this video or

the channel in general deserves to reach

more people people please consider

sharing it uh with your friends maybe on

LinkedIn insta anywhere please just tag

me share it you have no idea how much it

will mean to me it will just take 1

minute or Max 2 minutes for you but it

will mean just so much to me it will

motivate me to do better and to keep

delivering as much as possible a lot of

effort has gone behind this and a share

would mean the one to me let's get

started with the first video now

in all the databases we firstly have a

client now this is the client that is

going to be responsible for performing

the crud operations which is create read

update delete in the DB right obviously


there's going to be a DB server that is

going to be running separately so this

is the entire system that we are going

to understand in detail and we are going

to understand all the components inside

this now first thing that we want to be

able to do is for this client to be able

to interact with our DB right now

usually DB runs on a completely

different server somewhere on the the

network so first thing that we have to

do is to be able to communicate between

the client and this DB server somehow so

our DB server obviously is supposed to

have a network layer which will be

responsible for the communication

between this client and the rest of the

components in the DB so I'm going to

quickly write the network layer over

here now note that I said that usually

there's going to be a separate server

for databases because there's one

exception now most common databases have

a server dedicated for DBS right but

ulite which is the most common and the

most simplest database for small to

medium devices I was just talking about

it that this is one of the simplest

databases because you don't even need a

separate server for it it is embeddable


that means you don't need a separate

process it is going to be embedded in

the same process as the rest of the

system so it obviously doesn't need a

network layer but for rest of the

databases which are commonly used

obviously a networking layer is required

as and when we discuss these databases

in detail obviously these nuances are

going to be more clear that these

components are very important in

particular databases I just told you

about it right but generally obviously

there's going to be a network layer to

be able to communicate between the

client and the rest of the components

coming to the first component in order

to understand what is the first thing

that should happen inside the database

obviously we should understand how

client is sending the instruction or the

command or the query to the DB it is a

simple query right that like select all

from or like create table or create

document or insert into so it is a

command or a query now first thing that

DB is supposed to do is to make sense of

whatever instruction or command has come

in or is the query even valid so first


thing that DB does is so this is going

to be our first component and the first

thing that DB is supposed to do is to

understand that what are the words that

have come in and to understand it one by

one so basically select all from a

create table or create document so is

create a valid token or a valid word

similarly is select a valid word or is

insert a valid word so first component

is going to have a tokenizer and it will

be responsible for tokenizing the query

basically dividing the query into simple

words and tokens so that we can

understand that okay this might mean

this this might mean this now once we

have divided the entire query into small

small tokens we need to understand that

these tokens together to they form a

valid statement is it a valid

instruction or not like for example if I

do into insert or if I do table create

it doesn't make sense R DB will not be

able to make sense of it so whatever

tokens are there have they come in the

right format do they together make sense

of an instruction or what do they even

mean so after that we have a parsel

usually databases have the output of a

parsel as a parse tree which is going to


tell you that what are the instructions

that need to be performed so tokenizer

is going to divide your entire query

into tokens and then parser is going to

understand that okay these tokens

together are making sense this is a

valid instruction and this is what our

client is asking us to do so basically

these these tokens together are they

following a particular rule or not are

they together a valid instruction and

then what can be done corresponding to

this instruction now that is what parel

is going to do after that after we know

that okay this has been passed now we

know that we need to start executing

this particular instruction that we have

gotten but before we do that I database

needs to understand that what would be

the best way of executing this query so

for that most of the databases are going

to have an

Optimizer now see there are many ways of

executing a particular query our

database needs to be fast and it needs

to find out the best way to be able to

do that a very simple example is we

should be able to understand do I need

to scan all the rows or can I scan just


an index in order to be able to go

through this query in order to run this

instruction or for example we need to uh

have some joints for to run this

particular instruction so how do we

execute that the role of the optimizer

is to understand what is the best

possible way to be able to run this

instruction or to be able to run this

query is there something that I can do

parall so after this after Optimizer we

will have proper instructions or proper

operations that the rest of the

components need to perform now the

entire output of this first component

can be either in say terms of bite code

or it can be say operations or

instructions the main thing is that the

output should be such that the rest of

the components can make sense of it so

if it so if it is in terms of bite code

or in terms of operations rest of the

components should be able to understand

what is it and what are we supposed to

do so these three together which is

tokenizer parer Optimizer together form

the first component of our database

which you can call as the front end of

the database because this is responsible

for making sense of whatever query is


coming and converting into something

that the rest of the database or you can

consider the backend components together

can understand and execute this uh query

so this is like the front end of the

first division or the first component of

the entire database it consists of

usually three things tokenizer parser

and Optimizer after the first component

now that we have the optimize plan and

we know that what does the instruction

actually mean what does the query wi

mean it is time that we can get started

with the query execution now this is

done by one of the most important

components of the entire system which is

called execution engine you can consider

this like the most important part of the

entire system because it is responsible

for firstly query execution so query

executor now query executor is very

simple whatever instruction is supposed

to get like or whatever operations it

got like one by one just do this other

than query execution it also does cash

management so there is a cash manager

inside why cash management is needed

obviously it doesn't make sense to you

know uh keep quering again and again if


we can just cash it then we should be

caching it you you know about caching

right to make it faster we'll cover it

in a lot more detail in the coming

videos of course but you should

understand the basic concept that we

need to cash some data instead of quing

again and again so a cash manager is

needed other than this there are also

some Utility Services inside the

execution engine uh like for example uh

for authentication or security or backup

or metrics for these things there are

some Utility Services inside the

execution engine now you must be

thinking that Ki didn't we just do the

query execution shouldn't this bring us

to the end of the entire system no not

really see in databases transaction

itself is a very complicated term it's

not as simple transaction as it sounds

let's take a very simple example for

example we have to transfer money from a

person to B person that means that we

have to first reduce the money amount

from the a person and also increment or

give the money to the B person right so

of one transaction itself involves two

steps similarly us transactions in our

real databases and in real scenarios are


going to be very complex and are going

to consist of multiple steps now all

databases do not support asset

properties but a lot of databases too

now all databases the other databases

also are supposed to have asset

properties to some level right atomicity

consistency isolation durability

especially atomicity doesn't it make

sense that all databases uh should have

some level of guarantee that either the

transaction is going to get completed or

it is not going to get completed

similarly consistency isolation

durability all of this so it makes sense

that there is one component that is

going to be dedicated to make sure that

the transaction management happens

properly so that is why we have a

transaction manager which is going to

make sure that these things happen

properly they asset properties are

followed now as we do transaction

management a very important thing that

we have to do in order to make sure that

when there are multiple transactions

there are no data inconsistencies or

conflicts or data an abilities uh we

have to take some locks right that is


how things work in real systems now

different databases are going to manage

locks in a different way some are going

to take shared logs some are going to

take exclusive logs uh some databases

are going to take logs say over a

particular row or over a particular

document while some other database might

take a lock over entire database itself

that is also possible now the way of

handling is different but definitely it

makes sense that every single database

is supposed to take care of multiple

transactions so obviously lock

management is again very important so we

can again take care of transaction

management lock management now as we

talking about transaction management and

lock management one very important thing

to talk about is that suppose in between

the transaction a crash happens and our

DB crashed so now what is going to

happen how are we going to recover from

it did the transaction finish did it not

finish uh do we have no idea how to go

on from there right so how do we recover

from the crash so a very important

component is going to be recovery

manager and these together the

transaction manager the lock manager the


recovery manager form our next component

we discussed that in order to handle

multiple transactions at the same time

we can have a lock manager which is

going to take care of locks uh in other

words we are we want to take care of

multiple transactions concurrently or we

want to take care of concurrency which

just means that take care of multiple

transactions at the same time when we

want to take care of it at the same time

we just say that we want to take care of

concurrency now the thing is that taking

logs is just a part of handling

concurrency if we want to talk about

concurrency at a broader level how to

take care of concurrency most of the

databases like take care of concurrency

by using other techniques also for

example a very common technique that is

used by a lot of databases is called

mvcc it is multiversion concurrency

control there will be a dedicated video

in this course where we will be talking

about mbcc because it is actually used

by a lot of databases so in short there

are other techniques as that databases

employ to make sure that we can handle

concurrent transactions at a broader


level and lock management is just a part

of it see when we designing such a huge

system like a DB it makes sure that we

have separation of concerns so that our

system is flexible it is maintainable it

is extensible and all of that the usual

guidelines for system design right so

that is why usually there is a separate

concurrency manager Al together now I

could have added it in the last

component as well but I'm just telling

you that concurrency manager and lock

manager usually works to work together

but there's a clear separation of

concerns that concurrency manager will

take care at a broader level that the

transactions can happen concurrently

whatever is the level expected from that

particular database it will take care of

that taking care of the logs is just a

part of that and lock management will

take care of next question that must be

coming to your mind is that KY we talked

about execution engine transaction

management and this and that but how

does you know data actually get stored

on disk because in the end data is going

to get stored on disk so how does that

happen how do we retrieve the data how

does the processing happen who is taking


care of that and yes that is where our

next component comes into picture and as

you must have guessed it is an extremely

important component it is called the

storage engine as the name suggests it

is going to take care of everything

related to storage now let me give you

some context about it we're going to be

discussing About Storage engine in a lot

more detail in the coming videos but

I'll give you some quick context for

basic understanding so firstly we need

to store data on disk right so obviously

there's going to be a disk storage

manager and this is where the knowledge

of B B+ trees and LSM trees is going to

come into picture why do we need them

why are these so commonly used in

databases we going to be discussing it

in a lot more detail but just understand

this for now see disk is basically just

like a circular proper disk it is

actually cheaper than say our Ram so

that is why we store data on this and

also we get persistence right if we

store something on RAM and it crashes it

is gone so in in order to have

persistence in order to make sure that

we can store lots of data in a cheap way


disk is used now how does this look like

how is data accessed see disk is divided

into say circular tracks and some pie

shaped so it looks something like this

so this is how dis is going to look like

and there are leg say circular tracks

like this so every small component is

called a page now whenever you want to

access something on disk the entire page

from the disk needs to come to memory

then you can do whatever you want with

it and if you want to even update it you

need to process the entire page and put

it back this is how it works on a higher

level trust me we're going to discuss in

a lot more detail but the main thing

that you should understand is that you

cannot process things directly on TK you

need to take it like you know bring it

to the buffer manage it process it

whatever you want to do and then put it

back so this is where we need another

component which is called buffer manager

now a lot of people get confused between

the difference of buffer manager and the

cash manager so let me just make things

clearer see caching is something that we

again do for uh you know to make queries

faster only but it is something that we

can do at all levels so we must have


heard that caching we do in the front

end systems in the back end so in DB it

is happening at the execution engine as

we have already seen right but what

happens is that there can be different

different mechanisms for cashing like

they can be lru lfu and so on and so

forth this is to make the executions

faster it is a very broad concept to

take care of and to make queries faster

perer manager on the other hand is to

make sure that we can deal with the data

that is there on disk so how do we do

that so we have to first take out that

data in terms of pages and fix size

pages so on dis everything that you

store is in fixed size Pages like say 4

KB so you bring the entire 4kb to buffer

deal with it and then put it back so

that is why you need a buffer manager so

there is a dis storage manager there's a

buffer manager now the thing is that

dealing with disk takes a lot of time it

is a very expensive operation and

especially that we have millions of data

billions of data a very very important

thing that needs to be done is to make

sure that this happens very fast right

and this is where you know a lot of


discussions and a lot of databases

happen that which storage engine a

particular database is using you must

have heard like for mongod DB the

storage engine is called wire tiger

their storage engines have evolved over

time because it's a very very core

concept A very important concept of a

database how fast or slow your database

is going to be is really dependent on

storage engine how fast can it retrieve

the information how fast can it update

the information on the disk so that is

where the concept of making you know

this retrieval as fast as possible comes

into picture and in order to do that you

make sure that you know minimum number

of reads are required from the disk and

you are able to get that particular data

that you are looking for in a very fast

way and this is where indexing comes

into picture we will again discuss it in

a lot more detail in the next video as

so make sure you subscribe and make sure

that you you know don't miss that video

because it's going to be an amazing

video your entire concept of why do we

need uh be trees why so many databases

use B trees why indexing and how much

does it actually improve in terms of


efficiency we will cover all of that I

will actually show you that see by this

many times we are actually improving the

efficiency so it's going to be amazing

but over here the main point is that we

need an index manager as well in short

storage engine is the core of our

database and it takes care of dis

storage management buffer management and

indust management it also takes care of

a lot of other things also but these

three are its main functionalities

coming to our lowest component in

databases see when dealing with disk

anyway we have to deal with what we have

to deal with disk files now OS plays a

very important role over here see we

might be using Mac or windows or Linux

and so on see every uh OS is going to

have its own system call like for

example if you're doing file. open the

system call is going to be different say

for Mac for Windows for Linux NR DB is

supposed to support all and it is

supposed to have an interface so there

can be the lowest layer we can call it

as OS interaction

layer or you can see it as a file system

interface so this is nothing but uh we


are making the system call according to

whatever OS we are using as you can see

I have left some space over here does

this mean that I missed one component in

between not really see whatever

components we have covered so far these

together are enough to understand the

general components that are there in one

DB server but the truth is in reality it

is rarely Just One DB server right

nowadays we want our DBS to be what

highly available to be scalable we want

to be able to store millions and

billions of data so how do we do that we

obviously cannot store everything on One

DB server there have to be multiple DB

servers so in order to handle DB server

distribution to take care of the

distributed components there's another

component which we can talk about see

the thing is that as our DB is going to

grow as the amount of data is going to

grow say initially there were lacks of

data then it grew to millions of data as

the amount of data keeps growing it will

not be possible to store it within one

One DB server right just one disk so it

will make sense that we have multiple DB

servers so we will divide our entire

data into small parts not small parts


into parts or we can say into shards

correct so we need what we need Shard

management so we will have a Shard

manager which will take care of that

okay this data is going to go into which

Shard So based on some factors we can

have sharting and so we need a shart

manager now the thing is that that this

will make sure that our DB is is

scalable but now we also need to manage

these multiple nodes right so there

going to be a lot of servers together in

a particular cluster so all these

servers together can form one big

cluster and we will need a separate

cluster manager for that and now we

talked about scalability but

availability is again extremely

important we always have a trade off

right consistency versus availability

cap theorem very very important so when

we talk about availability the first

thing that comes into mind is

replication so obviously in our DB also

we have replication which is actually

extremely important we talk about this a

lot right so we need a replication

manager for that so all these three

components that I've written right now


these make sure that our DB becomes

distributed and we can handle

availability and high scalability

congratulations you just understood the

general components and the General

Internal architecture of DB as I said

earlier also it is going to vary

slightly in all the dat bases of course

each database is designed in its own way

but this will be a great starting point

for you you will be able to understand

that okay this is the component that we

are talking about this is how the data

is Flowing this is what needs to be

happen in any DB right so let's do a

quick revision for you to be able to

understand it together just once right

so let's do that so our client firstly

needs to talk to our DB system right so

there needs to be communication between

client and our DB server so our Network

layer is going to take care of the

communication between client and the

rest of the components then there is

front end which is responsible for just

converting whatever query is coming to

us to something that the rest of the

components or you can say this together

forms the back end of the database so

front end is making sure that we convert


the query into something that a backend

can process understand everything right

so the query that is coming is in terms

of English or any commands any

instruction format that we are given

like for example in terms of SQL

databases it is going to be a SQL query

right similarly it is going to vary per

database so first thing that we need to

do is first tokenize the query

understand what it means then together

see is it following a particular grammar

rule or a does it make sense to some to

form some instruction so we are going to

pass it and then we are going to

optimize it and find the most optimal

way of dealing with that particular

query then our front end is going to

pass whatever instructions we have to

our execution engine execution engine is

going to make sure that our query

executed properly and it is also going

to take care of the cash management and

Utility Services like o uh backup

metrics and so on then we need to take

care of our transaction management so

all of these components together are

taking care of transaction management uh

within which we are going to make sure


that if needed we follow asset

properties uh we take care of atomicity

consistency isolation and durability so

there is lock management which is going

to help us us take locks at different

levels there's recovery management which

is going to help us take care with the

crashes within the DB transaction

management and concurrency management

now as we discussed lock management and

concurrency management are going to work

together to help us take care of

multiple transactions at the same time

concurrency manager is more broader and

it sort of uses lock manager and one

very common technique that we are going

to be discussing in the course is going

to be mvcc which is multiversion

concurrency control

other than this we talked about that

this is going to help us make sure that

our DB is highly available and scalable

by helping us take care of distribution

of DB servers so we are going to be

sharding our data based on some

particular parameter so this is going to

take care of sharting this is going to

take care of cluster management and this

is going to make sure that we have

backup of the data we replicate the data


so replication management then comes

another very very core component so

execution engine is very important then

storage engine is again extremely

important uh it takes care of how do we

really store the data on disk how do we

process it so how do we take out the

data from the disk to the buffer how do

we process it and how do we make this

process as fast as possible how do we

make the retrieval as fast as possible

by doing indexing then we know that we

have to deal with this file so there is

one OS layer which helps us deal with

different OS like mac Windows Linux and

so on so this is how our general DBS are

going to look like and I hope you're

excited to start diving into DBS

individually again we're going to be

starting with SQ

light we have optimized from 100 seconds

to 250 MCS to 3 milliseconds we have

seen number of comparisons that are

required for 1 million records so you

can see this is the most efficient

solution there is a huge imp Improvement

as the number of entries grow and that

is what B tree does why do so many

databases use B trees and B+ trees are


these really more efficient than say

bsts or ar and if more efficient how

much more efficient is it okay to use

bsts or arrays in case of databases or

are B trees really really necessary how

do B trees even work internally can we

do a dry to understand it in a simple

way what is the difference between B

trees and B+ trees are they same and yes

we are going to cover all of this in

this video and welcome to another video

in the free course of databases in depth

by edu courses where we not only discuss

databases in theory but we debug their

code and yes in the next video we are

going to be starting with one of the

most common databases which is also one

of the most simplest databases in the

world which is SQ light but before

getting started it is very important

that you get a Basics right and that is

why this is an extremely important video

let's get started

whenever we think about storage we think

of two solutions one is RAM and the

other is hard disk now you might think

that this is very Basics but it is very

important to get our Basics right and to

build up the use case so just stay with

me so what about ram ram as we know is


temporary that means that if it crashes

it's gone we lose our data right it is

volatile is another term which means

that it is temporary also it is

expensive and because of that we have

limited space in Ram what does that mean

that if our database is obviously

growing if we have terabytes of data we

cannot store it in ram ram is very very

limited it is expensive so that is where

hard disk comes into picture why because

hard disk gives us persistence we are

able to store the data permanently right

and there is no limitation as such we

can keep adding hard disk and we can

store TBS of data it is also very cheap

then hard disk is the way to go it is

awesome right not really because it

takes more time for every IO operation

for a simple comparison let me just tell

you that if you are doing one simple IO

operation in Ram it might take time in

some Nan seconds obviously the exact

time is going to vary but let's say 1 to

10 or Max 100 Nan seconds versus in a

hard disk the same one IU operation is

going to be in milliseconds yes just

literally one IU operation can take up

to say 1 10 100 milliseconds so we know


that each IU operation is expensive it

takes some time in hard disk but it is

very important for us to understand that

why is it so how does a hard disk look

like how does it operate so hard disk is

nothing but a circular disk platter and

how it is divided is that inside it

there are some concentric circles so

these circles that you can see these are

called tracks so each of these

concentric circles

this is called a track and then you

divide this dis also into some pie

shaped pieces so you see these pie

shaped pieces each of them is called a

sector so like this is one sector okay

now what will the intersection of a

track and sector give you so if you see

if we consider this track and if we

consider this sector what will it give

us this particular block correct

similarly if we have to consider just

this one block this is what this is

intersection of this track and this

sector similarly if I have to consider

this block it is intersection of this

sector and this track correct so

intersection of a track and a sector is

giving us what it is giving us a file

block so now typically the file block on


a disk is of the size of 4 KB obviously

it can it varies with the dis it can

also be say 16 KB or like 512 bytes but

typically it is 4K

so what happens is that whenever we have

to do an IO operation like write or read

operation what happens is that we can

always read and write in terms of file

blocks what do I mean for example if we

have to consider some data which is over

here okay so this is one file block and

we are considering data over here now

some there's some data on the left

there's some data on the right but it is

not like I can take out just this

particular data it is not like that in

order to process it in order to process

that data I have to take this entire

disk block from the hard disk to Ram so

in order to process any data what we

have to do is we have to take the entire

file block from the hard disk to the

ram so whenever we have to do any

processing like suppose we have to

process some data over here we have to

take out this entire file block get it

to RAM process it and then put it back

you cannot Direct process it over here

so the CPU can access the data that is


there in Ram and it cannot directly

process the data that is stored on hard

disk not directly so whenever you have

to process any data you have to take the

entire block from hard disk to the RAM

process it if update is required put it

back to the hard disk so this is how it

works but let's understand what is the

impact that it has on us let's take the

example suppose we have 1 million

records suppose we dealing with 1

million records now if we are dealing

with say SQL data each record is going

to be a row suppose we are dealing with

no SQL DB like for example uh mongod DB

we are going to call each record as a

document so now each record can be a row

it can be a document in our case let's

say that each record is an entry okay so

each entry for example is taking 400

bytes so this is per entry or per record

now it can be a row or dock or anything

now what might happen is that uh there

might be like huge dock and inside that

maybe some string is taking 5050 bytes

there there are like four five integers

4 four bytes or 88 bytes like that right

so maybe total is 400 bytes so 1 million

records and 400 bytes is per entry so

now first thing that we have to


understand is that one block can store

up to 4 KB of data correct one block can

store up to 4 KB of data so this is what

we said right so that means how many

entries or how many records can it store

that means four 4,000 bytes divided by

400 bytes that means one block can store

10 records or 10 entries correct every

single block so it might be something

like this 1 to 10 over here 11 to 20

over here 21 to 30 over here 31 to 40

over here like this so every single

block would store 10 records or entries

correct so I'm assuming that every file

block can store up to 4 KB and every

entry is taking 400 bytes so this is

just an example to understand okay now

in order to read 1 million records in

order to read 1 million records let's

see how much time would it take so if we

have 1 million records that is 10 power

records right how many blocks would we

have to read number of blocks to be read

would be 10^ 6 / 10 because every block

is storing 10 records so how many blocks

are required 10/ 6 by 10 which is 10 5

so we have to read 10 power 5 blocks to

go through 1 million records in order to


go through 1 million records we are

reading 10 power five blocks and we said

that in a hard disk one IU operation

takes say 1 millisecond it can also take

10 milliseconds but let's say it is

taking 1 millisecond okay if it is

taking 1 millisecond that means that

this is going to take 10^ 5

milliseconds that means it is going to

take 100 seconds

to read 1 million records which is huge

so in order to just read 1 million

records from the hard disk we would take

100 seconds we should obviously try to

make this better now that is the problem

that we are trying to solve so what did

I do I just took some example 1 million

records is a very common thing in

databases right 4 KB is the standard

thing your document can be this big so

even if not 400 bytes maybe even 40

bytes even if it is taking 10 seconds

it's a lot of time so we want to

minimize the time taken to go through

the records how can we do that is the

problem that we are going to try to

solve before we move on I would just

like to take a minute over here to tell

you that if you're preparing for

interviews or if you're looking to


become a better software engineer I

would love to be part of your Learning

Journey and make it easier faster better

and more fun I teach live plus you also

get access to the recording so you can

learn at your own pace and you get a

consistent teaching style because I

teach all hld lld PSA Project based

courses where we cover industry level

projects like YouTube WhatsApp zeroa we

cover monack as well as devops and we

also have C++ plus if you sign up for

any of our courses you get lifetime

access to all the batches the past

present future everything you also get

access to our mock interview platform

you can come practice any time we are

going to guide you we obviously have

doubt support as well we are putting our

heart and soul into teaching so if

interested at least check out the site

all the detailed curriculums the

testimonials the answers to all the

general FAQ everything is present over

here just check it out educ courses.com

is the site plus if you still have any

questions feel free to reach out to us

at support educ courses.com we will also

guide you about what is better for you


at least check out the testimonials for

now the best thing that I like about edu

courses is their courses are live keep

the H way she teaches the way the

courses are structured are actually the

best part the live classes and the very

helpful bort Community I feel very

confident as compared to what it was

before these courses as a software

engineer my experience at edu courses

have been really great I feel more

confident as a software engineer after

enrolling in edu courses but for now

let's get back to the

video Let's consider a very simple

solution so I have drawn a disc over

here let's see that okay here there are

records so 1 to 10 are over here again

11 to 20 are over here 21 to 30 just how

we saw okay so records are over here now

what we are going to do is we are going

to have yet another table and what it is

going to store is it is going to store

suppose record one it is going to say

that from record 1 till the next entry

so suppose my next entry is 11 so from 1

to 10 all the entries are over here

similarly from 11 to say 20 all the

entries are over here so we have a

pointer to that particular block so we


know that if I just go through this

table I will know exactly where the

entry is so instead of going through all

the how many blocks so there were 10

power five blocks right so instead of

going through all the 10 power five

blocks now let's see that how many

blocks will we have to go through okay

now what is going to happen is obviously

this table is again going to be on disk

right so this is also going to take some

space it is also going to take some

extra uh file blocks right so we'll have

to see whether it is actually more

efficient or not so let's do the

calculation and let's see so approaching

the problem in the similar way so first

thing that we have to see is that in one

particular block how many entries can we

store so one block can store 4,000 uh

bytes right obviously this is all rough

estimation by the way so when I say 10

records it is roughly because it is 4 KB

so I'm considering it 4,000 bytes

obviously this is all rough estimation

okay just uh I hope you understand that

much at least so let's see that how many

entries can we store let's say that each

entry in this particular table is taking


say 10 bytes good enough estimation

maybe it would be 12 or something but

four and8 so maybe we can take as 10

bytes okay so if each block can store up

to 4,000 bytes and each entry in this

table take 10 bytes so that means that

each block can store how many entries of

the table that means 400 entries so

every block can store up to 400 entries

of this table versus if we consider the

data entry how many entry was that so

per block we were storing only 10

entries because that was a huge data

entry right it was of 400 bytes over

here it is of only 10 bytes so each

block in this case is going to store 400

entries of what of this table now again

how many such entries are going to be

there so that depends on how many blocks

are there

correct because how many do we need how

many blocks were there in the data and

how many blocks were there 10 power 5

blocks correct so we need the table for

what we need the table for each block so

10 power five entries are there in this

particular table so 10 power five

entries are there in this table so the

next question is in order to read this

entire table how much time are we going


to take so we have to read through how

many entries 10 power 5 entries in this

table I hope you understood why why we

need 10 power five entries see what are

these pointing to these are pointing to

these data blocks right these file

blocks that are storing data and how

many blocks are storing data it is 10

power five blocks so that is why the

number of entries in this table is 10

power five now we know that 400 entries

of this particular table so if this

entire thing is say 10 power 5 we can

divide into Parts say 400 then 800 then

1 1200 so on right so each of that can

be stored in one block so how many

blocks do we have to go through for this

entire table so the number of blocks is

going to be what 10^ 5 divided 400 so if

I just write it in very simple terms I'm

going doing the calculation also over

here so this is going to be 250 so we

would have to go through 250 blocks of

the disk in order to go through this one

entire table and and now when we go

through this one entire table that is

when we are done going through the 250

blocks we would have found uh the data

that we looking for in one of the blocks


suppose it pointed over here now this

one data also we would have to read so

this would take one extra 1 M second

also so don't forget that so we are

going to the table by going through the

250 blocks and after that we are going

to see that where is the data then we

are going to access that one particular

block so how much time will this take

250 blocks right and every block takes 1

millisecond roughly so that means 250

milliseconds plus 1 milliseconds but

roughly we can say that we are going to

take 250 milliseconds to find something

in 1 million records so we have

optimized how we have optimized from 100

seconds to 250 milliseconds which is

huge how did we do that by adding this

one table this table is called index

table and you just understood what is

indexing and how does it work how is it

more efficient I am sure that

calculations would have definitely

overwhelmed some of you I'm going to

quickly summarize it and it will be

completely clear just be patient with me

and if you still have doubts just go

back try doing the calculations yourself

it would be just so amazing if you can

understand how we were able to optimize


from 100 seconds to 250 milliseconds

because this is just amazing stuff this

is real stuff that we are understanding

okay so let's quickly summarize first we

saw that we have 1 million records we

said that every record is roughly taking

400 bytes and because every file block

can store 4 KBS we can we saw that one

block is going to store 10 records that

means that the number of blocks that we

are going to require to store 1 million

records are going to be 10 far five

blocks which is going to require us 100

seconds so how many blocks are used to

store all the 1 million records that is

10 power five blocks so on the disk 10

power five blocks are used to store all

the data then we said let's optimize the

time and that is why we drew this index

table now in this index table what did

we see we saw that because each entry is

much smaller it's not like 400 bytes

right because each entry is much smaller

say 10 bytes here we can store 400

entries in one block because we can

store 400 entries in one block that

means this these entire entries are

going to go in one block these entries

are going to go in another block these


entries are going to go in another block

these entries are going to go in another

block and so on so we saw that 250

blocks are needed for the index entri

firstly if you still have the doubt how

many entries are going to be there in

this table 10 power five entries why see

10 power five blocks are used to store

the data what are we pointing to the

blocks that have data right how many

blocks have data 10 power five so that

is why 10 power 5 entri so first of all

this should be clear 10 power five

entries why because we are pointing to

data how many blocks are pointing to

data 10 power five blocks I'm again and

again repeating so that you don't have

any doubts I'm sorry if you're getting

getting annoyed by my reputation just

making sure that there are no doubts

okay now if you also want to do the

cross calculation you can see that see

uh 400 entries and then we I'm saying

250 blocks right so this will be 10

power 5 right so this is 100 and then

three more zeros so 10 power five so

that is why see this is one block on the

hard disk this is one five block on the

hard disk this is one F block on the

hard disk and so on so that is how we


optimized from 100 seconds to 250

milliseconds but now the question is can

we optimize this even further let's see

if it is possible so what I am going to

do now is I'm going to create yet

another

table and I'm again going to store index

only in this but now this is going to be

pointing to one block that is storing

our index entries and not the data

entries so there are 2 power five blocks

for data 250 blocks for IND X entries

right so how many entries are going to

be in this table 250 entries correct so

this is pointing to this block this

block this block so on how many blocks

are there 250 blocks so there are going

to be 250 entries inside this right

let's see how much space is this table

going to require so again the space for

each record is same only as over here

right what did we see 10 bytes so if we

come back over here and so 10 bytes for

every entry and then there are 250

entries right so how much space do we

need to store this entire table 250

entries into 10 bytes which is just 2500

bytes which is actually less than 4 KB

and this entire table is not even going


to require one entire file block size

correct so now if we have to find

something we can just take out what

whatever one file block is there for

this one table we will go through this

then we we will see that okay if we

looking for something over here we will

see that where is this particular block

then we will get that one block over

here and then we will go through this

one block and we'll see where exactly is

the data so how many IO operations are

we doing with the dis 1 2 and three so

three IO operations each taking 1

millisecond so now we have optimized to

what 3 milliseconds so we went from 100

seconds to 250 milliseconds and now to 3

milliseconds by just adding one more

table and this my friend is called

multi-level

indexing and guys we didn't just

understand the concept of multi level

indexing but we actually also do our

very first P Tre and let me show you how

this is going to be very interesting so

what I'm going to do is I'm going to

copy this diagram and I am just copying

it I'm going to a new page

and just going to paste it and I am

going to rotate it and this is exactly B


Tre and let me show you how so if you

see over here there are many keys and

there are child

pointers and child pointers are again

pointing to some other nodes and within

this also again there are some pointers

right on a higher level this is how B3

looks like but we are going to go into a

lot of detailing we are also going to

dry run but before that I think it is

very important that we answer a very

very important question that is be the

only solution what if we used other data

structures that are commonly used so

what we're going to do is we're going to

draw a table and do proper comparison we

going to take the example of 1 million

records and we're going to compare the

time complexities so we are going to be

comparing using three most common DB

operations that we perform that is we

usually perform search we perform insert

and then we perform deltion correct so

we are going to consider these three

operations and then we are going to

consider all the data structures one by

one the most common solution that is

going to come to your head like if you

did not know about B would be balanced


PST why because it is the most common

solution right it offers us logarithmic

time for searching insertion and

deletion so if we consider 1 million

records what would be the answer so if

we have 1 million records log base 2 10^

6 this roughly comes out to be 20 so

that means we would have to do 20 IO

operations with the hard disk for search

insert and delete how it would look like

is like suppose we have like a root and

we know that where it is on hard disk we

get this and then we will compare and

we'll see that okay is it on the left or

on the right and then we going to say

that oh it is on the left then we are

going to access another one again from

the DB right so like this we will have

to do how many IU operations because for

every single node we will have to do one

IU operation right so it there is going

to be 20 IO operations for 1 million

records obviously unbalanced BST would

be worse than this but let's do a quick

comparison for that also so that Clarity

is there and there's no confusion left

see what is the difference between

balanced PST and unbalanced PST like if

we are storing 1 2 3 right so the

correct way of storing is like this 1 2


3 this is balanced in unbalanced we

could also store like this because this

is also bstd right this is also bstd but

this is not balanced so suppose if there

are 1 Mill records that means that we

could actually store 1 million records

linearly itself right in an unbalanced B

state is also a possibility so how many

worst case how many operations would we

have to do 10^ 6 which is obviously not

efficient we would have to do 1 million

dis operations just for searching

through 1 million records which just

doesn't make sense and if we consider

this worst case of unbalanced BST

basically the linear structure then

again insertion deletion will also be 10

power 6 so obviously this is not the

efficient approach now let's compare for

a sorted array because now you might be

thinking that KY what if we just store

all the data in a sorted way one by one

so basically in a disk how we would

store is 1 2 3 like for the ID we would

just store like this let's see what

would be the complexities in this case

see search would be very simple it would

be binary search like balance BST itself

which would have logarithmic time so 20


IU operations just like over here right

because it is sorted so searching

becomes very efficient but now insertion

and deletion is a bit tricky now let me

tell you how if you have blocks like

this which are sorted if you want to

insert a new block over here what you

would have to do is you would have to

make space for it and you would have to

shift these blocks you would have to

shift these elements right so in the

worst case you might end up shifting all

the elements so in the worst case you

would have to actually do 10^ 6 which is

1 million operations for both insertion

and deletion similarly for deletion also

right if I'm deleting this I would have

to shift all the other elements or all

the other blocks if you want to store it

in sorted order and if you want to

insert in between you would have to ship

the rest of them correct that is how it

works so this is an interesting point to

keep in mind when we talk about

databases that operations are not

similar to data structure operations

that we usually consider in Ram here we

are talking about hard disk and we have

to think that every IU operation is

expensive and finally it's time that we


talk about our star which is the B3

structure now in B tree what happens is

that we have a minimum and maximum

number of keys that it can have that

every note can have like over here we

saw that there are multiple keys and

then there are child pointers right so B

Tre is nothing but a special type of MV

tree so we will understand it in more

detail what is MV tree how binary trees

have like two two children right mway

tree can have M children so whenever we

have like a lot of data it makes sense

that me we can have multiple children

and I will show it to you how now this

is a special kind of MV tree now in B

tree if we say that the order is M that

means the maximum children that it can

have is

M so there are ways to prove this but to

make things easier for now I'm just

telling you that search insertion and

deletion in a b tree what is the

complexity log of whatever n so in our

case 10 par

base M so m is what the maximum children

that every note can have so suppose we

consider a b tree with maximum children

100 so then in that case it would become


log of 10^ 6 base 100 which would become

equal to 3 so we would essentially need

just three IU operations for all search

insertion as well as deletion and we

actually saw this in our example we did

the calculations and saw that what with

1 million records it is possible that we

need only three IU operations for search

so this is actually possible and now the

comparison is complete if you want to

draw the proper table let me just draw

it properly and show you I created this

nice table for your reference you can

either take a screenshot or you can

share it on LinkedIn or anywhere that

you studied something nice if you're

interested otherwise let's do a quick

recap a quick summary of whatever we

understood so far see we have seen

number of comparisons that are required

for 1 million records so in a balanced

PST everything is logarithmic so log

base 2 of 10^ 6 is roughly equal to 20

it is 19 something so search insert and

delete will require 20 IU comparisons on

the hard disk versus an unbalanced BST

because it is unbalanced it can be

completely skewed so the height of the

bstd can actually be 10^ 6 so the number

of operations will be 10^ 6 itself which


is 1 million in a sorted array for

search you can do binary search but for

insertion and deletion you will need

shifting of the elements that is why it

will take 1 million IU operations versus

for a b tree of order 100 what does

order 100 mean it means that any node

can have maximum 100 children then we

can perform search insert and delete in

just three IO operations so you can see

this is the most efficient solution

after this there is definitely balanced

BST but if you think there is a huge

Improvement and that is why B3 is used

by so many databases now

it's finally the time to understand the

structure of B Tre in detail so what

happens is that unlike BST where every

note can have only one key here a note

can have multiple keys so what happens

in bstd suppose the root node has the

key 20 so we see that if you're

searching for something less than 20

we'll have go here otherwise more than

20 will go over here now in databases

what happens is that even when you're

searching for one key will anywh have to

access the entire file block from the

disk correct that is the basic that we


need to understand so instead of doing

that why don't we store multiple keys in

a particular node only now how does this

help actually we need more comparisons

now see let me explain you it in detail

so suppose the keys that we store in

this particular route is 100 200 300 400

so what is going to happen is that what

this is going to point suppose again

here all the key values are going to be

lesser than 100 over here it is going to

be between 100 and 200 over here it is

going to be between 200 and 300 over

here it is going to be between 300 and

400 over here it is going to be more

than 400 so if I have to see something

like suppose 10 20 30 40 over here it

will be 110 150 180 190 like this so

here also we have to do comparisons only

in fact for every note we have to do

more number of comparisons because

earlier in bstd what happened we just

saw 20 20 or less than 20 or more than

20 over here there is one note we have

to see oh is it between 100 200 is it

between 200 300 is it between 300 400

even though the number of comparisons

are increasing the major thing our major

problem is being solved what was our

major problem the cost of getting this


data from disk to Ram now this is much

more expensive than the number of

comparisons that you're doing in Ram and

that is why this is more efficient

because even though you're storing more

number of keys all of this will fit

within one block and best would be that

you know you try to maximize the number

of keys that you can fit in a block why

because anyway if we have to access just

this also you will have to access the

entire file block anyway so it is better

to access the maximum number of keys in

once and then we can compare accordingly

right so there is a trade-off over here

if we had only one key in the node then

also we would have to anyway uh get the

entire file block from the disk but the

number of compar would be one over here

the number of comparisons is more than

one there are multiple number of

comparisons but the cost of getting the

data from the disk to Ram is saved and

that is why bet is so much more

efficient for databases so I hope now

core concept is clear let's understand

the major characteristics of a bet Tre

now when we say that a b tree is of

order M so I said this before also this


means that the maximum number of

children that any of the nodes in the B

tree can have the max number of children

is equal to M that means that the

maximum number of keys that any of the

nodes can have is equal to M minus one

why so why did this happen we did not

understand so let's see this example

here we saw that Max children are what

say five right so if the children is

five see the keys are only four which is

one less why because see here it is

pointing to all the values that are less

than 100 more than 100 and in the

between values right so even though the

keys are four the number of children are

4 plus one which is five so if we say

maximum children can be M then the

maximum number of keys can be M minus

one similarly B property says that the

minimum number of keys that any node

other than the root node other than root

node should have M by 2 C seal that

means that you round off to the next

bigger integer so you take by two and

then you round off to the next bigger

integer that is what taking seal means

now root is an exception it can have

minimum one key also so at least one key

is the is what the properties say so


root can have minimum one key but all

the other nodes should have minimum M

by2 keys so these are the main

properties if you just see the max

number of keys the Min number of keys

you should be able to understand that

this B tree will have maximum how many

keys minimum how many keys let's take

quick examples so say if M value is

equal to 4 that means the maximum number

of keys possible for every Noe is three

any of the nodes okay maximum number of

keys is three minimum number of keys

will be what 4x2 which is 2 if m is

equal to 5 the maximum number of nodes

that is possible is 5 - 1 which is 4 the

minimum number of nodes so 5 by 2 is

what 2.5 so you round it off to the

bigger integer so which is three so this

this is how you can see that the B tree

of order M can have minimum how many and

maximum how many notes now because of

these two characteristics what happens

is that as we go down one level in B

tree with every level the number of keys

that any level can store becomes

byproduct of M so it becomes M * so let

me explain what I mean so if from root I

have M keys so my next level will have M


into M keys because M can go over here M

can go over here like this m times right

so M into M similarly in the next level

it will again become M time so M into M

into M so as we go one level down our

number of keys that can be stored at

that level will become M times so if I

have to find the height of the tree

because my height of the tree is going

to be balanced a very very important

characteristic of B tree is that all the

leaves are at the same level all the

leaves are at same

level so if I have to find the height of

the tree how can we do that so if I have

to find height if the total number of

keys that I have to accommodate is n if

the total number of keys that I have to

accommodate is n then by every level if

I go up it is going to keep reducing by

m so if suppose the bottommost layer the

leaf W layer suppose it had like the

total is supposed to be n right so then

in the next layer there will be uh

suppose Here There Was X in the next

layer there will be X by m in the next

layer again it will be X by m divided by

m again so it will keep reducing by M

till we come across one right so if we

keep doing this our H will actually


become order of log n base M order of

log n base M so there are a lot of

proofs also behind this but this is the

easiest way to understand because total

number of keys are n and with every

level the number of keys that can be

stored are becoming M times so height

becomes the number of iterations while

searching or while going through the

entire tree so that is why h becomes log

of n base M and that is what we used

while figuring out the complexity for

searching inserting and deleting right

so in our case in the example that we

had taken it was log of 10^ 6 and the

order that I said was 100 that means

that every note can have maximum of 100

Keys which is easily doable in real case

scenario right so this would essentially

require just three IO operations and

this is from where we started I hope now

the characteristics and the main

important points of B trees are

completely clear let's discuss one more

very very interesting and cool feature

about B Tre before that just notice

something about bstd now if we have some

particular nodes and we have to insert a

new node what happens we compare and the


new node is inserted below the present

nodes right so we are expanding the tree

where towards the leaves correct so

towards the leaves is where we add the

new nodes versus what happens in B3 is

that once we have the maximum number of

keys in a particular node we are going

to split it and a new node is going to

be inserted on top of it so there's a

new node that is inserted and suppose

maximum that we could insert is three

and now there's one more node maybe the

new node will be inserted on top with

one more node that can move up so what

happens in B3 is that we grow towards

the root now why is this so important

and why is this so interesting because

this is very very important when it

comes to multi-level indexing in

databases how see now to we might have

100 entries in our database tomorrow we

might have th000 day after tomorrow we

might have 1 million and as the number

of entries grow or as the number of

entries reduce we want the level of

indexing to change and we shouldn't have

to manage that and that is what B Tre

does so beautifully see if we have th

rows or th000 entries then we might need

only one level of indexing but as the


number of rows will increase as the

number of entries will increase what it

will do is it will create a new note

towards where towards the root so that

is what a new level of indexing will be

created how did we visualize multi-level

indexing like this right that we keep

adding the index tables so if only this

index table is required then we are

going to add it if it is not required

then we are going to remove it and this

level of indexing is enough so this is a

disk this is the index table for this

this is the index table for this and it

if when it is required when we see that

the number of data entries are

increasing in database that is when we

can add new level of indexing and that

is how B trees really makes things a lot

more

simpler coming to the final part of the

video Let's quickly talk about what is

the difference between B trees and B+

trees obviously I can go into a lot more

detailing but we will quickly understand

the major differences and if you want me

to go into more details let me know in

the comments I would love to do that I

would love to create a separate video on


this but for now just for your basic

understanding what happens in B+

is that there is duplicate data and it

is made sure that all the data is

present in the leaf nodes all the keys

are present in the leaf nodes and then

all the leaf nodes are connected in a

link list now how does this help so in

database range queries are very very

common right so in range queries

whenever you need like a lot of range

queries P plus trees become more

efficient than b trees why because you

can just Traverse that okay from 15 till

100 give me all the keys or something

like that you can basically Traverse in

a linkless way in a very easy way versus

in a b tree what would happen is that

you would need multiple traversals you

would have to go up and then go down

like that right you would need multiple

traversals for range quaries it won't be

as straight forward that is why because

this is such a common operation in

databases B+ trees came into picture and

it is very very famous the major

difference is that in B trees the keys

are present in the internal nodes as

well versus in B+ trees all the keys

pres are present in the leaf


notes it is stored on a single file on

disk how a node of B3 is going to look

like getting the tokens this is where my

start table is going to happen can you

see the flow that from bdb we called a b

function from B3 function we calling a p

function if crash happened we would have

used this journal file list of dirty pag

is in lru order welcome to the free

course of databases in depth by oses

where we are not only going to discuss

the databases in theory but we are also

going to see the quote in a lot of

detail and not just see the quote but

debug the most common statements like

insert create select and so on it is my

guarantee that this is not only the most

detailed but also the simplest and the

coolest tutorial series on databases on

YouTube so far the only thing is that 77

78% of you have still not subscribed so

even though it is completely free for

you hitting that just one button is

going to motivate me so much it is going

to mean to me so much you have no idea

so if you could just take a second to

hit that subscribe button and share the

series with anyone who is looking to

become a better software engineer who is


interested in software engineering it

would mean the world of now without

wasting any time let's get

[Music]

started have you ever wondered what

statea bases are used to store data on

your mobile phone while using mobile

apps I think all of you will agree that

some data is stored on mobile phone

right like for cases of offline usage or

simply you cannot make Network calls for

anything and everything right you do

need to store some data on the front end

itself and same is the case for iot

devices the basic thing is that you need

some lightweight database you obviously

cannot use something like TV or

post because you require a separate

server running for those databases so

what do you use in such cases in mobile

apps or iot devices the most common

solution is SQ light and that is what

this video is going to be about we are

going to see the code in detail but

before we do that let us understand what

is site and why is it so famous SQ light

is lightweight embedded server s DB what

does this mean lightweight meaning it

has literally few hundreds of KBS of

footprint literally 200 300 KB is enough


obviously the exact value depends on the

OS that you're using but that's it 200

300 KB should be enough so it is very

lightweight it is embedded what does

embeded means so whatever app you're

using the library the sqlite library is

embeddable in the same process so you

don't need like a separate process

running for SQ light so it is

lightweight it is embeddable and it

needs zero configuration what do I mean

by zero configuration literally no

configuration nothing just download the

code the open source code obviously that

is why we'll be able to see the code it

is open source so you can just download

it you can compile it using any c

compiler and use it there's no

configuration required talking about

some cool features it supports asset

properties so yes it supports atomicity

consistency isolation durability also if

there are say power failures or system

crashes it has recovery of DB5 so we

will actually look at the code at how is

the data being stored on the DB5 and how

does this recovery happen it is also

threat safe so it takes care of DB level

thread concurrency and also it is


crossplatform so it lets you literally

move your DB file from from one platform

to another platform let's say you

created your DB file in the windows

platform you literally just have to copy

paste to a Mac platform and it will work

seamlessly it will work identically that

is so cool right but out of all the

features that I have talked so far the

coolest thing about SQ light is always

going to be that it is one of the most

simplest databases out there supporting

so many features and still being so

simple the code is very interesting how

it is structured it's going to be very

very interesting but before we get into

the details of SQ architecture it is

very important for you to understand

that databases like MySQL and postgress

are different from sqlite although all

three of them support asset transactions

you use SQL queries these two are

different how my SQL and post are client

server databases they are designed for

large scale they are designed to handle

high performance higher concurrence same

so you must have seen them being

commonly used in high level system

design in huge web apps like Netflix

YouTube and so on right and you must


have seen that we create different

database server because they are

designed like that versus what we we

discuss in sqlite they are designed for

small to medium applications like mobile

apps like iot devices they are embeded

they don't need separate process so it

is very important that you understand

the difference between them we will be

discussing postgress also in this course

but as we go ahead with sqlite it is

very important that you understand the

use cases where it is a good fit

this is the sqlite open source code this

is what I will be cloning and showing

you in detail so here as you can see

there's the source code inside this

there's EXT folder where all the

extensions are going to be there there's

test folder where all the test cases

scripts and all are written so

essentially our main goal is going to be

to understand the source code right so

this is what we have to understand

there's a lot of code but the

documentation of of this is actually

pretty awesome I will add the links in

the description and also the official

documentation page is also pretty good


so this is their official page and if

you go to the documentation it is pretty

detailed so whatever I'm explaining in

this video I have understood from this

documentation I've gone through it in

detail even in the code the

documentation is pretty detail I'll be

showing you to you that as we discuss

each and every module and component I

will show you the documentation I will

show you the code as well let's get

started

let's finally get started with

understanding the SQ architecture but to

make things easier let's get started

with a quick overview and then we will

discuss this in a lot more detail and

I'll show you the code now to make

things very simple what we going to do

is we're going to divide this entire

esite architecture into some components

to be precise we'll talk about seven

components and we will understand the

code component by component and then

later we'll Stitch everything together

while debuging and to make things even

more simpler we are going to divide

these components into two divisions let

me explain by what I'm trying to say so

how do we talk to site how do we tell SQ


that okay do this do that we send some

SQL commands

correct now these commands are in

English we say create table or select

all from or insert into or update this

or delete this so the first thing that

esite has to do is make sense of what we

are trying to say right because these

commands are in English or they are in

some form right in some particular

format we are sending these commands now

first thing that sqlite is supposed to

do is make sense of these commands and

whatever instruction we are

saying generate bite code instruction

for that so basically whatever we are

trying to say generate bite code for

that so this is what our first division

is going to be doing what our second

division is going to be doing

is that it is going to receive this bite

code and this is the division that is

going to be responsible for actually

doing the data processing so we are

dividing our components into two

divisions the first division is going to

have three components the second

division is going to have four

components so the first division is only


responsible for uh processing what we

are trying to say and generating the

pite code the second division is going

to pick up this pite code and actually

do the data processing okay let's talk

about the first division of components

first so as we said we are getting some

SQL commands now how do these commands

come like create table or insert into so

now this big command we are going to

divide into chugs and understand that

okay create is one token table is one

token or insert is one token and then

into is another token so what we are

doing whatever command is coming we are

tokenizing it and understanding okay

these are the tokens that are coming so

the first component is going to be a

simple tokenizer that okay tokenize it

now once we tokenize this together we

are going to understand okay does this

statement even make sense is this a

valid statement or not what is it trying

to say can we optimize it further like

if I say into insert does it make sense

no or if I say table create does it make

sense no it doesn't or if I say create

into doesn't make sense no so is it a

valid statement do these tokens together

make sense or what is it trying to say


can we optimize it all of this is going

to be done by our parser so first we

have tokenizer then we have parer and

then whatever parsing we have done after

that we can finally do our code

generation and this will be done by our

code generator now this code generator

will finally generate our bite code and

this SP goat is going to be interpreted

by our second division now these

divisions we can also call them as

frontend and back end but don't get

confused it's not like a front end back

end of the huge systems that we usually

talk about it is just division of the

components and because the first

division what it is doing it is making

sense of whatever we are sending it is

sort of like a front end of SQ light and

whatever are the other components

they're going to be doing the processing

work and that is why we can consider it

like a back end of sqlite but don't get

confused by the terms of front end and

back end and that is why initially I use

the terms divisions and not front end

and back end now let's understand the

second division or the back end of site

as you must have understood by now the


first thing that this new division is

supposed to do is to interpret or to

implement whatever code instruction is

given to us right so we are giving some

code instruction in this SP code so the

first component what it is supposed to

do it is supposed to implement this so

this is the component that will actually

be doing the data processing this

component is called virtual database

engine or VDP or you can also call it

the VM so which is virtual machine so

this component is essentially

abstraction of a new machine alog

together this is the one that is

actually doing the data processing but

what is happening over here is that this

component is still viewing the data in

terms of tables rows and columns that is

how we view the data correct now it is

not like data is stored in tables and

columns and rows right internally it is

stored in some other format but this

component is still viewing the data in

terms of tables and rows and columns and

the components that are below it that is

where the data is stored in some other

form and that is what we'll understand

now before I go to the next component

this is reminder that we are going to be


discussing each of these components in a

a lot of detail we're going to be seeing

the code so it is going to become

clearer just stay patient with me it is

very interesting right so what this

virtual machine is doing is that it is

just a component it's not a different

machine on in the same SQ code it is

just one small component and what it is

doing it is actually doing the data

processing so when I show you the code

you will see that is actually a huge

switch statement so what it is doing is

that if my bite code is saying that

select something it is going to actually

do the selection if it is saying insert

it is is actually going to do the

insertion so that is where the data

processing is happening now the

components below that is where we will

understand that how data is stored so

the next component that this VDB

actually uses is b or B+ 3 now there is

a different video alog together where I

have covered that why bb+ 3es are

commonly used to store indexes and

tables in databases so you can refer to

that video the link is in the

description it is part of this entire


free course so what happens in SQ light

is that there is a separate bb+ tree for

every table and every index so there is

one completely different tree for a

table or for an index correct now this

VDB or VM is going to use this B tree to

insert or to delete or to whatever

whatever it has to do it is going to use

this B Tre component so VDB is using

this B Tre component B Tre component is

going to use the next component which is

going to use the next component now

before understanding the next two

components it is very important to

understand that how data is stored on

disk because we are viewing the data in

terms of B Tre that okay one entire

table so we are visualizing it how that

okay this table corresponds to a b tree

okay so but then there is a lot of data

inside this right okay there's data

there's data there's data how is it

stored on disk after all so that is what

we should understand before going to the

next components and let me tell you that

first in SQ light whatever data is there

in a particular data database it is

stored on a single file on disk so there

is a single file on dis corresponding to

every database and I will actually show


this to you as I create new database

there will be a new file that will be

created suppose I create one more

database another file another database

another file so if I delete the file the

database is gone so every database is

represented by a single file on disk now

what is a file actually file is nothing

but how does it look like so if suppose

this is a file there's just some data

return on rate right so some data that

is written on it it is just some bytes

put together correct some bits so

there's like a lot of data now what we

going to do in SQ is that we are going

to take a fixed size so by default the

size is 4 KB we can also configure it

but for now to keep the explanation

simple I'm going to say 4 KB is the

default size it is the size so what we

going to do is we're going to take this

one file the disk file and we're going

to divide into that fixed size chunks so

if you're taking 4 KB just divide the

entire file into 4 KB chunks so what it

is going to look like okay 4 KB another

4kb another 4kb another 4kb another 4kb

another 4kb and in SQ light these chunks

are called Pages which are represented


by page numbers so if we have to

understand it in very simple terms a

single DB is represented by a single

file on disk and that is nothing but an

array of pages some just some pages put

together so like page one page two page

three page four page five like this it

is just an array of pages and this is

how your disk file looks like now next

important thing that you need to

understand is that this file is on disk

right and the pages are within the file

obviously now how will a processing

happen because the disk file is huge

when you want to do some change to a

particular page right if you want to

insert something or you want to get

something or you want to write something

it will be done to a few pages not to

the entire disk file the disk file is

huge and whenever you have to process

something whenever you have to make

changes to the file what you're going to

do is you're going to take out a few

pages of it maybe like one page or two

page you're going to take out the page

put it into memory like into RAM or you

can say into cache so you're going to

take those pages out that you need to be

processed and put it to Cache correct


because your DB file is on disk now if

you want to process something you have

to take it out to memory to Ram cor so

you're going to take it out and use

cache so I will actually show you the

code and interestingly if you have heard

about caching even a bit and if I ask

you do you know any one caching

technique you will probably come up with

lru and that is what is used by sqlite I

will actually show you the code see that

skite is actually using lru technique to

maintain its cash but we'll discuss it

in a lot more detail I will show it to

you but in order to understand the

overview what is happening is there's a

dis bite there are Pages now whenever we

want to process a few pages those pages

have to be taken out put to Cache

processed and then something something

now this processing is done by our next

component which is called pager this

pager component is actually extremely

important and we're going to spend a lot

of time on it cash management is just

one of the things it has to do so our

asset transactions happen because of

pager the roll back thing that I was

talking about happens because of pager


the locking mechanism is done by pager

so all of that is taken care by P and

we'll discuss it in a lot of detail uh I

know they might be a bit of confusion

right now that you know we were talking

about be trees and suddenly we talking

about pager how does that happen let me

give you some idea and then when we talk

into a lot of detail it will be clearer

see B3 is what a concept so tree is what

okay there are nodes nodes right so B

tree is also in the end a tree now each

node in the end is to in pager now note

the how is the page represented by a

page number so our B3 component is going

to say that okay get me this particular

node essentially what it is telling

pager is get me this particular page and

process it so if we have to visualize it

the entire table so when we say a table

in SQ light so the table is structured

in a b tree but how are those B tree

nodes stored on the dispite in the end

these are some pages now we have just

one file so we will see how a page looks

like how it together forms the TB file

but each page they will be represented

by a page number now one page can be a

node corresponding to the B tree so we

will visualize it like this we will see


the code coming to the last component it

might be obvious to a few of you

in the end what we have to do is we have

to deal with a disk file now what do we

have to do when we talk about file we

have we perform a lot of operations on

the disk file right we say open file or

read file or close file so I will

actually show you that when I say create

a new database there's like an open

statement that is called in the

operating system right because we are

creating a file or we are opening a file

in similarly we closing a file just how

we do like read open close that is what

is exactly happening but since SQ light

is crossplatform uh we need one common

interface which can say that okay oh

this is Windows platform so okay we have

to open like this okay this is Mac

platform we have to close it like this

and so on and so forth so that is what

our last component does we can call it

virtual file system or VFS you will

actually see this variable being used a

lot of times in the code VFS or you can

also call it operating system interface

so it is like an interface that okay if

this is our operating system this is how


you're going to open the file or you're

going to read the file or close the file

and so on so these are the main seven

components that we have to understand

and we have to see the code but you have

understood the flow so congratulations

you actually understand the overview of

site architecture now it will just

become a lot more fun because we will

pick up each and every component and we

will see it in a lot of detail we will

see the code and in the end when I debug

when suppose when I debug the create

statement or insert statement you will

see that first tokenizer is hit then

part is hit then code generator is hit

then VM is hit then B3 is hit then pager

is hit then OS interface so this is the

exact order in which components or

modules are called so the basics are now

clear let's get to the detailing before

we move in I would just like to take a

minute over here to tell you that if

you're preparing for interviews or if

you're looking to become a better

software engineer I would love to be

part of your Learning Journey and make

it easier faster better and more fun I

teach live plus you also get access to

the recording so you can learn at your


own pace and you get a consistent

teaching style because I teach all hld

lld DSA Project based courses where we

cover industry level projects like

YouTube WhatsApp zeroa we cover M Stack

as well as devops and we also have C+

plus plus if you sign up for any of our

courses you get lifetime access to all

the batches the past present future

everything you also get access to our

mock interview platform you can come

practice anytime we are going to guide

you we obviously have doubt support as

well we are putting our heart and soul

into teaching so if interested at least

check out the site all the detailed

curriculums the testimonials the answers

to all the general FAQ everything is

present over here just check it out educ

courses.com is the site plus if you

still have any questions feel free to

reach out to us at support the

courses.com we will also guide you about

what is better for you at least check

out the testimonial for now the best

thing that I like about edu courses is

their courses are live teach the way she

teaches the way the courses are

structured are actually the best part


they live classes and the very helpful B

Community I feel very confident as

compared to what it was before these

courses as a software engineer my

experience at ad courses have been

really great I feel more confident as a

software engineer after enrolling in Ed

courses but for now let's get back to

the video

this is the cloned esite project the

same esite project that I showed you on

GitHub I have just cloned it and this is

on my system that's it first let me show

you some important files so that you get

used to the code structure and you

understand it how is the code written so

we are in the SRC folder and that is the

folder we'll be focusing on so let's see

some important files so here you can see

there is or. C there is B3 files right

so these are the files ready to B3 if

you see the documentation is just

amazing the detailing to which they have

explained is just awesome this is how

I've understood the entire code I

literally went through all the files

right through the documentation so if

you see over here like B3 end so here

they' have literally shown that how a

node of B Tre is going to look like what


all things are going to be there inside

it so if you just go down if you read

about it this is how a page is going to

look like so each P3 pages is divided

into three sections so you can see file

header page header and see they have

explained it in very awesome way my goal

is to make things simpler for you and to

summarize things in as easy way as as

possible and interestingly if you see on

the top of every single file they have

written this with the copyright

statement may you do good and not evil

may you find forgiveness for yourself

and forgive Others May you share freely

never taking more then you can give okay

amazing but okay cool uh so let's look

at some very important files so B3 files

will be important for us we'll be going

through them in detail other than that

what else will we go through in detail

so you can see there are are some mutex

files so this is where our OS layer is

handled so the last layer that we just

discussed the last component so these

this is the OS layer then you can see

pag. C pag. again very very important

files so here you can see interface at

SQ page cache subsystem right and here


you can see this is like the cache files

then um we will be covering in a lot of

detail what exactly is sqi 3 file it is

extremely important it is the most most

important file of the project and I will

tell you how it is generated what it is

in detail just give me some time for

that there is table file then there are

some test

files and after that you can see see

tokenized file is from where we will

start we will start understanding

tokenization then you can see update. C

upsert do c see you remember VDB

component the virtual machine component

so this is the header file work for

virtual database engine so I told you

right VDB component so that entire code

is in the these two files then other

than that VDB Trace uh V tab so these

are the important files that we will be

going through there is utility file and

so on so let's get started with tokenize

do c and let's understand the main

functions that are hit while

tokenization in token I.C the most

important function that is going to be

hit like so many many times you will be

just done with this function it will be

hit so many times is SQ 3 run parel so


what is happening over here is that we

will get the entire SQL string that we

will be passing so like select all or

insert into or create space table and

this is where the entire passing is

going to happen so if we go inside this

so don't get overwhelmed by the lot of

code so if you see inside this there is

a very simple while statement there's a

while one statement so does it go on

forever of course not we'll see so first

of all they getting the tokens so one by

one in every Loop they will get the

token like create space table so first

they will get the token create then they

will get the token space then they will

get the token table like this and if you

see for every token type there is like

sort of a switch statement over here let

me show you what exactly is happening

now you can see there are different

token types right space over filter we

will see what it is first let's see what

is happening in get token so if we go

over here it is returning the length of

the token that begins at that point so

here you can see again there is CC space

CC minus CC LP there's something

happening over here so let's see what is


CC space so it is nothing but a

character class so we are just figuring

out whatever character that we dealing

with is it a digit is it a dollar is it

a minus or is it a slash is it a comma

is it and so or is it illegal right so

we just figuring out that whatever we

dealing with what token is it right so

this is where we are figuring out that

what is the character that we have

gotten accordingly we are assigning a

token type so what is a token type now

see token type can be for example U it

can be table or it can be create so you

figure out that okay these characters

together are forming this token so

create or table so essentially when the

passing is going to happen and I will

show you this literally when we debug

that when we do create space table we

will see that the tokens that are

identified will be say 17 then there

will be a token for space so if I search

tkor space so you can see there is 184

okay so there will be like 17 then 184

then create space and then there will be

table so it will be 16 so we will see

like this that passing is happening and

we can figure out that these are the

tokens now corresponding to each token


there will be like a function called so

this is exactly where those things are

happening right so this is where we are

getting the tokens so as you you can see

this is literally a huge switch

statement right is it CC pipe is it

comma is it and so you're just assigning

a token type to it okay and let's get

back to our

parcel so now that you have a fair idea

rough idea of exactly how tokenization

is happening we're just basically going

through every character by character and

then putting some valid characters

together and saying that okay this is a

keyword this is a token now we have

tokens and now we should understand a

bit about how exactly paring is

happening right so after tokenizer there

is parser so let's look at a very very

important file which is par. y now you

would be thinking that KY par. C and

makes sense what isy file now let me

explain to you so if you know a bit

about parel generator this will be

easier if you don't it's fine uh let me

just tell you in very simple terms so

earlier there were some parcel

generators named yac and Pyon now site


has written their own partial generator

which is called lemon which is more

easier it is simpler so this file

contains SQ lights SQL parcel in very

simple terms now what is happening

inside this what does this file look

like now I will make it very very simple

just look at this one statement so look

at this statement suppose this is where

we are seeing the create table statement

paral generator is there what does it

mean see what exactly happens in passing

we say that okay these are the tokens

now we should know that these tokens

should come exactly in this order right

because table create and create table

are different orders so there are some

rules corresponding to it so if these

tokens are coming in this order if

they're following these particular

grammar rules then we can call this

function or we can take this action this

is exactly what is happening in paring

right that if I have tokens which are

following this particular grammar rule

call so and so function or perform this

particular action that is exactly what

is happening see here we are saying that

this is the grammar rule in very simple

terms and we are saying that call this


particular function okay and what is

happening inside this function so if you

see site three start table if I just

search for this this is what is

happening over here so if you see this

is where it is telling see uh this is

where parsing will be happening and this

is where we will tell that okay these

were the tokens that were sent and this

is the function that we have to call so

if I go inside this if I go inside this

particular function you can see this is

where my start table is going to happen

so there is a partial generator which is

going to say see if tokens are coming in

this order then you call this particular

function which is this start table

function and this is where our you know

we will begin constructing a new table

so if you see like this they will be all

the statements so like this was for

start table right this will be

similarly so if you want to see more

statements let's

see uh so you can see create view

statement the drop table statement then

select statement so see select query

should look like this and after that you

will call SQ light 3 select like this we


can keep going through a lot of

statements in detail but for now to

summarize it in a very simple terms we

first saw tokenization what is

tokenization

you take the SQL query that is coming

you divide into tokens and then parsing

what is happening so we have a par

generator which makes sense of these

tokens put together that okay this is

the action that has to be performed like

for example we saw that start table is

one action that has to be performed so

if we come to start table over here and

if we go down see there are bunch of

things that are happening over here and

I will explain this to you we will debug

this also so just stay with me so first

thing we are reading schema this is one

crucial thing I will explain you why it

is crucial and other than this if you go

down you will see there are a lot of VDB

calls that are happening right so what

was the order in which the components

were called there was tokenizer then

there was parser then before VDB there

should be one more thing right which is

what which is code generator so we have

to make sure that whatever we are

sending to VDB actually makes sense it


is bite code so first let me tell you

exactly where that happens and then we

can see that how the call to that b good

generator is happening let us go through

that first

so for that one very important file is

prepare. C if we come over here so see

if we go to the file this file contains

the implementation of site 3 prepare

interface and routines that contribute

to loading the DB schema from disk okay

and the most important structure is site

3 prepare and if we go to this statement

so let me actually just go through this

so so here okay let's go to this this is

very very important an instance of this

object represents a single squl

statement so whatever statement is there

like create table insert into select all

that has been compiled into binary form

so this is the binary code that we are

generating and we are sending to the VDB

or VM so this is the SQL statement

compiled into binary form and it is

ready to be evaluated think of each SQL

statement as a separate computer program

the original SQL text is the source code

so this is exactly what our first

division of components were doing right


the front end what was it doing it is

just making sense of whatever SQL

statement has come to us and what is the

final output the output is this SQL 3

statement a prepared statement object is

the compiled object code all SQL must be

converted into prepared statement before

it can be run who is going to start

running it the first form component is

going to be VDB or VM the life cycle of

a prepared statement object usually goes

like this create the prepared statement

object using this prepare V2 so this

seems like a very important function

right so we'll see cite 3 prepare V2

bind values to the parameters so this

will just say okay this is the meaning

of the statement then we have to bind

the values like for example uh if table

name is there then we have to say Okay

this is the table name and then we going

to run so theun running of the SQL is

happening where in cite 3 step so these

are the very important functions cite 3

prepare 2 cite 3 step and once it

happens after that cite 3 reset is

called but destroying the object is

happening in cite 3 finalize so any

statement that is being called has to go

through these three very very important


functions that we should understand

prepare we to step and finalize so every

time of particular statement is called

we have to see that these three

functions are called or not and it

should be in this order first we have to

prepare the statement then we have to

execute the statement this is where the

running is happening and this is where

we are destroying the object that okay

it has been run now now you can destroy

now that we have a rough idea that this

is how binary form of the SQL statement

is generated let's quickly go through

the order once more so first thing that

we saw was that tokenization happens

that was in parcel right first was get

token so we generated the tokens then we

saw that okay there's a parcel generator

and it says that okay if this is the

grammar rules call this function what

was that function so let's take the

example of create table itself so when

we create table it goes and calls this

function right start table so this is

what it pointed to the parcel now that

it has pointed to this uh I told you

that one very important function inside

this is read schema and let's see why it


is so important and why do we need to

read schema so when we say create table

so we are understanding the flow of

create table right now so focus on that

so when we say create table first thing

that we need to make sure is that the

new table name should not collide with

an existing index or table name because

if I'm creating a new table there should

not be an existing table or index with

the same name so you can see here I'm

reading the schema and then whatever is

there I'm going to say find the table

with this name is it there if it is

there then I'm going to return the error

from here itself see error message that

table already exists maybe there's a

view maybe there's a table maybe it

already exists right so so let's see

what exactly is happening inside this

read schema so if I go inside this read

schema there is see there is a mutex

that is being held and there are a lot

of things that are going to happen I'm

going to talk about the most important

things so there is one init function and

if we go further inside this there is

one in it one function see there are a

lot of things that are going to happen

so there is one in memory representation


of schema taes that are created we will

talk about this in detail but for now

what is most important thing for you to

understand

is exec so if I go inside this

particular function see over here this

is what is called from initialize and if

you see this is where you will be able

to see the statements being called see

first one is esite 3 prepare V2 this is

where you are preparing the statement

then this is where you are executing the

statement here you should be able to see

a step call where is it where is it so

yes so this is the step call right so

first call was the prepare call

then there is a step call and after that

we should be able to see a finalized

call so see this is there's a finalized

call so this was the order which we saw

also but now you can see that how and

exactly where it is happening so inside

the exit so if you see this is the

function that is responsible for

executing the SQL code whatever SQL code

we had we first tokenized it we passed

it then we say okay start table now

while initializing the table we are

saying okay execute the statements first


prepare the statement that means prepare

the binary form of whatever SQL

statement was given to us then what

you're going to do then you are going to

execute the statement and then you can

destroy it so when we say finalize if we

go inside this you can see clean up and

delete the VDB after execution

now clean up and delete the VDB after

execution where did VDB come into

picture okay so we have finally leveling

up because we are talking about VDB

which is in the packet or the core of SQ

light correct before I move ahead let me

just quickly remind you that this video

is not supposed to be easy you're

talking about internals of a database it

is one of the coolest things as a

software engineer right you will feel

very proud and if you tell anyone it is

like so cool that you understood

internals of a database it is not

supposed to be easy so I am just talking

about the main main functions right now

but the more you go through this video

again and again and also when you see

the debugging video the next video

things will become just so much more

easier so stay patient I do not want you

to leave the video right now because it


doesn't make sense leaving till here

because we talked about the first

division we talked about the front end

of SQ light we are finally getting to

the core the back end so it doesn't make

sense that you leave right now so don't

leave okay stay till the end of the

video Let's understand what is happening

in VDB why did we get into ADB suddenly

to understand let's quickly go to the

start of the file and see what's written

so it's written that this file contains

code used for creating destroying and

populating a VDB or an sqid 3 statement

as it is known to the outside world what

does that mean that means that VDB and

SQ 3 statement are actually the same

thing it is just that when we talk about

the front end when we talk in the first

division of components we refer to it as

site 3 statement and when we come to the

packet the same thing is VDB see VDB or

VM is a simple component what it is

supposed to do it is just supposed to

execute some instructions now those

instructions are in binary form are in

bite code us just supposed to execute it

so let me just show you this let me make

things much more clear so if we go back


to where we called SQ 3 VDB finalize

which was our exact statement right so

inside this there was all the prepare V2

step and the finalize calls so if I go

back again in this step function now let

me show you something very very

interesting can you see this what does

this mean we are typ casting over here

right so actually the VDB and the

statement that is there is exactly the

same thing it is just that in the front

and the back end the different terms are

used and if you see this this is there

in all the functions over here so let me

actually show you we are in bdb API

right now so if I show you you will be

able to see this in a lot of functions

so can you see this over here we sqid 3

reset so same thing is happening over

here correct uh let's see we'll be able

to see it in more places I'm pretty

sure let me actually search these

statements so that it will be

easier see it is happening in Step it is

happening in statement explain in

statement busy in finalize so you will

see this happening in a lot of places

not just in this file in other files as

well that statement is being converted

to VDB this actually proves my


understanding that statement and VDB are

essentially the same thing I think it's

finally time to see that how VDB or VM

executes whatever bite code we have

whatever is prepared statement we have

how does it execute so how normal

machines use Assembly Language there is

one language that is very specific to

site wdp you are not supposed to know it

it will not run anywhere outside even

site people say that is specific to VDB

and you're not supposed to run it not

even the front end it is specific to the

the core so let's see what exactly

happens first let's see the VDB

structure so if we go inside this this

is how the structure looks right right

so over here you can see link place of

VB is with the same VDB DB now here you

can see something called op space to

hold the virtual machines program and if

you go back one very important structure

that you're going to see is VDB op let

me just show that to you VDB op so yes

let us go into to this and this is very

very important so if you see over here

there's one op code and then there are

some operant so this in very simple

terms is the operation that wdb is


supposed to perform and these are the

operant so suppose in very simple term

suppose I have to do 2+ 5 so two and

five are the operant and the addition

the plus is the operation that we have

to perform so you can think of it like

this that whatever VDB or the virtual

machine you're supposed to do it is in

the end some instructions that do this

do this do this do this right so every

single instruction of the VM has an OP

code and as many as three operant so

they can be Max three operant so most of

the operations use I think two operant

only there's one more that is used in

some operations the instruction is

recorded as an instance of the following

structure so this is the operation that

you have to perform and these are the

operant now you're thinking that kitty

where does all of this happen so if we

see the VDB code okay if we see this so

do you see the switch statement let's

see how big is the switch statement and

what is the switch on the switch is on

One op code so one if you see this is a

massive switch statement where each case

implements a separate instruction in the

VM right this is in VDB Doc and if I

show you how big is the switch statement


972

29055 that is how big one statement is

in actual simple terms VDB is nothing

but a simple switch statement oh this is

the instruction to be run I will run

this and this is how I'm going to run it

so it doesn't do any optimization

nothing it just has its own language in

which it is going to implement a

particular operation so if you see over

here this is the switch statement op go

to if we see this so these are the

operations that are possible like go to

or if if not right so these are the

operations that are possible so based on

these operations so you can see there

are so many operations right so based on

each of these operations there is one

like correspondingly there's a case in

the switch statement so you can see this

right so if I close this there's another

case for go sub there's another case so

it is also showing let's see this is the

operation that you are supposed to do

this is how it is going to look like

write the current address to the

registered so if it is go to what you're

supposed to do you're supposed to do

jump this is the operate return this is


the so there is proper documentation

also that how this is done so what is

happening is that whatever bite code was

there in the prepared statement now that

is like divided into small small

instructions so each op code instruction

and then it is run slowly inside the

switch statement that okay this is what

we supposed to do so you will actually

see this being run a lot op halt that

means stop it so VDB halt that

internally calls op halt it means to

stop it then integer integers six so you

can see these are all the switch case

statements right and this switch is

inside what it is inside VDB exec I

think so if I just search yes so this is

the exact statement and if we go inside

this this is where it starts correct so

this is happening inside VDB exec and if

we see what is happening inside exec so

execute as much as of VDB program as we

can this is the core of SQ light three

step so SQ light three step is

internally calling VDP exec so now the

flow is a bit clearer that so when we

have site 3 prepare so we are preparing

the statement then when we execute what

is it calling internally the VDB exec so

that is where we are actually executing


the statements right so uh if we want to

see this if we go inside this and see

where all this is being called actually

let's see where all is this being called

let's see where is this call from so

this if you can see this is called from

the esite three step so inside step what

is being called the exact statement and

this is how execution happens and you

will see this while debugging that first

we prepared then while execution what

are we calling intern the VDB exit so

now we have moved from front end to pack

end and now VDB execution is going to

happen

since we are now discussing the back end

part of SQ light the core features each

of the components are going to be pretty

huge especially pager it itself has so

many things to be done B Tre also can be

a huge discussion in itself so what we

are going to do is to make things simple

let's first make sure that we are able

to visualize the entire flow of

components in the component diagram that

we had discussed in the esuite

architecture in the beginning what did

we see that VDB uses B tree uses pager

and then below that there is the layer


so at least if we are able to see this

flow happening in the code we will have

some confidence and then we can see each

of these components in a lot more detail

correct so for that let me just open one

very important function for you so VDB o

if we see VDB Commit This is a very very

important function so if we see whenever

changes have to be made to DB we usually

say that we are committing the changes

right to the disk so if we see in VDB

commit inside this you will see there

are a lot of B Tre calls that were going

to happen like B Tre enter right but the

most important two calls that you should

be thinking about is B3 commit Phase 1

and B3 commit Phase 2 we will discuss in

a lot of detail that what is Phase One

what is phase two but to give you a

quick context do you remember in the

starting we discussed that the entire DB

file is on disk now we bring up a few

pages on memory and then we make changes

in the memory and then we push to the

disk now what you should be thinking

about right now is that what if you know

DB crashes in between because we making

the changes on memory right how do we

make sure that the entire changes go to

the dis and in cases of crashing and how


how do we support recovery so in order

to all of this we will understand phase

one phase two properly that how is this

happening this is a very common solution

also in all the databases not only in SQ

light we will talk about this in detail

but just notice that there are two

things that are written over here phase

one and phase two so VDB internally is

calling a B3 call right B3 commit Phase

1 B Tre commit phase two if I go inside

this if I go inside B3 commit phase one

see internally it is calling pager

commit phase one so can you see the flow

that from VDB we called a b Tre function

from B3 function we are calling a pager

function so the component flow is

happening like that similarly if I go to

B3 commit phase two if I go inside this

see from B3 commit phase two we are

calling pager phase two so at least that

flow should be completely clear now from

the starting first we tokenize then we

pass then we generate bite code by quote

we prepare the statement which is

nothing but a VDB po pointer itself then

VDB internally uses P3 B3 uses pager now

beneath pager what will be the last

layer the OS layer now that you have


understood this let's see that finally

how the data is stored on dis how things

are happening let's understand in a lot

more detail

since we just saw in the code at the

commit happens in two phases phase one

and phase two it is time that we discuss

that why sqi does it in two phases and

how does it actually happen this will

introduce us to A New Concept which is

not just applicable for sqid but a lot

of databases use it so this is pretty

important listen very carefully so what

did we discuss in the starting that sqi

supports asset transactions right what

does a stand for an asset it is

atomicity what does atomicity mean in

very simple terms a transaction should

either completely happen or it should

not happen at all it should not happen

that a few steps of the transaction

happen and the others did not happen so

a very common example that everyone

gives that I also give us Suppose there

are two people A and B and 100 rupees

need to be transferred from A to B so

there are two steps over here that 100

rupees should be deducted from a and 100

rupees should be given to B it should be

increment in B's account so there are


two steps over here it should never

happen that the money got deducted from

a but it never reached B and it should

also never happen that the money got

incremented for B but it never got

deducted for a because both of these

scenarios can be disastrous we need to

make sure that the transaction either

completely happens or it does not happen

now let's discuss that what did we see

in SQ light what happens is that there

is a disk file and whenever we want to

make some changes to it we we bring the

pages to memory so in this example

suppose we have to make the transaction

from A to B Suppose there are two pages

where this data is there and we bring

the pages to memory and we update the

values for A and B so we did the

transaction that minus 100 + 100 in

memory now what happened was that while

we were putting these changes back to

the disk file maybe we decremented for a

we had not incremented for B so just we

did this and then entire crash happened

and the maybe the dis file crashed or

memory crashed or whatever crashed

happened so in case of Crash what do we

do now so now when the disk file comes


back up it will have no idea so it would

have the data that a was deducted but we

did not get the money so what is going

to happen now this is not the expected

Behavior we expect what we expect

atomicity from site so how should site

take this into picture how does it

handle this scenario so to handle this

scenario there is one more component

that finally comes into picture which is

called Journal so there are three things

one is the cache one is the journal and

the other is the disk pile so let's see

what exactly happens in Phase One and

phase two as you can see here I have

opened the documentation for commit

Phase 1 B3 commit Phase 1 and let's see

what it is saying Okay so this routine

does the first phase of two-phase commit

so it there are two phases right phase

one and phase two so let's quickly write

that down so come at Phase One and phase

two so what happens in Phase One this

routine causes a roll back journal to be

created so we are creating the journal

okay so create Journal is what is

happening over here and if it does not

already exist and populated with enough

information so that if a power loss

occurs the DB can be restored to its


original state by playing back the

journal so this has the original state

so when we load the pages to memory what

we do is whatever is the original state

that is pushed to journal okay so that

in case in case power loss occurs so in

case the crash happens so if crash

happens so there is the original file

that is there over here so your disk

file can always roll back to Journal

because the original part is already

there right so what are we doing if a

power loss occurs the DB can be restored

to its original state by playing back

the journal that means that the original

data the original state is in journal

you can put it back to your disk file

this is also called roll back this is

called roll back so what is roll back

that you're rolling back to the original

state there's also something called

writing ahead that means that you are

making the changes so in this case I'm

I'm talking about rolling back then the

contents of the journal are flush to the

disk so when we say flush to the disk we

do not mean disk file over here don't

get confused so this journal initially

also obviously whenever you make any


changes you are making it in memory so

what you're doing is you push the

journal to disk so we are pushing the

journal to disk and I will actually show

this to you while debugging that there

is a journal file that is created and

you will be able to see that okay a new

Journal file got created right now

because we're making the changes after

the journal is safely on oxide that

means it has it is safely on the disk

the changes to the DB are returned into

the DB file and flush to oxide so you

make the changes over here and then push

to your TB at the end of this call the

roll back Journal still exists on the

disk so after this phase that is after

commit phase one Journal file is is

still there on disk okay it is still

there on disk the journal still exists

on disk and we are still holding all the

locks so locks are there the transaction

has not committed so after this there

happens commit phase two for the second

phase of the commit process let's see

what is happening in the phase two so

commit the transaction currently in

progress so you can say this is the

phase two let's see what is written this

routine implements the second phase of


the two-phase commit so phase one does

the first phase and should be invoked

prior to calling this routine so it is

specifically mentioning that first phase

and then second phase first phase did

what did all the work of writing

information out to the disk and flushing

all the content so that they are written

on the disk platter so you made sure

that the changes are done in memory and

you made sure that they are put to the

disk as well all this routine has to do

is to delete or truncate or zero the

header in the roll back Journal which

causes the transaction to commit and

drop blcks so it is saying that this

journal file that you created dude now

you can deleted now it is not needed see

if crash happened we would have used

this journal file but now because there

is no crash we have nicely we are

talking about happy FL right now we have

nicely put the changes to the disk file

after we have done that we can delete

the journal so that is what we are doing

over here all this routine has to do is

to delete or truncate or zero the header

in the roll back Journal so this is the

roll back Journal right in case crash


would have happened what would have

happened when SQ light would have come

up it would have seen oh there are some

changes in the journal or the state of

the journal and the state of the disk

file is not the same so either roll back

can happen so the original thing are

there in journal right maybe a was

deducted and B was not there so half

changes are there in the disk file so

what we what esite will do do is it will

see the original state in the journal

and the disk file will roll back to the

original state so this is how we can

handle crashes correct so normally if an

error occurs while the pag of layer is

attempting to finalize the underlying

Journal layer this function returns an

error and the upper layer will attempt

to roll back however if the second

argument is non zero so this is like

very code level so I don't think we need

to not this this will release the right

lock on the DB file if there are no

active users it also releases the read

lock

so this is very very important to

understand that there are three

components one is disk file one is

memory which is the cash and one is J


for revision purposes and better

understanding I've also created notes

which are available for free I'll be

adding them to Ed courses I'll be

refining them a bit more and I'll be

adding them to edu courses you can check

out the link is in the description so I

have summarized it in very easy way see

in SQ light there are how does the

update actually happen so there are

three things right one is cash then

there is Journal then there is the TB

file correct so what exactly happens and

how does it actually happen let's see so

cash is where the pages of the DB file

are temporarily stored and modified in

memory so it is in Ram by the way don't

get confused between this cache and the

OS while cache because this is specific

to sq light okay when a transaction is

initiated the necessary pages are loaded

into the cash and any changes are made

here are made here first in general what

happens before any changes in the cash

are returned to the DB file a record of

the original pages is written to the

journal that we saw so that we can roll

back this ensures that if an error

occurs the changes can be undone


maintaining atomicity this is what you

need to understand the DB file once all

the changes are safely recorded in the

journal the modified pages from the cash

are returned to the DB file this is the

final step to make the transaction

permanent and this is where phase one

ends and in the phase two what do you do

you delete your Journal right so the

pager is the component that manages all

three it handles loading and modifying

pages in the cache it writes the

original pages to the journal before any

changes are made ensuring that it can

roll back if needed it manages the final

writing of modified pages from cash to

DB file ensuring data consistency and

durability so all of this is managed by

whom it is managed by P right so this is

how we are ensuring atomicity in SQ

light for reference I have added simple

notes for Journal management as well so

journ is a repository of info that is

used to recover a DB when aborting a

transaction or statement subtransaction

and also when recovering after an

application system or power failure so

it uses a single Journal file per DB so

how there is a single disk file there's

also single Journal file per DB it


assures only roll back undo and not redo

of transactions by the way we will also

cover that what all things are not

supported in sqlite so that you have a

basic understanding because sqlite is is

designed to be simple and easy correct

and the journal file is often called the

roll back journal the journal always

resides in the same directory as the DB

file does and has the same name with

Dash Journal appended so I will actually

show this journal file being created to

you SQ permits at most one right

transaction on a DB file at a time it

creates the journal file on the Fly for

every right transaction so for every

right transaction it is created and

deletes the file after the transaction

is complete so we saw phase two right so

that is when it is deleted so this is

called right ahead logging I told you

right this concept is not used just by

site but by many DBS and we'll see this

in future as well to ensure durability

of DB changes and recoverability of DBS

and occurrences of application system of

power failures writing log records in

the journal is lazy site does not force

them to disk surface immediately however


when before writing the next page to the

DB file it forces uh so this is called

flushing the journal so flushing the

journal means WR to the DP file I told

you right that first you make changes in

the memory and then you flush them to

the disk so whenever you make changes

from memory to disk often the term that

is used is flashing now that we have

covered some really interesting theory

about journaling about caching let's see

some really interesting code as well I'm

sure you must be wondering that how does

cash management happen in site right so

let's get to some really interesting

code now so here there is pager commit

phase one let's go inside this and let's

see exactly ly what is happening over

here so firstly we can see that pag of

flash on Commit is happening and all

that that's fine here can you see cash

dirty list what do we mean by cash dirty

list let's go inside this and if we see

return a list of all dirty pages in The

cach sorted by page number what do you

mean by Dirty pages so dirty pages in

the cach meaning that we have made

changes to these pages in the cache but

the state is different from the disk

file so when you have made some changes


in the cach but the changes are not made

on the disk these pages in the cash are

called Dirty pages and let's see how

they are managed because there's a list

of dirty pages right so let me show you

something very very interesting can you

all see this this is pach and if you see

list of dirty pages in lru order so if

you have ever heard of cash management

if you have ever read about it either

DSA lld you must have come across a

question of lru caching correct and this

is how sqlite also caches it uses lru

order and this is how it keeps the list

of dirty Pages let's see exactly what is

happening a complete page cach is an

instance of this structure every entry

in the cache holds a single page of the

DP file the B3 layer only operates on

the cached copy of DB Pages remember I

told you B3 component uses pager so B B

Tre doesn't talk to the disk file which

component talks to the disk file the OS

above that there is pager so B Tre only

operates on the cach copy the page cache

entry is clean if it exactly matches

what is currently on disk a page is

dirty if it has been modified and needs

to be persisted to disk this is


important if it has been modified if it

needs to be persisted on dis that is

called a dirty page and it has all the

states so 30 pages are linked to W link

list so you can see PG header so that is

the header there is next and this

previous so this you can actually see

the entire implementation of Link list

over here so if we go through the

function see link list management the

entire link list code you can actually

see manage dirty list so you can see

next and previous if you do DSA this is

going to really excite you because all

of you keep saying where do we see DSA

in real code see this is database code

this is awesome right there's another

function for number of cach pages then

this is where you initialize the cache

so you can see you are making some

configurations then there is shutdown

then there's page size U create a new

page catch object so there's like the

another object set page cache size then

there is Cash fetch so there are a lot

of functions over here obviously this is

like too much detailing I cannot cover

everything in this video but this is

just awesome right this is awesome you

have to agree so if you also want to see


PG header so this is how it looks like

every page in the cache is controlled by

an instance of the following structure

so the link list so this is like a node

in the link list in simple terms you can

see this is the cache object page handle

and you can also see dirty next dirty

previous number of users of this page

next element in the list of dirty Pages

previous element in the list of dirty

pages so it's like the node of the W

link list right so you can go through

this quote this is pretty awesome as you

must have understood by now one of the

most important components of escolite

Architecture is pager it does so many

things that understanding pager itself

can be divided into some modules so I

thought let's make a small diagram of

that so that this will give you further

Clarity of exactly what all things are

being managed by Pam so what did we

discuss that there is a component on top

for b or B+ trees then there is the

pager component so let's visualize our

pager component properly now and Below

pager component is what there is the OS

layer correct so this is going to call

the pager component and the pager


component below the pager component

there is the OS layer so now what are

things does pager component do so first

of all there is Page cash because it

does cash management right so this is

like the most important thing and in the

OS layer we do interaction with the DB

file so we can have like this we can

visualize it like this other than this

pager is responsible for taking care of

our asset transactions so we can say

that it is also like a transaction

manager it is playing the role of a

transaction manager for us

correct then it also plays the role of

lock manager because it is the pager

that takes the logs over DB so all the

read write logs that we take whatever

read light logs that we take you will be

able to see that it happens in the pager

component other than this it also plays

the role of a log manager now what is a

log manager we discuss about journaling

right so we have a journal file over

here we talked about right ahead logging

right that is what that is called log

management so it is also playing the

role of a log manager so if you see that

pager itself takes care of transactions

it takes care of logs it takes care of


cash management it takes care of log

management so there are so many things

that pager does it's time that we get

into the code of page and really

understand it in a lot more detail I

have now opened the official

documentation of sqlite because honestly

it is just pretty awesome and it is very

easy to understand as well I've gone

through a lot of it and obviously the

more you read the more you will

understand let's look at the most

important point so here you can see that

they are talking about the DB file so

the complete state of sqi DB is usually

contained in a single file on disk as we

know which is called the main DB file

during a transaction so when a

transaction is happening sqi stores

additional info in a second file called

roll back Journal as we know so they

have also explained about right ahead

locking in a lot of detail uh how it

happens what is concurrency how they

handle concurrency what is roll back so

how the crash handling happens so you

can read the documentation in a lot of

detail what I wanted to focus on right

now is on the pages see the main DB file


consists of one or more pages we had

visualized it in the starting right that

there's a disk file and then it is

divided into small small Pages see all

pages within the same DB are the same

size right so they are of the same size

the numbering starts with one so the

maximum size and they have mentioned all

of this but the main thing okay that you

should understand right now that at any

point in time every page in the main DB

has a single use which is one of the

following so every page in your disk

file is one of these types it is either

a B3 page or a free list page or a

overflow page or a pointer map page or a

loog pite page so these are the only

types of pages that are possible let's

understand them in detail so what is a b

page now see B trees can be representing

like a table or an index so it can be

either a table interior page or a table

leaf page similarly it can be an either

index interior page or index Leaf page

right why they are important because see

interior so the root also is interior by

the way so leaves are different and then

Interiors are different then free list

page free list page means currently

unused it can be used other all the


pages are active free list is not active

then we have payload overflow page what

does this mean why is it needed see

because page is of fixed size sometimes

what happens is if the payload is huge

and it cannot come within one page then

for that we have payload overflow Pages

this is an interesting thing right

because sometimes the payload can be

huge if it doesn't come within one page

then what so that is why there are

overflow Pages there is pointer map page

lock pite page and all of that so this

about pages is very interesting because

when we see the code you can actually

see okay this is a P3 Pages free list

page and so on so understanding this

really helps a lot now coming to the DB

file the file itself we will actually

see the file and it has a format so what

happens is that the first 100 pites of

the DB file compris the DB file header

now this header is divided into Fields I

will actually open the DB file and show

you the first 100 bytes you will be

actually able to see that first it will

say site format 3 you will will be able

to see it that the top is so that then

it will say the right version the re


version everything so the first 100

bytes and it is given so nicely that see

first 16 bytes will be this then from

from 16th bite for size two this is what

will be there from 18th bite from 19

bytes so for the 100 bytes it is given

so properly right and they have given

what is the page size file format

version numbers they explained

everything in a lot of detail so

obviously this is like too much

detailing we cannot discuss in that

level but I hope you get some idea see

what are P Tre Pages for this also how

does it look like uh so again for that

what is the header format it is

mentioned you will be able to see this

again then what is a leaf cell what is

interior cell again it is mentioned over

here what is a pointer map what is cell

payload what are over full pages right

so going into that much retailing is

obviously not possible but the main

things you should be able to to

understand and I hope now you understand

that how SQL table is represented see

this each ordinary SQL table in the DB

schema is represented on disk by a table

B Tre so each table has its own B tree

right each entry in the table B Tre


corresponds to a row of the SQL table

the row ID of the SQL table is the 64

bit sign integer for each entry in the

table B tree the content of each each

table row is stored in the DB file by

first combining the values of the

various columns this is fine then

there's primary key and all of this

right representation another important

thing to see storage of SQL database

schema so what happens whenever we

create a table there is one special

table that is created SQL schema and we

will see this table being created when

we create a table this schema will be

created why you remember we read the

schema and see whether there's a table

that already exists with this name or

with this schema or something like that

so that read schema what are we doing we

are actually reading this schema so when

we create table internally when I

actually debug and show you just create

table internally there are like around 8

to nine SQL statements that get run so

you will see that the SQL run pass

statement that we saw that will be

called like 8 to nine times that is why

that is like a completely different


video Al together because within one

create table statement 8 to 9 times we

will pass every statement we will see

exactly what is happening because

there'll be a lot of things that we will

be doing internally we'll be creating

the SQ schema right so we'll be we'll be

talking about virtual tables and things

like that so there is see alternate

names so the there is escolite Master

there escolite temp schema there's

escolite temp master so you will

actually see escolite master and temp

Master we will see this properly right

so this is this gives you a rough idea

of how things are stored in internally

considering how important pager is we

should definitely look at the code

quickly so you can see pager do c this

is the implementation of the page cach

subsystem pager is used to access the DB

disk file right it implements Atomic

commit and roll back through the use of

a journal file that is separate from the

DB file the pager also implements file

locking to prevent two processes from

writing the same DB file simultaneously

now this is very very important one

understand firstly as we discussed pager

acts like what transaction manager so it


helps us with atomic commit or roll back

and because it is also doing journaling

it is also acting like a log manager now

this statement is helping us understand

that it is also acting like a lock

manager why lock manager see firstly you

need to understand that it implements

lock on the entire DB file not on a

table not on a row of table but on

entire DB file and when does it

implement the log it prevents two

processes from writing the same DB file

or from reading the DB while another is

writing that means that while one

process is writing the DB file no other

process can write or read so write log

is exclusive I'm repeating this this is

very very important to understand what

pager does is it lcks the entire DB file

so what we are doing is when one process

is writing to the DB file no other

process can read or write so the lock is

exclusive so our pager is acting like a

lock manager like a log manager like a

transaction manager and here they have

also explained the design of the pager

just how we saw the like how the header

looks and all of that right so if you go

down you'll be able to see that and also


the state is also given very nicely so

the states of the pager can be open

reader uh so writer logg wrer cach mod

wrer DB mod and they have also explained

that from which state can you transition

to which state like from open to reader

reader to open reader to write log and

how does it happen so it starts in this

state and then what happens see I can

create like a separate video all

together covering just about the pager

and let me know if this would be

something that would you would be

interested in I thought this is a very

good overview but if you want me to go

into detailing of each of these

components I can actually do that we

have a lot of more databases coming up

and I would love feedback because this

is like an entire series are you

enjoying if you would like me to go into

more detailing for any particular

component let me know I would love to do

that so here you can see the different

states that are possible how the lock is

happening right so the Locking States so

no lock shared Lo lock reserved lock

exclusive lock so exactly when which

lock happens so that is also mentioned

right that when error happens when when


you you can actually Let It Go when

exclusive lock happens when shared lock

happens so it is just awesome again I

can go on about this so one more thing I

want to show before we end the video so

this file is obviously huge this is

about the pager because pager is like

the most important part right here you

can actually see the code for moving the

pages how the journal management is

happening how the cash management is

happening cash move Market as dirty drop

it so all of this exactly how it is

happening we'll be able to see this

right other than this pager for B tree

also although we have like a separate

video Al together covering B Tre and why

do these databases usually use B Tre uh

but this documentation is also very nice

that how it has n+ one pointers to sub

pages so the format detailing is

mentioned so how we just saw right that

first we have a header string which

shows sqid format and then there is Page

size file format by the way can you see

more light over here because it's

morning I have been recording entire

night so can you see some light more

from this side because it is 6:00 a.m.


right now cool anyway uh I hope the

entire effort of entire night because I

keep re-recording this and I keep trying

to make the explanations as simple as

possible the entire effort of the entire

night is going to be worth it if you can

just hit that subscribe button and share

just share with your friends or on

LinkedIn or on Insta wherever you are

comfortable it would mean just so much

to me literally it is morning I can show

it to you I will put up a insta status

right now right it is morning but cool

let's get back to it so can you see file

header page header sell pointer array so

this is how it looks

like OS Lev call if you see the file

over here is we have the entire command

with us we are back to the exact command

but now what is the statement that is

going to be run select all from the SQ

light Master because we have given

primary key it is internally creating an

index what is the SQL statement insert

into SQ light Master the values are this

is the index as soon as we saw primary

key here was the second part of

execution see what is the statement

update this set type table demo DB

Journal there is demo DB Journal there


is no when I started my journey as a

software engineer I used to always think

that only the super cool software

Engineers would be able to understand

such complex code of databases so yes

this is a dream come true for me also as

a teacher this has always been in my

tutour list because my ultimate goal as

a software engineer as a teacher is to

make such complex topics simple and to

cover them in depth right so yes welcome

to another video in the free course of

databases in depth by educ courses where

we not only cover databases in theory

but also debug the internal code and yes

we are starting with sqlite which is one

of the most simplest but one of the most

famous databases in the industry this is

the perfect database to get started with

we are going to be debuging the code

this is going to be super interesting

thank you for showing up without wasting

any time let's get started but before we

get started just one small thing

although this is completely free for you

if you could just hit that subscribe

button and if you could just share it

with your friends or anyone who is

looking to become a better software


engineer it would mean the world to me

it would mean so much all my effort all

the effort behind this entire series

would be completely worth it so thank

you so much for that and now let's get

started since we are going to be

debugging the code the first question

that comes is how are we going to

compile and run such a huge project

because if you see the source code there

are so many C files it looks huge how do

we compile this how do we run this

because site is supposed to be very

simple database it is supposed to be

edable you are not required a different

process for it you are not supposed to

have any extra configurations nothing it

is supposed to be simple to use for

small to medium level applications so

how is that going to happen so if we

open the official documentation of SQ

light and if you just go to this

download button you are going to get

this page so SQ download page and if we

see the first thing that first zip file

that we can download it says site

amalgamation see source code as an

amalgamation what does that mean

let's see some more documentation here

it says how to compile SQ light so if we


read about it it says the use of the

amalgamation is recommended for all the

applications building SQ light directly

from the individual source code files

that means from the GitHub source code

that we just saw is certainly possible

but it is not recommended then what is

recommended for some specialized

applications it might be necessary to

modify the build process in ways that

cannot be done using the pre-built

amalgamation which is fine not from the

website but for those situations it is

fine but other than that use the

amalgamation what does that mean so

let's see see site is available as a

prepackaged amalgamation source code

file which is SQ 3. C I know you might

be a bit confused right now just stay

with me for a bit amalgamation file is a

single file of C code that implements

the entire SQ light Library this is much

easier to deal with everything is

contained within a single code file so

it is easy to drop into Source tree of a

larger C or C++ program what does this

mean let's actually see the site 3. C

file and let's try to understand so I've

opened the site 3. C file and if you see


over here it is written begin file SQ

light int.ed what does that mean so if I

just see begin file do you remember we

saw this in the starting of every single

file when we were going through the code

and let's see what is actually happening

so if I just search for begin file I've

searched for begin file can you see

there are 138 matches and if I just go

keep going down you can see VX works.

sqlite 3. Ed sqlite3 rt. Ed session. Ed

something. Edge limit. hh. par. os. OS

setup. pag. v3. VDB op. pc. mx. OS

common. c time. c global. c status. c so

all the C and H files together I think

you are getting my point and if we go to

the top of the file you can see what is

written over here so this file is an

amalgamation of many separate C source

files from SQ light version by combining

all the individual code files into a

single large file

the entire code can be compiled as a

single translation unit this allows many

compilers to do optimizations that would

not be possible if the files were

compiled separately even performance

improvements of 5% or more are commonly

seen when sqlite is compiled as a single

translation unit so even I was trying to


compile all the source codes separately

and doing all of that but when I read

the documentation I was like oh this is

just so cool and so easy to use and that

is why site is famous right so this file

is all you need to compile SQ light to

use SQ light in other programs you need

this file and the SQ light 3. which I

have right over here SQ 3.h and what it

is saying is that this is what you need

that defines the programming interface

to the site library and there are more

details mentioned that we can keep

reading about it and if we come back to

our official documentation you can see

that a build of the command line

interface requires three source files SQ

3. C SQ 3.h and it is saying it requires

one more shell. C so what this does is

it it contains the definition of main

routine so what I have done is I

downloaded from here only and when I

downloaded there were three files but

what I've done is I've written my own

main so that I can show you whatever

commands I want to run so let me show

you that so here you can see three files

SQ 3. C SQ 3. HED and then there is one

main file this file I have written so


this is a simple main function and

inside this this is what I am doing and

we are going to be debugging so open the

database create table insert all of this

right so we can I have commented the

insert update delete for now but what we

are going to be debugging firstly is the

create table then insert then select

we're going to be debugging like this

the entire thing as you can see I have

put break points literally after every

single statement in main so it is is

going to be very clear we are going to

go one by one our major focus is going

to be on the create table statement

although it looks very simple create

table statement there's a lot more that

goes behind it and we're going to see

how much is there behind one single

statement for now let's get started with

a simple open database statement and I'm

going to build it and run it for now so

you can see I have a lot of break points

and I'm going to show you the important

functions that are going to be hit and

I'm going to be using the cursor as much

as possible so that you know exactly

what I am doing here can you see right

now there is inside DB if you see there

is already VFS VDB do you remember these


terms VFS VDB a bit yeah let's see what

exactly is happening let's go inside

open and if we use open or if you use

open version 2 internally open database

is called so let's go inside this and

let's see what's happening so if you see

the file name was demo. DB that is what

is sent that you have to open a database

with this name demo. DB and uh there are

a lot of things that happens you can see

there is initialize and all of this but

let me show you the most important

things so for now there is B Tree open

correct so you are opening the B tree

sort of and if I run it again see here

comes the interesting part so this is a

OS Lev call if you see the file over

here is users K basani Library developer

excode something site test and build

products debug so let us see what is

there in that folder right now so let me

just go to the command line quickly and

let's see what is there inside this so

right now there is only sq. test and now

let's do this let's run just this one

line and let's see what happens so we

are inside robust open and if we go back

and run it again can you see demo. DB so

that is where the exact file was created


where so if I show it to you again this

is where because you're opening the file

right so you are opening the file with

okay if I just come like this you will

be able to see uh this is where B Tree

open pager open so remember B Tre to

pager so here you can actually see it

happening and this is the Unix open so

it got run and everywhere if you see in

the Unix open also what was the name

that was given debug and if you see

entire name over here demo. dbdb demo.

DB and this is the one that got created

correct and we are inside this right now

so right now I just wanted to show that

where it is being created and if I just

quickly run it now we are going to be in

the create table and let's quickly just

see what else is there right now our

condition is there's demo. DB and

there's sqli test before we run create

table statement let me tell you that

internally there are a lot of commands

that are going to be run in create table

and it can be overwhelming when we

directly debug so let me give you some

context and let me tell you exactly what

is going to happen so that you have some

idea and you know that why so many

commands are being run so let's do that


before we move in I would just like to

take a minute over here to tell you that

if you're preparing for interviews or if

you're looking to become a better

software engineer I would love to be

part of your Learning Journey and make

it easier faster better and more fun I

teach live plus you also get access to

the recording so you can learn at your

own pace and you get a consistent

teaching style because I teach all hld

lld DSA Project based courses where we

cover industry level projects like

YouTube WhatsApp zeroa we cover M Stack

as well as devops and we also have C++

plus if you sign up for any of our

courses you get lifetime access to all

the batches the past present future

everything you also get access to our

mock interview platform you can come

practice anytime we are going to guide

you we obviously have doubt support as

well we are putting our heart and soul

into teaching so if interested at least

check out the site all the detailed

curriculums the testimonials the answers

to all the general FAQ everything is

present over here just check it out educ

courses.com is the site plus if you


still have any questions feel free to

reach out to us at support the r

courses.com we will also guide you about

what is better for you at least check

out the testimonials for now the best

thing that I like about edu courses is

their courses are live Tey herself way

she teaches the way the courses are

structured are actually the best part

their live classes and the very helpful

bort Community I feel very confident as

compared to what I was before these

courses as a software engineer my

experience at ad courses have been

really great I feel more confident as a

software engineer after enrolling in Ed

coures but for now let's get back to the

video this is the exact create statement

that we are going to be running and

debugging so what is going to happen is

first we are going to tokenize and

understand and pass it so first we're

going to come across create token and

then we are going to come across table

so we know we are we are supposed to

create a table then we are going to see

that when we come across this starting

of the bracket we know that we have our

table name now so what is the table name

in our case it is courses so we know


that we supposed to create a table with

the name courses the first thing that we

supposed to do is make sure that

whatever DB we have right now there is

no table or index with the same name it

is possible right so while are debugging

this is the first table that we are

creating but in reality it is possible

that there was another table or index

that was created with the same name and

if that was the case sqlite is supposed

to throw you error saying that there's

already a table or index with the same

name so we are supposed to do this

checking now how do we do this checking

first thing that we have to do is we

have to go through our database and see

what are the present tables what are the

present indexes they can also be

triggers views so we have to first go

through our database so how do we do

that so so for that what we have is

there is another special table that is

created in SQ light which is called SQ

light master so this is a special table

that is created and it has the schema of

all the tables and excess triggers views

that are created I will actually show

you the first statement that is supposed


to be run is the select statement so

what we are going to do is we are going

to run a select query on SQ light master

and we are going to see that what are

the present tables and indexes what are

the schema is there any table or index

with the name courses already or not now

when is this site Master created so

because till now there is no table or no

index that has been created for now

right in our debugging we just created

the database nothing else has been

created so in that case there is yet

another query that is going to be run

and the first query that is going to be

run is going to be a create table query

so first we're going to create the table

and we are going to create what we are

going to create the SQ light master and

I will actually show this to you and

then we will run a select query to see

all the tables and indexes that are

there to read the schema so this is the

create table SQL command that we are

going to be executing so here we use SQ

3 open now we are going to be running

this command sqid 3 exit here you can

see we'll be passing the query as well

so let's do that let's go inside this

exact function so so if we come over


here we have passed the entire query

right so initially what is the first

thing that we are supposed to do we are

supposed to tokenize it so for doing

that firstly there's going to be a

prepared function that is going to be

hit so first we are going to take a lock

and we going to prepare so there's going

to be B3 log so we going to take the

mutex and all of that and there's

prepare the main function that we are

supposed to care about is this site 3

run parcel do you remember we saw this

so here we have the entire command with

us what we are going to be doing now is

actually see the while loop and we will

see that okay create table it is

tokenized and can we see the tokens or

not so let's continue to the while loop

so see while loop is going to start from

here and let's come inside this and now

you can see the entire SQL statement

right now what is the token type so as

soon as we get the token right now the

token type is 17 why 17 so if we go

inside this and see what was the token

type for

17 so let's see 17 is cre see so what

did we do in get token you remember we


actually saw this code so if we go to

get token we had seen that we see one by

one that what it is right so right now

it is create the next one that is

supposed to be come that is supposed to

come is space right and then after that

table so we are going to go one by one

like this so there is a y Loop and if

you see this y Loop is still where so

you can see the while loop is still here

right and let me just continue and come

back to the next iteration so see the

next one is 184 so if we go inside this

and see what is 184 so it is a space

right so create space the next one is

supposed to be table so if we go the up

and see what was the number for table so

for table it is 16 so the next one

should be 16 so if I just continue again

and I hit this break point so you can

see it is 16 for now and let's go to

next and see what happens so the last

token is saved over here and you can see

the statement is also trimming off right

like you because we have already taken

care of create space and now we are

dealing with table so one by one see

here is where we are n is the size of

the token right and now that is how we

are moving this is like a simple while


loop that we just moving to the end of

the next token right and just we moved

over there so now we are at space

courses so if I just do again right now

the token type is 16 and if I go to the

next line you can see that it is 184 why

again because it was space so now we are

going to be dealing with space if I

continue again so now we are going to be

starting with courses now what happens

in courses see it is 59 what is 59 if we

jump to the definition and see what is

59 see 59 is ID so there is some ID that

is there correct courses is an ID now

let's see what happens so where are we

going to go so there is also parsel over

here right so if you go inside this and

if you want to see this is how the

parcel looks like remember we saw this

entire thing that this is how the lemon

parcel looks like so we can keep hitting

this again and again but why to see the

parcing code so I'm just moving ahead

you know exactly what is happening right

if I move ahead now now we are going to

come across this braacket so we know

that we have now the name and now we

supposed to see we are supposed to read

the schema so let's see how that is


going to happen see what is the token

type right now it is 22 let's see what

is 22 so if we go back over here and we

see what is 22 see it is LP so I think

it stands for

parenthesis and if we go inside

now see so when the passsing happened I

had put a break point over here now it

is saying that you have to start the

table so this is where we are going to

start the table see all the parts

structure is there so inside this you

can see VDB and all of this is null

right now let's go inside this and let's

see what is going to happen so now we

know that we supposed to start creating

a table right now comes the interesting

part so now we have have tokenized to

create table courses and then bracket so

we know that we are supposed to start a

new table so what we are going to do is

we are going to initialize so can you

see over here we are going to try

initializing something so firstly uh

okay let's move ahead for now so you can

see that the name is there so okay let

me just show you the name give me one

sec yep can you see it over here so this

is the token so here the entire thing is

passed over here so what is the


remaining token has been passed so let's

go

ahead we are doing some o checking and

let's move ahead for now now can you see

we are going to read the schema this is

exactly what I was talking about make

sure the new table name does not collide

with an existing index or table name in

the same DB issue an error message if it

does so here you can see we can issue an

ER message that the view already exists

or the table already exists okay there's

already an index name this so if there's

an index name this we can also say that

okay so we have to first of all read the

schema so let's go inside this and let's

see that how do we read the schema okay

and if we go inside here is where we are

going to be reading the schema this

routine is a no op if the DB schema is

already initialized because the schema

is not initialized yes this is the first

statement that we are going to be

running so that is why we we are going

to run the initialize and let's go

inside this and because this is the

first time we are going to be going

inside the init function and you can see

we are inside initialize initialize all


DB files the main DB file the file used

to store temporary tables and additional

DB files created using attach statements

now let's see exactly what is happening

next very important function that we

should be looking at is in it one and

let's go inside this and let's see what

is happening so if I go inside this you

can see that some transaction okay right

now I think I can continue and let me

just continue for now and see I inside

in it one and you can see there are some

things that are being done and here you

can see we are forming a new statement

saying create table X type text name

text table name root page and SQL this

is the schema table that we are creating

okay this is the schema table and the

name we are going to give it site master

so in this file actually if I just

search for sitecore master so you can

see there's a legacy schema table that

is going to be created similarly there's

a temp master that is going to be

created Legacy temp schema table so what

we are creating right now is the Legacy

schema table so if I do next here you

can see this statement is saved in this

argument and what we are going to do is

in this call back we are actually going


to execute this new comp command so this

is a new command altoe so again whatever

we have seen so far so prepare lock and

prepare run parser all of this is going

to happen again for this new command so

this is a new SQL command all together

so within a create table command that we

are running internally there are a lot

of commands that are going to be run so

here you can see we are constructing the

inmemory representation schema tables so

SQ schema or site temp schema by

invoking the parel directly so now this

entire table is going to be created and

interestingly the schema of this schema

table is also very important because you

will see that within this table we going

to store schema of all the tables

indexes views and triggers so when you

see type type is going to signify so

here can you see the type so what we are

going to pass is a table because we're

creating a table right now so in the

type you're going to save whether it is

a table it is a index it is a trigger or

a view then this is where you're going

to store your table name this is the

name of the object and this is the root

page number so you remember every B tree


so there is a root page so this is the

page number of the root page of the B

tree right and this is where you're

going to store the entire SQL statement

that created this particular table or

index and you will see later also when

we are going to compare the schemas you

will actually see that the entire SQL

statement that you were running so in

our main that we are seeing no create

table this this this this is actually

going to be stored as it is within this

schema table because the entire SQL

command that was run to create this

particular table or index or Trigger or

view that entire SQL command is going to

be stored so that you can check the

schema of that also so whenever you're

going to insert it is going to validate

that see this was the SQL command that

created this particular table now you're

saying to insert is the schema matching

or not because how will it do the

validation that okay you are not sending

any particular field that is supposed to

be not null or primary or something like

that so this table is going to give you

the entire schema so that is why this is

a very very important table so see this

is where we are saying that this is


going to be a statement so if I continue

now see now we are in call back and if

we go ahead so if I just go ahead and

continue actually you will see that I am

going to prepare and what are we passing

the fourth argument what was the fourth

argument the entire SQL statement that

we made there and we passed right so

this is a huge Cod but I'm showing you

the most important thing so we are in

the init call back right now and we are

passing that fourth argument and this is

where the statement is going to be there

and see right now we are back to prepare

so where did we start from so from our

mean we went to exact to prepare to run

parcel now again parcel will be run but

if you see now what is the statement

that it is the new statement it is not

the statement that we gave in main it is

the statement to create the SQ light

master table so if you see this is the

type table name root page SQL text again

if I continue uh now we are going to

again pass this again we can see the Y

Loop see again you can see the

tokenization here you can see right now

it is 17 create and if I do again again

it is going to be space so 184 if I do


again now it is going to be 16 so 17 184

16 just how we saw earlier and if I do

again so it is going to be of the type

184 against space and if I come now it

is going to be 59 which was ID and if I

continue again now let's see what is

going to happen again we are back to

start table but now what table are we

starting the site master and not the

table courses that we were supposed to

create so this is interesting right and

if we go inside this see if we actually

see the names right now so here can you

see this entire C type text this this

this and X is going to be sent

separately so if we see you will

actually see the X name as well let me

just show it to you uh so here can you

see right now

no see now when we read the

schema okay so I pressed continue but

what you saw was that right now in it

was not hit because this is not true now

we are not going to get inside this

because we have already set the state

and we are creating what site master so

if I move ahead now now we have said

that okay we are going to start creating

the table and this is what we are going

to be passing so now we are going to


tokenize the rest of the SQL statement

so we are inside the Y Loop so there was

type which was of 184 then there's going

to be space uh then 59 right so text

comma so if we see 59 again and if we do

name then can you see this moving so

text then text is gone now comma so we

are just going through the rest of the

statement so we can debug it like this

so root page space int comma SQL

and then text so right now we are just

tokenizing and passing the rest of the

entire statement and what is the going

to be the important moment now when we

come across the bracket that means that

now we have all the things now we are

ready to create the table right so if I

do continue now now you can see we are

in the end table what does this mean

this means this routine is called to

report the final this that terminates

the create table statement so now we are

seeing ke now we have tokenized pass the

rest of the statement now we are at the

end of the create table statement which

create table are we doing create table

of the esuite master so we are at the

end of that let's now see what happens

inside this so when we are at the end so


here we are just checking and we are

going to set some flags and everything

and initialize the entire table so

schema to index and I'm just going to

continue for now and here we are saying

par of finalize because whatever engine

was there now we are done with the

creation of SQ light master and I'm

going to continue again and now we are

in site 3 finalized so do you remember

while executing any site command so we

go through site exec and then there is

finaly so this is finalized for what

statement so the statement was the

creation of that and here you remember

so whatever statement is there that

becomes the VDB the engine and let's see

so right now there is No statement so we

are just going to go ahead and we are

going to see the finalize in the next

statement so that's fine and let's just

go back let's just move ahead so now we

have created the table we have created

which table we have created the site

master table so this init call back is

now done so a quick recap from the

create table of courses that we were

creating we went to read schema and then

we saw that okay nothing has been

created till now so we have to create


create the schema table so this is where

we created the schema table we are done

with initializing the schema table so

now what is the next thing that we are

going to do if we move

ahead okay if we move ahead so you can

see uh there is so B3 transaction that

we are starting now that we are clear

with that much you should know that we

are still in our init okay so I'm just

going to continue and Go to the next

important thing so right now just know

that this is the huge function where we

actually created the table so the

created the SQ light master table okay

so we are still inside in it one so if I

just go to next we are ins we are still

inside that but now we are going to run

yet another SQL command which is select

all from what is that we are selecting

all from from site master so the very

first table or index which will be

created at that time we will be creating

the table but if that is done see the

create table statement is not going to

happen again and again because the this

was the first table that was created

that is why this was called so if you

see in it busy is the variable that they


are setting to see that okay this is the

first time that we have to create now

that it is created what we are going to

do is we are going to select all from it

and we are going to see the schema now

so now we are going to see the schema so

now next time also when you create a

table in in this database every time you

create a table or index or anything this

will be run so if we come inside this we

are going to select all from from this

and now we are going to execute a select

query So within a create query we have

now run a create query and now we are

running the select query and if I just

move ahead and if I just do continue so

we are back to the exact command but now

what is the statement that is going to

be run select all from the SQ light

master and from where did we call this

so we were just over here right now

right site all from uh the master table

we just saw this so we are just inside

this right now and there's also an init

call back inside this okay so if I just

continue this is the statement that is

going to be run again we are going to

prepare so if I do continue again we are

in Lock and prepare and again we are in

prepare and now we are again going to be


where run parer but now what is the

statement that is going to be passed

select all from so if I again do

continue again if I tokenize this what

is the first token that is going to be

there 138 let's see what is 138

so if I jump to definition and see what

is 138 so you can see 138 is Select so

it is what select all from so the next

one is supposed to be asterisk which is

all so if I just do continue the next is

108 so if we again see what is 108 so

let's see what is 108 it is star see so

select all and the next one is supposed

to be from so if I just search from over

here uh oh

okay it will start searching in the

entire file let's just uh go ahead and

let's see what is the next one so what

is the token type right now it is 108 if

I run again it is 142 again let's jump

to the definition and see what is 142 it

is from see so select all from and if I

do continue again here you can see 141

again what is 141 so if we go to the

definition 141 it is dot Okay so this

dot it is trying to say then we are

going to come across SQ light master and

then next is 59 59 was what I think ID


correct so 59

is

59 is ID because site Master is ID next

should be space which I think was 184 or

something let's see yes space is 184 so

if we run it the next is space which is

184 now then it will be order then again

184 then buy so so if we do continue

again it is 145 so 145 should be order

if we see 145 is order then it should be

184 again yes and if we do continue

again it should be by so if it is 34

let's

see

34 is by and see right now I am showing

you only the tokenization part but

within this y Loop every single time we

are going to the parser also so if I

just put like a debugger inside this if

I just do this you can see that within

every V statement we are coming inside

this only and within this there are if

and else so now you should be able to

understand that how tokenization and

passing is happening properly right so

if we see over here there is a while one

for passing also so if I just continue

we'll be back over here and it is again

space and if I continue again now there

will be row ID and again we going to be


passing and if I continue now let's move

ahead and now we are in select right

where did we go from select so in our

reduce just how we came to start table

right so if you remember over here only

they would have in a start table

somewhere and that is where all the ifls

are there so there is drop table create

view drop table and uh select and there

would have been a a break point over

here somewhere for start table that we

just hit the last time right now we are

in so yeah you can see over here correct

so you can see add not null add primary

key create index so all of these will be

hit so right now we are in select and

let's move ahead so what is going to

happen is now we are selecting the

information we are getting the

information from where we are getting

the info from esuite master so that is

the table that we are reading let's

quickly move ahead and see what is going

to happen so this is Select expander and

this is how selection is going to happen

I think we can just continue so see we

are reading the schema again right now

it should not go into the if statement

so because initialization has been done


right see it did not go into the if

statement it is going to go inside the

if statement only the first time only

the first time we are creating any table

or index or Trigger or view only the

first time it is going to go inside that

after that not okay and right now we are

going to go to find table we are going

to find a table with the name site

master so you can actually go inside

this and when we do find table also it

is again going to check in the DB only

right so it is going to compare is there

anything with the name site master and

then it is going to compare one by one

csq light underscore and it is checking

are we finding the schema table or the

temp schema table so we are finding the

schema table right so if we go actually

ahead so we are going to locate the

table now

I'm going to just continue from over

here and we are in pass of finalize and

let's run it and now we are going to

execute the statement which statement

are we executing we are executing the

statement select all so what did we do

till now we just prepared the statement

all the initialization happened where

does the actual execution happen in the


step right so if we go inside this see

you can see this is where the execution

is going to happen so now we have the

prepared statement so now we are moving

from the front end to the back end of

site correct so this is the top level

implementation of esite three step call

esite three step to do most of the work

so this is where the actual work is

going to start if you can see the

prepared statement is now typ casted to

VDB right VDB is our first component in

the back end of SQ light so we are now

in VDB the we are taking M Tex lock

and let's move

ahead so do you remember what exactly

happens in VM VM is nothing but a huge

switch statement and there are op codes

within right so this is the op codes and

there are operant and all we will not be

discussing up to that level but you can

see the switch statement so can you see

the switch statement being executed one

by one so if I just move ahead and if I

just show you

see this is the switch so this is where

we are going through every op code one

by one what follows is a massive switch

statement so do you remember in VM


there's only a huge switch statement

right that is why understanding the

basics is very very important so here

there are some op codes and there

there's a switch statement in the VM so

let's continue for now and I just wanted

to show you the switch statement once

and I'm going to remove the debugger

from there and just going to

continue so this is VDB halt so now VDB

part is going to be finished let's move

ahead and you can see B3 commit phase

one so a quick recap of exactly what is

happening so we are executing the select

all statement from site master so we

were in the exact statement right from

there we went to the step this is where

the actual execution happens and now we

have the prepared statement so in the

step is where we are going to be typ

casting the statement the prepared

statement to the VDB pointer so now we

have the VD VDB pointer and then we saw

there was a huge switch case that was

executed for the op codes now from after

the VDB component what is the next

component that is hit it is the B3 one

now in B3 there are two faces phase one

and phase two and B tree component will

be using which component the pager


component So within B tree phase one you

can see there is pager commit phase one

called and similarly within B phase two

we will be able to see pager phase two

call so right now there's a select

statement right we will see the commit

phase one and phase two execution later

for now I am moving ahead from phase one

and now we are going to phase two with

for the B3 and if I go ahead in the B3

Phase 2 see if I just go ahead and I am

outside and now we'll be doing the

comment and if I just continue for now

and we are going outside the bdb part

and I think we can just return if I just

go out I go out so we are coming out of

this this a step right so we are coming

out of this step if I just

continue VDB finalize so we are doing

what we are doing the clean up and

delete the VDB okay and we DB free and

right now we were executing what select

all statement so remember that again

phase one phase

two in it one and now let's move ahead

and see where are we so the create table

part is going to get executed again

let's move ahead call back prepare again

prepare parsel what is the statement


right now it is create table I know the

question right now is how did we get

into create table all over again with

the type text name so this looks like

schema of site Master right how are we

creating esite Master again so no we are

not creating esuite Master we are

creating a temp master I know it can be

confusing just stay with me stay patient

how tokenizer part is now completely

clear right slowly it will become clear

let me do a quick recap for you to

completely understand what is happening

we started from create table courses

then we did a create table of esite

Master then we did select all from esite

master that happened then we were still

in in it and still create table is

called again right create table is

called again but now what is the x value

that is passed so now let's see that

right so if you go ahead

and come to the parser and if I just go

right now so the value of x should be

what so if I move ahead and I just do

this and let us just actually go to

create table and I'm removing the break

point from over here and I'm just going

to go directly to the start table now

see start table what is the name of the


table that has come let's see so let's

move ahead let's move ahead what is the

name so it is a schema table is what we

know and let's move ahead let's move

ahead just stay patient with me stay

with me okay we are checking is it

already there we are doing some o check

we moving ahead and now we reading the

schema again okay I know it can be

confusing stay with me stay with me come

on read schema find table and let's see

what is the name SQ light temp master so

you remember when we saw site Master the

Legacy table so if we just search

sitecore Master if I just searched that

so there was the Legacy schema table and

there's Legacy temp schema table now

when is this created why is it needed so

this is to store schema of the temporary

objects so whenever we are creating a

new schema no this schema table is

created the temp schema table is created

so this temp schema table is responsible

for temporarily holding the the schema

information because we are creating a

new schema in esite master right we're

creating a new entry all together

because we are doing that temporarily we

are going to store it in cite temp


master and that is why we first have to

create the esite temp master so the

first create statement is actually going

to be a lot more complex the first

create table statement of the entire DB

but once all of this happens you don't

have to keep creating site Master over

and over again right because it has

already been created so after that

Things become more simpler but because

this is the first time it is also

interesting because you'll be able to

understand right so now what we are

creating we are creating the site temp

master so if I just run right now and

now we are back to type text this this

this now what is the remaining this we

are creating again the site temp master

so I'm going to run see text comma name

just notice over here space text Comma

just keep noticing this table name so

the schema is same only it is just for

temporary objects so the schema of esite

Master and esite temp Master is same

only that is why all of this is same

correct so text comma root page space

int comma SQL space text now as soon as

we come across this what should happen

end of the creation so end table this is

where we are at the end of the create


table statement right so what are we

ending create creation of table of esite

temp master so I will run now paral

finalize I'll move ahead and esite 3

finalize we are finally done with

creating the temp master and let's move

ahead now we are in which one now

finally we are done with creation

selection of site master and site temp

Master now we are going to get started

with the rest of our courses Valor table

so if you go back to this if you

forgotten so this was the command that

we were running till now we have run

only this create table courses now ID in

primary key not null title so there are

three things one is the ID title and

then duration in weeks so if we go back

over here so if I just go to the next

statement here you can see ID uh then

title duration right so right now we are

in this bracket so 59 so if I go inside

this and see what is is at

59 see 59 is the ID so we are at ID

right ID let's just run this now okay

now when we come across primary key what

is going to happen now we have to

register a primary key right because

when we say create table it has to


register somewhere that okay this is a

primary key so that also should happen

internally in esite right so currently

the token is 184 and if I just go to the

next statement and I move move ahead see

as soon as I come across the parser so

this is was the space and as soon as I

move ahead can you see add primary key

so we came over here see we are in add

primary key how did we come here from

parser again we reduced to add primary

key so as soon as we saw okay primary

key space then we saw oh we have to

register the primary key so inside

primary key if we come what are we

supposed to do we supposed to create a

new Index right this is exactly what is

supposed to happen internally in SQ

right because we have given primary key

it is internally creating an index so

let's go inside this and let's see how

is the index created so if I go inside

this see now we are going to create

index let's move ahead and see uh let's

just go

ahead so again we are reading the

schema again we are in

init again we are we are going to see

okay just give it one sec so we are in

in it right
now in it is

done see schema to index so if we go

inside this if the same DB is attached

more than once the first attached TB is

returned okay so we are going to

continue now okay so now comes the next

important part within create index what

is the next thing we supposed to do we

are supposed to insert this index where

is all this information of all the

tables indexes triggers views in our

schema table so in our schema table we

have to say there is a new index so

inside our create table of courses

because we had given primary key because

we had given primary key it is going to

create an index and what is going to

happen when we create an index we are

going to insert that index into our

schema table what is the schema table

sitecore master and what exact ly are we

going to store inside this so if I just

do run again from nested part what do

you think is going to be called if I do

just continue see run parsel what is the

SQL statement insert into site Master

the values are this is the index what is

the name

sitecore Auto index uncore


courses1 so for the courses table it is

creating an auto index and the uh name

is underscore 1 see if you remember the

scheme of SQ light Master how did it

look like so first was type so what was

the type initially it was table right

now it is index because what are we

inserting the index right and this is

the name of the index this is the table

name the courses right and this is the

SQL command that is going to be run and

this is what it is the root page number

so that is four so let's move ahead for

now and we are going to pass so so let's

again go inside the Y Loop what is the

token type it is 127 and let's see what

does 127 stand for so 127 is insert

right because this is the first time we

are seeing insert statement we have

already seen create we created the site

Master we created the site temp Master

we also did a select all from site

Master now we are doing insert Index

right and now we are running the insert

statement so I'm going to run now see

insert space and then into so 151 if you

see 151 is into correct and if I do

again so insert into then it will be

main do SQ light Master space values and

then this and I'm just continuing and


moving ahead and as soon as the insert

statement finishes what just happens so

in here you can see SQ three insert from

here we came to insert just how we went

to select right now we are in insert so

here you can see there's a huge insert

statement so if we come inside this now

what we are going to do if this is not a

view open the table and all the indexes

so this is where we are going to open

the table and indices if I go inside

this you can see what all things are

happening schema to index and I'm going

to continue for now again we are back

and we are finishing the insert

statement parsa finalize moving ahead

primary key is now done so back to our

original statement primary key is done

now we are in not null okay let's

continue so you can see not is done and

then space now null is being executed

and as soon as I do this you can see not

null whilea parsing is going to happen

and if we go inside this it is going to

mark it as not null right because in our

schema it is supposed to update that it

cannot be null that is why the next time

when there is any insert into the

courses table it will be able to check


the schema and say oh this is not

supposed to be null so we supposed to

update the schema now right so I'm going

to continue and you can see uh text

space not space null again not null will

be hit then duration in

again not null will be

hit see not null is hit so if you go

inside you can see it adds the not null

whilea part okay so it just updates the

column that it is supposed to be like

this now we are finally finishing what

the create table courses finally we are

going to do this and we are finally in

the end table of create right and if we

just continue for now okay okay I know

you're overb let's do a quick revision

of exactly what has been happening till

now so this was our original statement

we went to create we went to table then

we went to courses and as soon as we

came over here there was a lot that

started happening so let's quickly write

it down what all things happened so when

we came over here we did first because

this is the first time we are creating

any table or index or anything so that

is why the first thing that happened was

create table what the site Master right

the SQ light Master sorry for the handw


writing now within site Master what was

the schema so how does the schema look

like there is type which was like index

or table like that then there was name

then there was the table

name and after that there was root page

number and then there was the SQL text

correct so this was the schema of the SQ

Master now after we created the table

what did we do we did select all from

the SQ light master

why did we run this command because we

had to check for conflicts so check

conflicts and gather the current state

so current schema state so that is what

we get from the select a statement so

first we did the create statement then

we did the select all statement after

this what happened was we did one more

create a table because why because we

are creating a new schema itself so that

is is why we creating a temporary object

and for that we need the temp table so

now we did create

table it was for the SQ light temp

master and the schema of SQ light temp

Master is exactly same as this type name

table name root page and SQL so the

scheme is exactly the same just that


this table is for temporary objects and

this table is for the main objects now

one more important thing to remember

that in the select statement you

actually saw you saw that within step we

did the VDB calls so there was VDB exec

and VDB Hal and after that we did the

VDB commit after that we did the phase

one phase two right and in phase one

phase two we saw that there was B tree

and then there was pager so you saw the

flow that in select what was happening

that first we prepared the statement

which is like front end of the SQ light

and then once we had the VDB pointer we

did the op code W switch statement then

VDB commit and then it use the B3 layer

B3 layer use the pager layer and then

the OS layer gets called so all of this

happened till we were over here okay so

this was like first part so all of these

were like the first part execution then

we executed all of these tokens and then

we saw primary key oh as soon as we saw

primary key here was the second part of

execution because we have seen primary

key now what needs to be done we need to

insert a new index so that is why what

was the call that was made that insert

into so
insert into the sqq light master and

what were the values so I am literally

just seeing okay so the values were

index because that is the type right so

that was the type so here you can see

the type so the index was the type then

the name was sitecore autoindex _

courses1 so that was the index name then

there was table name which was courses

then there was root page if you saw we

saw something like this right that was

the one and then there was SQL statement

which was null so we did the insert into

statement so this happened so this

insert into statement happened because

of this primary key and once this

happened then there was not null and all

of this and then once we ended once this

happened now all of this is done now we

have reached the end

table and from here now we are seeing

yet another command which is the update

command coming back to our update

statement now why is this happening a

slot for the record has already been

allocated in the schema table we just

need to update that slot with all the

information we have collected like for

example the SQL statement the root page


all of this information we have not set

properly right so now we are updating

the schema table so this again Legacy

schema table itself which is the SQ

light Master here we are saying see this

was the SQL statement which created so

let's actually if we just go ahead with

this so from here again nested pass and

from here what should be called

according to you run pass see what is

the statement update this set type table

name was the courses table name was the

courses root Pages two and the entire

SQL command so if you see over here you

can actually see the entire SQL command

right so if I again do this this is our

SQL command see create table courses ID

primary key not so whatever SQL command

we use to create this table that entire

SQL command is going to be saved in

sqlite master so now if I just go ahead

so again tokenization passing all of

that is going to happen see now it is

129 so if we go to again

129 let's go to 129 see update So within

one create table you saw create you saw

insert you saw select and now we are

also seeing update so if I just move

ahead update then main dosq set type

equal to table space name equal to


courses comma table name equal to

courses comma root page equal to 2 then

SQL equal to the entire SQL statement

this should be one text so if I just do

this see that was one text space where

space row ID again the row ID is there

and now we are done with what we are

done with the execution of update so if

I just move ahead see par

finalize if I go

ahead VDB set SQL we are setting the SQL

over here and let's move ahead and uh

after that so we are in prepare if I

just move ahead for now see where are we

right now we are back to the exact

statement of which one our initial SQL

statement so now we are done with the

preparing of the our first statement

that we said the create table statement

see it properly create table courses

whatever so now we are done with

preparing the entire thing now we are

going to execute this now we are going

to go to step and we are going to

execute so we go inside this if I go

ahead now we are in Step see statement

is now completely prepared now VDB is

ready now this VDB again from where VDB

to B3 to pager now phase one phase two


should happen if I move ahead see step

just move ahead VDB exec so it is going

to take care of our op codes and all and

I'm I'm going to go ahead see B3 begin

transaction so from VDB to B3 correct

and if you can read about this about B3

what is happening we are starting a new

transaction because we are creating the

table right so I'm just moving ahead for

now new

database pager right see now from B3 to

page so now we are actually writing in

the page so let's see what is going to

happen let's go ahead pager right and

here see now I have opened a journal now

something interesting okay let's go to

our terminal and see if I do LS right

now there's demo. TB and sqi test what

was there initially okay let's run now

let's go ahead see VFS you remember VFS

layer we have hit that and if I just go

ahead see Journal open okay again let's

check what is there only two things

let's come back and let's run again and

now open file what file is being opened

tbu something something so here you can

see demo DB Journal Journal is what it

is going to be created as temporary it

is it should be deleted later also we

are creating it and it should be deleted


once the transaction is over so right

now again I'm showing you see there are

only two things if I just move ahead now

we have opened and if I do LS now see

there is demo DB journal and this should

be deleted once we are done with this

entire transaction so once I come back

see I go ahead and I'm going to

continue and I'm going to continue so

open is done okay and here we are going

to make the page as dirty remember what

is a dirty page when the page is

different from whatever is there on the

DB so we are marking it as dirty so you

can go over here make sure the page is

marked as dirty if it isn't dirty

already make it some

okay so that is what is happening over

here we are managing the dirty list we

go

ahead see this is the link list W code

this is where we are managing the entire

thing magic header so do you remember on

the page on the top there was given that

what is there in the header and all so

there is Magic header also if you

remember and if I just go go ahead so

this is how we are setting the page so

here you can see every page we are


making the changes we are setting the uh

you remember in the offset it was

mentioned see site format 3 Do You

Remember at the top of the file there is

site format 3 so this was also mentioned

somewhere in the documentation let me

show it to

you see this is how the format of the

file header looks like header string is

SQ light format 3 this is what we just

saw after this this there should be page

size and PES file format right version

file format read version so if I go next

see if we just see the data there is

sql3 right now then there is m page and

all of that if you just go

ahead zero page so we added the zero

page right now this is how we are

setting all of those information see

cell offset data offset overflow in it

so we are setting is the page overflow

page or is it a B3 page what is it so

all of that is also happening so we are

in transaction right now let's move

ahead schema

version B3 begin

transaction op code so we are in VDB

right now let's just move ahead pager

right so I'm removing the break point

from here because we're going to be


changing making the changes in multiple

Pages now we are in B create table see

create a new B table right into this so

this is where we are finally creating

the table so there's a new B Tre for

every table right so this is where we

are creating

that and see allocate B Tre page pager

right so this is where all of this is

happening so P cach dirty again make

dirty manage

list zero page I'm just removing the

break point

now see now in the B Tre we are going to

insert so we were first created the B

Tre now we are inserting in the B

Tre create table P right M

dty okay so we are back to Preparing yet

another statement which is select all

from esuite Master where all of this

right so what we are doing right now is

we are in VDB exec so what this do does

is this is like a final verification

that we have created the entire table

properly so it is going to go into the

Legacy table which is again esite m is

going to retrieve everything and it is

going to check properly that everything

has been done properly so it is going to


check that there was no corruption there

was nothing that happens see there is

one error message for corruption also

right so here we are just making sure

everything's working fine the creation

of table was successful so we are going

to select everything it should not be a

trigger but everything else and we are

going to continue so here again prepare

again we are in parcel again we are

going to tokenize see select all from so

select all from Main skite Master where

table name equal to courses and type is

not equal to trigger order by row ID

right so this is where we are doing the

select so again select was called how

did this happen from our passing we went

to select so now we are going to select

from the site Master this is we are

finalizing this particular execution so

now that we are done with that we can

move ahead now we are finally EX

executing the statement if we go ahead

to

this again bdp exit so there will be off

codes I'm just moving

ahead by the way we should check if we

do Lis you can see that the journal file

is still there let's move ahead

continue so exit call back in it call


back in it call back prepare now what

statement is being

prepared create table courses the same

one why because we're reading from the

schema now let's again do this run

parsel again create

table again primary key not null create

index all of that is going to happen

again because now we are actually

reading from the schema table and

executing the entire SQL all over again

we just making sure that that statement

is running fine

and end table so now we are done with

that SQ 3 finalize step now we are going

to finally execute it so we are in the

step part and from here where do we go

VDB exact from here where should we go B

Tre from B Tre to P right let's just

move ahead for now I just remove

this in it call back

step VDB exec VDB Hal bdb finalized DB

free bdb Hal commit see B3 commit phase

one if I go inside this let's move

ahead see pager commit phase one and now

we are finally doing pager commit phase

one okay so we are inside

this see we are writing the super

journal and we can go ahead see this is


where we are going to be dealing with

this file and let's move

ahead we are finally writing to the

page finally writing to all the

concerned

Pages phase one is

complete B Tre now we will go to phase

two it is checking for all the B trees

right done now we'll go to phase two now

in phase two what happens phase two is

when the journal gets deleted so in

phase one we just update all the

information so if we see Journal is

still there in phase two is when we

delete the journal correct so we go in

we go inside

this let's go over here we are ending

the

transaction as content end transaction

B3 leave let's go

ahead VDB commit will happen and let's

actually check now see there is no

Journal because phase two got executed

so what do we do we delete the joural in

phase two right so our understanding is

clear congratulations this is a great

part to understand right now we are

coming out of VDB commit coming out of

VDB halt VDB exec so now we are finally

done with executing our statement I


think we'll be out now VDB finalize VDB

delete done

freeing the DB coming

out finally we are finishing with create

table

statement and we are done give yourself

a huge pat on the back for staying till

here and understanding the entire

execution of create table statement it

must be really interesting to see that

within one statement there are so many

things that are happening right the

indexes are being created the schemas

are being saved internally and all of

that happens now although we have

already seen insert select update all of

that happening let's quickly see one

insertion into the courses as well so

let's go ahead with that and let's go

ahead and in the inside the exec so when

we go inside the exec you can see

there's the SQL statement obviously the

first thing that's going to happen is

lock and prepare prepare then run parer

so now you must be able to understand

this properly right so now the insertion

is going to happen so tokenization

passing so insert into courses

ID title so once all of this happens


once we are into the insertion part so

right now we are going through the

values let's move ahead so we are into

the insert where did we come from from

the reduced part so from the parcel we

came to insert now we are going to move

ahead and let's move ahead with

insertion so this is just to show you

one final flow let's see this is how we

go ahead from here to VDB to B3 so I

just wanted to see that once so open the

table and all the indices we go ahead

paral

finalize bdb right set SQL so now we are

going to execute the statement when we

go inside the execution we now have the

VDB part right now we have prepared the

statement and all that has happened the

front end is done now we are in the back

end part so now we get started with VDB

we move ahead so VDB part VDB exec so

now we have the op codes there will be

switch statement now B tree part right

so B3 begin transaction so here you can

see B3 is there now B3 has the pages

correct so if we go ahead he'll be able

to see so VDB accesses the component B

Tre B Tre uses pager so now the

component part should be completely

clear right so see cache table lock then


we move move

ahead see now we have a m

page schema version so we are getting

from the page so schema version is what

three we already saw that right we can

just go ahead right

now see pager right so if we go ahead

for now open Journal if we come back to

the command line and see so so right now

there are two things only see here you

can see that pager component using which

component VFS component so that part

also you're able to see this function is

called at the start of the every right

transaction so this is the one where we

open the journal so if I just move ahead

journal open and what as soon as I go

ahead with this now if I go back and

show you see now there is journ so for

every right transaction this opens let's

move ahead again so we are making that

to dirty so if we go back you can see we

came from pager add page to roll back

Journal so now we have added the page to

roll back Journal as well and if we go

ahead we are going to

write B Tre insert is happening pag of

right is happening we are marking it as

dirty we are adding it to journal all of


it is happening properly right so now

the understanding is clear VDB commit

from VDB commit where do we come B3

commit phase one as soon as as we go

inside this we will go to pager phase

one let's go inside

that see pager phase one as soon as we

go inside the pager phase one what all

things are supposed to happen

see pag of flash on Commit

right pager

right pager add page to roll back

Journal OS right so I can move ahead

now we are going to be in phase two if

you see right now it is there after

phase two is completed the journal

should not be there so let's come over

here see page of phase two as soon as I

go inside

this we are ending the transaction right

as soon as we commit and end the

transaction and we come out cb3 leave as

content we are just leaving from

everywhere and as soon as we come out of

here see journalist gone so in phase two

Journal is

deleted bdb finalized DB free we are

done with the insert statement we are

finally closing our DB and we are done

with one entire flow so we first open


the DB we created the table we inserting

into to the table select statement we

have already seen multiple times of from

the esite master it's same as that you

can try it out yourself I would now like

to really really thank you for staying

till the end of the video I hope you

enjoyed this this is very interesting

ing in very different kind of video if

you have stayed till here you're

genuinely interested in software

engineering and not just doing interview

prep or just doing it for the sake of

work you will look really interested

because only then you would go through

the entire video and I'm really proud of

you for that thank you so much for

letting me being part of your Learning

Journey if you like the video if you're

finding the series helpful please do let

me know in the comments if you have any

feedback also if you would like me to

cover any more parts just in SQ light

itself obviously I would go into a lot

of detailing so here's the plan we

started from esite which is one of the

most simplest databases and after this

we'll be covering mongod DP postgress

Cassandra redis and a lot more so once


we cover all the databases at least the

basic flows and the debuging then we can

also start going through minute details

of each of these databases and I would

love to do that but I can do that only

if I know that you all are enjoying the

series only if it reaches more people

because you have no idea the amount of

eff What It Takes behind this you first

debug the entire thing you analyze the

entire thing you read through the

documentation then you think how to make

it simpler and then you make sense of

why are so many calls happening trust me

the first time I was debuging the create

statement I was literally pulling my

hair that why is the parer being called

so many times why are so many internal

commands being called then you keep

reading about it and then it makes sense

to you that okay master table 10 master

and all of that so it has taken a lot of

effort but if you think it was worth it

please please do comment please consider

sharing it with your friends or with

your uh social media people it would

really really help me a lot like you

have no idea how much it would mean to

me it is literally morning by the time I

keep recording it takes me entire night


to record like one one video uh just

being honest over here and sharing

something with you and if you would like

me to be part of your further Learning

Journey if you would like me to help you

with interview prep or with project

building do consider signing up for any

one of our courses lld h DSA HH there's

a lot on Ed courses go check it out I

would love to teach you live or to be

part of your Learning Journey and make

it more faster quicker because YouTube

has its own limitations I don't know you

personally if you would like that

connection with me please do consider

signing up for Ed courses otherwise see

you next time see in the next video

please do comment and let me know if you

found this helpful bye

sh

You might also like