SQL Day1
SQL Day1
FOR
GROWTH
BASICS OF
SQL
WHO ARE WE?
Data Science is a very vast field that focuses on extracting insights and
knowledge from data. It integrates various disciplines including statistics,
mathematics, advanced computing, artificial intelligence (AI), and machine
learning to analyze, model, and interpret large datasets.
There are several data science branches related to data, that are involved
in getting insights from the data and then using those insights to make
decisions or predict something. Few of these branches are:
• We cover Data Analyst, Business Analyst, and Data Science + Machine Learning +
Deep Learning in this order only.
• We also provide electives of Tableau, Entrepreneurship, Gen AI, etc which learners are
free to choose from.
Who all have given Analyst
interviews?
What was the most asked skill?
In this series, we will discuss SQL as it is
expected in almost 80% of analyst
openings like product analyst, growth
analyst, business analyst, etc
SAMPLE SAMPLE
INPUT OUTPUT
ARENA Q: Identify Similar Customer Profiles Based on Purchased
Product Categories
Let’s look at Amazon SQL Interview
Question
Given the data of sales, mark each store’s average daily sales in April 2024
as “lower”, “higher” or “same” when compared to average daily sales of the
company in April 2024.
SAMPLE SAMPLE
INPUT OUTPUT
ARENA Q: Comparing Store Revenue to Company Average
for April 2024
This is the level of
questions asked in Major
firms like Google and
Amazon, so one needs to
be really good with their
SQL and problem-solving
skills.
Here just knowing concepts like CTE, Join, etc won’t
help you understand how to solve problems how to
calculate metrics, and so on to work optimally in a
competitive analytics team
You can think of DBMS as the Library and SQL as the librarian!
HOW EXCEL FALLS SHORT?
• Dealing with large dataset: Excel may struggle to handle large dataset efficiently,
leading to slow performance and potential crashes.
That’s where Database and SQL comes in to optimize the operations of DATA.
What is a DATABASE?
• You will be able to rapidly search for and find items(especially missing socks)
[SEARCH]
• Easily modify each drawer without affecting the others. [ALTER AND UPDATE]
• Get rid of a particular set of clothing without a second thought. [DELETE]
LIBRARY LIBRARI
AN
FACILITATO
R
• There are multiple platforms like METABASE, GOOGLE BIGQUERY, MYSQL
WORKBENCH, DBEAVER that allow querying on hosted or local databases.
• For our sessions we will use Newton’s MySQL Playground which already
has the hosted database.
b.WHERE
BASIC SQL QUERIES
d. ORDER BY
e. LIMIT
• Get all the payments data, which have all the following properties:
a.Payment is made after July 2023
b.Payment amount exceeds 10000
• Get all the products data, which have either of the following properties:
a.Category are either of Clothing, Footwear
b.Stocks is less than 100
SQL FLOW
There are so many SQL clauses like SELECT, FROM, ORDER BY, LIMIT, GROUP BY, etc, how
does SQL know which to execute first and which to execute later?
3. Coding Problems:
d. Here either you are given a database or given the
description of a data, and asked to solve for a
requirement.
IDEAL METHODOLOGY
Expectations from an analyst in SQL:
1.Should be able to code
2.Should be able to explain and optimize
Explanation:
1.Firstly break down the problem into steps.
2.Then try to get the output of each step in stages
3.At the end, get the overall output either by combining the steps or executing the final
step.
In the next session, when we will be solving a few questions live, we will
discuss this in more depth, because all the folks need to be on the same page
regarding their awareness of SQL
Practice Components
Let’s try to solve few mandatory assignment questions on Newton’s Platform:
It won’t take a lot of time for you (max 30 min.) for completing the questions,
so solve it!!
ARENA
We’ve given all the learners access to Arena.
Let’s look at one of the questions: The cheap flier connection (Present in
Arena)
STEPWISE APPROACH
First of all, I hope you all realise it’s not a straightforward question that will be solved by a simple
group by or a join.
STEPS:
1.First we need to get the data of all the possible total cost for 0, 1 or 2 stops for
each starting and end city pair.
2.We can get these using self joins.
3.Then for each 0, 1 and 2 stop, we can combine the results using set operations
4.In the end we can simply group the data we got from 3rd step using origin and
destination to get the minimum for all the pairs.
• In the next session we will discuss solving questions in this manner only.
E-Commerce (test_db on NS Playground)
PROBLEM STATEMENT FOR THE NEXT
SESSION
Analyze the trend of spending in different customers’
segments (like Occasional shoppers, Regular Buyers, and
VIP Customers) over the months.