Data and Business Analytics Interview Questions
Data and Business Analytics Interview Questions
Analytics
Interview
Questions &
Answers
1
Satyajit Pattnaik
Agenda
Business Analyst vs Data Analyst
Types of questions expected
Behavioral Questions
Functional Questions
Analytical/Brain Teasers
Technical Questions
Theoretical
SQL Basics
SQL Advanced
Python Basics
Python Assignment: Hands on
Statistical Questions
Data Visualization Questions
Power BI Questions
2 Basic ML Concepts
Business vs Data Analysts
Analytical &
Technical Scenario
questions based
questions
4
Behavioral How to handle difficult
questions. clients & stakeholders
6
Video Reference: Click here
Analytical &
Scenario based
questions.
These are the scenario based
questions that are usually
asked to play with candidate's
brain, and see his analytical
power to resolve problems &
puzzles
7
Video Reference: Click here
Puzzle
questions. Bag of coins
You have 10 bags full of coins, all
have infinite coins, and 1 of the 10
bags has fake coins, and you don't
remember it. You know that
genuine coins weighs 1 gram and
fake weighs 1.1 grams. Identify the
bag with fake coins in minimum
readings.
Note: You have a digital weighing
machine
8
Video Reference: Click here
Puzzle
questions.
Shake Hands
At a party, everyone shook hands
with everybody else, There were
66 total handshakes, how many
were at the party?
9
Video Reference: Click here
Technical
questions.
Questions
BA are responsible for the documentation, and the initial
requirement gathering about the project, so they need to
create documents such as Stakeholder Analysis, Scope
Statement, and if requirement, they also need to work on
Competitotr Analysis.
14
Video Reference: Click here
SQL
Basics.
Retrieve all the records from a table
Retrieve two columns from a table
Retrieve customers having age more than 25
Retrieve customers having age more than 25 and from Mumbai
Retrieve customers having age more than 25 OR churn: Y
Retrieve customers having age between 15 and 35
Retrieve customers staying in Delhi & Mumbai
Find the total number of records in the table
Insert a customer to the table
Update the customer details as Churned whose customer id is 7
Find the minimum and maximum age from the list of customers
15
Video Reference: Click here
Answers
Id is the primary key for this table. Id is the primary key for this table.
21 Each row contains the ID, name, Each row contains the ID and the
salary, and department of one name of one department.
SQL Advanced
Facebook Source:
SELECT Leetcode
d.Name AS 'Department', e1.Name AS 'Employee',
e1.Salary
FROM
Employee e1
JOIN
Department d ON e1.DepartmentId = d.Id
WHERE
3 > (SELECT
COUNT(DISTINCT e2.Salary)
FROM
Employee e2
WHERE
e2.Salary > e1.Salary
AND e1.DepartmentId = e2.DepartmentId
)
;
22
SQL Advanced
Facebook Source:
Write a SQL query to get the nth highest salary from the Leetcode
Employee table.
23
SQL Advanced
Facebook Source:
Leetcode
24
SQL Advanced
Facebook Source:
Leetcode
Write a SQL query to find employees who have the highest salary in each
of the departments.
25
SQL Advanced
Facebook Source:
Leetcode
SELECT
Department.name AS 'Department',
Employee.name AS 'Employee',
Salary
FROM
Employee
JOIN
Department ON Employee.DepartmentId = Department.Id
WHERE
(Employee.DepartmentId , Salary) IN
( SELECT
DepartmentId, MAX(Salary)
FROM
Employee
GROUP BY DepartmentId
)
26 ;
SQL Advanced
Facebook Source:
Write an SQL query to find all numbers that appear at least three
Leetcode
times consecutively.
Return the result table in any order
Solution
SELECT DISTINCT
l1.Num AS ConsecutiveNums
FROM
Logs l1,
Logs l2,
Logs l3
WHERE
l1.Id = l2.Id - 1
AND l2.Id = l3.Id - 1
AND l1.Num = l2.Num
27 AND l2.Num = l3.Num
;
SQL Advanced
Facebook Source:
Write a SQL query to rank scores. If there is a tie between two
Leetcode
scores, both should have the same ranking. Note that after a tie, the
next ranking number should be the next consecutive integer value.
In other words, there should be no "holes" between ranks.
Solution:
select
score,dense_rank()over(order
by score desc) as 'Rank'
from Scores
order by 'Rank'
28
SQL Advanced
Facebook Source:
Write a SQL query to get the second highest salary from the
Leetcode
Employee table.
Solution
SELECT
(SELECT DISTINCT
Salary
FROM
Employee
ORDER BY Salary DESC
LIMIT 1 OFFSET 1) AS
SecondHighestSalary
;
29
SQL Advanced
Facebook
Solution
select distinct t1.*
from stadium t1, stadium t2, stadium t3
where t1.people >= 100 and t2.people >= 100 and t3.people >=
100
; Source:
Leetcode
31
Python Basics
Deep copy: Shallow copy:
35
Python Basics
What is the difference between pass, continue & break
Pass: It is used when you need some block of code syntactically, but you
want to skip its execution. This is basically a null operation. Nothing happens
when this is executed.
Continue: It allows to skip some part of a loop when some specific condition
is met and the control is transferred to the beginning of the loop. The loop
does not terminate but continues on with the next iteration.
Break: It allows the loop to terminate when some condition is met and the
control of the program flows to the statement immediately after the body of
the loop. In case, the break statement is inside a nested loop (the loop inside
another loop), then the break statement will terminate the innermost loop.
36
Python Basics
What does enumerate() function do?
The enumerate() function assigns an index to each item in an iterable object
that can be used to reference the item later. It makes it easier to keep track
of the content of an iterable object.
Example:
list2 = [“apple”,”ball”,”cat”]
e1 = enumerate(list2)
print(e1)
Output: [(0, ‘apple’), (1, ball’), (2, ‘cat’)]
37
Python
Assignment
Data
38
Questions
For Solutions:
Click here
39
Statistical Questions
What's your knowledge on Which step of a data
statistics, and how you have analysis project did you
used it in your work? enjoy the most?
If possible describe how you've used It's situational based, and depends on
stats to solve a problem. personal interests.
Example: You can tell them how you
imputed the null values, what Example: I have enjoyed performing
statistical strategies you took, you the EDA steps, so that I can know
can explain them how to perform more about the data, have performed
EDA, categorical vs numerical categorical vs numerical analysis,
analysis etc (Don't mug up, try to performed correlation on numerical
answer as openly from personal variables and so on..
40
experiences as possible)
Dashboard Building Questions
What's your experience in Explain some of the
creating dashboards? important charts, and
where
Try explaining the interviewer how
dashboards are important, how can There are many basic charts you
we capture the KPIs and metrics, and should be aware of, such as area
make it look visually appealing and charts, bar charts, column charts,
track business goals. You can then doughnut charts, pie charts, gauge
explain about one of the tools that charts etc.
you have used, it could be Power BI,
MS Excel, Tableau, or anything else, Also explain, which chart to be used
and if possible, talk about some of for which particular scenario
the features of the BI tool you used.
41
Data Visualization Charts
42
Data Visualization Charts
43
Data Visualization Charts
44
Data Visualization Charts
45
Data Visualization Charts
46
Power BI Questions
FILTER FUNCTION How would you create
trailing X month metrics via
Let say, you want to find the total DAX against a non-standard
count of customers present in calendar?
Mumbai
Count_Mumbai = The solution will involve:
CALCULATE(COUNT('table'[CUST_ID]), 1. CALCULATE function to control
FILTER('table', 'table'[CITY]="MUMBAI")) (take over) filter context of
measures.
FILTER acts as the WHERE clause 2. ALL to remove existing filters on
the date dimension.
3. FILTER to identify which rows of the
date dimension to use.
Alternatively, CONTAINS may be used:
47 CALCULATE(FILTER(ALL(‘DATE’),…….))
Power BI Questions
Can we have more than one Explain When Do You Use
active relationship between Sumx() Instead Of Sum()?
two tables in data model of
power pivot? When the expressions to SUM() consits
of anything else than a column name.
Typically when you want to add or
No, we cannot have more than one multiply the values in different
active relationship between two columns:
tables. However, can have more than SUMX(Orderline, Orderline[quantity],
one relationship between two tables Orderline[price])
but there will be only one active SUMX() first creates a row context over
relationship and many inactive the Sales table (see 1 above). It then
relationship. The dotted lines are iterates through this table one row at
inactive and continuous line are a time. SUM() is optimized for reducing
active. over column segments and is as such
48 not an iterator.
Power BI Questions
What Is The Difference
Name Any 3 Most Useful Between Distinct() And
Text Functions In Dax? Values() In Dax?
The text functions in DAX include the Both count the distinct values, but
following: VALUES() also counts a possible
CONCATENTATE implictit virtual empty row because of
REPLACE non matching values in a child table.
SEARCH This is usually in a dimension table.
UPPER
FIXED Which Function Should You
Use Rather Than
Countrouws(distinct())?
49
DISTINCTCOUNT()
Power BI Questions
What is Power Query Common Power Query
Editor Transforms?
Power query is a ETL Tool used to
shape, clean and transform data Changing Data Types, Filtering Rows,
using intuitive interfaces without Choosing/Removing Columns,
having to use coding. It helps the Grouping, Splitting a column into
user to: multiple columns, Adding new
Import Data from wide range of Columns ,etc.
sources from files, databases, big
data, social media data, etc.
Join and append data from
multiple data sources.
Shape data as per requirement
by removing and adding data.
50
Basic ML Questions
How to handle missing values & null values?
One of the following methods can be applied while
handling missing values, depending on the nature of the
data and data type we are working with
Drop Rows with more than 50% missing values
Impute missing values with central tendency measures
such as mean or median
For categorical attributes, impute missing values with
the most frequent category (mode)
Predicting the missing values with regression or
classification models
Using machine algorithms such as K-NN that supports
missing values while making a prediction
51
Basic ML Concepts
Explain univariate, bivariate & multivariate
analysis
Univariate analysis: Univariate is a form of data analysis where
a single variable is analyzed to describe and find patterns that
exist within it. It is the simplest form of data analysis as it
doesn’t deal with causes or relationships.
Bivariate Analysis: Bivariate analysis measures the correlation
between two variables. This technique is used by researchers
when they aim to draw comparisons between two variables.
Multivariate Analysis: Multivariate analysis is used to study
complex data sets. In this form of analysis, a dependent
variable is represented in terms of several independent
variables observations available to establish such a
52
relationship.
Basic ML Concepts
Different types of Machine Explain any of the Machine
learning techniques? Learning project you did
Supervised Learning: Dealing with the Talk about any project you did in past.
labeled data.
Unsupervised Learning: Unlabeled Always explain 4 major things:
data
Reinforcement Learning: In 1. Business Understanding: What the
reinforcement learning, decisions are use case is all about, what business
made by the system based on the problem is it solving
feedback it receives for its actions. In 2. Exploratory Data Analysis: Tell some
this approach, the algorithm learns important insights that you got by
from its mistakes and improvises to performing any EDA steps
return better results, over time. 3. Model Building: Talk about the
53
algorithms you used.
4. Model Deployment
More
advanced
questions.
Stay tuned!!
54