0% found this document useful (0 votes)
199 views

Data and Business Analytics Interview Questions

The document outlines different types of questions that may be asked in a business analyst interview, including behavioral, functional, analytical, technical, and scenario-based questions. It provides examples of questions in each category and recommends preparing answers related to your background and experience. The technical questions section covers topics like SQL, Python, statistics, machine learning, and data visualization tools.

Uploaded by

Nafis Reza Karim
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
199 views

Data and Business Analytics Interview Questions

The document outlines different types of questions that may be asked in a business analyst interview, including behavioral, functional, analytical, technical, and scenario-based questions. It provides examples of questions in each category and recommends preparing answers related to your background and experience. The technical questions section covers topics like SQL, Python, statistics, machine learning, and data visualization tools.

Uploaded by

Nafis Reza Karim
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 54

Business

Analytics
Interview
Questions &
Answers
1
Satyajit Pattnaik
Agenda
Business Analyst vs Data Analyst
Types of questions expected
Behavioral Questions
Functional Questions
Analytical/Brain Teasers
Technical Questions
Theoretical
SQL Basics
SQL Advanced
Python Basics
Python Assignment: Hands on
Statistical Questions
Data Visualization Questions
Power BI Questions
2 Basic ML Concepts
Business vs Data Analysts

Business Analyst Data Analyst


Understands and Solves Gathers and processes
a business Problem large data set

Validates business Analyzes the collected


data
requirements

Mediates between the IT Unravels useful business


Team and the insights to the company
3
stakeholders
Behavioral Functional
questions questions

Analytical &
Technical Scenario
questions based
questions
4
Behavioral How to handle difficult
questions. clients & stakeholders

Any mistakes you did


in past, and how did
you rectify it

How to pitch an idea


to the higher
management, or
5
Video Reference: Click here seniors
Why should we hire
Functional you?
questions. What's your educational
background?
How your typical day
looks like?
Where do you see
yourself in next 2-5
years?

6
Video Reference: Click here
Analytical &
Scenario based
questions.
These are the scenario based
questions that are usually
asked to play with candidate's
brain, and see his analytical
power to resolve problems &
puzzles
7
Video Reference: Click here
Puzzle
questions. Bag of coins
You have 10 bags full of coins, all
have infinite coins, and 1 of the 10
bags has fake coins, and you don't
remember it. You know that
genuine coins weighs 1 gram and
fake weighs 1.1 grams. Identify the
bag with fake coins in minimum
readings.
Note: You have a digital weighing
machine
8
Video Reference: Click here
Puzzle
questions.
Shake Hands
At a party, everyone shook hands
with everybody else, There were
66 total handshakes, how many
were at the party?

9
Video Reference: Click here
Technical
questions.

Theoretical & Practical


Basics of Python, & Statistics
SQL, Advanced SQL
Dashboard Building
Power BI/Tableau
Basics of Machine Learning & AI
10
Types of SQL Statements
SQL definition: DDL: Create, alter, & drop
SQL manipulation: DML: Delete, Insert, Select, Update
.SQL control: Transaction Control Statements: Commit,
Rollback, Set Transaction, Session Control: Alter Session,
Set Role etc.

Experience with technical & functional


Theoretical documents

Questions
BA are responsible for the documentation, and the initial
requirement gathering about the project, so they need to
create documents such as Stakeholder Analysis, Scope
Statement, and if requirement, they also need to work on
Competitotr Analysis.

How do you convey complex, technical


information to non-tech stakeholders
The way you answer will showcase your communication
skills, so you have to be relatable, and able to create
11 simple mockups and answers, so storytelling skills is
Video Reference: Click here going to be tested here.
Describe your experience
with UAT?
Planning
Execution
Documentation
Evaluation
Reporting & Lessons Learned

PaaS, SaaS, IaaS


Theoretical Platform as a service: allows developers to build apps
over the internet

Questions Software as a service: 3rd party to host applications &


give access
Internet as a service: form of cloud computing that
provides virtual computing resources through the
internet

After researching, you come across two


possible solutions, one is cloud based,
and other is on-premises, which one
would you recommend and why?
12 Video Reference: Click here No concrete answer.....
Which visualization tools have
you used
Power BI
Tableau
QlikSense
Alteryx
MS Excel

Theoretical How to build a predictive


model?
Questions Give examples and explain, and be ready to tackle
related questions to the use case you explain.

Explain one of the ML use


cases you worked on
Based on your resume
13 Video Reference: Click here
SQL
Basics.

14
Video Reference: Click here
SQL
Basics.
Retrieve all the records from a table
Retrieve two columns from a table
Retrieve customers having age more than 25
Retrieve customers having age more than 25 and from Mumbai
Retrieve customers having age more than 25 OR churn: Y
Retrieve customers having age between 15 and 35
Retrieve customers staying in Delhi & Mumbai
Find the total number of records in the table
Insert a customer to the table
Update the customer details as Churned whose customer id is 7
Find the minimum and maximum age from the list of customers
15
Video Reference: Click here
Answers

select * from tablename


select column A, column B from tablename
select * from tablename where age > 25
select * from tablename where age > 25 and city = 'Mumbai'
select * from tablename where age > 25 OR churn = 'Y'
select * from tablename where age between 15 and 35
select * from tablename where city in ('Delhi', 'Mumbai')
select count(*) from tablename
Insert into tablename values ()
update tablename set churn = 'Y' where cust_id = 7
select * from tablename where age in (select min(age) from tablename)
16
Video Reference: Click here
SQL
Advanced.
Facebook
Let say, you have date, wifi id, wifi speed, latency, country
1. Calculate the average download speed per wifi for a
particular date
2. same question, show it for last 7 days
17
SQL
Advanced.
Answers
1. select avg(wifi_speed), wifi_id from tablename where date =
'specified_date' group by wifi_id
2. select avg(wfi_speed), wifi_id from tablename where date > =
date_add(SYSDATE() interval-7 day) group by wifi_id order by
date desc
18
SQL Advanced
Facebook Source:
Write a SQL query to find the cancellation rate of requests with Leetcode
unbanned users (both client and driver must not be banned) each
day between "2013-10-01" and "2013-10-03".
The cancellation rate is computed by dividing the number of
canceled (by client or driver) requests with unbanned users by the
total number of requests with unbanned users on that day.
Return the result table in any order. Round Cancellation Rate to two
decimal points.
Id is the primary key for
this table. Users_Id is the primary
The table holds all taxi key for this table.
trips. Each trip has a The table holds all users.
unique Id, while Client_Id Each user has a unique
and Driver_Id are foreign Users_Id, and Role is an
keys to the Users_Id at ENUM type of (‘client’,
the Users table. ‘driver’, ‘partner’).
Status is an ENUM type of Status is an ENUM type of
(‘completed’, (‘Yes’, ‘No’).
19 ‘cancelled_by_driver’,
‘cancelled_by_client’).
SQL Advanced
Facebook Source:
Leetcode
Solution
with r1 as (
select sum(case when t.status like 'cancelled%' then 1 else 0 end)
as sum_canc,count(*) as tot_trips, t.request_at
from trips t
where 1=1
and client_id not in (select users_id from users where banned =
'Yes')
and driver_id not in (select users_id from users where banned =
'Yes')
and t.request_at between '2013-10-01' and '2013-10-03'
group by 3
)
select r1.request_at as Day, round(sum_canc/tot_trips,2) as
"Cancellation Rate" from r1
20
SQL Advanced
Facebook Source:
A company's executives are interested in seeing who earns the Leetcode
most money in each of the company's departments. A high
earner in a department is an employee who has a salary in the
top three unique salaries for that department.
Write an SQL query to find the employees who are high earners
in each of the departments.
Return the result table in any order.

Id is the primary key for this table. Id is the primary key for this table.
21 Each row contains the ID, name, Each row contains the ID and the
salary, and department of one name of one department.
SQL Advanced
Facebook Source:
SELECT Leetcode
d.Name AS 'Department', e1.Name AS 'Employee',
e1.Salary
FROM
Employee e1
JOIN
Department d ON e1.DepartmentId = d.Id
WHERE
3 > (SELECT
COUNT(DISTINCT e2.Salary)
FROM
Employee e2
WHERE
e2.Salary > e1.Salary
AND e1.DepartmentId = e2.DepartmentId
)
;
22
SQL Advanced
Facebook Source:
Write a SQL query to get the nth highest salary from the Leetcode
Employee table.

23
SQL Advanced
Facebook Source:
Leetcode

CREATE FUNCTION getNthHighestSalary(N INT)


RETURNS INT
BEGIN
RETURN (
# Write your MySQL query statement below.
select DISTINCT salary as getNthHighestSalary FROM
(select dense_rank() over (order by salary desc) rnk,
salary from employee) x
WHERE X.RNK=N
);
END

24
SQL Advanced
Facebook Source:
Leetcode
Write a SQL query to find employees who have the highest salary in each
of the departments.

25
SQL Advanced
Facebook Source:
Leetcode
SELECT
Department.name AS 'Department',
Employee.name AS 'Employee',
Salary
FROM
Employee
JOIN
Department ON Employee.DepartmentId = Department.Id
WHERE
(Employee.DepartmentId , Salary) IN
( SELECT
DepartmentId, MAX(Salary)
FROM
Employee
GROUP BY DepartmentId
)
26 ;
SQL Advanced
Facebook Source:
Write an SQL query to find all numbers that appear at least three
Leetcode
times consecutively.
Return the result table in any order

Solution
SELECT DISTINCT
l1.Num AS ConsecutiveNums
FROM
Logs l1,
Logs l2,
Logs l3
WHERE
l1.Id = l2.Id - 1
AND l2.Id = l3.Id - 1
AND l1.Num = l2.Num
27 AND l2.Num = l3.Num
;
SQL Advanced
Facebook Source:
Write a SQL query to rank scores. If there is a tie between two
Leetcode
scores, both should have the same ranking. Note that after a tie, the
next ranking number should be the next consecutive integer value.
In other words, there should be no "holes" between ranks.

Solution:
select
score,dense_rank()over(order
by score desc) as 'Rank'
from Scores
order by 'Rank'

28
SQL Advanced
Facebook Source:
Write a SQL query to get the second highest salary from the
Leetcode
Employee table.

Solution
SELECT
(SELECT DISTINCT
Salary
FROM
Employee
ORDER BY Salary DESC
LIMIT 1 OFFSET 1) AS
SecondHighestSalary
;

29
SQL Advanced
Facebook

Write an SQL query to display the records with three or more


rows with consecutive id's, and the number of people is greater
than or equal to 100 for each.
Return the result table ordered by visit_date in ascending order.

visit_date is the primary key for this table.


Each row of this table contains the visit date
and visit id to the stadium with the number of
people during the visit.
No two rows will have the same visit_date, and
Source:
as the id increases, the dates increase as well. Leetcode
30
SQL Advanced
Facebook

Solution
select distinct t1.*
from stadium t1, stadium t2, stadium t3
where t1.people >= 100 and t2.people >= 100 and t3.people >=
100
; Source:
Leetcode
31
Python Basics
Deep copy: Shallow copy:

It constructs a new compound It constructs a new compound


object and then, recursively, inserts object and then (to the extent
copies into it of the objects found in possible) inserts references into it
the original. to the objects found in the original.
It makes the reference to an object Shallow copy is used to copy the
and the new object that is pointed reference pointers just like it copies
by some other object gets stored. the values.
The changes made in the original These references point to the
copy won’t affect any other copy original objects and the changes
that uses the object. made in any member of the class
will also affect the original copy of it.
It makes execution of the program It allows faster execution of the
slower due to making certain copies program and it depends on the size
32
for each object that is being called. of the data that is used.
Python Basics
Map → Utility function, maps a collection to another collection object based on
certain functionality.
map(function, iterable object)
For example: If we have list of people like:
firstname = ["Ram", "Shyam", "Vinay", "Gopal"]
Map the list to obtain the names in upper case
list(map(lambda x:x.upper(), firstname))
Filter → Similar function, but it requires the function to look for a condition and
then returns only those elements from the collection that satisfies the
condition.
Reduce → An operation that breaks down the entire process into pair-wise
operations and uses the result from each operation, with the successive
33
element.
Video for reference: https://www.youtube.com/watch?v=kr9QnYf2zK4
Python Basics
What is the Lambda Function?
Lambda functions are an anonymous or nameless function.
These functions are called anonymous because they are not declared in the
standard manner by using the def keyword. It doesn’t require the return
keyword as well. These are implicit in the function.
The function can have any number of parameters but can have just one
statement and return just one value in the form of an expression. They
cannot contain commands or multiple expressions.
An anonymous function cannot be a direct call to print because lambda
requires an expression.
Lambda functions have their own local namespace and cannot access
variables other than those in their parameter list and those in the global
namespace.
Example: x = lambda i,j: i*j
34
print(x(2,3))
Output: 6
Python Basics
Lists vs Tuple

35
Python Basics
What is the difference between pass, continue & break
Pass: It is used when you need some block of code syntactically, but you
want to skip its execution. This is basically a null operation. Nothing happens
when this is executed.
Continue: It allows to skip some part of a loop when some specific condition
is met and the control is transferred to the beginning of the loop. The loop
does not terminate but continues on with the next iteration.
Break: It allows the loop to terminate when some condition is met and the
control of the program flows to the statement immediately after the body of
the loop. In case, the break statement is inside a nested loop (the loop inside
another loop), then the break statement will terminate the innermost loop.

36
Python Basics
What does enumerate() function do?
The enumerate() function assigns an index to each item in an iterable object
that can be used to reference the item later. It makes it easier to keep track
of the content of an iterable object.
Example:
list2 = [“apple”,”ball”,”cat”]
e1 = enumerate(list2)
print(e1)
Output: [(0, ‘apple’), (1, ball’), (2, ‘cat’)]

37
Python
Assignment
Data

38
Questions

For Solutions:
Click here
39
Statistical Questions
What's your knowledge on Which step of a data
statistics, and how you have analysis project did you
used it in your work? enjoy the most?
If possible describe how you've used It's situational based, and depends on
stats to solve a problem. personal interests.
Example: You can tell them how you
imputed the null values, what Example: I have enjoyed performing
statistical strategies you took, you the EDA steps, so that I can know
can explain them how to perform more about the data, have performed
EDA, categorical vs numerical categorical vs numerical analysis,
analysis etc (Don't mug up, try to performed correlation on numerical
answer as openly from personal variables and so on..
40
experiences as possible)
Dashboard Building Questions
What's your experience in Explain some of the
creating dashboards? important charts, and
where
Try explaining the interviewer how
dashboards are important, how can There are many basic charts you
we capture the KPIs and metrics, and should be aware of, such as area
make it look visually appealing and charts, bar charts, column charts,
track business goals. You can then doughnut charts, pie charts, gauge
explain about one of the tools that charts etc.
you have used, it could be Power BI,
MS Excel, Tableau, or anything else, Also explain, which chart to be used
and if possible, talk about some of for which particular scenario
the features of the BI tool you used.
41
Data Visualization Charts

42
Data Visualization Charts

43
Data Visualization Charts

44
Data Visualization Charts

45
Data Visualization Charts

46
Power BI Questions
FILTER FUNCTION How would you create
trailing X month metrics via
Let say, you want to find the total DAX against a non-standard
count of customers present in calendar?
Mumbai
Count_Mumbai = The solution will involve:
CALCULATE(COUNT('table'[CUST_ID]), 1. CALCULATE function to control
FILTER('table', 'table'[CITY]="MUMBAI")) (take over) filter context of
measures.
FILTER acts as the WHERE clause 2. ALL to remove existing filters on
the date dimension.
3. FILTER to identify which rows of the
date dimension to use.
Alternatively, CONTAINS may be used:
47 CALCULATE(FILTER(ALL(‘DATE’),…….))
Power BI Questions
Can we have more than one Explain When Do You Use
active relationship between Sumx() Instead Of Sum()?
two tables in data model of
power pivot? When the expressions to SUM() consits
of anything else than a column name.
Typically when you want to add or
No, we cannot have more than one multiply the values in different
active relationship between two columns:
tables. However, can have more than SUMX(Orderline, Orderline[quantity],
one relationship between two tables Orderline[price])
but there will be only one active SUMX() first creates a row context over
relationship and many inactive the Sales table (see 1 above). It then
relationship. The dotted lines are iterates through this table one row at
inactive and continuous line are a time. SUM() is optimized for reducing
active. over column segments and is as such
48 not an iterator.
Power BI Questions
What Is The Difference
Name Any 3 Most Useful Between Distinct() And
Text Functions In Dax? Values() In Dax?
The text functions in DAX include the Both count the distinct values, but
following: VALUES() also counts a possible
CONCATENTATE implictit virtual empty row because of
REPLACE non matching values in a child table.
SEARCH This is usually in a dimension table.
UPPER
FIXED Which Function Should You
Use Rather Than
Countrouws(distinct())?
49
DISTINCTCOUNT()
Power BI Questions
What is Power Query Common Power Query
Editor Transforms?
Power query is a ETL Tool used to
shape, clean and transform data Changing Data Types, Filtering Rows,
using intuitive interfaces without Choosing/Removing Columns,
having to use coding. It helps the Grouping, Splitting a column into
user to: multiple columns, Adding new
Import Data from wide range of Columns ,etc.
sources from files, databases, big
data, social media data, etc.
Join and append data from
multiple data sources.
Shape data as per requirement
by removing and adding data.
50
Basic ML Questions
How to handle missing values & null values?
One of the following methods can be applied while
handling missing values, depending on the nature of the
data and data type we are working with
Drop Rows with more than 50% missing values
Impute missing values with central tendency measures
such as mean or median
For categorical attributes, impute missing values with
the most frequent category (mode)
Predicting the missing values with regression or
classification models
Using machine algorithms such as K-NN that supports
missing values while making a prediction
51
Basic ML Concepts
Explain univariate, bivariate & multivariate
analysis
Univariate analysis: Univariate is a form of data analysis where
a single variable is analyzed to describe and find patterns that
exist within it. It is the simplest form of data analysis as it
doesn’t deal with causes or relationships.
Bivariate Analysis: Bivariate analysis measures the correlation
between two variables. This technique is used by researchers
when they aim to draw comparisons between two variables.
Multivariate Analysis: Multivariate analysis is used to study
complex data sets. In this form of analysis, a dependent
variable is represented in terms of several independent
variables observations available to establish such a
52
relationship.
Basic ML Concepts
Different types of Machine Explain any of the Machine
learning techniques? Learning project you did
Supervised Learning: Dealing with the Talk about any project you did in past.
labeled data.
Unsupervised Learning: Unlabeled Always explain 4 major things:
data
Reinforcement Learning: In 1. Business Understanding: What the
reinforcement learning, decisions are use case is all about, what business
made by the system based on the problem is it solving
feedback it receives for its actions. In 2. Exploratory Data Analysis: Tell some
this approach, the algorithm learns important insights that you got by
from its mistakes and improvises to performing any EDA steps
return better results, over time. 3. Model Building: Talk about the
53
algorithms you used.
4. Model Deployment
More
advanced
questions.
Stay tuned!!
54

You might also like