0% found this document useful (0 votes)
3 views

BA Questions

The document outlines various analytical approaches and metrics for optimizing business operations in supermarkets, fast-food restaurants, and loan prediction. It discusses the importance of data collection, analysis techniques, and algorithms like Decision Trees and K-Means clustering for predictive modeling and customer segmentation. Additionally, it touches on recommendation systems, statistical analysis, and the significance of sampling in data analysis, particularly in the context of COVID-19 diagnosis.

Uploaded by

sk24msg1r43
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

BA Questions

The document outlines various analytical approaches and metrics for optimizing business operations in supermarkets, fast-food restaurants, and loan prediction. It discusses the importance of data collection, analysis techniques, and algorithms like Decision Trees and K-Means clustering for predictive modeling and customer segmentation. Additionally, it touches on recommendation systems, statistical analysis, and the significance of sampling in data analysis, particularly in the context of COVID-19 diagnosis.

Uploaded by

sk24msg1r43
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

PREVIOUS YEAR PAPERS

1. Business Analytics for Supermarket Optimization

Q1a. How might business analytics help the supermarket? What data would be
needed to facilitate good decisions?

a) Business analytics can help the supermarket by analyzing historical transaction


data, foot traffic patterns, and employee shift schedules to identify peak hours and
optimize staffing decisions. Predictive modeling can be used to anticipate high-
demand periods and allocate resources accordingly.

Data Required:

 Transaction timestamps
 Number of active registers
 Customer entry and exit times
 Sales volume by day and hour
 Employee shift schedules

Example: If analytics indicate that Mondays from 6 PM to 8 PM experience high


checkout congestion, the supermarket can proactively assign more employees to
checkout duty during that period.

Q1b. Do you agree with the statement "Data Scientist: The Sexiest Job of the
21st Century"? Justify.

b) The statement "Data Scientist: The Sexiest Job of the 21st Century" by Davenport
and Patil is justified due to the increasing reliance on data-driven decision-making.
The demand for skilled data scientists continues to rise, making it one of the most
sought-after careers.

2. Metrics for Fast-Food Restaurant Management

Q2a. Suggest some metrics a fast-food restaurant manager might want to collect.
How might the manager use the data to facilitate better decisions?

a) Key metrics for a fast-food restaurant include:

 Average Order Time: Helps in optimizing service speed.


 Customer Satisfaction Scores: Provides insights into service quality.
 Inventory Turnover Rate: Ensures efficient stock management.
 Revenue per Hour: Determines peak business periods.
 Employee Efficiency: Tracks productivity levels.

Managers can use these metrics to streamline operations, reduce waste, and improve
customer experience.
Q2b. How do you perform exploratory analysis using R? What package do you
use and how do you analyze the data through exploratory analysis? What
descriptive techniques might you want to use?

b) In R, exploratory data analysis (EDA) can be performed using packages like


ggplot2, dplyr, and tidyverse. Techniques include summary statistics, box plots,
histograms, and correlation matrices.

Example Code:

library(ggplot2)
library(dplyr)
data %>% summary()
ggplot(data, aes(x=variable)) + geom_histogram()

3. Loan Default Prediction & Customer Segmentation

Q3a. How do you build an algorithm that predicts loan defaulters in advance?
Should you use a Decision Tree or a regression model? What will be the
procedure?

a) To predict loan defaulters, a classification algorithm like Decision Tree or Logistic


Regression can be used. The steps include:

 Data Collection (Loan history, credit score, income, etc.)


 Data Cleaning (Handling missing values, standardization)
 Feature Selection
 Model Training using Decision Tree
 Evaluation using accuracy and recall

Q3b. A marketing team wants to target its set of customers and use an algorithm
that can divide them. What algorithm would you suggest and explain the steps
involved?

b) For customer segmentation, K-Means clustering is suitable. Steps:

 Collect customer purchase history


 Normalize data
 Determine optimal clusters using the Elbow method
 Apply K-Means and analyze cluster characteristics

4. Recommendation Systems & Sampling Challenges

Q4a. How do recommendations appear on platforms like Amazon or Netflix?


Explain the process and algorithm used.

a) Amazon and Netflix use collaborative filtering and content-based filtering for
recommendations. The process:
 Collect user interactions (views, purchases, ratings)
 Compute similarity scores (user-based or item-based)
 Generate recommendations based on preferences

Q4b. How important is sampling in data analysis? What challenges might arise
in diagnosing COVID-19 data, and how do you handle them?

b) Sampling is crucial in data analysis. In COVID-19 diagnosis, challenges include:

 Biased sample representation


 Data collection errors
 Incomplete patient history

Handling these challenges requires stratified sampling to ensure diverse and


representative data.

Q4c. You have run a classification model, namely logistic regression. How will
you present the effectiveness of the model to the business team?

c) Logistic Regression Model Effectiveness Report:

 Accuracy & Precision: Measure predictive performance


 ROC Curve: Evaluate classification efficiency
 Business Impact: Explain how the model helps reduce financial risk

5. Health Data Analysis & ANOVA Table

Q5a. Given patient health data, what kind of algorithm can you use to predict
fever? Why? Explain the process.

a) Given patient health data, a Naïve Bayes classifier can be used for fever
prediction. Steps:

 Convert categorical variables to numerical


 Apply probability rules to classify fever occurrence
 Evaluate using confusion matrix

Q5b. Construct an ANOVA table for a Multiple Regression Problem and


interpret the output (No. of Response Variables: 1, No. of Predictors: 4).

b) ANOVA Table for Multiple Regression:

Source SS df MS F P-value
Regression 1200 4 300 15 0.001
Residual 500 295 1.7 - -
Total 1700 299 - - -
Interpretation: A low p-value (< 0.05) indicates that at least one predictor significantly
affects the response variable.

6. Statistical Analysis Questions

Q6a. Compute the 30th Percentile and Five-Number Summary for BA Quiz
Scores.

 Given Scores: 95, 81, 81, 55, 68, 111, 88, 100, 94, 87, 65, 93,
85, 79, 106, 92, 15, 67, 83
 30th Percentile ≈ 72

Q6b. Dataset Analysis (Tablet Specifications Table)

 Elements: 8
 Variables: 5 (Cost, OS, Display, Battery Life, CPU)
 Categorical: OS, CPU Manufacturer
 Quantitative: Cost, Display Size, Battery Life

Q6c. Compute Sample Covariance and Correlation.

 Given X = [4, 6, 11, 3, 16], Y = [50, 50, 40, 60, 30]


 Sample Covariance = -60.0
 Sample Correlation = -0.969
 Interpretation: Strong negative correlation

d) IPL 2023 Runs Analysis

 Kohli’s Runs: [82, 21, 61, 50, 6, 59, 0, 54, 31, 55, 1, 18, 100,
101]
 Du Plessis’ Runs: [73, 23, 79, 22, 62, 84, 62, 17, 44, 45, 65, 55,
71, 28]

Findings:

 Kohli’s scores fluctuate more.


 Du Plessis has a more consistent range.

Q6e. Sleep Data Analysis using Empirical Rule

 Mean = 6.9 hours, SD = 1.2 hours


 68-95-99.7 Rule:
o 4.5 to 9.3 hours: 95.45%
o More than 9.3 hours: 2.28%
 Z-score for 8 hours sleep: 0.92

You might also like