0% found this document useful (0 votes)

103 views

BUSI2045 Midterm Questions 2024 Spring

This document contains a midterm test question paper for the course BUSI 2045 DATA ANALYTICS FOR BUSINESS DECISION MAKING. It has two parts - multiple choice questions worth 32% and empirical questions worth 68%. For the multiple choice section, there are 16 questions testing concepts related to data visualization, sampling, distributions, and basic R operations. The empirical questions section involves loading datasets, conducting descriptive analysis, creating visualizations, and answering questions based on the analysis. Submission requires including name, ID, R code, and results in one Moodle file upload.

Uploaded by

rinniechan630

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

103 views

BUSI2045 Midterm Questions 2024 Spring

Uploaded by

rinniechan630

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

BUSI 2045 DATA ANALYTICS FOR BUSINESS DECISION MAKING

Midterm Test Question Paper

1. This is a 3-hour open-book test, which accounts for 20% of your final score.
2. Total Score is 100.
1. Part 1 Multiple Choices (32%),
2. Part II Empirical Questions (68%).
3. Submission format.
1. Include your name and ID in the first line of your answer sheet.
2. You can upload only one file via Moodle submission link.
3. You need to report both R codes and results in the answer sheet.

Part I: Multiple Choice Questions (32 points)

There are 16 questions in total, there is only one correct answer for each question. Please organize your answer
one by one with both question number and your answer correspondingly.

Q1. Which of the following plot is used to test whether a variable is normally distributed?
A. Pie chart
B. Error bars
C. Box plot
D. QQ plot

Q2. If the median value for a variable is larger than its mean, and its mode value is larger than its median,
then the distribution of values of this variable tends to be
A. Negatively skewed
B. Positively skewed
C. Symmetrically distributed
D. None of the above

1
BUSI 2045 DATA ANALYTICS FOR BUSINESS DECISION MAKING

Q3. Which of the below terms refers to the procedure of random sampling with replacement to create
multiple re-samples from a sample data?
A. Central Limit Theorem
B. Random Sampling
C. Selection Bias
D. Bootstrapping

Q4. Which of the following statements is NOT correct?

A. The distribution of sample means tends to resemble a bell-shaped normal curve, if we draw multiple
samples (of the same size) from a population repeatedly.
B. Standard deviation measures the variability of individual data points in a sample, while standard error
measures the variability of a sample statistic (e.g., mean) from multiple samples.
C. Standard error would be a good estimate of the standard deviation of the population.
D. The distribution of sample means would be more normally distributed when sample size gets larger.

Q5. Given x1 = 1:4, x2 = 5:8 and x3 = 9:12, you want to create a matrix with 4 rows and 3 columns named
m1 by combining the three vectors. Which of the following statements is correct?
A. You can create m1 by m1 = rbind(x1, x2, x3).
B. You can create m1 by m1 = cbind(x1, x2, x3).
C. You can create m1 by m1 = matrix(x1, x2, x3).
D. The output of length(m1) is 4.

Q6. Which of the following codes will produce FALSE as the output?
A. is.numeric(1:5)
B. is.numeric(c(1,2,3))
C. is.numeric('123')
D. none of the above

2
BUSI 2045 DATA ANALYTICS FOR BUSINESS DECISION MAKING

Q7. Which of the following scenarios fulfils the principle of random sampling?
A. Estimating the average GPA of HKBU students with only BUSI2045 students in the sample.
B. Estimating the average income of all Hong Kong workers with only doctors included in the sample.
C. Estimating the average housing price in Hong Kong with the houses in the Hong Kong Island.
D. None of the above.

Q8. If you would like to produce a scatter plot and add a small amount of random variation to the location of
each point, which of the following function should you use?
A. geom_point()
B. geom_jitter()
C. geom_boxplot()
D. geom_pointrange()

Please answer Q9 to Q11 with below data frame df.

3
BUSI 2045 DATA ANALYTICS FOR BUSINESS DECISION MAKING

Q9. Which of the following statements can produce a plot like the below?

A. ggplot(df, aes(period, sales)) + geom_point()

B. ggplot(df, aes(period, sales, colour = "red")) + geom_point()
C. ggplot(mpg, aes(period, sales)) +geom_point(colour = "red")
D. ggplot(mpg, aes(period, sales)) +geom_point(aes(colour = "red"))

Q10. Which of the following statements can produce a plot as below?

A. ggplot(df, aes(period, sales)) + geom_line()

B. ggplot(df, aes(period, sales)) + geom_path()
C. ggplot(df, aes(period, sales)) + geom_smooth()
D. ggplot(df, aes(period, sales)) + geom_area()

4
BUSI 2045 DATA ANALYTICS FOR BUSINESS DECISION MAKING

Q11. What is the output for the code IQR(df$sales)?

A. 16
B. 16.25
C. 16.6
D. 17

Load the iris data from package datasets, which have been preinstalled in Base R, and answer Q12 -14
accordingly. (Hint: you may simply run the code data(iris) to load the data)

Q12. What is the median value for the variable Petal.Length?

A. 3.66
B. 3.76
C. 3.81
D. 4.35

Q13. Create a subset of the iris data in which Sepal.Width values are larger than 2.5. For the variable
Species, how many times the value ‘setosa’ appears in this subset?
A. 41
B. 49
C. 50
D. 139

Q14. Create another subset of the iris data in which Petal.Width values are larger than 1. How many unique
Species values are there in this subset?
A. 3
B. 2
C. 1
D. 0

5
BUSI 2045 DATA ANALYTICS FOR BUSINESS DECISION MAKING

Load the data set Duncan from the package carData, and answer Q15 - 16. This dataset records information on
the prestige and other characteristics of 45 U.S. occupations in 1950, based on a social survey data. Occupation
names were set as row names.
(Hint: you may need to install and load the package carData before loading its data Duncan into R)

Q15. The variable prestige records the percentage of respondents who rated the occupation as “good” or
better in prestige. What is the max prestige value and which occupation receives the highest prestige?
A. 97, physician
B. 17, engineer
C. 17, professor
D. 3, shoe.shiner

Q16. The variable type records the type of occupation, with “prof” representing professional and managerial,
“wc” representing white-collar, and “bc” representing blue-collar. Which occupation type occurs most often
in this dataset?
A. Professional and managerial
B. Blue-collar
C. White-collar
D. none of the above

6
BUSI 2045 DATA ANALYTICS FOR BUSINESS DECISION MAKING

Part II Empirical Questions (68 Points)

Question 1 Data Processing and Description (20 Points)

Read the file simulated_data.csv into R and answer the following questions.

(a) How many variables and observations are there? What are the data types for these variables?
(b) Find the mean, the 0.25 and 0.75 quantiles of the variable alpha.
(c) Construct a frequency table of the variable delta as below.

First Second Third

? ? ?

Create a subset named subset1 in which variable beta contains no missing value, answer below questions.
(d) How many observations are there in subset1 ?
(e) Find the mean, the 0.25 and 0.75 quantiles of variable alpha in subset1.
(f) With subset1, visualize the average gamma value in each delta level. Your output should look similar
as the below graph. (Hints: pay attention to the axis labels, legend title, and plot title)

7
BUSI 2045 DATA ANALYTICS FOR BUSINESS DECISION MAKING

Question 2 Data Description and Visualization (24 Points)

Read the file marketing_campaign.csv into R and answer the following questions. The dataset records 1560
customers’ different information, each row represents one customer.

(a) What are the unique values in the variable Marital_Status?

(b) Create a two-way table to show the number of customers separated by Education levels and
Marital_Status. How many customers are both “married” and with a master’s degree?

(c) Display the number of customers across different Marital_Status and Education levels with a bar plot.
The plot should look like the below. (Hints: pay attention to plot title, legend title, and axis labels)

(d) Create a subset named subset2 in which the “Divorced” people are excluded (variable
Marital_Status). How many customers are there in the subset?

(e) With subset2 , visualize the distribution of variable Income across different Marital_Status and
Education with a boxplot. Your result should look like the below. (Hints: Pay attention to the title, x-
axis, and y-axis labels)

8
BUSI 2045 DATA ANALYTICS FOR BUSINESS DECISION MAKING

(f) Based on the boxplot in step (e), answer the following two questions:
i. Which education level tends to have lower income in general? Explain your answer.
ii. Is the income of the customers with education level ‘Graduation’ higher after getting married in
general? Explain your answer.

9
BUSI 2045 DATA ANALYTICS FOR BUSINESS DECISION MAKING

Question 3 Data and Sampling Distribution (20 Points)

Run the following codes to generate a random sample (named 𝑿) of 400 values from a normal distribution
with mean as 172, standard deviation as 10. (Note: set the seed as 2024)

set.seed(2024)
X <- rnorm(400, mean=172, sd=10)

(a) Visualize values in X with a density plot and mark their mean with a red vertical line. You result should
look like the below. (Hint: you may need to convert the vector X as a data frame before plotting)

(b) Assume 𝑿 is a random sample representing the height of all residents in Hong Kong. If we collect
multiple random samples, each with the same sample size (i.e., 𝑛 = 400), from the Hong Kong
population, will these sample means be normally distributed? Why?

(c) Calculate the standard error (of the mean) with the mathematical approximation based on sample standard
deviation and sample size. What does the standard error measure?

(d) Calculate the 95% confidence interval (of the mean) via bootstrapping with 5000 resamples. What does it
tell us? (Note: set the random seed as 2024)

Hd 90 Buss1020 Notes Organised Well Labelled Easy to Understand
No ratings yet
Hd 90 Buss1020 Notes Organised Well Labelled Easy to Understand
51 pages
BUSS1020 InSemester Practice
No ratings yet
BUSS1020 InSemester Practice
8 pages
Business Statistics Final Exam Solutions
100% (4)
Business Statistics Final Exam Solutions
10 pages
Process Capability Study Template
100% (4)
Process Capability Study Template
3 pages
Solutions Manual For Biostatistics For The Biological and Health Sciences 1st Edition by Triola
No ratings yet
Solutions Manual For Biostatistics For The Biological and Health Sciences 1st Edition by Triola
56 pages
Itae006 Test 1 and 2
No ratings yet
Itae006 Test 1 and 2
18 pages
205 R Prog MCQ
100% (1)
205 R Prog MCQ
48 pages
Group Project STA 108
0% (1)
Group Project STA 108
18 pages
Assignment 1 - 2023spring
No ratings yet
Assignment 1 - 2023spring
3 pages
100 Puzzles to Learn Data Warehousing
From Everand
100 Puzzles to Learn Data Warehousing
Cristian Scutaru
No ratings yet
DS Unit 1
No ratings yet
DS Unit 1
99 pages
Data Science and Big Data Analysis Mcqs
No ratings yet
Data Science and Big Data Analysis Mcqs
53 pages
Maximum Possible Questions for Theory Exam Business Analytics
No ratings yet
Maximum Possible Questions for Theory Exam Business Analytics
5 pages
AP Computer Science Principles: Student-Crafted Practice Tests For Excellence
From Everand
AP Computer Science Principles: Student-Crafted Practice Tests For Excellence
Sama Alshatali
No ratings yet
BA-205-MCQ
No ratings yet
BA-205-MCQ
39 pages
Stat211 083 02 E1
100% (1)
Stat211 083 02 E1
10 pages
02 Stats Revision
No ratings yet
02 Stats Revision
46 pages
Intro and EDA
No ratings yet
Intro and EDA
74 pages
PG IV 1110 Online Predictive Modelling End Term Paper
No ratings yet
PG IV 1110 Online Predictive Modelling End Term Paper
3 pages
CCW331 SET4
No ratings yet
CCW331 SET4
5 pages
Assignment 1 ISOM2500 2025Spring
No ratings yet
Assignment 1 ISOM2500 2025Spring
5 pages
BigDatal PDF
No ratings yet
BigDatal PDF
50 pages
MBA S BIG DATA & BUSINESS ANALYTICS MGU ❤️
No ratings yet
MBA S BIG DATA & BUSINESS ANALYTICS MGU ❤️
12 pages
MODULE 2 Coursera
No ratings yet
MODULE 2 Coursera
9 pages
Question Text: Clear My Choice
No ratings yet
Question Text: Clear My Choice
13 pages
DA_Answer-Key
No ratings yet
DA_Answer-Key
12 pages
Datascience Interview
100% (1)
Datascience Interview
31 pages
Bcacac 385
No ratings yet
Bcacac 385
6 pages
10 Questions BBA (Stat - 1) (19 Pages)
No ratings yet
10 Questions BBA (Stat - 1) (19 Pages)
28 pages
291 Practice Midterms and Solutions
100% (2)
291 Practice Midterms and Solutions
116 pages
BCOM 209 Business Statistics
No ratings yet
BCOM 209 Business Statistics
12 pages
intro of bi mba
No ratings yet
intro of bi mba
17 pages
Bussiness analytics_FINAL
No ratings yet
Bussiness analytics_FINAL
34 pages
Merged Ma MCQ and Descriptive
No ratings yet
Merged Ma MCQ and Descriptive
36 pages
25 Question Paper
No ratings yet
25 Question Paper
4 pages
Business Statistics Final Exam Solutions PDF
No ratings yet
Business Statistics Final Exam Solutions PDF
10 pages
Predictive Modeling MCQs IMT
100% (1)
Predictive Modeling MCQs IMT
19 pages
DSC2608_Assessment_05 S1-2025
No ratings yet
DSC2608_Assessment_05 S1-2025
4 pages
11 Questions BBA (Hons) (19 Pages)
No ratings yet
11 Questions BBA (Hons) (19 Pages)
18 pages
Pratima Education® 9898168041: D. Ratio
No ratings yet
Pratima Education® 9898168041: D. Ratio
68 pages
QMB Final
No ratings yet
QMB Final
25 pages
R-Practical questions-Sem-IV
No ratings yet
R-Practical questions-Sem-IV
4 pages
BT10403 Online Midterm Exam 2 - 2409
No ratings yet
BT10403 Online Midterm Exam 2 - 2409
19 pages
week 7 assignment solution
No ratings yet
week 7 assignment solution
3 pages
Unit 1 Ganeshk e
No ratings yet
Unit 1 Ganeshk e
24 pages
Verify - Template 2 (Jenny)
No ratings yet
Verify - Template 2 (Jenny)
94 pages
SB Test Bank Chapter 2
No ratings yet
SB Test Bank Chapter 2
49 pages
Mocktest - Midterm 1 - Solution
No ratings yet
Mocktest - Midterm 1 - Solution
8 pages
Midterm Self Tests
No ratings yet
Midterm Self Tests
4 pages
DS&BDA Techneo Unit 1&2 MCQs
No ratings yet
DS&BDA Techneo Unit 1&2 MCQs
16 pages
DEV_Lab_Manual
No ratings yet
DEV_Lab_Manual
27 pages
Excel_Functions_and_Shortcuts
No ratings yet
Excel_Functions_and_Shortcuts
13 pages
4-DataUnderstanding
No ratings yet
4-DataUnderstanding
51 pages
The Owner of A Company Has Recently Decided To Raise The Salary of One Employee, Who Was Already Making The Highest Salary, by 20%. Which of The Following Is (Are) Expected To Be Affected by
No ratings yet
The Owner of A Company Has Recently Decided To Raise The Salary of One Employee, Who Was Already Making The Highest Salary, by 20%. Which of The Following Is (Are) Expected To Be Affected by
8 pages
Problems: Managerial Statistics Problem Set Populations and Distributions Boston College
No ratings yet
Problems: Managerial Statistics Problem Set Populations and Distributions Boston College
3 pages
Apache Cassandra Developer Associate - Exam Practice Tests
From Everand
Apache Cassandra Developer Associate - Exam Practice Tests
Cristian Scutaru
No ratings yet
Business Statistics Final Exam Solutions
No ratings yet
Business Statistics Final Exam Solutions
10 pages
NSE Project
No ratings yet
NSE Project
11 pages
Sample Questions Big Data Preparation
No ratings yet
Sample Questions Big Data Preparation
4 pages
BA Module 01 - Quiz
No ratings yet
BA Module 01 - Quiz
29 pages
Mocktest - Midterm 1
No ratings yet
Mocktest - Midterm 1
6 pages
Business Analytics 2nd Edition Evans Test Bank download
100% (4)
Business Analytics 2nd Edition Evans Test Bank download
56 pages
AI-900: Microsoft Azure AI Fundamentals Preparation
From Everand
AI-900: Microsoft Azure AI Fundamentals Preparation
Georgio Daccache
No ratings yet
Intelligent Prediction Model of Shanghai Composite
No ratings yet
Intelligent Prediction Model of Shanghai Composite
20 pages
VTU-mba New Syllabus
No ratings yet
VTU-mba New Syllabus
186 pages
Biostatistics & Research Methodology Mcqs 2024
100% (1)
Biostatistics & Research Methodology Mcqs 2024
35 pages
Brown - Warner - 1985 How To Calculate Market Return
No ratings yet
Brown - Warner - 1985 How To Calculate Market Return
29 pages
Demographic Factors
No ratings yet
Demographic Factors
20 pages
MATH 6200 2013T UGRD Data Analysis Midterm Exam - PDF 1
No ratings yet
MATH 6200 2013T UGRD Data Analysis Midterm Exam - PDF 1
10 pages
Latihan 1
No ratings yet
Latihan 1
13 pages
Skewness and Kurtosis
No ratings yet
Skewness and Kurtosis
47 pages
UNit 2-Prob Dist-Discrete
No ratings yet
UNit 2-Prob Dist-Discrete
5 pages
Business Statistics & Analytics KMBN104 UNIT-1
No ratings yet
Business Statistics & Analytics KMBN104 UNIT-1
13 pages
Nonparametric Statistics for Non Statisticians A Step by Step Approach 1st Edition Gregory W. Corder - The full ebook with all chapters is available for download now
100% (2)
Nonparametric Statistics for Non Statisticians A Step by Step Approach 1st Edition Gregory W. Corder - The full ebook with all chapters is available for download now
55 pages
Descriptive Measures With Samples-1
No ratings yet
Descriptive Measures With Samples-1
33 pages
Bicro: Noisy Correspondence Rectification For Multi-Modality Data Via Bi-Directional Cross-Modal Similarity Consistency
No ratings yet
Bicro: Noisy Correspondence Rectification For Multi-Modality Data Via Bi-Directional Cross-Modal Similarity Consistency
11 pages
Notes 3 Descriptive Statistics RJMurden 2021
No ratings yet
Notes 3 Descriptive Statistics RJMurden 2021
47 pages
Chapter 5 Descriptive Statistics in SPSS
No ratings yet
Chapter 5 Descriptive Statistics in SPSS
35 pages
Estimation and Testing of Hypothesis PDF
100% (1)
Estimation and Testing of Hypothesis PDF
75 pages
Data Decision and Managers
No ratings yet
Data Decision and Managers
22 pages
12 How To Analyse Rainfall Data
100% (1)
12 How To Analyse Rainfall Data
40 pages
Laboratory Exercise Sampling and Sampling Distribution
No ratings yet
Laboratory Exercise Sampling and Sampling Distribution
2 pages
Crystal Ball Report - Full
No ratings yet
Crystal Ball Report - Full
7 pages
MMW Chapter 4 GH Annotated1
No ratings yet
MMW Chapter 4 GH Annotated1
33 pages
PROF ED Assessment and Evaluation of Learning 3
No ratings yet
PROF ED Assessment and Evaluation of Learning 3
4 pages
RainfallFrequencyAnalysisusingGumbelDistribution
No ratings yet
RainfallFrequencyAnalysisusingGumbelDistribution
7 pages
chapter 1.2
No ratings yet
chapter 1.2
55 pages
Fatima Abbasi
No ratings yet
Fatima Abbasi
64 pages
Chapter 3 DESCRIPTIVE STATISTICS FOR EDA
No ratings yet
Chapter 3 DESCRIPTIVE STATISTICS FOR EDA
51 pages
2928
No ratings yet
2928
37 pages

BUSI2045 Midterm Questions 2024 Spring

Uploaded by

BUSI2045 Midterm Questions 2024 Spring

Uploaded by

BUSI 2045 DATA ANALYTICS FOR BUSINESS DECISION MAKING

Midterm Test Question Paper

Part I: Multiple Choice Questions (32 points)

Q4. Which of the following statements is NOT correct?

Please answer Q9 to Q11 with below data frame df.

A. ggplot(df, aes(period, sales)) + geom_point()

Q10. Which of the following statements can produce a plot as below?

A. ggplot(df, aes(period, sales)) + geom_line()

Q11. What is the output for the code IQR(df$sales)?

Q12. What is the median value for the variable Petal.Length?

Part II Empirical Questions (68 Points)

Question 1 Data Processing and Description (20 Points)

First Second Third

Question 2 Data Description and Visualization (24 Points)

(a) What are the unique values in the variable Marital_Status?

Question 3 Data and Sampling Distribution (20 Points)

You might also like