ADVANCED DATA ANALYTICS TUTORIAL PRACTICE QUESTIONS Session 2
ADVANCED DATA ANALYTICS TUTORIAL PRACTICE QUESTIONS Session 2
Task 1
With the aid of Examples, Identify and briefly explain the 4V’s in big data.
List three types of applications of the T-Test
The following data represents hemoglobin values in gm/dl for 10 patients: gm/dl for 10 patients:
Perform a T-test to determine whether the mean value for patients significantly differ from the
mean value of general population (12 gm/dl). Evaluate the role of chance.
Task 3
Century Office Supplies, an electronic retail company in Francistown, has recorded the
number of flat-screen TVs sold each week and the number of advertisements placed weekly
for a period of 12 weeks.
Advertisements 4 4 3 2 5 2 4 3 5 5 3 4
Sales 26 28 24 18 35 24 36 25 31 37 30 32
Comment on the strength and direction of the association between TV sales and
advertisement.
Find the linear regression model that Century Office Supplies can use estimate sales for
each week, based on number of advertisements placed.
Estimate the mean sales of flat -screen TVs when three advertisements are placed.
Task 4
What is a parametric test and what is the difference between a parametric test and a
non-parametric test?
Give two examples of non-parametric tests and two examples of parametric tests
Identify and discuss three different use cases of the Chi-Squares Test.
The table below shows the number of accidents that occurred throughout the week on A1 road.
Day ofSun Mon Tue Wed Thursday Friday Sat
Week
Number of11 13 14 13 15 14 18
Accidents
Table
You are required to use the data in in table 1.0 to test whether the accidents occur uniformly
over the week. You should answer the following question in order to achieve your goal:
State the null and alternative hypothesis
Determine the tabulated Chi-Square value at the 5% confidence level
Compute the Chi-Square value
State your conclusion
Task 5
Table below shows data on a random sample on average earnings per share (EPS) for
commercial banks, retailing companies, transport and logistic companies and utility
companies (variable Companies) for the same time period in 2019. You are required to
carry out one-way analysis of variance ANOVA) test to determine if there are any
differences in the mean earning per share among the 4 different groups of companies.
Answers the sub questions below to perform the ANOVA test.