Question Bank
Question Bank
Question bank
Unit 1
2 marks:
10 mark Questions:
1. Discuss the applications of data science and big data with suitable examples.
2. Illustrate the overview of the data science process.
3. Elaborate any 昀椀ve application domains of data science
4. Describe the categories of data for data mining.
5. Discuss the signi昀椀cance of setting the research goal for the data science project.
6. Discuss the categories involved in retrieving relevant data from di昀昀erent sources of
data.
7. Explain the di昀昀erent stages of data preparation phase.
8. Elucidate the techniques involved in data cleansing.
9. Illustrate the steps involved in combining data from di昀昀erent data sources.
10. Explain the impact of variable reduction on data science project highlighting its pros
and cons.
11. Elaborate on the steps involve in model building with suitable diagrams.
2 marks:
1. What is meant by frequency distribution?
2. What is meant by qualitative data? Give examples.
3. What is meant by quantitative data? Give examples.
4. Di昀昀erentiate qualitative and quantitative data.
10 Marks Questions:
1. Explain the di昀昀erent types of frequency distribution with suitable examples
and diagrams.
2. Elaborate the di昀昀erent ways to describe or represent data using tables with
suitable examples.
3. Explain the various ways by which data can be represents or describes using
graphs with suitable examples.
4. Elaborate the di昀昀erent measures of central tendency and describe the
suitable measures for the di昀昀erent types of data distribution.
5. Construct the frequency table an draw bar graph and stem, leaf displays for
the following data:
6. The following data are the shoe sizes of 50 male students. The sizes are
discrete data since shoe size is measured in whole and half units only.
Construct a histogram and calculate the width of each bar or class interval.
Suppose you choose six bars.
9; 9; 9.5; 9.5; 10; 10; 10; 10; 10; 10; 10.5; 10.5; 10.5; 10.5; 10.5; 10.5; 10.5;
10.5
11; 11; 11; 11; 11; 11; 11; 11; 11; 11; 11; 11; 11; 11.5; 11.5; 11.5; 11.5; 11.5;
11.5; 11.5
12; 12; 12; 12; 12; 12; 12; 12.5; 12.5; 12.5; 12.5; 14
7. The following data are the heights (in inches to the nearest half inch) of 100
male semiprofessional soccer players. The heights are continuous data, since
height is measured.
60; 60.5; 61; 61; 61.5 63.5; 63.5; 63.5 64; 64; 64; 64; 64; 64; 64; 64.5; 64.5;
64.5; 64.5; 64.5; 64.5; 64.5; 64.5 66; 66; 66; 66; 66; 66; 66; 66; 66; 66; 66.5;
66.5; 66.5; 66.5; 66.5; 66.5; 66.5; 66.5; 66.5; 66.5; 66.5; 67; 67; 67; 67; 67;
67; 67; 67; 67; 67; 67; 67; 67.5; 67.5; 67.5; 67.5; 67.5; 67.5; 67.5 68; 68; 69;
69; 69; 69; 69; 69; 69; 69; 69; 69; 69.5; 69.5; 69.5; 69.5; 69.5 70; 70; 70; 70;
70; 70; 70.5; 70.5; 70.5; 71; 71; 71 72; 72; 72; 72.5; 72.5; 73; 73.5 74
8. Compute the mean, median and mode for the following data sets.
I ) 9, 10, 12, 13, 13, 13, 15, 15, 16, 16, 18, 22, 23, 24, 24, 25
1 43 99
2 21 65
3 25 79
4 42 75
5 57 87
6 59 81
2 Mark Questions:
10 Mark Questions:
7. An online food delivery company claims that the mean delivery time is less
than 30 minutes with a standard deviation of 10 minutes. Is there enough
evidence to support this claim at a 0.05 signi昀椀cance level if 49 orders were
examined with a mean of 20 minutes?
8. A company wants to improve the quality of products by reducing defects and
monitoring the e昀케ciency of assembly lines. In assembly line A, there were 9
defects reported out of 100 samples and in line B, 25 defects out of 600
samples were identi昀椀ed. Check if there is a di昀昀erence in the procedures at a
0.05 alpha level?
9. Explain in detail about Estimation and the signi昀椀cance of point estimates.
10. Elaborate on Con昀椀dence interval and level of con昀椀dence.
2 Mark Questions:
10 Mark Questions:
2 Mark Questions:
1. How do you calculate least squares?
2. List the methods the available to calculate least square.
3. De昀椀ne the principle of least square.
4. De昀椀ne least square.
5. What is least square curve 昀椀tting?
6. Why do we need Time series Analysis?
7. Give some examples for time series analysis.
8. Mention the types of Time series Analysis.
9. Mention the applications of Time Series Analysis.
10. Give the limitations of Time series Analysis.
11. List the Data types of Time series.
12. What does Goodness of 昀椀t mean?
13. Why is Goodness of 昀椀t is important?
14. Provide the most common goodness of 昀椀t tests.
15. Why do we test goodness of 昀椀t.
16. De昀椀ne multiple linear regression.
17. How the error is calculate in linear regression model.