NSE Sample Test
IMS Proschool Certification in Business Analytics
Sample Test: CBA
Basic Data Exploration with Statistics
Question 1 (1)
When would we say that we have a left tailed distribution basis following observations in data?
I. Mean < Median
II. Mean = Median
III. Mean > Median
IV. Median = Mode
Question 2 (1)
Given the below sample data compute what would be the mean, median, mode and standard deviation in
the Age 17 34 23 28 20 25 37 21 11 19 39 37 37 32 32
I. Mean = 27, Median = 28, Mode = 37, Standard Deviation = 8.7
II. Mean = 18, Median = 25, Mode = 37, standard deviation = 8.7
III. Mean = 27, Median = 28, Mode = 37, standard Deviation = 7.8
IV. Mean = 27, Median = 28, Mode = 37, Standard Deviation = 8.7
Question 3 (2)
Temperature (In degrees Fahrenheit) as a variable in any study with sample observations like 35oF, 75 oF,
98.3 oF ,etc would be
I. having a Interval scale of measurement as zero is assigned arbitrarily
II. having a Interval scale of measurement as we cannot take ratios of two measurements
III. having a Ratio scale of measurement as zero is absolute
IV. both 1 & 2
Question 4 (3)
The Lower Quartile of a Box Plot created from a dataset with observations as {2,4,40,44,46,66,56,33,45}
is 33 while the Upper Quartile is 46. Where should we be seeing the Lower Whiskers?
I. 2
II. 4
III. 40
IV. None of the Above
NSE Sample Test
IMS Proschool Certification in Business Analytics
Sampling and Hypothesis Testing
Question 5 (1)
If you get some data related to the efficacy of a fertilizer based on tests on field wherein the quantity
administered and size of the field under study is different for different samples, then which of the following
would be an ideal process before you go ahead with data analysis
I. Using percentage ratios
II. Transforming the data using normalization techniques
III. Use Average values and compare the tests
IV. Use absolute values as is available so as to keep the data unchanged
Question 6 (1)
Probability of an event A is 0.4, and the probability of event B is 0.3. Assuming the two events are
independent of each other what is the conditional probability of A given B denoted by P(A|B)?
I. 0.4
II. 0.12
III. 0.1
IV. None of the above
Question 7 (1)
A sampling distribution is the probability distribution for which one of the following?
I. A sample
II. A population
III. A sample statistic
IV. A population parameter
Question 8 (2)
Given that you have specified the confidence level at 95 %, if p value is less than then specify the and
maximum probability of Type I error respectively
I. 0.95 and 0.85
II. 5 % and 0.95
III. 0.05 and 0.05
IV. None of the above
Question 9(3)
Select the hypothesis formulation and the corresponding best values for , in a Judiciary Scenario so as to
avoid punishing an innocent in lieu of which its okay to pronounce a real case of guilty as not guilty
I. H0 : Defendant is Guilty ,H1 : Defendant is not Guilty, = 10%
NSE Sample Test
IMS Proschool Certification in Business Analytics
II. H0 : Defendant is Innocent, H1 : Defendant is not Innocent, = 5%
III. H0 : Defendant is Guilty, H1 : Defendant is not Guilty, = 1%
IV. H0 : Defendant is Innocent, H1 : Defendant is not Innocent, = 1%
Predictive Analytics: Linear Regression
Question 10(1)
The degree or strength of correlation between an independent variable age and dependent variable salary is
measured by
I. Coefficient of determination
II. Coefficient of correlation
III. Standard error of estimate
IV. All of above
Question 11(1)
Percent total variation of the dependent variable Y explained by the set of independent variables
X1,X2,...,Xn is measured by
I. Coefficient of correlation
II. Coefficient of skewness
III. Coefficient of determination
IV. Standard deviation
Question 12(1)
Coefficient of correlation between age and mortality rate is 0.9 indicating
I. a weak relationship between age and mortality rate
II. a weak relationship between age and mortality rate which is positive
III. a strong relationship between age and mortality rate
IV. a strong relationship between age and mortality rate which is positive
Question 13 (2)
Given the ANOVA output, compute the missing values
Source of Sum of
Variation Squares Degrees of Freedom Mean Square F Ratio
Regression 321.5 ???? 107.1666667 4.351945854
Error ???? 4 XXXXXX
Total 420 7
I. Regression sum of squares is 300 and Degrees of freedom for Regression is 3
II. Regression sum of squares is 210 and Degrees of freedom for Regression is 6
III. Regression sum of squares is 80.5 and Degrees of freedom for Regression is 6
IV. Regression sum of squares is 98.5 and Degrees of freedom for Regression is 3
NSE Sample Test
IMS Proschool Certification in Business Analytics
Question 14(3)
From the following ANOVA output compute the total number of observations and number of variables
respectively
Source of Sum of
Variation Squares Degrees of Freedom Mean Square F Ratio
Regression 400 12 33.33333333 2.666666667
Error 100 8 12.5
Total 500 20
I. n = 20 and k =8
II. n = 21 and k = 12
III. n = 19 and k = 8
IV. n = 18 and k = 12
Classification
Question 15(1)
Naive Bayes algorithm is a
I. Supervised learning model
II. Unsupervised learning model
III. Both of the Above
IV. None of the Above
Question 16(1)
Decision tree algorithm is a
I. Supervised learning model
II. Unsupervised learning model
III. Both of the Above
IV. None of the Above
Question 17(1)
Naive Bayes algorithm is a
I. Prediction model
II. Classification model
III. Both of the Above
IV. None of the Above
NSE Sample Test
IMS Proschool Certification in Business Analytics
Question 18(2)
Which of these target variable types are used by CHAID for decision making?
I. Numeric
II. Integer
III. Interval
IV. Class
Market Basket Analysis
Question 19 (1)
Market Basket Analysis is a study of
I. Association between products
II. Link between products
III. Relation between numbers
IV. Association between dependent variable and independent variable
Question 20 (1)
Association rule mining is a
I. Supervised learning model
II. Unsupervised learning model
III. Classification model
IV. None of the above
Question 21 (1)
Market basket analysis is used for
I. Up selling Only
II. Cross selling Only
III. Up selling and cross selling
IV. None of the above
Question 22 (2)
Would you expect good number of rules in a transaction set of 100 records as compared to 100000
records?
I. Yes
II. No
III. Cant Say
IV. None of the Above
NSE Sample Test
IMS Proschool Certification in Business Analytics
Question 23 (3)
At any point in time for a specific customer, is it possible to see more than one consequent as a
recommendation?
I. Yes
II. No
III. Cant Say
IV. None of the Above
Predictive Analytics: Forecasting Time Series Analysis
Question 24 (1)
What is the x axis of a time series data?
I. Time
II. Sales
III. Both of the Above
IV. None of the Above
Question 25(1)
What would we call an ordered set of data arranged in accordance with their time of occurrence?
I. Arithmetic series
II. Time Series
III. Both of the Above
IV. None of the Above
Question 26(1)
What would a time series indicate?
I. Short term variation
II. Irregular variation
III. Both of the Above
IV. None of the Above
Question 27(2)
What would be the systematic components of time series which follow regular pattern of variations?
I. Noise
II. Signal
III. Correlation
IV. None of the Above
NSE Sample Test
IMS Proschool Certification in Business Analytics
Question 28(3)
Which of the following describes a time series as a weak stationary process?
a. Constant mean
b. Constant variance
c. Constant auto covariance for given lags
d. Constant probability distributions
I. a
II. a&c
III. a,b & c
IV. a,b,c & d
Clustering
Question 29(1)
What is the Clustering?
I. Prediction of data
II. Classification of data
III. Partition of data
IV. None of the Above
Question 30(1)
Do we identify a set of independent variables and a dependent variable when we do clustering?
I. Yes
II. No
III. Cant say
IV. None of the Above
Question 31(1)
Will we call cluster analysis a variable reduction technique?
I. Yes
II. No
III. Cant say
Question 32 (2)
What of these would be a dependent variable in a clustering algorithm?
I. Numerical
II. Categorical
III. Both of the Above
NSE Sample Test
IMS Proschool Certification in Business Analytics
IV. None of the Above
Logistic Regression
Question 33(1)
In a Logistic regression model, the level of significance for a variable in the model indicates
I. The probability of accepting the null hypothesis when it is actually true
II. The probability of rejecting the null hypothesis when it is actually true
III. The probability of accepting the null hypothesis when it is actually false
IV. The probability of rejecting the null hypothesis when it is actually false
Question 34(1)
What is the relation between level of confidence and the significance level ?
I. Level of confidence =
II. Level of significance = 1 -
III. Level of confidence = 1-
IV. Level of confidence = Level of significance
Question 35(2)
The likelihood term in logistic regression statistically
1. Is the probability of observing a particular parameter value given a set of data
2. Is same as p value
3. Is the parameter value which is most likely given the observed data
4. Minimises the difference between the model and the data