0% found this document useful (0 votes)
795 views

Data Visualization Question Bank eDBDA Sept 21

This document contains a question bank on data visualization and exploratory data analysis concepts. It includes 20 multiple choice questions on topics like appropriate graphs to visualize different types of data, correlation measures, and identifying distribution shapes. It also has 15 multiple choice questions on data visualization concepts in Pandas and Matplotlib like creating plots, adjusting layouts, and visualizing relationships between variables.

Uploaded by

Somesh Rewadkar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
795 views

Data Visualization Question Bank eDBDA Sept 21

This document contains a question bank on data visualization and exploratory data analysis concepts. It includes 20 multiple choice questions on topics like appropriate graphs to visualize different types of data, correlation measures, and identifying distribution shapes. It also has 15 multiple choice questions on data visualization concepts in Pandas and Matplotlib like creating plots, adjusting layouts, and visualizing relationships between variables.

Uploaded by

Somesh Rewadkar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

USM’s Shriram Mantri Vidyanidhi Info Tech Academy

Data Visualization Question Bank eDBDA Sept 21

Contents
EDA............................................................................................................................................................. 1
Numpy, Pandas and Data Visualization ..................................................................................................... 2
Matplot and seaborn ................................................................................................................................. 4

EDA

1. Exploratory data analysis should be used to A.


help you search for patterns in your data.
B. spot serious defects in your data that may warrant taking corrective action.
C. help determine whether assumptions of the inferential tests you intend to use may have been violated.
D. all of the above

2. A bar graph is the best graph to use when


A. your dependent variable was measured on at least a ratio scale.
B. your independent variable is categorical.
C. your independent and dependent variables are both continuous.
D. you want to show ordered trends in your data.

3. To show a functional relationship between your independent and dependent variables, the graph of choice
would be a
A. line graph. B. histogram. C. pie chart D. scatterplot.

4. The Spearman Rank Order Correlation is used when A.


your data are scaled on an ordinal scale.
B. your data are scaled on an interval scale.
C. one measure is scaled on a nominal scale and the other on an ordinal scale.
D. one measure is scaled on a in interval scale and the other on an ordinal scale

5. In which of the following situations would you not want to use a Pearson correlation coefficient?
A. when the relationship between variables is nonlinear
B. when both of your variables are measured on at least an interval scale
C. when the variances of your distributions are very similar D. all of the above

6. A curve showing a functional relationship that starts off flat, becomes progressively steeper, and shows a single
direction of change is
A. negatively accelerated. B. monotonic
C. positively accelerated. D. both b and c

7. A ________ distribution has most scores collected about the center and is symmetrical about its midpoint.
A. functional B. normal B. monotonic D. bimodal

1
USM’s Shriram Mantri Vidyanidhi Info Tech Academy
Data Visualization Question Bank eDBDA Sept 21

8. _______ are used to represent category values (e.g., gender) as values.


A. Unstacked formats B. Dummy codes C. Stacked formats D.Codes

9. A functional graph that shows a uniformly increasing or decreasing functional relationship is said to
be A. monotonic. B. negatively skewed. C. normal. D. positively skewed.
10. If you have discrete group data, such as months of the year, age group, shoe sizes, and animals.
Which is best to explain?
A. Boxplot B. histogram C. bar D. scatterplot

11. Which graph is better used when data needs to be classified or categorize?
A. stack bar B. Pie chart C. histogram D. None of the above

12. Which is best to explain a relationship between to target and feature?


A. scatterplot B. bar C. Pareto chart D. all of the above

13. How can you check for outliers in data set?


Using scatterplot B. Using histogram C. Using Boxplot D. all of the above

14. From which plot you will come to the distribution of the target variable?
A. histogram B. pie chart C. bar D. Pareto chart

15. TrueFalse: The quantilequantile (qq) plot is a graphical technique for determining if two data sets come from
populations with a common distribution.
A. True B. False

16. TrueFalse: In Boxplot the middle line inside the box display the mean of the
distribution A. True B. False

17. TrueFalse: For Numeric vs Numeric data scatterplot is the best representation. A.
True B. False

18. TrueFalse: For Bivariant data, correlogram or corr plot show the correlation of each
variable. A. True B. False

19. TrueFalse: the height of the bar corresponds to the value of each category. A. True
B. False

20. TrueFalse: The height of the resulting Stacked Bar shows the combined result of the
groups. A. True B. False

Numpy, Pandas and Data Visualization

1) Pandas is designed to work with _______ data.


A. Relational B. Labeled C. Both of these D. None of these

2) DataFrame is a _______ labeled data structure.


A. 1dimensional B. 2dimensional C. 3dimensional D. ndimensional

2
USM’s Shriram Mantri Vidyanidhi Info Tech Academy
Data Visualization Question Bank eDBDA Sept 21

3) Pandas does easy handling of missing data in floating point as well as nonfloating point data? A.
True B. False

4) Columns can be deleted and inserted from:


A. DataFrame B. Higher dimensional objects.
C. All of the above D. None of the above

5) Shape property in pandas is used to


A. Visualise the distribution of the data
B. See the number of rows and columns of the data
C. Visualise the shape of skewness of the data
D. See the spread of data (mean, median etc.)

6) The _______ method allows us to retrieve rows and columns by position.


A. head B. getloc C. iloc D. locate

7) Pivot table can aggregate the data and summarize it by grouping the
columns A. True B. False

8) _______ is a convenient method for combining the columns of two


potentially differentlyindexed DataFrames into a single result DataFrame.
A. Concatenate B.Merge C. Join D. Collaborate

9) Dimensions should match along the axis you are _______ on.
A. concatenating B. merging C. joining D. collaborating

10) Series can have axis labels and it can be indexed by a label
A. True B. False

11) MatplotLib is a _______ library for data visualisation.


A. 1dimensional B. 2dimensional C. 3dimensional D. ndimensional

12) Select the proper sequence to create a plot:


A. Set plot parameters, import required libraries, define the required dataset, display plot.
B. Define the required dataset, set plot parameters, import required libraries, display plot. C.
Set plot parameters, define the required dataset, import required libraries, display plot.
D. Import required libraries, define the required dataset, set plot parameters, display plot.

13) The plt.subplots() object acts as a more automatic axis manager?


A. True B. False

14) To avoid the overlapping of subplots we use


A. fig.tight_layout() B. sub.tight_layout() C. flt.tight_layout()

15) We cannot create a horizontal bar plot in matplotlib


A. True B. False

3
USM’s Shriram Mantri Vidyanidhi Info Tech Academy
Data Visualization Question Bank eDBDA Sept 21

16) We use plot.barh() to adjust the height of the plot


A. True B. False
Explanation: We use it to create a horizontal barplot

17) We use ____ to create a horizontal bar plot.


axesh.bar() B. haxis.bar() C. axes.barh() D. hor.barh()

18) _______ is a visualisation library that provides a highlevel interface to draw attractive statistical graphics.
A. Scrapy B. Seaborn C. Airborn D. Statistica

Matplot and seaborn


1. The plot method on Series and DataFrame is just a simple wrapper around :
A. gplt.plot() B. plt.plot() C. plt.plotgraph() D. none of the Mentioned
Explanation: If the index consists of dates, it calls gcf().autofmt_xdate() to try to format the xaxis nicely.

2. Point out the correct combination with regards to kind keyword for graph plotting:
A. ‘hist’ for histogram B. ‘box’ for boxplot
C. ‘area’ for area plots D. all of the Mentioned
Explanation: The kind keyword argument of plot() accepts a handful of values for plots other than the default Line
plot.

3. Which of the following value is provided by kind keyword for barplot ?


A. barh B. kde C. hexbin D. none of the Mentioned
Explanation: bar can also be used for barplot.

4. You can create a scatter plot matrix using the __________ method in pandas.tools.plotting.
A. sca_matrix B. scatter_matrix C. DataFrame.plot D. all of the Mentioned
Explanation: You can create density plots using the Series/DataFrame.plot.

5. Point out the wrong combination with regards to kind keyword for graph
plotting: A. ‘scatter’ for scatter plots B. ‘kde’ for hexagonal bin plots C.
‘pie’ for pie plots D. none of the Mentioned
Explanation: kde is used for density plots.

6. Which of the following plots are used to check if a data set or time series is
random ?
A. Lag B. Random C. Lead D. None of the Mentioned
Explanation: Random data should not exhibit any structure in the lag plot.

7. Plots may also be adorned with error bars or tables.


A. True B. False
Explanation: There are several plotting functions in pandas.tools.plotting.

8. Which of the following plots are often used for checking randomness in time series ?
A. Autocausation B. Autorank C. Autocorrelation D. None of the Mentioned
Explanation: If time series is random, such autocorrelations should be near zero for any and all timelag separations.
4
USM’s Shriram Mantri Vidyanidhi Info Tech Academy
Data Visualization Question Bank eDBDA Sept 21

9. __________ plots are used to visually assess the uncertainty of a statistic.


A. Lag B. RadViz C. Bootstrap D. None of the Mentioned
Explanation: Resulting plots and histograms are what constitutes the bootstrap plot.

10. Andrews curves allow one to plot multivariate data.


A. True B. False
Explanation: Curves belonging to samples of the same class will usually be closer together and form larger
structures.

You might also like