Quizzes Module 1
Quizzes Module 1
This are centralized data containers in a purpose-built space that supports business intelligence and
reporting but restricts robust analyses.
Data marts
Data warehouses
Analytic Sandbox
None of the Above
Which of the following are problems encountered in traditional data architecture?
High-value data is hard to reach and leverage, and predictive analytics and data mining activities
are last in line for data.
Data scientists are limited to performing in-memory analytics which will restrict the size of the
datasets they can use.
Data Science projects will remain isolated and ad hoc, rather than centrally managed.
All of the Above
II only
both I and II
neither I nor
II
Which of the following TRUE about the differences of Business Intelligence (BI) and Data Science?
I. Where Data Science problems tend to require highly structured data organized in rows and columns
for accurate reporting, BI projects tend to use many types of data sources, including large or
unconventional datasets.
II. Data Science tends to be more exploratory in nature and may use scenario optimization to deal with
more open-ended questions.
I only
II only
both I and II
neither I nor
II
Among the business drivers that push businesses to become more analytical and data driven, this one
involves customer churn, fraud and default
Optimize Business Operations
Identify Business Risk
Predict New Business Opportunities
Comply with Regulatory Requirements
Which of the following is true about the current analytical architecture?
I. Data sources are first loaded into the data warehouse where data needs to be well understood,
structured, and normalized with the appropriate data type definitions. This kind of centralization enables
security, backup, and failover of highly critical data.
II. Once in the data warehouse, data is read by additional applications across the enterprise for BI and
reporting purposes. These are high-priority operational processes getting critical data feeds from the
data warehouses and repositories.
I only
II only
both I and II
neither I nor
II
Quiz 2
Examples that fall under this group includes financial analysts, market research analysts, life scientists,
operations managers, and business and functional managers.
Data Savvy Professionals
Deep Analytical Talent
Technology and Data Enablers
None of the Above
Which of the following describe the decade beyond 2010 in regards to big data?
I. In this era, everyone and everything is leaving a digital footprint.
II. Data volumes in this decade are measured in terms of petabytes.
I only
II only
both I and II
neither I nor
II
The following are recurring sets of activities that data scientist performs EXCEPT
Reframe business challenges as analytics challenges.
Design, implement, and deploy statistical models and data mining techniques on Big Data.
Provide technical expertise to support analytical projects such as provisioning and administrating
analytical sandboxes.
Develop insights that lead to actionable recommendations.
Which of the following group of players in the data value chain makes sense of the data collected from
various entities?
Data Devices
Data Collectors
Data Aggregators
Data Users and Buyers
The data now is said to come from many sources including
Photos and video footage uploaded to the World Wide Web
Nontraditional IT devices, including the use of radio-frequency identification (RFID) readers, GPS
navigation systems, and seismic processing
Medical information, such as genomic sequencing and diagnostic imaging
All of the Above
Which of the following key roles in the new big data ecosystem has members who possess a combination
of skills to handle raw, unstructured data and to apply complex analytical techniques at massive scales?
Data Savvy Professionals
Deep Analytical Talent
Technology and Data Enablers
None of the Above
The following are the skillsets and behavioral characteristics a data scientist must possess EXCEPT
Qualitative skill
Curious and creative
Skeptical mindset and critical thinking
Communicative and collaborative
Quiz 3
This refers to the process of cleaning data, normalizing datasets, and performing transformations on the
data.
Data Preparation
Data Transformation
Data Conditioning
Data Visualizing
In this phase of the data analytics life cycle, the team assesses the resources available to support the
project in terms of people, technology, time, and data.
Discovery
Data Preparation
Model Building
Model Planning
The following activities is part of the discovery phase EXCEPT
The team determine how much business or domain knowledge the data scientist needs to
develop models.
N t The team catalog the data sources that the team has access to and identify additional data
sources that the team can leverage.
The team identify the main objectives of the project, identify what needs to be achieved in
business terms, and identify what needs to be done to meet the needs.
The team identify the key stakeholders and their interests in the project.
Which of the following describe the key role of Data Engineer?
provides access to key databases or tables and ensuring the appropriate security levels are in place
related to the data repositories.
executes the actual data extractions and performs substantial data manipulation to facilitate the
analytics.
provides subject matter expertise for analytical techniques, data modeling, and applying valid
analytical techniques to given business problems.
gives business domain expertise based on a deep understanding of the data, key performance
indicators (KPIs), key metrics, and business intelligence from a reporting perspective.
Which of the following activity is NOT involve in identifying potential data sources?
Capture aggregate data sources
Evaluate the data structures and tools needed
Perform extract, transform, load processes to data
Scope the sort of data infrastructure needed
In this phase of the data analytics life cycle, the team delivers final reports, briefings, code, and technical
documents.
Model Building
Model Planning
Communicate Results
Operationalize
I only
II only
both I and II
neither I nor
II
Quiz 4
II only
both I and II
neither I nor
II
Which of the following are free or open source tools available for data analytics practitioner?
SAS Enterprise Miner
SPSS Modeler
Octave
Alpine Miner
Which of the following is a deliverable under the operationalize phase?
Presentation for project sponsors
Presentation for analysts
Technical specifications of implementing the code
All of the Above
The following activities are involved under the model planning phase EXCEPT
Assess the structure of the datasets.
Ensure that the analytical techniques enable the team to meet the business objectives and accept
or reject the working hypotheses.
Evaluate whether similar, existing approaches are available or if the team will need to create
something new.
Assess the validity of the model and its results.
Which of the following is TRUE about model planning?
I. Under this phase, the team develop datasets for training, testing, and production purposes.
II. Data Exploration, Variable and Model selection characterize this phase.
I only
II only
both I and II
neither I nor
II
Which of the following is TRUE about the final phase of data analytics life cycle?
I. In the final phase, the team communicates the benefits of the project more broadly and sets up a
pilot project to deploy the work in a controlled way before broadening the work to a full enterprise or
ecosystem of users.
II. Under this phase, the team reflect on the project and consider what obstacles were in the project
and what can be improved in the future as well as make recommendations for future work or
improvements to existing processes.
I only
II only
both I and II
neither I nor
II
Quiz 5
Prior to any regression modelling, the data should always be inspected for the following EXCEPT
Data – entry errors
Expected pattern
Outliers
Missing values
Which of the following statements is/are ALWAYS TRUE?
I. Inferential statistics consists of Estimation and Hypothesis Testing
II. The link between inferential and descriptive statistics is probability
I only
II only
both I and II
neither I nor
II
15.
6
17.
4
19.
2
20.
8
II only
both I and II
neither I nor
II
II only
both I and II
neither I nor
II
In predicting Sales Revenue using TV and Radio Ads Expenses, we have the
following regression results
Estimate the predicted sales if tv and radio ads expenses are 200 and 50 respectively.
19.
3
21.
5
23.
7
25.
9
Quiz 6
Based on the following results of logistic regression, which of the following statements is/are TRUE?
I. For every 1 unit increased in Age, the value of logistic function increases by 0.16.
II. The regression coefficient for the Married variable is not significant.
I only
II only
both I and II
neither I nor
II
Based on the following results of logistic regression, what is the likelihood of churning when Age = 40
and Churned_contacts = 5? (Note: Round coefficients up to 2 decimal places)
0.714
0.62
3
0.35
7
0.26
9
II only
both I
and II
neither I
nor II
II only
both I and II
neither I nor
II