0% found this document useful (0 votes)
75 views

DP-100

Uploaded by

sadiq
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
75 views

DP-100

Uploaded by

sadiq
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

Microsoft DP-100

Designing and Implementing a Data Science Solution


on Azure
Version: 7.0
Microsoft DP-100 Exam
Topic 1, Define and prepare the development environment

QUESTION NO: 1 DRAG DROP

You are planning to host practical training to acquaint staff with Docker for Windows.

Staff devices must support the installation of Docker.

Which of the following are requirements for this installation? Answer by dragging the correct
options from the list to the answer area.

Answer:

"Pass Any Exam. Any Time." - www.actualtests.com 2


Microsoft DP-100 Exam

Explanation:

"Pass Any Exam. Any Time." - www.actualtests.com 3


Microsoft DP-100 Exam

References:

https://docs.docker.com/toolbox/toolbox_install_windows/

https://blogs.technet.microsoft.com/canitpro/2015/09/08/step-by-step-enabling-hyper-v-for-use-on-
windows-10/

https://docs.docker.com/docker-for-windows/install/

QUESTION NO: 2

"Pass Any Exam. Any Time." - www.actualtests.com 4


Microsoft DP-100 Exam
You have been tasked with constructing an intelligent solution with machine learning models.

You have to make sure that the environment allows for data scientists to build notebooks in a
cloud environment, and enforce the use of automatic feature engineering and model building in
machine learning pipelines. The environment should also allow for notebooks to be deployed to
retrain via Spark instances with dynamic worker allocation. Furthermore, notebooks have to be
exportable for local version control purposes.

You have created an Azure HDInsight cluster that includes the Apache Spark Mlib library.

Which of the following is the action you must take NEXT?

A.
Create and execute the Zeppelin notebooks on the cluster.

B.
Create and execute a Jupyter notebook on the cluster.

C.
Install the Microsoft Machine Learning SDK for Python on the cluster.

D.
Install Microsoft Machine Learning for Apache Spark.

Answer: D
Explanation:

References:

https://docs.microsoft.com/en-us/azure/hdinsight/spark/apache-spark-zeppelin-notebook

https://azuremlbuild.blob.core.windows.net/pysparkapi/intro.html

QUESTION NO: 3

You have configured a Deep Learning Virtual Machine for Windows.

You need to use tools and frameworks to build deep neural network (DNN) models.

Which tools and frameworks should you use?

"Pass Any Exam. Any Time." - www.actualtests.com 5


Microsoft DP-100 Exam
A.
Caffe2

B.
LightGBM

C.
Azure Data Factory (ADF)

D.
The Microsoft Cognitive Toolkit (CNTK)

Answer: D
Explanation:

The Microsoft Cognitive Toolkit (CNTK) is an open-source toolkit for commercial-grade distributed
deep learning. It describes neural networks as a series of computational steps via a directed
graph. CNTK allows the user to easily realize and combine popular model types such as feed-
forward DNNs, convolutional neural networks (CNNs) and recurrent neural networks
(RNNs/LSTMs).

References:

https://docs.microsoft.com/en-us/cognitive-toolkit/

QUESTION NO: 4

This question is included in a number of questions that depicts the identical set-up.
However, every question has a distinctive result. Establish if the recommendation satisfies
the requirements.

You have been tasked with employing a machine learning model, which makes use of a
PostgreSQL database and needs GPU processing, to forecast prices.

You are preparing to create a virtual machine that has the necessary tools built into it.

You need to make use of the correct virtual machine type.

Recommendation: You make use of a Geo Al Data Science Virtual Machine (Geo-DSVM)
Windows edition.

"Pass Any Exam. Any Time." - www.actualtests.com 6


Microsoft DP-100 Exam
Will the requirements be satisfied?

A.
Yes

B.
No

Answer: B
Explanation:

The Azure Geo AI Data Science VM (Geo-DSVM) delivers geospatial analytics capabilities from
Microsoft's Data Science VM. Specifically, this VM extends the AI and data science toolkits in the
Data Science VM by adding ESRI's market-leading ArcGIS Pro Geographic Information System.

References:

https://docs.microsoft.com/en-us/azure/machine-learning/data-science-virtual-machine/overview

QUESTION NO: 5

This question is included in a number of questions that depicts the identical set-up.
However, every question has a distinctive result. Establish if the recommendation satisfies
the requirements.

You have been tasked with employing a machine learning model, which makes use of a
PostgreSQL database and needs GPU processing, to forecast prices.

You are preparing to create a virtual machine that has the necessary tools built into it.

You need to make use of the correct virtual machine type.

Recommendation: You make use of a Deep Learning Virtual Machine (DLVM) Windows edition.

Will the requirements be satisfied?

A.
Yes

B.
"Pass Any Exam. Any Time." - www.actualtests.com 7
Microsoft DP-100 Exam
No

Answer: B
Explanation:

DLVM is a template on top of DSVM image. In terms of the packages, GPU drivers etc are all
there in the DSVM image. Mostly it is for convenience during creation where we only allow DLVM
to be created on GPU VM instances on Azure.

References:

https://docs.microsoft.com/en-us/azure/machine-learning/data-science-virtual-machine/overview

QUESTION NO: 6

This question is included in a number of questions that depicts the identical set-up.
However, every question has a distinctive result. Establish if the recommendation satisfies
the requirements.

You have been tasked with employing a machine learning model, which makes use of a
PostgreSQL database and needs GPU processing, to forecast prices.

You are preparing to create a virtual machine that has the necessary tools built into it.

You need to make use of the correct virtual machine type.

Recommendation: You make use of a Data Science Virtual Machine (DSVM) Windows edition.

Will the requirements be satisfied?

A.
Yes

B.
No

Answer: A
Explanation:

"Pass Any Exam. Any Time." - www.actualtests.com 8


Microsoft DP-100 Exam
In the DSVM, your training models can use deep learning algorithms on hardware that's based on
graphics processing units (GPUs).

PostgreSQL is available for the following operating systems: Linux (all recent distributions), 64-bit
installers available for macOS (OS X) version 10.6 and newer – Windows (with installers available
for 64-bit version; tested on latest versions and back to Windows 2012 R2.

References:

https://docs.microsoft.com/en-us/azure/machine-learning/data-science-virtual-machine/overview

QUESTION NO: 7 DRAG DROP

You have been tasked with moving data into Azure Blob Storage for the purpose of supporting
Azure Machine Learning.

Which of the following can be used to complete your task? Answer by dragging the correct options
from the list to the answer area.

"Pass Any Exam. Any Time." - www.actualtests.com 9


Microsoft DP-100 Exam

Answer:

"Pass Any Exam. Any Time." - www.actualtests.com 10


Microsoft DP-100 Exam

Explanation:

"Pass Any Exam. Any Time." - www.actualtests.com 11


Microsoft DP-100 Exam

You can move data to and from Azure Blob storage using different technologies:

References:

https://docs.microsoft.com/en-us/azure/machine-learning/team-data-science-process/move-azure-
blob

QUESTION NO: 8 HOTSPOT

Complete the sentence by selecting the correct option in the answer area.

"Pass Any Exam. Any Time." - www.actualtests.com 12


Microsoft DP-100 Exam

Answer:

Explanation:

"Pass Any Exam. Any Time." - www.actualtests.com 13


Microsoft DP-100 Exam

Use the Convert to ARFF module in Azure Machine Learning Studio, to convert datasets and
results in Azure Machine Learning to the attribute-relation file format used by the Weka toolset.
This format is known as ARFF.

The ARFF data specification for Weka supports multiple machine learning tasks, including data
preprocessing, classification, and feature selection. In this format, data is organized by entities and
their attributes, and is contained in a single text file.

References:

https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/convert-to-arff

QUESTION NO: 9

You have been tasked with designing a deep learning model, which accommodates the most
recent edition of Python, to recognize language.

You have to include a suitable deep learning framework in the Data Science Virtual Machine
(DSVM).

Which of the following actions should you take?

A.
You should consider including Rattle.

B.
"Pass Any Exam. Any Time." - www.actualtests.com 14
Microsoft DP-100 Exam
You should consider including TensorFlow.

C.
You should consider including Theano.

D.
You should consider including Chainer.

Answer: B
Explanation:

References:

https://www.infoworld.com/article/3278008/what-is-tensorflow-the-machine-learning-library-
explained.html

QUESTION NO: 10 HOTSPOT

Complete the sentence by selecting the correct option in the answer area.

Answer:

"Pass Any Exam. Any Time." - www.actualtests.com 15


Microsoft DP-100 Exam

Explanation:

A Deep Learning Virtual Machine is a pre-configured environment for deep learning using GPU
instances.

References:

https://azuremarketplace.microsoft.com/en-au/marketplace/apps/microsoft-ads.dsvm-deep-
learning

QUESTION NO: 11

You need to implement a Data Science Virtual Machine (DSVM) that supports the Caffe2 deep
learning framework.

Which of the following DSVMs should you create?

"Pass Any Exam. Any Time." - www.actualtests.com 16


Microsoft DP-100 Exam
A.
Windows Server 2012 DSVM

B.
Windows Server 2016 DSVM

C.
Ubuntu 16.04 DSVM

D.
CentOS 7.4 DSVM

Answer: C
Explanation:

Caffe2 is supported by Data Science Virtual Machine for Linux.

Microsoft offers Linux editions of the DSVM on Ubuntu 16.04 LTS and CentOS 7.4.

However, only the DSVM on Ubuntu is preconfigured for Caffe2.

References:

https://docs.microsoft.com/en-us/azure/machine-learning/data-science-virtual-machine/overview

QUESTION NO: 12

You create a real-time service endpoint using Azure Machine Learning designer.

After training the model and preparing the real-time pipeline for deployment into a designated
Azure Machine Learning compute resource, you are required to publish the inference pipeline as a
web service.

You need to make use of the correct compute target to achieve your goal.

Which of the following is the compute target you should use?

A.
Azure Container Instances

B.
Azure Kubernetes Service (AKS)

"Pass Any Exam. Any Time." - www.actualtests.com 17


Microsoft DP-100 Exam
C.
Azure Machine Learning compute clusters

D.
Local web service

Answer: B
Explanation:

Azure Kubernetes Service (AKS) can be used real-time inference.

References:

https://docs.microsoft.com/en-us/azure/machine-learning/concept-compute-target

QUESTION NO: 13

After creating a multi-class image classification deep learning model, you are required to make
sure that the model is retrained once-a-month with the new image data retrieved from a public web
portal.

You have created an Azure Machine Learning pipeline to retrieve new data, normalize the size of
images, as well as retrain the model.

to configure the schedule for the pipeline, you must make use of the Azure Machine Learning
SDK.

Which of the following actions should you take FIRST?

A.
Define an Azure Machine Learning pipeline schedule.

B.
Retrieve the pipeline ID.

C.
Publish the pipeline.

D.
Create a ScheduleRecurrence.

"Pass Any Exam. Any Time." - www.actualtests.com 18


Microsoft DP-100 Exam
Answer: C
Explanation:

References:

https://docs.microsoft.com/en-us/azure/machine-learning/how-to-schedule-pipelines

Topic 2, Prepare data for modeling

QUESTION NO: 14

This question is included in a number of questions that depicts the identical set-up.
However, every question has a distinctive result. Establish if the recommendation satisfies
the requirements.

You are in the process of creating a machine learning model. Your dataset includes rows with null
and missing values.

You plan to make use of the Clean Missing Data module in Azure Machine Learning Studio to
detect and fix sort out the null and missing values in the dataset.

Recommendation: You make use of the Replace with median option.

Will the requirements be satisfied?

A.
Yes

B.
No

Answer: B
Explanation:

References:

https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/clean-missing-
data

"Pass Any Exam. Any Time." - www.actualtests.com 19


Microsoft DP-100 Exam

QUESTION NO: 15

This question is included in a number of questions that depicts the identical set-up.
However, every question has a distinctive result. Establish if the recommendation satisfies
the requirements.

You are in the process of creating a machine learning model. Your dataset includes rows with null
and missing values.

You plan to make use of the Clean Missing Data module in Azure Machine Learning Studio to
detect and fix sort out the null and missing values in the dataset.

Recommendation: You make use of the Custom substitution value option.

Will the requirements be satisfied?

A.
Yes

B.
No

Answer: B
Explanation:

References:

https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/clean-missing-
data

QUESTION NO: 16

This question is included in a number of questions that depicts the identical set-up.
However, every question has a distinctive result. Establish if the recommendation satisfies
the requirements.

You are in the process of creating a machine learning model. Your dataset includes rows with null

"Pass Any Exam. Any Time." - www.actualtests.com 20


Microsoft DP-100 Exam
and missing values.

You plan to make use of the Clean Missing Data module in Azure Machine Learning Studio to
detect and fix sort out the null and missing values in the dataset.

Recommendation: You make use of the Remove entire row option.

Will the requirements be satisfied?

A.
Yes

B.
No

Answer: A
Explanation:

Remove entire row: Completely removes any row in the dataset that has one or more missing
values. This is useful if the missing value can be considered randomly missing.

References:

https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/clean-missing-
data

QUESTION NO: 17

You need to consider the underlined segment to establish whether it is accurate.

To transform a categorical feature into a binary indicator, you should make use of the Clean
Missing Data module.

A.
No adjustment required.

B.
Convert to Indicator Values

C.

"Pass Any Exam. Any Time." - www.actualtests.com 21


Microsoft DP-100 Exam
Apply SQL Transformation

D.
Group Categorical Values

Answer: B
Explanation:

Use the Convert to Indicator Values module in Azure Machine Learning Studio. The purpose of
this module is to convert columns that contain categorical values into a series of binary indicator
columns that can more easily be used as features in a machine learning model.

References:

https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/convert-to-
indicator-values

QUESTION NO: 18

You need to consider the underlined segment to establish whether it is accurate.

To improve the amount of low incidence cases in a dataset, you should make use of the SMOTE
module.

Select “No adjustment required” if the underlined segment is accurate. If the underlined segment is
inaccurate, select the accurate option.

A.
No adjustment required.

B.
Remove Duplicate Rows

C.
Join Data

D.
Edit Metadata

Answer: A

"Pass Any Exam. Any Time." - www.actualtests.com 22


Microsoft DP-100 Exam
Explanation:

Use the SMOTE module in Azure Machine Learning Studio to increase the number of
underrepresented cases in a dataset used for machine learning. SMOTE is a better way of
increasing the number of rare cases than simply duplicating existing cases.

References:

https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/smote

QUESTION NO: 19 HOTSPOT

You need to consider the underlined segment to establish whether it is accurate.

Answer:

"Pass Any Exam. Any Time." - www.actualtests.com 23


Microsoft DP-100 Exam

Explanation:

The box-plot algorithm can be used to display outliers.

References:

https://medium.com/analytics-vidhya/what-is-an-outliers-how-to-detect-and-remove-them-which-
algorithm-are-sensitive-towards-outliers-2d501993d59

QUESTION NO: 20

You are making use of Azure Machine Learning Studio to analyze a dataset.

You want to produce a statistical summary that includes the following:

"Pass Any Exam. Any Time." - www.actualtests.com 24


Microsoft DP-100 Exam
The p-value.

The unique count for all feature columns.

Which of the following actions should you take?

A.
Evaluate Probability Function

B.
Export Count Table

C.
Execute Python Script

D.
Build Counting Transform

Answer: B
Explanation:

The Export Count Table module is provided for backward compatibility with experiments that use
the Build Count Table (deprecated) and Count Featurizer (deprecated) modules.

References:

https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/export-count-
table

QUESTION NO: 21

You are in the process of constructing a multi-class classifier via Azure Machine Learning Studio.

You are currently executing a filter-based feature selection for a dataset. You have to make use of
a metric that calculates the statistical bond between two continuous variables.

Which of the following is the metric described above?

A.
Linear discriminant analysis

"Pass Any Exam. Any Time." - www.actualtests.com 25


Microsoft DP-100 Exam
B.
Mutual Information

C.
Count Based

D.
Pearson correlation

Answer: D
Explanation:

Pearson's correlation coefficient is computed by taking the covariance of two variables and
dividing by the product of their standard deviations. The coefficient is not affected by changes of
scale in the two variables.

References:

https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/filter-based-
feature-selection

https://www.statisticssolutions.com/pearsons-correlation-coefficient/

QUESTION NO: 22

You are planning to host practical training to acquaint learners with data visualization creation
using Python. Learner devices are able to connect to the internet.

Learner devices are currently NOT configured for Python development. Also, learners are unable
to install software on their devices as they lack administrator permissions. Furthermore, they are
unable to access Azure subscriptions.

It is imperative that learners are able to execute Python-based data visualization code.

Which of the following actions should you take?

A.
You should consider configuring the use of Azure Container Instance.

B.
You should consider configuring the use of Azure BatchAI.
"Pass Any Exam. Any Time." - www.actualtests.com 26
Microsoft DP-100 Exam
C.
You should consider configuring the use of Azure Notebooks.

D.
You should consider configuring the use of Azure Kubernetes Service.

Answer: C
Explanation:

References:

https://notebooks.azure.com/

QUESTION NO: 23 HOTSPOT

Complete the sentence by selecting the correct option in the answer area.

Answer:

"Pass Any Exam. Any Time." - www.actualtests.com 27


Microsoft DP-100 Exam

Explanation:

Replace using Probabilistic PCA: Compared to other options, such as Multiple Imputation using
Chained Equations (MICE), this option has the advantage of not requiring the application of
predictors for each column. Instead, it approximates the covariance for the full dataset. Therefore,
it might offer better performance for datasets that have missing values in many columns.

References:

https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/clean-missing-
data

QUESTION NO: 24

You have recently concluded the construction of a binary classification machine learning model.

You are currently assessing the model. You want to make use of a visualization that allows for

"Pass Any Exam. Any Time." - www.actualtests.com 28


Microsoft DP-100 Exam
precision to be used as the measurement for the assessment.

Which of the following actions should you take?

A.
You should consider using Venn diagram visualization.

B.
You should consider using Receiver Operating Characteristic (ROC) curve visualization.

C.
You should consider using Box plot visualization.

D.
You should consider using the Binary classification confusion matrix visualization.

Answer: D
Explanation:

References:

https://docs.microsoft.com/en-us/azure/machine-learning/how-to-understand-automated-
ml#confusion-matrix

QUESTION NO: 25

This question is included in a number of questions that depicts the identical set-up.
However, every question has a distinctive result. Establish if the recommendation satisfies
the requirements.

You have been tasked with evaluating your model on a partial data sample via k-fold cross-
validation.

You have already configured a k parameter as the number of splits. You now have to configure the
k parameter for the cross-validation with the usual value choice.

Recommendation: You configure the use of the value k=1.

Will the requirements be satisfied?

"Pass Any Exam. Any Time." - www.actualtests.com 29


Microsoft DP-100 Exam
A.
Yes

B.
No

Answer: B
Explanation:

QUESTION NO: 26

This question is included in a number of questions that depicts the identical set-up.
However, every question has a distinctive result. Establish if the recommendation satisfies
the requirements.

You have been tasked with evaluating your model on a partial data sample via k-fold cross-
validation.

You have already configured a k parameter as the number of splits. You now have to configure the
k parameter for the cross-validation with the usual value choice.

Recommendation: You configure the use of the value k=3.

Will the requirements be satisfied?

A.
Yes

B.
No

Answer: B
Explanation:

QUESTION NO: 27

This question is included in a number of questions that depicts the identical set-up.

"Pass Any Exam. Any Time." - www.actualtests.com 30


Microsoft DP-100 Exam
However, every question has a distinctive result. Establish if the recommendation satisfies
the requirements.

You have been tasked with evaluating your model on a partial data sample via k-fold cross-
validation.

You have already configured a k parameter as the number of splits. You now have to configure the
k parameter for the cross-validation with the usual value choice.

Recommendation: You configure the use of the value k=10.

Will the requirements be satisfied?

A.
Yes

B.
No

Answer: A
Explanation:

Leave One Out (LOO) cross-validation

Setting K = n (the number of observations) yields n-fold and is called leave-one out cross-
validation (LOO), a special case of the K-fold approach.

LOO CV is sometimes useful but typically doesn’t shake up the data enough. The estimates from
each fold are highly correlated and hence their average can have high variance.

This is why the usual choice is K=5 or 10. It provides a good compromise for the bias-variance
tradeoff.

QUESTION NO: 28

You construct a machine learning experiment via Azure Machine Learning Studio.

You would like to split data into two separate datasets.

Which of the following actions should you take?


"Pass Any Exam. Any Time." - www.actualtests.com 31
Microsoft DP-100 Exam
A.
You should make use of the Split Data module.

B.
You should make use of the Group Categorical Values module.

C.
You should make use of the Clip Values module.

D.
You should make use of the Group Data into Bins module.

Answer: D
Explanation:

The Group Data into Bins module supports multiple options for binning data. You can customize
how the bin edges are set and how values are apportioned into the bins.

References:

https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/group-data-
into-bins

QUESTION NO: 29

You have been tasked with creating a new Azure pipeline via the Machine Learning designer.

You have to makes sure that the pipeline trains a model using data in a comma-separated values
(CSV) file that is published on a website. A dataset for the file for this file does not exist.

Data from the CSV file must be ingested into the designer pipeline with the least amount of
administrative effort as possible.

Which of the following actions should you take?

A.
You should make use of the Convert to TXT module.

B.
You should add the Copy Data object to the pipeline.

C.
"Pass Any Exam. Any Time." - www.actualtests.com 32
Microsoft DP-100 Exam
You should add the Import Data object to the pipeline.

D.
You should add the Dataset object to the pipeline.

Answer: D
Explanation:

The preferred way to provide data to a pipeline is a Dataset object. The Dataset object points to
data that lives in or is accessible from a datastore or at a Web URL. The Dataset class is abstract,
so you will create an instance of either a FileDataset (referring to one or more files) or a
TabularDataset that's created by from one or more files with delimited columns of data.

Example:

from azureml.core import Dataset

iris_tabular_dataset = Dataset.Tabular.from_delimited_files([(def_blob_store, 'train-


dataset/iris.csv')])

References:

https://docs.microsoft.com/en-us/azure/machine-learning/how-to-create-your-first-pipeline

Topic 3, Perform Feature Engineering

QUESTION NO: 30 DRAG DROP

You are in the process of constructing a regression model.

You would like to make it a Poisson regression model. To achieve your goal, the feature values
need to meet certain conditions.

Which of the following are relevant conditions with regards to the label data? Answer by dragging
the correct options from the list to the answer area.

"Pass Any Exam. Any Time." - www.actualtests.com 33


Microsoft DP-100 Exam

Answer:

"Pass Any Exam. Any Time." - www.actualtests.com 34


Microsoft DP-100 Exam

Explanation:

"Pass Any Exam. Any Time." - www.actualtests.com 35


Microsoft DP-100 Exam

Poisson regression is intended for use in regression models that are used to predict numeric
values, typically counts. Therefore, you should use this module to create your regression model
only if the values you are trying to predict fit the following conditions:

References:

https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/poisson-
regression

QUESTION NO: 31

This question is included in a number of questions that depicts the identical set-up.
However, every question has a distinctive result. Establish if the recommendation satisfies
the requirements.

"Pass Any Exam. Any Time." - www.actualtests.com 36


Microsoft DP-100 Exam
You are in the process of carrying out feature engineering on a dataset.

You want to add a feature to the dataset and fill the column value.

Recommendation: You must make use of the Group Categorical Values Azure Machine Learning
Studio module.

Will the requirements be satisfied?

A.
Yes

B.
No

Answer: B
Explanation:

QUESTION NO: 32

This question is included in a number of questions that depicts the identical set-up.
However, every question has a distinctive result. Establish if the recommendation satisfies
the requirements.

You are in the process of carrying out feature engineering on a dataset.

You want to add a feature to the dataset and fill the column value.

Recommendation: You must make use of the Join Data Azure Machine Learning Studio module.

Will the requirements be satisfied?

A.
Yes

B.
No

Answer: B
"Pass Any Exam. Any Time." - www.actualtests.com 37
Microsoft DP-100 Exam
Explanation:

QUESTION NO: 33

This question is included in a number of questions that depicts the identical set-up.
However, every question has a distinctive result. Establish if the recommendation satisfies
the requirements.

You are in the process of carrying out feature engineering on a dataset.

You want to add a feature to the dataset and fill the column value.

Recommendation: You must make use of the Edit Metadata Azure Machine Learning Studio
module.

Will the requirements be satisfied?

A.
Yes

B.
No

Answer: A
Explanation:

Typical metadata changes might include marking columns as features.

References:

https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/edit-metadata

https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/join-data

https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/group-
categorical-values

"Pass Any Exam. Any Time." - www.actualtests.com 38


Microsoft DP-100 Exam
QUESTION NO: 34

You have been tasked with ascertaining if two sets of data differ considerably. You will make use
of Azure Machine Learning Studio to complete your task.

You plan to perform a paired t-test.

Which of the following are conditions that must apply to use a paired t-test? (Choose all that
apply.)

A.
All scores are independent from each other.

B.
You have a matched pairs of scores

C.
The sampling distribution of d is normal

D.
The sampling distribution of x1- x2 is normal.

Answer: B,C
Explanation:

References:

https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/test-hypothesis-
using-t-test

QUESTION NO: 35

You want to train a classification model using data located in a comma-separated values (CSV)
file.

The classification model will be trained via the Automated Machine Learning interface using the
Classification task type.

You have been informed that only linear models need to be assessed by the Automated Machine
Learning.

"Pass Any Exam. Any Time." - www.actualtests.com 39


Microsoft DP-100 Exam
Which of the following actions should you take?

A.
You should disable deep learning.

B.
You should enable automatic featurization.

C.
You should disable automatic featurization.

D.
You should set the task type to Forecasting.

Answer: C
Explanation:

References:

https://econml.azurewebsites.net/spec/estimation/dml.html

https://docs.microsoft.com/en-us/azure/machine-learning/how-to-use-automated-ml-for-ml-models

QUESTION NO: 36

You are preparing to train a regression model via automated machine learning. The data available
to you has features with missing values, as well as categorical features with little discrete values.

You want to make sure that automated machine learning is configured as follows:

missing values must be automatically imputed.

categorical features must be encoded as part of the training task.

Which of the following actions should you take?

A.
You should make use of the featurization parameter with the 'auto' value pair.

B.
You should make use of the featurization parameter with the 'off' value pair.

"Pass Any Exam. Any Time." - www.actualtests.com 40


Microsoft DP-100 Exam
C.
You should make use of the featurization parameter with the 'on' value pair.

D.
You should make use of the featurization parameter with the 'FeaturizationConfig' value pair.

Answer: A
Explanation:

Featurization str or FeaturizationConfig

Values: 'auto' / 'off' / FeaturizationConfig

Indicator for whether featurization step should be done automatically or not, or whether
customized featurization should be used.

Column type is automatically detected. Based on the detected column type


preprocessing/featurization is done as follows:

Categorical: Target encoding, one hot encoding, drop high cardinality categories, impute missing
values.

Numeric: Impute missing values, cluster distance, weight of evidence.

DateTime: Several features such as day, seconds, minutes, hours etc.

Text: Bag of words, pre-trained Word embedding, text target encoding.

References:

https://docs.microsoft.com/en-us/python/api/azureml-train-automl-
client/azureml.train.automl.automlconfig.automlconfig

Topic 4, Develop models

QUESTION NO: 37

You are making use of a two-class logistic regression model to create a binary classification.

You want to evaluate the results of the model for disproportion. You need to configure the use of a
suitable evaluation metric
"Pass Any Exam. Any Time." - www.actualtests.com 41
Microsoft DP-100 Exam
Which of the following is the metric you should choose?

A.
Recall

B.
Area Under the Curve (AUC) Curve

C.
Precision

D.
Root Mean Square Error

Answer: B
Explanation:

One can inspect the true positive rate vs. the false positive rate in the Receiver Operating
Characteristic (ROC) curve and the corresponding Area Under the Curve (AUC) value. The closer
this curve is to the upper left corner, the better the classifier’s performance is (that is maximizing
the true positive rate while minimizing the false positive rate). Curves that are close to the diagonal
of the plot, result from classifiers that tend to make predictions that are close to random guessing.

References:

https://docs.microsoft.com/en-us/azure/machine-learning/studio/evaluate-model-
performance#evaluating-a-binary-classification-model

QUESTION NO: 38

You make use of Azure Machine Learning Studio to develop a linear regression model. You
perform an experiment to assess various algorithms.

Which of the following is an algorithm that reduces the variances between actual and predicted
values?

A.
Fast Forest Quantile Regression

B.
Poisson Regression

"Pass Any Exam. Any Time." - www.actualtests.com 42


Microsoft DP-100 Exam
C.
Boosted Decision Tree Regression

D.
Linear Regression

Answer: C
Explanation:

Mean absolute error (MAE) measures how close the predictions are to the actual outcomes; thus,
a lower score is better.

References:

https://docs.microsoft.com/en-us/azure/machine-learning/algorithm-module-reference/boosted-
decision-tree-regression

https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/evaluate-model

https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/linear-
regression

QUESTION NO: 39

This question is included in a number of questions that depicts the identical set-up.
However, every question has a distinctive result. Establish if the recommendation satisfies
the requirements.

You have been tasked with constructing a machine learning model that translates language text
into a different language text.

The machine learning model must be constructed and trained to learn the sequence of the.

Recommendation: You make use of Convolutional Neural Networks (CNNs).

Will the requirements be satisfied?

A.
Yes

"Pass Any Exam. Any Time." - www.actualtests.com 43


Microsoft DP-100 Exam
B.
No

Answer: B
Explanation:

QUESTION NO: 40

This question is included in a number of questions that depicts the identical set-up.
However, every question has a distinctive result. Establish if the recommendation satisfies
the requirements.

You have been tasked with constructing a machine learning model that translates language text
into a different language text.

The machine learning model must be constructed and trained to learn the sequence of the.

Recommendation: You make use of Generative Adversarial Networks (GANs).

Will the requirements be satisfied?

A.
Yes

B.
No

Answer: B
Explanation:

"Pass Any Exam. Any Time." - www.actualtests.com 44

You might also like