March Sale Special - Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: top65certs

Microsoft DP-100 Dumps

Page: 1 / 10
Total 407 questions

Designing and Implementing a Data Science Solution on Azure Questions and Answers

Question 1

You need to identify the methods for dividing the data according to the testing requirements.

Which properties should you select? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 2

You need to replace the missing data in the AccessibilityToHighway columns.

How should you configure the Clean Missing Data module? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 3

You need to set up the Permutation Feature Importance module according to the model training requirements.

Which properties should you select? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 4

You need to visually identify whether outliers exist in the Age column and quantify the outliers before the outliers are removed.

Which three Azure Machine Learning Studio modules should you use in sequence? To answer, move the appropriate modules from the list of modules to the answer area and arrange them in the correct order.

Options:

Question 5

You need to configure the Permutation Feature Importance module for the model training requirements.

What should you do? To answer, select the appropriate options in the dialog box in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 6

You need to configure the Edit Metadata module so that the structure of the datasets match.

Which configuration options should you select? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 7

You need to produce a visualization for the diagnostic test evaluation according to the data visualization requirements.

Which three modules should you recommend be used in sequence? To answer, move the appropriate modules from the list of modules to the answer area and arrange them in the correct order.

Options:

Question 8

You need to implement early stopping criteria as suited in the model training requirements.

Which three code segments should you use to develop the solution? To answer, move the appropriate code segments from the list of code segments to the answer area and arrange them in the correct order.

NOTE: More than one order of answer choices is correct. You will receive credit for any of the correct orders you select.

Options:

Question 9

You need to identify the methods for dividing the data according, to the testing requirements.

Which properties should you select? To answer, select the appropriate option-, m the answer area. NOTE: Each correct selection is worth one point.

Options:

Question 10

You need to select a feature extraction method.

Which method should you use?

Options:

A.

Mutual information

B.

Mood’s median test

C.

Kendall correlation

D.

Permutation Feature Importance

Question 11

You need to correct the model fit issue.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Options:

Question 12

You need to select a feature extraction method.

Which method should you use?

Options:

A.

Spearman correlation

B.

Mutual information

C.

Mann-Whitney test

D.

Pearson’s correlation

Question 13

You need to configure the Feature Based Feature Selection module based on the experiment requirements and datasets.

How should you configure the module properties? To answer, select the appropriate options in the dialog box in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 14

You need to select an environment that will meet the business and data requirements.

Which environment should you use?

Options:

A.

Azure HDInsight with Spark MLlib

B.

Azure Cognitive Services

C.

Azure Machine Learning Studio

D.

Microsoft Machine Learning Server

Question 15

You need to build a feature extraction strategy for the local models.

How should you complete the code segment? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 16

You need to modify the inputs for the global penalty event model to address the bias and variance issue.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Options:

Question 17

You need to implement a scaling strategy for the local penalty detection data.

Which normalization type should you use?

Options:

A.

Streaming

B.

Weight

C.

Batch

D.

Cosine

Question 18

You need to define a modeling strategy for ad response.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Options:

Question 19

You need to implement a feature engineering strategy for the crowd sentiment local models.

What should you do?

Options:

A.

Apply an analysis of variance (ANOVA).

B.

Apply a Pearson correlation coefficient.

C.

Apply a Spearman correlation coefficient.

D.

Apply a linear discriminant analysis.

Question 20

You need to define a process for penalty event detection.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Options:

Question 21

You need to implement a model development strategy to determine a user’s tendency to respond to an ad.

Which technique should you use?

Options:

A.

Use a Relative Expression Split module to partition the data based on centroid distance.

B.

Use a Relative Expression Split module to partition the data based on distance travelled to the event.

C.

Use a Split Rows module to partition the data based on distance travelled to the event.

D.

Use a Split Rows module to partition the data based on centroid distance.

Question 22

You need to implement a new cost factor scenario for the ad response models as illustrated in the

performance curve exhibit.

Which technique should you use?

Options:

A.

Set the threshold to 0.5 and retrain if weighted Kappa deviates +/- 5% from 0.45.

B.

Set the threshold to 0.05 and retrain if weighted Kappa deviates +/- 5% from 0.5.

C.

Set the threshold to 0.2 and retrain if weighted Kappa deviates +/- 5% from 0.6.

D.

Set the threshold to 0.75 and retrain if weighted Kappa deviates +/- 5% from 0.15.

Question 23

You need to resolve the local machine learning pipeline performance issue. What should you do?

Options:

A.

Increase Graphic Processing Units (GPUs).

B.

Increase the learning rate.

C.

Increase the training iterations,

D.

Increase Central Processing Units (CPUs).

Question 24

You need to define a process for penalty event detection.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Options:

Question 25

You need to use the Python language to build a sampling strategy for the global penalty detection models.

How should you complete the code segment? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 26

You need to define an evaluation strategy for the crowd sentiment models.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Options:

Question 27

You need to define an evaluation strategy for the crowd sentiment models.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Options:

Question 28

You create an experiment in Azure Machine Learning Studio. You add a training dataset that contains 10,000 rows. The first 9,000 rows represent class 0 (90 percent).

The remaining 1,000 rows represent class 1 (10 percent).

The training set is imbalances between two classes. You must increase the number of training examples for class 1 to 4,000 by using 5 data rows. You add the Synthetic Minority Oversampling Technique (SMOTE) module to the experiment.

You need to configure the module.

Which values should you use? To answer, select the appropriate options in the dialog box in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 29

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You are a data scientist using Azure Machine Learning Studio.

You need to normalize values to produce an output column into bins to predict a target column.

Solution: Apply a Quantiles binning mode with a PQuantile normalization.

Does the solution meet the goal?

Options:

A.

Yes

B.

No

Question 30

You have an Azure Machine Learning workspace named workspace1 that is accessible from a public endpoint. The workspace contains an Azure Blob storage datastore named store1 that represents a blob container in an Azure storage account named account1. You configure workspace1 and account1 to be accessible by using private endpoints in the same virtual network.

You must be able to access the contents of store1 by using the Azure Machine Learning SDK for Python. You must be able to preview the contents of store1 by using Azure Machine Learning studio.

You need to configure store1.

What should you do? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 31

You create an Azure Databricks workspace and a linked Azure Machine Learning workspace.

You have the following Python code segment in the Azure Machine Learning workspace:

import mlflow

import mlflow.azureml

import azureml.mlflow

import azureml.core

from azureml.core import Workspace

subscription_id = 'subscription_id'

resourse_group = 'resource_group_name'

workspace_name = 'workspace_name'

ws = Workspace.get(name=workspace_name,

subscription_id=subscription_id,

resource_group=resource_group)

experimentName = "/Users/{user_name}/{experiment_folder}/{experiment_name}"

mlflow.set_experiment(experimentName)

uri = ws.get_mlflow_tracking_uri()

mlflow.set_tracking_uri(uri)

Instructions: For each of the following statements, select Yes if the statement is true. Otherwise, select No.

NOTE: Each correct selection is worth one point.

Options:

Question 32

You create an Azure Machine Learning compute resource to train models. The compute resource is configured as follows:

  • Minimum nodes: 2
  • Maximum nodes: 4

You must decrease the minimum number of nodes and increase the maximum number of nodes to the following values:

  • Minimum nodes: 0
  • Maximum nodes: 8

You need to reconfigure the compute resource.

What are three possible ways to achieve this goal? Each correct answer presents a complete solution.

NOTE: Each correct selection is worth one point.

Options:

A.

Azure Machine Learning designer

B.

Azure CLI ml extension v2

C.

Azure Machine Learning studio

D.

BuildContext class in Python SDK v2

E.

MLCIient class in Python SDK v2

Question 33

You have the following Azure subscriptions and Azure Machine Learning service workspaces:

You need to obtain a reference to the ml-project workspace.

Solution: Run the following Python code:

Does the solution meet the goal?

Options:

A.

Yes

B.

No

Question 34

You are implementing a machine learning model to predict stock prices.

The model uses a PostgreSQL database and requires GPU processing.

You need to create a virtual machine that is pre-configured with the required tools.

What should you do?

Options:

A.

Create a Data Science Virtual Machine (DSVM) Windows edition.

B.

Create a Geo Al Data Science Virtual Machine (Geo-DSVM) Windows edition.

C.

Create a Deep Learning Virtual Machine (DLVM) Linux edition.

D.

Create a Deep Learning Virtual Machine (DLVM) Windows edition.

E.

Create a Data Science Virtual Machine (DSVM) Linux edition.

Question 35

You are developing deep learning models to analyze semi-structured, unstructured, and structured data types.

You have the following data available for model building:

  • Video recordings of sporting events
  • Transcripts of radio commentary about events
  • Logs from related social media feeds captured during sporting events

You need to select an environment for creating the model.

Which environment should you use?

Options:

A.

Azure Cognitive Services

B.

Azure Data Lake Analytics

C.

Azure HDInsight with Spark MLib

D.

Azure Machine Learning Studio

Question 36

You create an Azure Machine Learning workspace.

You must configure an event-driven workflow to automatically trigger upon completion of training runs in the workspace. The solution must minimize the administrative effort to configure the trigger.

You need to configure an Azure service to automatically trigger the workflow.

Which Azure service should you use?

Options:

A.

Event Grid subscription

B.

Azure Automation runbook

C.

Event Hubs Capture

D.

Event Hubs consumer

Question 37

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You plan to use a Python script to run an Azure Machine Learning experiment. The script creates a reference to the experiment run context, loads data from a file, identifies the set of unique values for the label column, and completes the experiment run:

from azureml.core import Run

import pandas as pd

run = Run.get_context()

data = pd.read_csv('data.csv')

label_vals = data['label'].unique()

# Add code to record metrics here

run.complete()

The experiment must record the unique labels in the data as metrics for the run that can be reviewed later.

You must add code to the script to record the unique label values as run metrics at the point indicated by the comment.

Solution: Replace the comment with the following code:

for label_val in label_vals:

run.log('Label Values', label_val)

Does the solution meet the goal?

Options:

A.

Yes

B.

No

Question 38

You use Azure Machine Learning designer to create a training pipeline for a regression model.

You need to prepare the pipeline for deployment as an endpoint that generates predictions asynchronously for a dataset of input data values.

What should you do?

Options:

A.

Clone the training pipeline.

B.

Create a batch inference pipeline from the training pipeline.

C.

Create a real-time inference pipeline from the training pipeline.

D.

Replace the dataset in the training pipeline with an Enter Data Manually module.

Question 39

You are conducting feature engineering to prepuce data for further analysis.

The data includes seasonal patterns on inventory requirements.

You need to select the appropriate method to conduct feature engineering on the data.

Which method should you use?

Options:

A.

Exponential Smoothing (ETS) function.

B.

One Class Support Vector Machine module

C.

Time Series Anomaly Detection module

D.

Finite Impulse Response (FIR) Filter module.

Question 40

You have an Azure Machine learning workspace. The workspace contains a dataset with data in a tabular form.

You plan to use the Azure Machine Learning SDK for Python vl to create a control script that will load the dataset into a pandas dataframe in preparation for model training The script will accept a parameter designating the dataset

You need to complete the script.

How should you complete the script? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Page: 1 / 10
Total 407 questions