Latest Microsoft DP-100 Dumps PDF Questions Answers 2025

Designing and Implementing a Data Science Solution on Azure Questions and Answers

Question 1

You need to define a process for penalty event detection.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Options:

Buy Now

Question 2

You need to implement a model development strategy to determine a user’s tendency to respond to an ad.

Which technique should you use?

Options:

Use a Relative Expression Split module to partition the data based on centroid distance.

Use a Relative Expression Split module to partition the data based on distance travelled to the event.

Use a Split Rows module to partition the data based on distance travelled to the event.

Use a Split Rows module to partition the data based on centroid distance.

Question 3

You need to use the Python language to build a sampling strategy for the global penalty detection models.

How should you complete the code segment? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 4

You need to implement a new cost factor scenario for the ad response models as illustrated in the

performance curve exhibit.

Which technique should you use?

Options:

Set the threshold to 0.5 and retrain if weighted Kappa deviates +/- 5% from 0.45.

Set the threshold to 0.05 and retrain if weighted Kappa deviates +/- 5% from 0.5.

Set the threshold to 0.2 and retrain if weighted Kappa deviates +/- 5% from 0.6.

Set the threshold to 0.75 and retrain if weighted Kappa deviates +/- 5% from 0.15.

Question 5

You need to resolve the local machine learning pipeline performance issue. What should you do?

Options:

Increase Graphic Processing Units (GPUs).

Increase the learning rate.

Increase the training iterations,

Increase Central Processing Units (CPUs).

Question 6

You need to select an environment that will meet the business and data requirements.

Which environment should you use?

Options:

Azure HDInsight with Spark MLlib

Azure Cognitive Services

Azure Machine Learning Studio

Microsoft Machine Learning Server

Question 7

You need to implement a scaling strategy for the local penalty detection data.

Which normalization type should you use?

Options:

Streaming

Weight

Batch

Cosine

Question 8

You need to modify the inputs for the global penalty event model to address the bias and variance issue.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Options:

Question 9

You need to define a process for penalty event detection.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Options:

Question 10

You need to define a modeling strategy for ad response.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Options:

Question 11

You need to implement a feature engineering strategy for the crowd sentiment local models.

What should you do?

Options:

Apply an analysis of variance (ANOVA).

Apply a Pearson correlation coefficient.

Apply a Spearman correlation coefficient.

Apply a linear discriminant analysis.

Question 12

You need to define an evaluation strategy for the crowd sentiment models.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Options:

Question 13

You need to define an evaluation strategy for the crowd sentiment models.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Options:

Question 14

You need to build a feature extraction strategy for the local models.

How should you complete the code segment? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 15

You plan to implement an Azure Machine Learning solution. You have the following requirements:

• Run a Jupyter notebook to interactively tram a machine learning model.

• Deploy assets and workflows for machine learning proof of concept by using scripting rather than custom programming.

You need to select a development technique for each requirement

Which development technique should you use? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 16

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You are creating a new experiment in Azure Machine Learning Studio.

One class has a much smaller number of observations than tin- other classes in the training set.

You need to select an appropriate data sampling strategy to compensate for the class imbalance.

Solution: You use the Principal Components Analysis (PCA) sampling mode.

Does the solution meet the goal?

Options:

Yes

Question 17

You manage an Azure Machine Learning workspace by using the Python SDK v2.

You must create a compute cluster in the workspace. The compute cluster must run workloads and properly handle interruptions. You start by calculating the maximum amount of compute resources required by the workloads and size the cluster to match the calculations.

The cluster definition includes the following properties and values:

• name="mlcluster1’’

• size="STANDARD.DS3.v2"

• min_instances=1

• maxjnstances=4

• tier="dedicated"

The cost of the compute resources must be minimized when a workload is active Of idle. Cluster property changes must not affect the maximum amount of compute resources available to the workloads run on the cluster.

You need to modify the cluster properties to minimize the cost of compute resources.

Which properties should you modify? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 18

You use the Azure Machine learning SDK foe Python to create a pipeline that includes the following step:

The output of the step run must be cached and reused on subsequent runs when the source.directory value has not changed.

You need to define the step.

What should you include in the step definition?

Options:

allow.reuse

hash_path

data-as_input(name-)

version

Question 19

You have a feature set containing the following numerical features: X, Y, and Z.

The Poisson correlation coefficient (r-value) of X, Y, and Z features is shown in the following image:

Use the drop-down menus to select the answer choice that answers each question based on the information presented in the graphic.

NOTE: Each correct selection is worth one point.

Options:

Question 20

You manage an Azure Al Foundry project.

You need to develop a solution that uses an Azure OpenAI Service model designed to support reasoning and problem solving. Which model should you use?

Options:

o1-mini

Whisper

GPT-4

Embeddirigs

Question 21

You create an Azure Machine Learning pipeline named pipeline1 with two steps that contain Python scripts. Data processed by the first step is passed to the second step.

You must update the content of the downstream data source of pipeline1 and run the pipeline again

You need to ensure the new run of pipeline1 fully processes the updated content.

Solution: Set the allow_reuse parameter of the PythonScriptStep object of both steps to False

Does the solution meet the goal?

Options:

Yes

Question 22

You are a lead data scientist for a project that tracks the health and migration of birds. You create a multi-class image classification deep learning model that uses a set of labeled bird photographs collected by experts.

You have 100,000 photographs of birds. All photographs use the JPG format and are stored in an Azure blob container in an Azure subscription.

You need to access the bird photograph files in the Azure blob container from the Azure Machine Learning service workspace that will be used for deep learning model training. You must minimize data movement.

What should you do?

Options:

Create an Azure Data Lake store and move the bird photographs to the store.

Create an Azure Cosmos DB database and attach the Azure Blob containing bird photographs storage to the database.

Create and register a dataset by using TabularDataset class that references the Azure blob storagecontaining bird photographs.

Copy the bird photographs to the blob datastore that was created with your Azure Machine Learningservice workspace.

Question 23

You are performing feature scaling by using the scikit-learn Python library for x.1 x2, and x3 features.

Original and scaled data is shown in the following image.

Use the drop-down menus to select the answer choice that answers each question based on the information presented in the graphic.

NOTE: Each correct selection is worth one point.

Options:

Question 24

You are analyzing a dataset containing historical data from a local taxi company. You arc developing a regression a regression model.

You must predict the fare of a taxi trip.

You need to select performance metrics to correctly evaluate the- regression model.

Which two metrics can you use? Each correct answer presents a complete solution.

NOTE: Each correct selection is worth one point.

Options:

an F1 score that is high

an R Squared value dose to 1

an R-Squared value close to 0

a Root Mean Square Error value that is high

a Root Mean Square Error value that is low

an F 1 score that is low.

Question 25

You have fine-tuned an Azure OpenAI Service model by using the Azure Ai Foundry portal. The fine-tuned model is overfitting.

You plan to correct overfitting by fine-tuning the model again

You need to modify the default value of a fine-tuning task parameter to minimize the possibility of overfitting. Which modification should you apply?

Options:

Increase the learning_rate_multiplier.

increase the batch_size.

Decrease the batch_size.

Decrease the leaming_rate_multiplier.

Question 26

You manage an Azure Machine Learning won pace named workspace 1 by using the Python SDK v2. You create a Gene-al Purpose v2 Azure storage account named mlstorage1. The storage account includes a pulley accessible container name micOTtalnerl. The container stores 10 blobs with files in the CSV format.

You must develop Python SDK v2 code to create a data asset referencing all blobs in the container named mtcontamer1.

You need to complete the Python SDK v2 code.

How should you complete the code? To answer select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 27

You have an Azure Machine Learning workspace.

You plan to run a job to tram a model as an MLflow model output.

You need to specify the output mode of the MLflow model.

Which three modes can you specify? Each correct answer presents a complete solution.

NOTE: Each correct selection is worth one point.

Options:

rw_mount

ro mount

upload

download

direct

Question 28

You use Azure Machine Learning Designer to load the following datasets into an experiment:

Dataset1

Dataset2

You use Azure Machine Learning Designer to load the following datasets into an experiment:

You need to create a dataset that has the same columns and header row as the input datasets and contains all rows from both input datasets.

Solution: Use the Join Data component.

Does the solution meet the goal?

Options:

Yes

Question 29

space and set up a development environment. You plan to train a deep neural network (DNN) by using the Tensorflow framework and by using estimators to submit training scripts.

You must optimize computation speed for training runs.

You need to choose the appropriate estimator to use as well as the appropriate training compute target configuration.

Which values should you use? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 30

You have an Azure Machine Learning (ML) model deployed to an online endpoint.

You need to review container logs from the endpoint by using Azure Ml Python SDK v2. The logs must include the console log from the inference server with print/log statements from the models scoring script.

What should you do first?

Options:

Create an instance of the the MLCIient class.

Create an instance of the OnlineDeploymentOperations class.

Connect by using SSH to the inference server.

Connect by using Docker tools to the inference server.

Question 31

You build a data pipeline in an Azure Machine Learning workspace by using the Azure Machine Learning SDK for Python.

You need to run a Python script as a pipeline step.

Which two classes could you use? Each correct answer presents a complete solution.

NOTE: Each correct selection is worth one point.

Options:

PythonScriptStep

AutoMLStep

CommandStep

StepRun

Question 32

You are with a time series dataset in Azure Machine Learning Studio.

You need to split your dataset into training and testing subsets by using the Split Data module.

Which splitting mode should you use?

Options:

Regular Expression Split

Split Rows with the Randomized split parameter set to true

Relative Expression Split

Recommender Split

Question 33

You create a binary classification model.

You need to evaluate the model performance.

Which two metrics can you use? Each correct answer presents a complete solution.

NOTE: Each correct selection is worth one point.

Options:

relative absolute error

precision

accuracy

mean absolute error

coefficient of determination

Question 34

You are the owner of an Azure Machine Learning workspace.

You must prevent the creation or deletion of compute resources by using a custom role. You must allow all other operations inside the workspace.

You need to configure the custom role.

How should you complete the configuration? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 35

You use Azure Machine Learning studio to analyze an mltable data asset containing a decimal column named column1. You need to verify that the column1 values are normally distributed.

Which statistic should you use?

Options:

Max

Type

Profile

Mean

Question 36

You create a binary classification model using Azure Machine Learning Studio.

You must use a Receiver Operating Characteristic (RO C) curve and an F1 score to evaluate the model.

You need to create the required business metrics.

How should you complete the experiment? To answer, select the appropriate options in the dialog box in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 37

You manage an Azure Machine Learning workspace. The Pylhon scrip! named scriptpy reads an argument named training_data. The trainlng.data argument specifies the path to the training data in a file named datasetl.csv.

You plan to run the scriptpy Python script as a command job that trains a machine learning model.

You need to provide the command to pass the path for the datasct as a parameter value when you submit the script as a training job.

Solution: python script.py –training_data dataset1,csv

Does the solution meet the goal?

Options:

Yes

Question 38

You have an Azure Machine Learning workspace.

You plan to use automated machine learning in the workspace to train a natural language processing model for multi-class classification. You need to provide a dataset for training the model. How should you format the data?

Options:

exactly one text column and one or more label columns

one or more text columns and exactly one label column

one or more text columns and one or more label columns

exactly one text column and exactly one label column

Question 39

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You have an Azure Machine Learning workspace. You connect to a terminal session from the Notebooks page in Azure Machine Learning studio.

You plan to add a new Jupyter kernel that will be accessible from the same terminal session.

You need to perform the task that must be completed before you can add the new kernel.

Solution: Create an environment.

Does the solution meet the goal?

Options:

Yes

Question 40

You are planning to register a trained model in an Azure Machine Learning workspace.

You must store additional metadata about the model in a key-value format. You must be able to add new metadata and modify or delete metadata after creation.

You need to register the model.

Which parameter should you use?

Options:

description

model_framework

cags

properties

Question 41

You manage are Azure Machine Learning workspace by using the Python SDK v2.

You must create an automated machine learning job to generate a classification model by using data files stored in Parquet format. You must configure an auto scaling compute target and a data asset for the job.

You need to configure the resources for the job.

Which resource configuration should you use? to answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 42

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

An IT department creates the following Azure resource groups and resources:

The IT department creates an Azure Kubernetes Service (AKS)-based inference compute target named aks-cluster in the Azure Machine Learning workspace.

You have a Microsoft Surface Book computer with a GPU. Python 3.6 and Visual Studio Code are installed.

You need to run a script that trains a deep neural network (DNN) model and logs the loss and accuracy metrics.

Solution: Install the Azure ML SDK on the Surface Book. Run Python code to connect to the workspace and then run the training script as an experiment on local compute.

Options:

Yes

Question 43

You are using an Azure Machine Learning workspace. You set up an environment for model testing and an environment for production.

The compute target for testing must minimize cost and deployment efforts. The compute target for production must provide fast response time, autoscaling of the deployed service, and support real-time inferencing.

You need to configure compute targets for model testing and production.

Which compute targets should you use? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 44

You are analyzing a raw dataset that requires cleaning.

You must perform transformations and manipulations by using Azure Machine Learning Studio.

You need to identify the correct modules to perform the transformations.

Which modules should you choose? To answer, drag the appropriate modules to the correct scenarios. Each module may be used once, more than once, or not at all.

You may need to drag the split bar between panes or scroll to view content.

NOTE: Each correct selection is worth one point.

Options:

Question 45

You load data from a notebook in an Azure Machine Learning workspace into a pandas dataframe named df. The data contains 10.000 patient records. Each record includes the Age property for the corresponding patient.

You must identify the mean age value from the differentially private data generated by SmartNoise SDK.

You need to complete the Python code that will generate the mean age value from the differentially private data.

Which code segments should you use? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 46

You develop a chat flow in an Azure Al Foundry project

You plan to include a Jinja language-based prompt template in the How

You need to complete the provided template to display a list of inputs and outputs included in the flow.

How should you complete the provided template? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 47

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You create an Azure Machine Learning service datastore in a workspace. The datastore contains the following files:

• /data/2018/Q1 .csv

• /data/2018/Q2.csv

• /data/2018/Q3.csv

• /data/2018/Q4.csv

• /data/2019/Q1.csv

All files store data in the following format:

id,M,f2,l

1,1,2,0

2,1,1,1

32,10

You run the following code:

You need to create a dataset named training_data and load the data from all files into a single data frame by using the following code:

Solution: Run the following code:

Does the solution meet the goal?

Options:

Yes

Question 48

You are moving a large dataset from Azure Machine Learning Studio to a Weka environment.

You need to format the data for the Weka environment.

Which module should you use?

Options:

Convert to CSV

Convert to Dataset

Convert to ARFF

Convert to SVMLight

Question 49

You manage an Azure Machine Learning workspace. The development environment for managing the workspace is configured to use Python SDK v2 in Azure Machine Learning Notebooks.

A Synapse Spark Compute is currently attached and uses system-assigned identity.

You need to use Python code to update the Synapse Spark Compute to use a user-assigned identity.

Solution: Initialize the DefaultAzureCredential class.

Does the solution meet the goal?

Options:

Yes

Question 50

You are working on a classification task. You have a dataset indicating whether a student would like to play soccer and associated attributes. The dataset includes the following columns:

You need to classify variables by type.

Which variable should you add to each category? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 51

You have an Azure Machine Learning workspace and a data source file ./data/cc_data.csv in the local storage.

You plan to use Azure Machine Learning Python SDK v2 to store the content of the cc.data.csv file in a data asset named cc_data_asset in the workspace.

You write code to connect to the workspace and import all required libraries.

You need to complete the remaining code to ensure it will result in the cc_data_asset that contains the data from cc_data.csv.

How should you complete the code? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 52

You need to set up the Permutation Feature Importance module according to the model training requirements.

Which properties should you select? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 53

You need to select a feature extraction method.

Which method should you use?

Options:

Mutual information

Mood’s median test

Kendall correlation

Permutation Feature Importance

Question 54

You need to implement early stopping criteria as suited in the model training requirements.

Which three code segments should you use to develop the solution? To answer, move the appropriate code segments from the list of code segments to the answer area and arrange them in the correct order.

NOTE: More than one order of answer choices is correct. You will receive credit for any of the correct orders you select.

Options:

Question 55

You need to replace the missing data in the AccessibilityToHighway columns.

How should you configure the Clean Missing Data module? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 56

You need to produce a visualization for the diagnostic test evaluation according to the data visualization requirements.

Which three modules should you recommend be used in sequence? To answer, move the appropriate modules from the list of modules to the answer area and arrange them in the correct order.

Options:

Question 57

You need to identify the methods for dividing the data according, to the testing requirements.

Which properties should you select? To answer, select the appropriate option-, m the answer area. NOTE: Each correct selection is worth one point.

Options:

Question 58

You need to visually identify whether outliers exist in the Age column and quantify the outliers before the outliers are removed.

Which three Azure Machine Learning Studio modules should you use in sequence? To answer, move the appropriate modules from the list of modules to the answer area and arrange them in the correct order.

Options:

Question 59

You need to configure the Edit Metadata module so that the structure of the datasets match.

Which configuration options should you select? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 60

You need to correct the model fit issue.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Options:

Question 61

You need to select a feature extraction method.

Which method should you use?

Options:

Spearman correlation

Mutual information

Mann-Whitney test

Pearson’s correlation

Question 62

You need to identify the methods for dividing the data according to the testing requirements.

Which properties should you select? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 63

You need to configure the Feature Based Feature Selection module based on the experiment requirements and datasets.

How should you configure the module properties? To answer, select the appropriate options in the dialog box in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 64

You need to configure the Permutation Feature Importance module for the model training requirements.

What should you do? To answer, select the appropriate options in the dialog box in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Exam Detail

Vendor: Microsoft

Certification: Microsoft Azure

Exam Code: DP-100

Exam Name: Designing and Implementing a Data Science Solution on Azure

Last Update: Aug 18, 2025

DP-100 Question Answers

Summer Special - Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: top65certs

Free and Premium Microsoft DP-100 Dumps Questions Answers

Designing and Implementing a Data Science Solution on Azure Questions and Answers

Options:

Answer:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Options:

Answer:

Options:

Answer:

Explanation:

Options:

Answer:

Options:

Answer:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Options:

Answer:

Options:

Answer:

Explanation:

Options:

Answer:

Options:

Answer:

Options:

Answer:

Explanation:

Options:

Answer:

Options:

Answer:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Options:

Answer:

Options:

Answer:

Options:

Answer:

Options:

Answer:

Explanation:

Options:

Answer:

Options:

Answer:

Explanation: