Labour Day Special - Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: top65certs

MLS-C01 Reviews Questions

Page: 5 / 20
Total 281 questions

AWS Certified Machine Learning - Specialty Questions and Answers

Question 17

A gaming company has launched an online game where people can start playing for free but they need to pay if they choose to use certain features The company needs to build an automated system to predict whether or not a new user will become a paid user within 1 year The company has gathered a labeled dataset from 1 million users

The training dataset consists of 1.000 positive samples (from users who ended up paying within 1 year) and 999.000 negative samples (from users who did not use any paid features) Each data sample consists of 200 features including user age, device, location, and play patterns

Using this dataset for training, the Data Science team trained a random forest model that converged with over 99% accuracy on the training set However, the prediction results on a test dataset were not satisfactory.

Which of the following approaches should the Data Science team take to mitigate this issue? (Select TWO.)

Options:

A.

Add more deep trees to the random forest to enable the model to learn more features.

B.

indicate a copy of the samples in the test database in the training dataset

C.

Generate more positive samples by duplicating the positive samples and adding a small amount of noise to the duplicated data.

D.

Change the cost function so that false negatives have a higher impact on the cost value than false positives

E.

Change the cost function so that false positives have a higher impact on the cost value than false negatives

Question 18

A Machine Learning Specialist uploads a dataset to an Amazon S3 bucket protected with server-side

encryption using AWS KMS.

How should the ML Specialist define the Amazon SageMaker notebook instance so it can read the same

dataset from Amazon S3?

Options:

A.

Define security group(s) to allow all HTTP inbound/outbound traffic and assign those security group(s) to

the Amazon SageMaker notebook instance.

B.

Сonfigure the Amazon SageMaker notebook instance to have access to the VPC. Grant permission in the

KMS key policy to the notebook’s KMS role.

C.

Assign an IAM role to the Amazon SageMaker notebook with S3 read access to the dataset. Grant

permission in the KMS key policy to that role.

D.

Assign the same KMS key used to encrypt data in Amazon S3 to the Amazon SageMaker notebook

instance.

Question 19

A Data Scientist needs to migrate an existing on-premises ETL process to the cloud The current process runs at regular time intervals and uses PySpark to combine and format multiple large data sources into a single consolidated output for downstream processing

The Data Scientist has been given the following requirements for the cloud solution

* Combine multiple data sources

* Reuse existing PySpark logic

* Run the solution on the existing schedule

* Minimize the number of servers that will need to be managed

Which architecture should the Data Scientist use to build this solution?

Options:

A.

Write the raw data to Amazon S3 Schedule an AWS Lambda function to submit a Spark step to a persistent Amazon EMR cluster based on the existing schedule Use the existing PySpark logic to run the ETL job on the EMR cluster Output the results to a "processed" location m Amazon S3 that is accessible tor downstream use

B.

Write the raw data to Amazon S3 Create an AWS Glue ETL job to perform the ETL processing against the input data Write the ETL job in PySpark to leverage the existing logic Create a new AWS Glue trigger to trigger the ETL job based on the existing schedule Configure the output target of the ETL job to write to a "processed" location in Amazon S3 that is accessible for downstream use.

C.

Write the raw data to Amazon S3 Schedule an AWS Lambda function to run on the existing schedule and process the input data from Amazon S3 Write the Lambda logic in Python and implement the existing PySpartc logic to perform the ETL process Have the Lambda function output the results to a "processed" location in Amazon S3 that is accessible for downstream use

D.

Use Amazon Kinesis Data Analytics to stream the input data and perform realtime SQL queries against the stream to carry out the required transformations within the stream Deliver the output results to a "processed" location in Amazon S3 that is accessible for downstream use

Question 20

A Machine Learning Specialist is implementing a full Bayesian network on a dataset that describes public transit in New York City. One of the random variables is discrete, and represents the number of minutes New Yorkers wait for a bus given that the buses cycle every 10 minutes, with a mean of 3 minutes.

Which prior probability distribution should the ML Specialist use for this variable?

Options:

A.

Poisson distribution ,

B.

Uniform distribution

C.

Normal distribution

D.

Binomial distribution

Page: 5 / 20
Total 281 questions