Weekend Sale 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: save70

Vce DY0-001 Questions Latest

Page: 2 / 6
Total 85 questions

CompTIA DataX Exam Questions and Answers

Question 5

A data scientist is designing a real-time machine-learning model that classifies a user based on initial behavior. The run times of these models are provided in the following table:

Which of the following models should the data scientist recommend for deployment?

Options:

A.

XGBoost

B.

Random forest

C.

Decision trees

D.

Artificial neural network

Question 6

A statistician notices gaps in data associated with age-related illnesses and wants to further aggregate these observations. Which of the following is the best technique to achieve this goal?

Options:

A.

Label encoding

B.

Linearization

C.

Binning

D.

Imputing

Question 7

Which of the following does k represent in the k-means model?

Options:

A.

Number of model tests

B.

Number of data splits

C.

Number of clusters

D.

Distance between features

Question 8

A data scientist receives an update on a business case about a machine that has thousands of error codes. The data scientist creates the following summary statistics profile while reviewing the logs for each machine:

| Number of machines observed | 3,000,000

| Number of unique error codes observed | 19,000

| Median number of unique codes per machine | 7

| Median number of error transactions | 45

Which of the following is the most likely concern with respect to data design for model ingestion?

Options:

A.

Sparse matrix

B.

Granularity misalignment

C.

Insufficient features

D.

Multivariate outliers

Page: 2 / 6
Total 85 questions