Microsoft Certified: Azure Data Engineer Associate DP-203 Exam Questions and Answers PDF

Data Engineering on Microsoft Azure Questions and Answers

Question 21

You are designing a partition strategy for a fact table in an Azure Synapse Analytics dedicated SQL pool. The table has the following specifications:

• Contain sales data for 20,000 products.

• Use hash distribution on a column named ProduclID,

• Contain 2.4 billion records for the years 20l9 and 2020.

Which number of partition ranges provides optimal compression and performance of the clustered columnstore index?

Options:

240

400

2,400

Question 22

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You plan to create an Azure Databricks workspace that has a tiered structure. The workspace will contain the following three workloads:

A workload for data engineers who will use Python and SQL.

A workload for jobs that will run notebooks that use Python, Scala, and SOL.

A workload that data scientists will use to perform ad hoc analysis in Scala and R.

The enterprise architecture team at your company identifies the following standards for Databricks environments:

The data engineers must share a cluster.

The job cluster will be managed by using a request process whereby data scientists and data engineers provide packaged notebooks for deployment to the cluster.

All the data scientists must be assigned their own cluster that terminates automatically after 120 minutes of inactivity. Currently, there are three data scientists.

You need to create the Databricks clusters for the workloads.

Solution: You create a High Concurrency cluster for each data scientist, a High Concurrency cluster for the data engineers, and a Standard cluster for the jobs.

Does this meet the goal?

Options:

Yes

Question 23

You have an Azure Databricks workspace and an Azure Data Lake Storage Gen2 account named storage!

New files are uploaded daily to storage1.

• Incrementally process new files as they are upkorage1 as a structured streaming source. The solution must meet the following requirements:

• Minimize implementation and maintenance effort.

• Minimize the cost of processing millions of files.

• Support schema inference and schema drift.

Which should you include in the recommendation?

Options:

Auto Loader

Apache Spark FileStreamSource

COPY INTO

Azure Data Factory

Question 24

You have an Azure subscription that contains an Azure Synapse Analytics dedicated SQL pool named Pool1.

You use Azure Monitor.

You need to monitor the performance of queries executed in Pool1.

Which log should you query?

Options:

SynapseSqlPoolSqlRequests

SynapseSqlPoolExecRequests

SynapseSqlPoolWaits

SynapseSqlPoolRequestSteps

Big Halloween Sale 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: save70

Microsoft Certified: Azure Data Engineer Associate DP-203 Exam Questions and Answers PDF

Data Engineering on Microsoft Azure Questions and Answers

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Options:

Answer:

Explanation:

CompTIA

Fortinet

Microsoft

Salesforce