Pre-Summer Sale 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: save70

Databricks Certification Databricks-Certified-Professional-Data-Engineer Dumps PDF

Databricks Certified Data Engineer Professional Exam Questions and Answers

Question 29

The data governance team is reviewing user for deleting records for compliance with GDPR. The following logic has been implemented to propagate deleted requests from the user_lookup table to the user aggregate table.

Assuming that user_id is a unique identifying key and that all users have requested deletion have been removed from the user_lookup table, which statement describes whether successfully executing the above logic guarantees that the records to be deleted from the user_aggregates table are no longer accessible and why?

Options:

A.

No: files containing deleted records may still be accessible with time travel until a BACUM command is used to remove invalidated data files.

B.

Yes: Delta Lake ACID guarantees provide assurance that the DELETE command successed fully and permanently purged these records.

C.

No: the change data feed only tracks inserts and updates not deleted records.

D.

No: the Delta Lake DELETE command only provides ACID guarantees when combined with the MERGE INTO command

Question 30

A member of the data engineering team has submitted a short notebook that they wish to schedule as part of a larger data pipeline. Assume that the commands provided below produce the logically correct results when run as presented.

Which command should be removed from the notebook before scheduling it as a job?

Options:

A.

Cmd 2

B.

Cmd 3

C.

Cmd 4

D.

Cmd 5

E.

Cmd 6

Question 31

A data architect has designed a system in which two Structured Streaming jobs will concurrently write to a single bronze Delta table. Each job is subscribing to a different topic from an Apache Kafka source, but they will write data with the same schema. To keep the directory structure simple, a data engineer has decided to nest a checkpoint directory to be shared by both streams.

The proposed directory structure is displayed below:

Which statement describes whether this checkpoint directory structure is valid for the given scenario and why?

Options:

A.

No; Delta Lake manages streaming checkpoints in the transaction log.

B.

Yes; both of the streams can share a single checkpoint directory.

C.

No; only one stream can write to a Delta Lake table.

D.

Yes; Delta Lake supports infinite concurrent writers.

E.

No; each of the streams needs to have its own checkpoint directory.

Question 32

A data engineer is optimizing a managed Delta table that suffers from data skew and frequently changing query filter columns . The engineer wants to avoid costly data rewrites when query patterns evolve. The table size is under 1 TB.

How should the data engineer meet this requirement?

Options:

A.

Apply Z-ordering , since it allows flexible reorganization of data layout without rewriting existing files and adapts easily to new filter columns.

B.

Use Hive-style partitioning , as it provides efficient data skipping and is easy to change partition columns at any time.

C.

Enable liquid clustering , as it efficiently handles data skew, allows clustering keys to be changed without rewriting existing data, and adapts to evolving query patterns.

D.

Combine partitioning and Z-ordering to maximize flexibility and minimize maintenance as query patterns change.