Pre-Summer Sale 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: save70

Databricks-Certified-Professional-Data-Engineer Exam Questions Tutorials

Databricks Certified Data Engineer Professional Exam Questions and Answers

Question 57

Although the Databricks Utilities Secrets module provides tools to store sensitive credentials and avoid accidentally displaying them in plain text users should still be careful with which credentials are stored here and which users have access to using these secrets.

Which statement describes a limitation of Databricks Secrets?

Options:

A.

Because the SHA256 hash is used to obfuscate stored secrets, reversing this hash will display the value in plain text.

B.

Account administrators can see all secrets in plain text by logging on to the Databricks Accounts console.

C.

Secrets are stored in an administrators-only table within the Hive Metastore; database administrators have permission to query this table by default.

D.

Iterating through a stored secret and printing each character will display secret contents in plain text.

E.

The Databricks REST API can be used to list secrets in plain text if the personal access token has proper credentials.

Question 58

A junior data engineer is working to implement logic for a Lakehouse table named silver_device_recordings. The source data contains 100 unique fields in a highly nested JSON structure.

The silver_device_recordings table will be used downstream for highly selective joins on a number of fields, and will also be leveraged by the machine learning team to filter on a handful of relevant fields, in total, 15 fields have been identified that will often be used for filter and join logic.

The data engineer is trying to determine the best approach for dealing with these nested fields before declaring the table schema.

Which of the following accurately presents information about Delta Lake and Databricks that may Impact their decision-making process?

Options:

A.

Because Delta Lake uses Parquet for data storage, Dremel encoding information for nesting can be directly referenced by the Delta transaction log.

B.

Tungsten encoding used by Databricks is optimized for storing string data: newly-added native support for querying JSON strings means that string types are always most efficient.

C.

Schema inference and evolution on Databricks ensure that inferred types will always accurately match the data types used by downstream systems.

D.

By default Delta Lake collects statistics on the first 32 columns in a table; these statistics are leveraged for data skipping when executing selective queries.