Spring Sale 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: save70

Last Attempt Data-Engineer-Associate Questions

AWS Certified Data Engineer - Associate (DEA-C01) Questions and Answers

Question 33

A company has multiple applications that use datasets that are stored in an Amazon S3 bucket. The company has an ecommerce application that generates a dataset that contains personally identifiable information (PII). The company has an internal analytics application that does not require access to the PII.

To comply with regulations, the company must not share PII unnecessarily. A data engineer needs to implement a solution that with redact PII dynamically, based on the needs of each application that accesses the dataset.

Which solution will meet the requirements with the LEAST operational overhead?

Options:

A.

Create an S3 bucket policy to limit the access each application has. Create multiple copies of the dataset. Give each dataset copy the appropriate level of redaction for the needs of the application that accesses the copy.

B.

Create an S3 Object Lambda endpoint. Use the S3 Object Lambda endpoint to read data from the S3 bucket. Implement redaction logic within an S3 Object Lambda function to dynamically redact PII based on the needs of each application that accesses the data.

C.

Use AWS Glue to transform the data for each application. Create multiple copies of the dataset. Give each dataset copy the appropriate level of redaction for the needs of the application that accesses the copy.

D.

Create an API Gateway endpoint that has custom authorizers. Use the API Gateway endpoint to read data from the S3 bucket. Initiate a REST API call to dynamically redact PII based on the needs of each application that accesses the data.

Question 34

A company uses AWS Glue ETL pipelines to process data. The company uses Amazon Athena to analyze data in an Amazon S3 bucket.

To better understand shipping timelines, the company decides to collect and store shipping dates and delivery dates in addition to order data. The company adds a data quality check to ensure that the shipping date is later than the order date and that the delivery date is later than the shipping date. Orders that fail the quality check must be stored in a second Amazon S3 bucket.

Which solution will meet these requirements in the MOST cost-effective way?

Options:

A.

Use AWS Glue DataBrew DATEDIFF functions to create two additional columns. Validate the new columns. Write failed records to a second S3 bucket.

B.

Use Amazon Athena to query the three date columns and compare the values. Export failed records to a second S3 bucket.

C.

Use AWS Glue Data Quality to create a custom rule that validates the three date columns. Route records that fail the rule to a second S3 bucket.

D.

Use an AWS Glue crawler to populate the AWS Glue Data Catalog. Use the three date columns to create a filter.

Question 35

A global company currently uses Amazon Redshift to store data and Amazon Quick Suite (previously known as Amazon QuickSight) to generate reports.

A team of business analysts have varying levels of technical expertise. Some analysts lack SQL knowledge. All the analysts need to create new reports frequently. The company wants to use natural program language queries to create dashboards and reports more efficiently.

Which solution will meet these requirements with the LEAST operational effort?

Options:

A.

Use Quick Suite dashboards that have zero-ETL access to Amazon Redshift.

B.

Enable Amazon Q in Quick Suite. Generate Quick Suite dashboards and reports.

C.

Integrate Tableau with Amazon Redshift to give Tableau direct access to the data.

D.

Use Quick Suite dashboards that have federated query access to Amazon Redshift.

Question 36

A data engineer maintains a materialized view that is based on an Amazon Redshift database. The view has a column named load_date that stores the date when each row was loaded.

The data engineer needs to reclaim database storage space by deleting all the rows from the materialized view.

Which command will reclaim the MOST database storage space?

Options:

A.

Option A

B.

Option B

C.

Option C

D.

Option D