Spring Sale 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: save70

AWS Certified Associate Data-Engineer-Associate Book

AWS Certified Data Engineer - Associate (DEA-C01) Questions and Answers

Question 37

A data engineer must manage the ingestion of real-time streaming data into AWS. The data engineer wants to perform real-time analytics on the incoming streaming data by using time-based aggregations over a window of up to 30 minutes. The data engineer needs a solution that is highly fault tolerant.

Which solution will meet these requirements with the LEAST operational overhead?

Options:

A.

Use an AWS Lambda function that includes both the business and the analytics logic to perform time-based aggregations over a window of up to 30 minutes for the data in Amazon Kinesis Data Streams.

B.

Use Amazon Managed Service for Apache Flink (previously known as Amazon Kinesis Data Analytics) to analyze the data that might occasionally contain duplicates by using multiple types of aggregations.

C.

Use an AWS Lambda function that includes both the business and the analytics logic to perform aggregations for a tumbling window of up to 30 minutes, based on the event timestamp.

D.

Use Amazon Managed Service for Apache Flink (previously known as Amazon Kinesis Data Analytics) to analyze the data by using multiple types of aggregations to perform time-based analytics over a window of up to 30 minutes.

Question 38

A company runs an AWS Glue workflow every day to process time series data from an Amazon S3 bucket. The workflow loads the data into an Amazon Redshift Serverless table. The company observes that some of the jobs in the workflow occasionally fail.

A data engineer must receive a notification when the Redshift table does not contain the most recent data.

Which solution will meet this requirement in the MOST operationally efficient way?

Options:

A.

Configure an Amazon EventBridge Scheduler to run an Amazon Macie job to scan the Redshift table for data freshness. Configure Macie to notify an Amazon Simple Notification Service (Amazon SNS) topic when an AWS Glue job fails.

B.

Schedule an AWS Glue Data Quality job to check the freshness of the data. Create an Amazon EventBridge rule to notify an Amazon Simple Notification Service (Amazon SNS) topic when a data quality rule fails.

C.

Load AWS Glue job logs to an Amazon S3 bucket. Configure an Amazon CloudWatch alarm to send a notification when the job logs in the S3 bucket contain Job.State=FAILED.

D.

Create an Amazon CloudWatch dashboard that displays a metric named Failed AWS Glue Jobs that counts AWS Glue job failures during the previous day. Set a CloudWatch alarm to send a notification when the metric value exceeds zero.

Question 39

A company stores logs in an Amazon S3 bucket. When a data engineer attempts to access several log files, the data engineer discovers that some files have been unintentionally deleted.

The data engineer needs a solution that will prevent unintentional file deletion in the future.

Which solution will meet this requirement with the LEAST operational overhead?

Options:

A.

Manually back up the S3 bucket on a regular basis.

B.

Enable S3 Versioning for the S3 bucket.

C.

Configure replication for the S3 bucket.

D.

Use an Amazon S3 Glacier storage class to archive the data that is in the S3 bucket.

Question 40

A hotel management company receives daily data files from each of its hotels. The company wants to upload its data to AWS. The company plans to use Amazon Athena to access the files. The company needs to protect the files from accidental deletion. The company will develop an application on its on-premises servers to automatically forward the files to a fully managed AWS ingestion service.

Which solution will meet these requirements with the LEAST operational overhead?

Options:

A.

Use AWS DataSync to replicate data from the on-premises servers to Amazon Elastic File System (Amazon EFS). Configure automatic backups in AWS Backup.

B.

Use the Amazon Kinesis Agent on the on-premises servers to send data to Amazon Data Firehose. Store the data in an Amazon S3 bucket that has versioning enabled.

C.

Use AWS Glue jobs to ingest data from the on-premises servers into Amazon RDS. Enable automated backups for data protection.

D.

Use a self-managed Apache Kafka agent on the on-premises servers to stream data to Amazon Managed Streaming for Apache Kafka (Amazon MSK). Store the data in an Amazon S3 bucket with versioning enabled.