New Year Sale 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: save70

Google Professional-Data-Engineer Based on Real Exam Environment

Google Professional Data Engineer Exam Questions and Answers

Question 21

You have data pipelines running on BigQuery, Cloud Dataflow, and Cloud Dataproc. You need to perform health checks and monitor their behavior, and then notify the team managing the pipelines if they fail. You also need to be able to work across multiple projects. Your preference is to use managed products of features of the platform. What should you do?

Options:

A.

Export the information to Cloud Stackdriver, and set up an Alerting policy

B.

Run a Virtual Machine in Compute Engine with Airflow, and export the information to Stackdriver

C.

Export the logs to BigQuery, and set up App Engine to read that information and send emails if you find a failure in the logs

D.

Develop an App Engine application to consume logs using GCP API calls, and send emails if you find a failure in the logs

Question 22

Your company uses Looker Studio connected to BigQuery for reporting. Users are experiencing slow dashboard load times due to complex queries on a large table. The queries involve aggregations and filtering on several columns. You need to optimize query performance to decrease the dashboard load times. What should you do?

Options:

A.

Configure Looker Studio to use a shorter data refresh interval to ensure fresh data is always displayed.

B.

Create a materialized view in BigQuery that pre-calculates the aggregations and filters used in the Looker Studio dashboards.

C.

Implement row-level security in BigQuery to restrict data access and reduce the amount of data processed by the queries.

D.

Use BigQuery BI Engine to accelerate query performance by caching frequently accessed data.

Question 23

You have a data pipeline with a Dataflow job that aggregates and writes time series metrics to Bigtable. You notice that data is slow to update in Bigtable. This data feeds a dashboard used by thousands of users across the organization. You need to support additional concurrent users and reduce the amount of time required to write the data. What should you do?

Choose 2 answers

Options:

A.

Configure your Dataflow pipeline to use local execution.

B.

Modify your Dataflow pipeline lo use the Flatten transform before writing to Bigtable.

C.

Modify your Dataflow pipeline to use the CoGrcupByKey transform before writing to Bigtable.

D.

Increase the maximum number of Dataflow workers by setting maxNumWorkers in PipelineOptions.

E.

Increase the number of nodes in the Bigtable cluster.

Question 24

You have a data analyst team member who needs to analyze data by using BigQuery. The data analyst wants to create a data pipeline that would load 200 CSV files with an average size of 15MB from a Cloud Storage bucket into BigQuery daily. The data needs to be ingested and transformed before being accessed in BigQuery for analysis. You need to recommend a fully managed, no-code solution for the data analyst. What should you do?

Options:

A.

Create a Cloud Run function and schedule it to run daily using Cloud Scheduler to load the data into BigQuery.

B.

Use the BigQuery Data Transfer Service to load files from Cloud Storage to BigQuery, create a BigQuery job which transforms the data using BigQuery SQL and schedule it to run daily.

C.

Build a custom Apache Beam pipeline and run it on Dataflow to load the file from Cloud Storage to BigQuery and schedule it to run daily using Cloud Composer.

D.

Create a pipeline by using BigQuery pipelines and schedule it to load the data into BigQuery daily.