Winter Sale - Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: top65certs

Sure Pass Exam Professional-Data-Engineer PDF

Google Professional Data Engineer Exam Questions and Answers

Question 45

You have a petabyte of analytics data and need to design a storage and processing platform for it. You must be able to perform data warehouse-style analytics on the data in Google Cloud and expose the dataset as files for batch analysis tools in other cloud providers. What should you do?

Options:

A.

Store and process the entire dataset in BigQuery.

B.

Store and process the entire dataset in Cloud Bigtable.

C.

Store the full dataset in BigQuery, and store a compressed copy of the data in a Cloud Storage bucket.

D.

Store the warm data as files in Cloud Storage, and store theactive data inBigQuery. Keep this ratio as 80% warm and 20% active.

Question 46

You need to look at BigQuery data from a specific table multiple times a day. The underlying table you are querying is several petabytes in size, but you want to filter your data and provide simple aggregations to downstream users. You want to run queries faster and get up-to-date insights quicker. What should you do?

Options:

A.

Run a scheduled query to pull the necessary data at specific intervals daily.

B.

Create a materialized view based off of the query being run.

C.

Use a cached query to accelerate time to results.

D.

Limit the query columns being pulled in the final result.

Question 47

You are deploying a batch pipeline in Dataflow. This pipeline reads data from Cloud Storage, transforms the data, and then writes the data into BigQuory. The security team has enabled anorganizational constraint in Google Cloud, requiring all Compute Engine instances to use only internal IP addresses and no external IP addresses. What should you do?

Options:

A.

Ensure that the firewall rules allow access to Cloud Storage and BigQuery. Use Dataflow with only internal IPs.

B.

Ensure that your workers have network tags to access Cloud Storage and BigQuery. Use Dataflow with only internal IP addresses.

C.

Create a VPC Service Controls perimeter that contains the VPC network and add Dataflow. Cloud Storage, and BigQuery as allowedservices in the perimeter. Use Dataflow with only internal IP addresses.

D.

Ensure that Private Google Access is enabled in the subnetwork. Use Dataflow with only internal IP addresses.

Question 48

You are working on a niche product in the image recognition domain. Your team has developed a model that is dominated by custom C++ TensorFlow ops your team has implemented. These ops are used inside your main training loop and are performing bulky matrix multiplications. It currently takes up to several days to train a model. You want to decrease this time significantly and keep the cost low by using an accelerator on Google Cloud. What should you do?

Options:

A.

Use Cloud TPUs without any additional adjustment to your code.

B.

Use Cloud TPUs after implementing GPU kernel support for your customs ops.

C.

Use Cloud GPUs after implementing GPU kernel support for your customs ops.

D.

Stay on CPUs, and increase the size of the cluster you’re training your model on.