New Year Sale 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: save70

Google Professional-Data-Engineer Online Access

Google Professional Data Engineer Exam Questions and Answers

Question 25

You work for a global shipping company. You want to train a model on 40 TB of data to predict which ships in each geographic region are likely to cause delivery delays on any given day. The model will be based on multiple attributes collected from multiple sources. Telemetry data, including location in GeoJSON format, will be pulled from each ship and loaded every hour. You want to have a dashboard that shows how many and which ships are likely to cause delays within a region. You want to use a storage solution that has native functionality for prediction and geospatial processing. Which storage solution should you use?

Options:

A.

BigQuery

B.

Cloud Bigtable

C.

Cloud Datastore

D.

Cloud SQL for PostgreSQL

Question 26

Each analytics team in your organization is running BigQuery jobs in their own projects. You want to enable each team to monitor slot usage within their projects. What should you do?

Options:

A.

Create a Stackdriver Monitoring dashboard based on the BigQuery metric query/scanned_bytes

B.

Create a Stackdriver Monitoring dashboard based on the BigQuery metric slots/allocated_for_project

C.

Create a log export for each project, capture the BigQuery job execution logs, create a custom metric based on the totalSlotMs, and create a Stackdriver Monitoring dashboard based on the custom metric

D.

Create an aggregated log export at the organization level, capture the BigQuery job execution logs, create a custom metric based on the totalSlotMs, and create a Stackdriver Monitoring dashboard based on the custom metric

Question 27

You migrated a data backend for an application that serves 10 PB of historical product data for analytics. Only the last known state for a product, which is about 10 GB of data, needs to be served through an API to the other applications. You need to choose a cost-effective persistent storage solution that can accommodate the analytics requirements and the API performance of up to 1000 queries per second (QPS) with less than 1 second latency. What should you do?

Options:

A.

1. Store the historical data in BigQuery for analytics.2. In a Cloud SQL table, store the last state of the product after every product change.3. Serve the last state data directly from Cloud SQL to the API.

B.

1. Store the historical data in Cloud SQL for analytics.2. In a separate table, store the last state of the product after every product change.3. Serve the last state data directly from Cloud SQL to the API.

C.

1. Store the products as a collection in Firestore with each product having a set of historical changes.2. Use simple and compound queries for analytics.3. Serve the last state data directly from Firestore to the API.

D.

1. Store the historical data in BigQuery for analytics.2. Use a materialized view to precompute the last state of a product.3. Serve the last state data directly from BigQuery to the API.

Question 28

Your team is working on a binary classification problem. You have trained a support vector machine (SVM) classifier with default parameters, and received an area under the Curve (AUC) of 0.87 on the validation set. You want to increase the AUC of the model. What should you do?

Options:

A.

Perform hyperparameter tuning

B.

Train a classifier with deep neural networks, because neural networks would always beat SVMs

C.

Deploy the model and measure the real-world AUC; it’s always higher because of generalization

D.

Scale predictions you get out of the model (tune a scaling factor as a hyperparameter) in order to get the highest AUC