You are training an ML model on a large dataset. You are using a TPU to accelerate the training process You notice that the training process is taking longer than expected. You discover that the TPU is not reaching its full capacity. What should you do?
Increase the learning rate
Increase the number of epochs
Decrease the learning rate
Increase the batch size
The best option for training an ML model on a large dataset, using a TPU to accelerate the training process, and discovering that the TPU is not reaching its full capacity, is to increase the batch size. This option allows you to leverage the power and simplicity of TPUs to train your model faster and more efficiently. A TPU is a custom-developed application-specific integrated circuit (ASIC) that can accelerate machine learning workloads. A TPU can provide high performance and scalability for various types of models, such as linear regression, logistic regression, k-means clustering, matrix factorization, and deep neural networks. A TPU can also support various tools and frameworks, such as TensorFlow, PyTorch, and JAX. A batch size is a parameter that specifies the number of training examples in one forward/backward pass. A batch size can affect the speed and accuracy of the training process. A larger batch size can help you utilize the parallel processing power of the TPU, and reduce the communication overhead between the TPU and the host CPU. A larger batch size can also help you avoid overfitting, as it can reduce the variance of the gradient updates. By increasing the batch size, you can train your model on a large dataset faster and more efficiently, and make full use of the TPU capacity1.
The other options are not as good as option D, for the following reasons:
Option A: Increasing the learning rate would not help you utilize the parallel processing power of the TPU, and could cause errors or poor performance. A learning rate is a parameter that controls how much the model is updated in each iteration. A learning rate can affect the speed and accuracy of the training process. A larger learning rate can help you converge faster, but it can also cause instability, divergence, or oscillation. By increasing the learning rate, you may not be able to find the optimal solution, and your model may perform poorly on the validation or test data2.
Option B: Increasing the number of epochs would not help you utilize the parallel processing power of the TPU, and could increase the complexity and cost of the training process. An epoch is a measure of the number of times all of the training examples are used once in the training process. An epoch can affect the speed and accuracy of the training process. A larger number of epochs can help you learn more from the data, but it can also cause overfitting, underfitting, or diminishing returns. By increasing the number of epochs, you may not be able to improve the model performance significantly, and your training process may take longer and consume more resources3.
Option C: Decreasing the learning rate would not help you utilize the parallel processing power of the TPU, and could slow down the training process. A learning rate is a parameter that controls how much the model is updated in each iteration. A learning rate can affect the speed and accuracy of the training process. A smaller learning rate can help you find a more precise solution, but it can also cause slow convergence or local minima. By decreasing the learning rate, you may not be able to reach the optimal solution in a reasonable time, and your training process may take longer2.
References:
Preparing for Google Cloud Certification: Machine Learning Engineer, Course 2: ML Models and Architectures, Week 1: Introduction to ML Models and Architectures
Google Cloud Professional Machine Learning Engineer Exam Guide, Section 2: Architecting ML solutions, 2.1 Designing ML models
Official Google Cloud Certified Professional Machine Learning Engineer Study Guide, Chapter 4: ML Models and Architectures, Section 4.1: Designing ML Models
Use TPUs
Triose phosphate utilization and beyond: from photosynthesis to end …
Cloud TPU performance guide
Google TPU: Architecture and Performance Best Practices - Run
You are deploying a new version of a model to a production Vertex Al endpoint that is serving traffic You plan to direct all user traffic to the new model You need to deploy the model with minimal disruption to your application What should you do?
1 Create a new endpoint.
2 Create a new model Set it as the default version Upload the model to Vertex Al Model Registry.
3. Deploy the new model to the new endpoint.
4 Update Cloud DNS to point to the new endpoint
1. Create a new endpoint.
2. Create a new model Set the parentModel parameter to the model ID of the currently deployed model and set it as the default version Upload the model to Vertex Al Model Registry
3. Deploy the new model to the new endpoint and set the new model to 100% of the traffic
1 Create a new model Set the parentModel parameter to the model ID of the currently deployed model Upload the model to Vertex Al Model Registry.
2 Deploy the new model to the existing endpoint and set the new model to 100% of the traffic.
1, Create a new model Set it as the default version Upload the model to Vertex Al Model Registry
2 Deploy the new model to the existing endpoint
The best option for deploying a new version of a model to a production Vertex AI endpoint that is serving traffic, directing all user traffic to the new model, and deploying the model with minimal disruption to your application, is to create a new model, set the parentModel parameter to the model ID of the currently deployed model, upload the model to Vertex AI Model Registry, deploy the new model to the existing endpoint, and set the new model to 100% of the traffic. This option allows you to leverage the power and simplicity of Vertex AI to update your model version and serve online predictions with low latency. Vertex AI is a unified platform for building and deploying machine learning solutions on Google Cloud. Vertex AI can deploy a trained model to an online prediction endpoint, which can provide low-latency predictions for individual instances. A model is a resource that represents a machine learning model that you can use for prediction. A model can have one or more versions, which are different implementations of the same model. A model version can have different parameters, code, or data than another version of the same model. A model version can help you experiment and iterate on your model, and improve the model performance and accuracy. A parentModel parameter is a parameter that specifies the model ID of the model that the new model version is based on. A parentModel parameter can help you inherit the settings and metadata of the existing model, and avoid duplicating the model configuration. Vertex AI Model Registry is a service that can store and manage your machine learning models on Google Cloud. Vertex AI Model Registry can help you upload and organize your models, and track the model versions and metadata. An endpoint is a resource that provides the service endpoint (URL) you use to request the prediction. An endpoint can have one or more deployed models, which are instances of model versions that are associated with physical resources. A deployed model can help you serve online predictions with low latency, and scale up or down based on the traffic. By creating a new model, setting the parentModel parameter to the model ID of the currently deployed model, uploading the model to Vertex AI Model Registry, deploying the new model to the existing endpoint, and setting the new model to 100% of the traffic, you can deploy a new version of a model to a production Vertex AI endpoint that is serving traffic, direct all user traffic to the new model, and deploy the model with minimal disruption to your application1.
The other options are not as good as option C, for the following reasons:
Option A: Creating a new endpoint, creating a new model, setting it as the default version, uploading the model to Vertex AI Model Registry, deploying the new model to the new endpoint, and updating Cloud DNS to point to the new endpoint would require more skills and steps than creating a new model, setting the parentModel parameter to the model ID of the currently deployed model, uploading the model to Vertex AI Model Registry, deploying the new model to the existing endpoint, and setting the new model to 100% of the traffic. Cloud DNS is a service that can provide reliable and scalable Domain Name System (DNS) services on Google Cloud. Cloud DNS can help you manage your DNS records, and resolve domain names to IP addresses. By updating Cloud DNS to point to the new endpoint, you can redirect the user traffic to the new endpoint, and avoid breaking the existing application. However, creating a new endpoint, creating a new model, setting it as the default version, uploading the model to Vertex AI Model Registry, deploying the new model to the new endpoint, and updating Cloud DNS to point to the new endpoint would require more skills and steps than creating a new model, setting the parentModel parameter to the model ID of the currently deployed model, uploading the model to Vertex AI Model Registry, deploying the new model to the existing endpoint, and setting the new model to 100% of the traffic. You would need to write code, create and configure the new endpoint, create and configure the new model, upload the model to Vertex AI Model Registry, deploy the model to the new endpoint, and update Cloud DNS to point to the new endpoint. Moreover, this option would create a new endpoint, which can increase the maintenance and management costs2.
Option B: Creating a new endpoint, creating a new model, setting the parentModel parameter to the model ID of the currently deployed model and setting it as the default version, uploading the model to Vertex AI Model Registry, and deploying the new model to the new endpoint and setting the new model to 100% of the traffic would require more skills and steps than creating a new model, setting the parentModel parameter to the model ID of the currently deployed model, uploading the model to Vertex AI Model Registry, deploying the new model to the existing endpoint, and setting the new model to 100% of the traffic. A parentModel parameter is a parameter that specifies the model ID of the model that the new model version is based on. A parentModel parameter can help you inherit the settings and metadata of the existing model, and avoid duplicating the model configuration. A default version is a model version that is used for prediction when no other version is specified. A default version can help you simplify the prediction request, and avoid specifying the model version every time. By setting the parentModel parameter to the model ID of the currently deployed model and setting it as the default version, you can create a new model that is based on the existing model, and use it for prediction without specifying the model version. However, creating a new endpoint, creating a new model, setting the parentModel parameter to the model ID of the currently deployed model and setting it as the default version, uploading the model to Vertex AI Model Registry, and deploying the new model to the new endpoint and setting the new model to 100% of the traffic would require more skills and steps than creating a new model, setting the parentModel parameter to the model ID of the currently deployed model, uploading the model to Vertex AI Model Registry, deploying the new model to the existing endpoint, and setting the new model to 100% of the traffic. You would need to write code, create and configure the new endpoint, create and configure the new model, upload the model to Vertex AI Model Registry, and deploy the model to the new endpoint. Moreover, this option would create a new endpoint, which can increase the maintenance and management costs2.
Option D: Creating a new model, setting it as the default version, uploading the model to Vertex AI Model Registry, and deploying the new model to the existing endpoint would not allow you to inherit the settings and metadata of the existing model, and could cause errors or poor performance. A default version is a model version that is used for prediction when no other version is specified. A default version can help you simplify the prediction request, and avoid specifying the model version every time. By setting the new model as the default version, you can use the new model for prediction without specifying the model version. However, creating a new model, setting it as the default version, uploading the model to Vertex AI Model Registry, and deploying the new model to the existing endpoint would not allow you to inherit the settings and metadata of the existing model, and could cause errors or poor performance. You would need to write code, create and configure the new model, upload the model to Vertex AI Model Registry, and deploy the model to the existing endpoint. Moreover, this option would not set the parentModel parameter to the model ID of the currently deployed model, which could prevent you from inheriting the settings and metadata of the existing model, and cause inconsistencies or conflicts between the model versions2.
References:
Preparing for Google Cloud Certification: Machine Learning Engineer, Course 3: Production ML Systems, Week 2: Serving ML Predictions
Google Cloud Professional Machine Learning Engineer Exam Guide, Section 3: Scaling ML models in production, 3.1 Deploying ML models to production
Official Google Cloud Certified Professional Machine Learning Engineer Study Guide, Chapter 6: Production ML Systems, Section 6.2: Serving ML Predictions
Vertex AI
Cloud DNS
While running a model training pipeline on Vertex Al, you discover that the evaluation step is failing because of an out-of-memory error. You are currently using TensorFlow Model Analysis (TFMA) with a standard Evaluator TensorFlow Extended (TFX) pipeline component for the evaluation step. You want to stabilize the pipeline without downgrading the evaluation quality while minimizing infrastructure overhead. What should you do?
Add tfma.MetricsSpec () to limit the number of metrics in the evaluation step.
Migrate your pipeline to Kubeflow hosted on Google Kubernetes Engine, and specify the appropriate node parameters for the evaluation step.
Include the flag -runner=DataflowRunner in beam_pipeline_args to run the evaluation step on Dataflow.
Move the evaluation step out of your pipeline and run it on custom Compute Engine VMs with sufficient memory.
The best option to stabilize the pipeline without downgrading the evaluation quality while minimizing infrastructure overhead is to use Dataflow as the runner for the evaluation step. Dataflow is a fully managed service for executing Apache Beam pipelines that can scale up and down according to the workload. Dataflow can handle large-scale, distributed data processing tasks such as model evaluation, and it can also integrate with Vertex AI Pipelines and TensorFlow Extended (TFX). By using the flag -runner=DataflowRunner in beam_pipeline_args, you can instruct the Evaluator component to run the evaluation step on Dataflow, instead of using the default DirectRunner, which runs locally and may cause out-of-memory errors. Option A is incorrect because adding tfma.MetricsSpec() to limit the number of metrics in the evaluation step may downgrade the evaluation quality, as some important metrics may be omitted. Moreover, reducing the number of metrics may not solve the out-of-memory error, as the evaluation step may still consume a lot of memory depending on the size and complexity of the data and the model. Option B is incorrect because migrating the pipeline to Kubeflow hosted on Google Kubernetes Engine (GKE) may increase the infrastructure overhead, as you need to provision, manage, and monitor the GKE cluster yourself. Moreover, you need to specify the appropriate node parameters for the evaluation step, which may require trial and error to find the optimal configuration. Option D is incorrect because moving the evaluation step out of the pipeline and running it on custom Compute Engine VMs may also increase the infrastructure overhead, as you need to create, configure, and delete the VMs yourself. Moreover, you need to ensure that the VMs have sufficient memory for the evaluation step, which may require trial and error to find the optimal machine type. References:
Dataflow documentation
Using DataflowRunner
Evaluator component documentation
Configuring the Evaluator component
You are tasked with building an MLOps pipeline to retrain tree-based models in production. The pipeline will include components related to data ingestion, data processing, model training, model evaluation, and model deployment. Your organization primarily uses PySpark-based workloads for data preprocessing. You want to minimize infrastructure management effort. How should you set up the pipeline?
Set up a TensorFlow Extended (TFX) pipeline on Vertex Al Pipelines to orchestrate the MLOps pipeline. Write a custom component for the PySpark-based workloads on Dataproc.
Set up a Vertex Al Pipelines to orchestrate the MLOps pipeline. Use the predefined Dataproc component for the PySpark-based workloads.
Set up Cloud Composer to orchestrate the MLOps pipeline. Use Dataproc workflow templates for the PySpark-based workloads in Cloud Composer.
Set up Kubeflow Pipelines on Google Kubernetes Engine to orchestrate the MLOps pipeline. Write a custom component for the PySpark-based workloads on Dataproc.
Copyright © 2021-2025 CertsTopics. All Rights Reserved