Spring Sale 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: save70

NCP-AAI Leak Questions

Page: 8 / 9
Total 121 questions

NVIDIA Agentic AI Questions and Answers

Question 29

You’re employing an LLM to automate the generation of email responses for a customer service team. The generated responses frequently miss the mark, failing to address the customer’s underlying concerns.

What’s the most crucial element to add to the prompt to enhance the quality of the email responses?

Options:

A.

Instructing the LLM with a detailed prompt containing instructions on how to format and compose the response in an easy-to-understand structure.

B.

Instructing the LLM to use a simple template for all email replies before generating a response.

C.

Instructing the LLM to “understand the customer’s issue” before generating a response.

D.

Instructing the LLM to provide a response that “is the most helpful” before generating a response.

Question 30

You are designing the architecture for a RAG (Retrieval-Augmented Generation) system, and you are concerned about ensuring data freshness and minimizing latency.

Which of the following is the most important consideration when designing the architecture?

Options:

A.

Employing a consolidated architecture with a large service handling all data retrieval and LLM interaction. This ensures consistent performance and simplifies debugging.

B.

Using a synchronous, block-level approach, where the LLM continuously monitors the database for updates and retrieves the entire dataset with each prompt.

C.

Implementing a single, centralized database for all data, updated with a synchronous polling mechanism for the LLM to retrieve the latest information.

D.

Use a loosely coupled, event-driven micro-service architecture where separate services handle data indexing, retrieval, and LLM prompting.

Question 31

What benefits does a Kubernetes deployment offer over Slurm?

Options:

A.

Kubernetes provides autoscaling, auto-restarts, dynamic task scheduling, error isolation with containers, and integrated monitoring.

B.

Kubernetes is the best option for both training and inference, offering advantages for resource management and workload visibility over traditional HPC schedulers like Slurm.

C.

Kubernetes is more optimized for batch jobs to achieve high throughput, and also provides for monitoring and failover in large-scale workloads.

Question 32

You’ve deployed an agent that helps users troubleshoot technical issues with their devices. After several weeks in production, user feedback indicates a decline in response accuracy, especially for newer issues.

Which monitoring method is most appropriate for identifying the root cause of declining agent performance?

Options:

A.

Review output token counts across sessions to detect unusual model behavior

B.

Analyze logs of tool usage frequency and error rates during inference

C.

Compare average prompt length over time to analyze common input patterns

D.

Schedule a weekly re-deployment cycle to reset the model and improve freshness

Page: 8 / 9
Total 121 questions