The retail company wants to build an ML model for product recommendations using responsible practices to decrease model bias. Collecting balanced and diverse data ensures the model does not favor specific groups, reducing bias and promoting fairness, a key responsible AI practice.
Exact Extract from AWS AI Documents:
From the AWS AI Practitioner Learning Path:
"To reduce model bias, it is critical to collect balanced and diverse data that represents various demographics and user groups. This practice ensures fairness and prevents the model from disproportionately favoring certain populations."
(Source: AWS AI Practitioner Learning Path, Module on Responsible AI)
Detailed Explanation:
Option A: Use data from only customers who match the demography of the company's overall customer base.Limiting data to a specific demographic may reinforce existing biases, failing to address underrepresented groups and increasing bias.
Option B: Collect data from customers who have a past purchase history.Focusing only on customers with purchase history may exclude new users, potentially introducing bias, and does not address diversity.
Option C: Ensure that the data is balanced and collected from a diverse group.This is the correct answer. A balanced and diverse dataset reduces bias by ensuring the model learns from a representative sample, aligning with responsible AI practices.
Option D: Ensure that the data is from a publicly available dataset.Public datasets may not be diverse or representative of the company’s customer base and could introduce unrelated biases, failing to address fairness.
[References:, AWS AI Practitioner Learning Path: Module on Responsible AI, Amazon SageMaker Developer Guide: Bias and Fairness in ML (https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-bias.html), AWS Documentation: Responsible AI Practices (https://aws.amazon.com/machine-learning/responsible-ai/), , , ]