Unsupervised learning is the correct methodology for classifying customers into tiers when the data is unlabeled, as it does not require predefined labels or outputs.
Unsupervised Learning:
This type of machine learning is used when the data has no labels or pre-defined categories. The goal is to identify patterns, clusters, or associations within the data.
In this case, the company has petabytes of unlabeled customer data and needs to classify customers into different tiers. Unsupervised learning techniques like clustering (e.g., K-Means, Hierarchical Clustering) can group similar customers based on various attributes without any prior knowledge or labels.
Why Option B is Correct:
Handling Unlabeled Data: Unsupervised learning is specifically designed to work with unlabeled data, making it ideal for the company’s need to classify customer data.
Customer Segmentation: Techniques in unsupervised learning can be used to find natural groupings within customer data, such as identifying high-value vs. low-value customers or segmenting based on purchasing behavior.
Why Other Options are Incorrect:
A. Supervised learning: Requires labeled data with input-output pairs to train the model, which is not suitable since the company's data is unlabeled.
C. Reinforcement learning: Focuses on training an agent to make decisions by maximizing some notion of cumulative reward, which does not align with the company's need for customer classification.
D. Reinforcement learning from human feedback (RLHF): Similar to reinforcement learning but involves human feedback to refine the model’s behavior; it is also not appropriate for classifying unlabeled customer data.