Comprehensive and Detailed Explanation:
This question is mainly about cost optimization for two different workload patterns:
A constant-rate ingestion workload
A noncritical nightly batch analytics workload that can tolerate interruption or rerun
The most cost-effective architecture is to decouple storage from the ingestion servers, store the incoming data in Amazon S3, and run the nightly calculations by using AWS Batch on Spot Instances.
Why B is correct
Amazon Data Firehose can continuously ingest streaming data and deliver the data to Amazon S3. Because the data rate is constant, the company does not need to keep a fleet of EC2 instances running just to collect and retain data on attached EBS volumes. Using S3 as durable storage is operationally simpler and generally more cost-effective than maintaining long-running EC2 ingestion servers with EBS-based storage for this type of streaming landing zone.
For the nightly processing, the job is:
batch oriented,
takes only 4 hours,
not critical to the business,
acceptable to rerun later if interrupted.
These characteristics make Spot Instances an ideal fit. AWS Batch is specifically designed to run batch computing workloads and can efficiently provision and manage compute resources, including Spot Instances. Since the job can tolerate interruption, Spot capacity provides the lowest-cost compute option.
This option reduces cost in both major areas:
storage and ingestion are moved to managed services and object storage,
nightly compute moves from On-Demand Instances to Spot-based batch processing.
Why A is incorrect
A improves the design by moving ingestion to Data Firehose and storage to S3, which is a good step. However, it still uses EC2 On-Demand Instances for nightly processing. Because the nightly workload is not business-critical and can tolerate failure, Spot Instances are a better fit and more cost-effective than On-Demand Instances. Therefore, A is not the most cost-effective choice.
Why C is incorrect
This option keeps ingestion on a fleet of EC2 Reserved Instances with 3-year reservations behind a Network Load Balancer. That is not the most economical or simplest design for constant ingestion when a managed ingestion service such as Data Firehose can deliver directly to S3.
Also, a Network Load Balancer does not add meaningful value for the ingestion requirement described here. The issue is not distributing interactive network traffic to an application fleet. The better architecture is to use a managed data ingestion service instead of recommitting to long-term EC2 reservations.
While AWS Batch with Spot Instances is good for the nightly analytics portion, the ingestion side of this option is still more expensive and less efficient than B.
Why D is incorrect
Amazon Redshift is a data warehouse and is useful for analytical querying, but this option is not the most cost-effective for the stated use case. The company needs to ingest constant-rate streaming market data and run a nightly aggregate job. Storing all incoming data in Redshift is generally a more specialized and potentially more expensive architecture than landing the raw stream in S3 and running elastic batch processing on demand.
The Lambda part is also problematic. The batch processing currently takes 4 hours. AWS Lambda is not designed for long-running heavy batch jobs of this kind. Even if the aggregate query could be expressed in SQL, the answer is still not the most cost-effective and does not align as well with the workload pattern as S3 plus AWS Batch on Spot Instances.
Key design principles being tested
This question tests whether you can identify:
S3 as low-cost, durable storage for incoming streamed data
Managed ingestion services instead of self-managed EC2 ingestion servers
Spot Instances for interruptible, fault-tolerant batch jobs
AWS Batch as the managed orchestration service for batch workloads
Final justification
The most cost-effective redesign is to:
ingest the streaming data with Amazon Data Firehose,
store the data in Amazon S3,
run nightly analytics through AWS Batch on Spot Instances.
That makes B the best answer.
[References:, AWS Certified Solutions Architect – Professional (SAP-C02) Exam Guide, Amazon Data Firehose documentation, Amazon S3 documentation, AWS Batch documentation, Amazon EC2 Spot Instances documentation, AWS Certified Solutions Architect – Professional Official Study Guide, , ]