AWS savings track
Find steady-state EC2/RDS usage, rightsizing opportunities, log retention waste, NAT/egress drag, idle storage, and commitment candidates such as Compute Savings Plans, EC2 Instance Savings Plans, or Reserved Instances.
This is synthetic/demo data, not a real client report and not a guaranteed savings claim. It shows the deliverable structure Olive-One uses to connect AWS, LLM, API, and workflow spend with margin risk before a founder accepts delivery, scales usage, reprices, or rebuilds.
Executive summary
This is a hypothetical sample. A product studio is preparing to hand off an AI support assistant hosted on AWS. The MVP works, but AWS compute commitments, EC2 rightsizing, RDS capacity, logging cost, model usage, retries, vector/database usage, and handoff controls are not fully documented.
Decision needed: keep the support assistant in production, but reduce obvious AWS waste, evaluate eligible EC2 commitment discounts, and cap AI runaway paths before expanding adoption.
Find steady-state EC2/RDS usage, rightsizing opportunities, log retention waste, NAT/egress drag, idle storage, and commitment candidates such as Compute Savings Plans, EC2 Instance Savings Plans, or Reserved Instances.
Find retry storms, context inflation, premium-model defaults, RAG payload bloat, vector drift, and missing cost-per-resolution reporting.
Client context
Project: AI support assistant MVP for a B2B services client.
Stage: Pre-handoff. Production pilot expected in three weeks.
Stack: AWS ALB, ECS on EC2 Auto Scaling, RDS PostgreSQL, ElastiCache, S3, NAT Gateway, CloudWatch Logs, OpenAI, vector search, webhook automation.
Input data used
Assumptions
AWS documentation describes Savings Plans as commitment-based discounts for eligible compute usage. Compute Savings Plans trade flexibility for lower rates across EC2/Fargate/Lambda, while EC2 Instance Savings Plans and EC2 Reserved Instances are more specific commitment tools. This sample shows where a review would evaluate them, not a purchase recommendation.
Cost baseline
Period: 30 days
Scope: EC2/ECS compute · RDS PostgreSQL · NAT/egress · CloudWatch Logs · S3/EBS storage · LLM API usage · Vector search/retrieval · Webhook automation
| Layer | Monthly baseline | Observed issue | Review decision |
|---|---|---|---|
| EC2 / ECS compute | $8,200 | Stable On-Demand baseline with low commitment coverage and oversized instances | Rightsize first, then evaluate Compute Savings Plan or EC2 Instance Savings Plan |
| RDS PostgreSQL | $2,600 | Steady production database, overprovisioned storage/IOPS | Right-size storage and evaluate DB commitment after utilization review |
| NAT / egress | $1,150 | Private subnet routing sends repeat traffic through NAT path | Review VPC endpoints, caching, and data transfer shape |
| CloudWatch Logs | $920 | Verbose request/tool logs and long retention | Reduce debug logs, set retention, sample traces |
| LLM API | $4,850 | Retries, context inflation, and premium model default | Cap retries, compress context, route low-risk tasks to cheaper model |
| Vector/search | $1,200 | Oversized retrieval payloads and stale index growth | Prune index, limit retrieval, cache repeated context |
Workflows reviewed
Top findings
About 62% of EC2/ECS compute spend appears steady-state in this synthetic window, but it is paid as On-Demand. The review separates flexible usage from stable baseline before any commitment decision.
Decision: Rightsize first. Then model a conservative Compute Savings Plan for flexible baseline usage and an EC2 Instance Savings Plan or RI only for stable instance-family usage.
Several always-on app nodes show low average CPU and memory headroom. Buying a commitment before rightsizing would lock in unnecessary spend.
Decision: Downsize or consolidate low-utilization nodes, validate performance, then re-run Savings Plans / RI scenarios.
Verbose AI tool logs and private subnet routing add avoidable CloudWatch, NAT Gateway, and data-transfer cost that does not improve customer outcomes.
Decision: Set log retention, sample traces, remove debug payloads, and evaluate VPC endpoints or route changes for repeat service calls.
Uncontrolled retries increased LLM calls by 38%. Malformed tool responses caused repeated calls on the same user action with no circuit breaker or max retry cap.
Decision: Add retry caps, failure budgets, cheaper fallback model, and owner escalation when retry rate breaches threshold.
Average prompt size increased from 3.2k tokens to 9.8k tokens after adding long support history. Premium model routing is also used for low-risk classification tasks.
Decision: Summarize history, cap retrieval windows, add model routing, and measure cost per resolved ticket.
Team tracks AWS and API cost separately, but not cost per resolved ticket or gross margin exposure by workflow. Finance cannot tell whether AI usage is profitable.
Decision: Add cost per workflow, cost per resolved ticket, AWS+AI blended cost, and gross margin exposure.
AWS commitment decision guardrails
Official AWS docs describe Compute Savings Plans as more flexible and EC2 Instance Savings Plans / Reserved Instances as more specific commitment tools. Real client recommendations should use AWS Cost Explorer recommendations, CUR data, utilization, coverage, and business constraints.
Executive decision table
Hypothetical sample. Real decision memos use actual billing exports, usage data, workflow traces, and owner mapping.
| Area | Owner | Cost driver | Estimated exposure | Recommended decision | Priority |
|---|---|---|---|---|---|
| EC2 / ECS baseline | Platform / FinOps | On-Demand steady-state compute | $2,900/month AWS savings candidate | Rightsize, then evaluate Compute Savings Plan vs EC2 Instance Savings Plan / RI | P0 |
| EC2 app nodes | Platform | Oversized always-on instances | $1,100/month example exposure | Rightsize before commitment purchase | P0 |
| RDS / storage | Database owner | Overprovisioned storage/IOPS and stable DB usage | $740/month example exposure | Right-size storage and evaluate database commitment separately | P1 |
| NAT / CloudWatch | Platform | Verbose logs, trace payloads, and repeat NAT path | $620/month example exposure | Set retention, sample traces, review VPC endpoints | P1 |
| Support triage | Support Ops | Retry storm after malformed tool responses | $2,400/month AI exposure | Cap retries and add circuit breaker | P0 |
| RAG support search | Engineering | Context inflation and oversized retrieval payloads | $3,100/month AI exposure | Optimize context and reroute retrieval | P0 |
| Sales research agent | Sales Ops | Premium model overuse for low-risk tasks | $950/month AI exposure | Reroute to cheaper model tier | P1 |
| Ticket summarization | Finance / Ops | No blended AWS+AI cost-per-resolution reporting | $1,200/month decision exposure | Keep, but add unit economics reporting | P1 |
Margin impact
Owner map
Owns retry limits, context limits, routing logic, and quick wins before handoff.
Owns AWS Cost Explorer review, EC2 rightsizing, Savings Plans / RI scenario modeling, log retention, and network cost cleanup.
Owns handoff docs, scope boundaries, and Phase 2 remediation list.
Owns provider accounts, budget approvals, and production monitoring after delivery.
Owns commitment approval, monthly cost targets, margin thresholds, and whether to reprice or limit high-cost workflows.
Owns resolution quality, escalation rules, and workflow usage assumptions.
Handoff risks
Handoff checklist
30-day action plan
Export Cost Explorer/CUR data, tag workflows, add retry caps, and assign AWS/AI owners.
Rightsize EC2/RDS, reduce logs/NAT drag, summarize support history, and cap retrieval windows.
Run Savings Plans / RI scenarios after rightsizing and add blended AWS+AI cost per resolved ticket.
Decide what to commit, keep On-Demand, reduce, reprice, cap, or govern before handoff.
Final executive recommendation
Proceed with handoff only after P0 controls are added and the client receives an operating-cost appendix. Keep the AI assistant, but first rightsize obvious AWS waste, model conservative Savings Plans / RI scenarios, cap risky AI workflows, assign budget owners, and add blended AWS+AI cost-per-resolution reporting.
This page shows the deliverable format with representative data. Real client reviews use actual billing exports, usage data, workflow traces, handoff docs, and owner mapping.
Olive-One reviews AWS spend and AI workflow cost together, then turns savings opportunities and operating risk into a client-ready handoff report.