Synthetic sample report

Sample AI App Cost Emergency Memo.

This is synthetic/demo data, not a real client report and not a guaranteed savings claim. It shows the deliverable structure Olive-One uses to connect AWS, LLM, API, and workflow spend with margin risk before a founder accepts delivery, scales usage, reprices, or rebuilds.

Executive summary

This is a hypothetical sample. A product studio is preparing to hand off an AI support assistant hosted on AWS. The MVP works, but AWS compute commitments, EC2 rightsizing, RDS capacity, logging cost, model usage, retries, vector/database usage, and handoff controls are not fully documented.

Executive memo sample

Decision needed: keep the support assistant in production, but reduce obvious AWS waste, evaluate eligible EC2 commitment discounts, and cap AI runaway paths before expanding adoption.

Audience
Agency founder, platform lead, client CTO, client finance owner, client operations owner
Primary issue
The app is functional, but the client may inherit avoidable AWS On-Demand spend, unclear AI unit economics, weak controls, and no owner map after handoff.
Recommended move
Fix waste before buying commitments. Rightsize EC2/RDS, evaluate Savings Plans or Reserved Instances only for stable baseline usage, cap retry/context paths, add cheaper model routing, and publish a client-facing operating guide.
Decision horizon
7 days for P0 guardrails, 30 days for AWS commitment decision and workflow-level AI cost reporting.

AWS savings track

Find steady-state EC2/RDS usage, rightsizing opportunities, log retention waste, NAT/egress drag, idle storage, and commitment candidates such as Compute Savings Plans, EC2 Instance Savings Plans, or Reserved Instances.

AI margin track

Find retry storms, context inflation, premium-model defaults, RAG payload bloat, vector drift, and missing cost-per-resolution reporting.

Client context

Project: AI support assistant MVP for a B2B services client.

Stage: Pre-handoff. Production pilot expected in three weeks.

Stack: AWS ALB, ECS on EC2 Auto Scaling, RDS PostgreSQL, ElastiCache, S3, NAT Gateway, CloudWatch Logs, OpenAI, vector search, webhook automation.

Input data used

Assumptions

AWS documentation describes Savings Plans as commitment-based discounts for eligible compute usage. Compute Savings Plans trade flexibility for lower rates across EC2/Fargate/Lambda, while EC2 Instance Savings Plans and EC2 Reserved Instances are more specific commitment tools. This sample shows where a review would evaluate them, not a purchase recommendation.

Monthly spend reviewed$18,920synthetic baseline
AWS savings candidate$4,860validate before commitment
AI margin exposure$6,740workflow risk reviewed

Cost baseline

Period: 30 days

Scope: EC2/ECS compute · RDS PostgreSQL · NAT/egress · CloudWatch Logs · S3/EBS storage · LLM API usage · Vector search/retrieval · Webhook automation

LayerMonthly baselineObserved issueReview decision
EC2 / ECS compute$8,200Stable On-Demand baseline with low commitment coverage and oversized instancesRightsize first, then evaluate Compute Savings Plan or EC2 Instance Savings Plan
RDS PostgreSQL$2,600Steady production database, overprovisioned storage/IOPSRight-size storage and evaluate DB commitment after utilization review
NAT / egress$1,150Private subnet routing sends repeat traffic through NAT pathReview VPC endpoints, caching, and data transfer shape
CloudWatch Logs$920Verbose request/tool logs and long retentionReduce debug logs, set retention, sample traces
LLM API$4,850Retries, context inflation, and premium model defaultCap retries, compress context, route low-risk tasks to cheaper model
Vector/search$1,200Oversized retrieval payloads and stale index growthPrune index, limit retrieval, cache repeated context

Workflows reviewed

Top findings

Finding 1: EC2 On-Demand commitment gap

About 62% of EC2/ECS compute spend appears steady-state in this synthetic window, but it is paid as On-Demand. The review separates flexible usage from stable baseline before any commitment decision.

Decision: Rightsize first. Then model a conservative Compute Savings Plan for flexible baseline usage and an EC2 Instance Savings Plan or RI only for stable instance-family usage.

Finding 2: EC2 rightsizing before Savings Plans

Several always-on app nodes show low average CPU and memory headroom. Buying a commitment before rightsizing would lock in unnecessary spend.

Decision: Downsize or consolidate low-utilization nodes, validate performance, then re-run Savings Plans / RI scenarios.

Finding 3: Logging and NAT drag

Verbose AI tool logs and private subnet routing add avoidable CloudWatch, NAT Gateway, and data-transfer cost that does not improve customer outcomes.

Decision: Set log retention, sample traces, remove debug payloads, and evaluate VPC endpoints or route changes for repeat service calls.

Finding 4: Retry Storm

Uncontrolled retries increased LLM calls by 38%. Malformed tool responses caused repeated calls on the same user action with no circuit breaker or max retry cap.

Decision: Add retry caps, failure budgets, cheaper fallback model, and owner escalation when retry rate breaches threshold.

Finding 5: Context inflation and premium model default

Average prompt size increased from 3.2k tokens to 9.8k tokens after adding long support history. Premium model routing is also used for low-risk classification tasks.

Decision: Summarize history, cap retrieval windows, add model routing, and measure cost per resolved ticket.

Finding 6: No workflow margin reporting

Team tracks AWS and API cost separately, but not cost per resolved ticket or gross margin exposure by workflow. Finance cannot tell whether AI usage is profitable.

Decision: Add cost per workflow, cost per resolved ticket, AWS+AI blended cost, and gross margin exposure.

AWS commitment decision guardrails

Official AWS docs describe Compute Savings Plans as more flexible and EC2 Instance Savings Plans / Reserved Instances as more specific commitment tools. Real client recommendations should use AWS Cost Explorer recommendations, CUR data, utilization, coverage, and business constraints.

Executive decision table

Hypothetical sample. Real decision memos use actual billing exports, usage data, workflow traces, and owner mapping.

AreaOwnerCost driverEstimated exposureRecommended decisionPriority
EC2 / ECS baselinePlatform / FinOpsOn-Demand steady-state compute$2,900/month AWS savings candidateRightsize, then evaluate Compute Savings Plan vs EC2 Instance Savings Plan / RIP0
EC2 app nodesPlatformOversized always-on instances$1,100/month example exposureRightsize before commitment purchaseP0
RDS / storageDatabase ownerOverprovisioned storage/IOPS and stable DB usage$740/month example exposureRight-size storage and evaluate database commitment separatelyP1
NAT / CloudWatchPlatformVerbose logs, trace payloads, and repeat NAT path$620/month example exposureSet retention, sample traces, review VPC endpointsP1
Support triageSupport OpsRetry storm after malformed tool responses$2,400/month AI exposureCap retries and add circuit breakerP0
RAG support searchEngineeringContext inflation and oversized retrieval payloads$3,100/month AI exposureOptimize context and reroute retrievalP0
Sales research agentSales OpsPremium model overuse for low-risk tasks$950/month AI exposureReroute to cheaper model tierP1
Ticket summarizationFinance / OpsNo blended AWS+AI cost-per-resolution reporting$1,200/month decision exposureKeep, but add unit economics reportingP1

Margin impact

Owner map

Agency engineering

Owns retry limits, context limits, routing logic, and quick wins before handoff.

Platform / FinOps

Owns AWS Cost Explorer review, EC2 rightsizing, Savings Plans / RI scenario modeling, log retention, and network cost cleanup.

Agency PM / operator

Owns handoff docs, scope boundaries, and Phase 2 remediation list.

Client owner

Owns provider accounts, budget approvals, and production monitoring after delivery.

Client finance

Owns commitment approval, monthly cost targets, margin thresholds, and whether to reprice or limit high-cost workflows.

Client ops / support

Owns resolution quality, escalation rules, and workflow usage assumptions.

Handoff risks

Handoff checklist

30-day action plan

Export Cost Explorer/CUR data, tag workflows, add retry caps, and assign AWS/AI owners.

Rightsize EC2/RDS, reduce logs/NAT drag, summarize support history, and cap retrieval windows.

Run Savings Plans / RI scenarios after rightsizing and add blended AWS+AI cost per resolved ticket.

Decide what to commit, keep On-Demand, reduce, reprice, cap, or govern before handoff.

Final executive recommendation

Proceed with handoff only after P0 controls are added and the client receives an operating-cost appendix. Keep the AI assistant, but first rightsize obvious AWS waste, model conservative Savings Plans / RI scenarios, cap risky AI workflows, assign budget owners, and add blended AWS+AI cost-per-resolution reporting.

This page shows the deliverable format with representative data. Real client reviews use actual billing exports, usage data, workflow traces, handoff docs, and owner mapping.

Want this before you accept delivery or scale usage?

Olive-One reviews AWS spend and AI workflow cost together, then turns savings opportunities and operating risk into a client-ready handoff report.