Synthetic sample report

Sample AI Margin Map memo.

This is synthetic/demo data, not a real client report and not a guaranteed savings claim. It shows the deliverable structure Olive One uses to connect AWS, LLM, API, and workflow spend with margin risk before a founder accepts delivery, scales usage, reprices, or rebuilds.

Executive summary

This is a hypothetical sample. A product studio is preparing to hand off an AI support assistant hosted on AWS. The MVP works, but AWS compute commitments, EC2 rightsizing, RDS capacity, logging cost, model usage, retries, vector/database usage, and handoff controls are not fully documented.

Executive memo sample

Decision needed: keep the support assistant in production, but reduce obvious AWS waste, evaluate eligible EC2 commitment discounts, and cap AI runaway paths before expanding adoption.

Audience: Agency founder, platform lead, client CTO, client finance owner, client operations owner
Primary issue: The app is functional, but the client may inherit avoidable AWS On-Demand spend, unclear AI unit economics, weak controls, and no owner map after handoff.
Recommended move: Fix waste before buying commitments. Rightsize EC2/RDS, evaluate Savings Plans or Reserved Instances only for stable baseline usage, cap retry/context paths, add cheaper model routing, and publish a client-facing operating guide.
Decision horizon: 7 days for P0 guardrails, 30 days for AWS commitment decision and workflow-level AI cost reporting.

AWS savings track

Find steady-state EC2/RDS usage, rightsizing opportunities, log retention waste, NAT/egress drag, idle storage, and commitment candidates such as Compute Savings Plans, EC2 Instance Savings Plans, or Reserved Instances.

AI margin track

Find retry storms, context inflation, premium-model defaults, RAG payload bloat, vector drift, and missing cost-per-resolution reporting.

Client context

Project: AI support assistant MVP for a B2B services client.

Stage: Pre-handoff. Production pilot expected in three weeks.

Stack: AWS ALB, ECS on EC2 Auto Scaling, RDS PostgreSQL, ElastiCache, S3, NAT Gateway, CloudWatch Logs, OpenAI, vector search, webhook automation.

Input data used

AWS Cost Explorer export, synthetic CUR-style line items, EC2/RDS inventory, and CloudWatch usage summary.
Architecture walkthrough and workflow map.
Synthetic LLM usage logs, vector-search usage, and support workflow assumptions.
Provider screenshots for AWS and AI cost categories.
Handoff checklist draft.
Known production concerns from founder, agency, and technical stakeholders.

Assumptions

40 pilot users and 400 rollout users.
Support workflows average 2-4 AI calls per successful task.
AWS baseline currently runs mostly On-Demand with no active EC2 commitment coverage in this synthetic example.
Commitment recommendations are scenario candidates only; real purchase decisions require Cost Explorer coverage/utilization data and rightsizing first.
Numbers are directional examples and must be validated from real exports.

AWS documentation describes Savings Plans as commitment-based discounts for eligible compute usage. Compute Savings Plans trade flexibility for lower rates across EC2/Fargate/Lambda, while EC2 Instance Savings Plans and EC2 Reserved Instances are more specific commitment tools. This sample shows where a review would evaluate them, not a purchase recommendation.

Monthly spend reviewed$18,920synthetic baseline

AWS savings candidate$4.3k–5.4krows sum $5,360 — validate before commitment

AI margin exposure$6.5k–7.7kincl. $1,200 reporting gap — workflow risk reviewed

Cost baseline

Period: 30 days

Scope: EC2/ECS compute · RDS PostgreSQL · NAT/egress · CloudWatch Logs · S3/EBS storage · LLM API usage · Vector search/retrieval · Webhook automation

Layer	Monthly baseline	Observed issue	Review decision
EC2 / ECS compute	$8,200	Stable On-Demand baseline with low commitment coverage and oversized instances	Rightsize first, then evaluate Compute Savings Plan or EC2 Instance Savings Plan
RDS PostgreSQL	$2,600	Steady production database, overprovisioned storage/IOPS	Right-size storage and evaluate DB commitment after utilization review
NAT / egress	$1,150	Private subnet routing sends repeat traffic through NAT path	Review VPC endpoints, caching, and data transfer shape
CloudWatch Logs	$920	Verbose request/tool logs and long retention	Reduce debug logs, set retention, sample traces
LLM API	$4,850	Retries, context inflation, and premium model default	Cap retries, compress context, route low-risk tasks to cheaper model
Vector/search	$1,200	Oversized retrieval payloads and stale index growth	Prune index, limit retrieval, cache repeated context

Workflows reviewed

Support triage on ECS/EC2
Refund assistant with approval workflow
RAG support search backed by vector index
Sales research agent using premium model calls
Ticket summarization and CloudWatch trace/log pipeline

Top findings

Finding 1: EC2 On-Demand commitment gap

Roughly 55–65% of EC2/ECS compute spend appears steady-state in this synthetic window, but it is paid as On-Demand. The review separates flexible usage from stable baseline before any commitment decision.

Decision: Rightsize first. Then model a conservative Compute Savings Plan for flexible baseline usage and an EC2 Instance Savings Plan or RI only for stable instance-family usage.

Finding 2: EC2 rightsizing before Savings Plans

Several always-on app nodes show low average CPU and memory headroom. Buying a commitment before rightsizing would lock in unnecessary spend.

Decision: Downsize or consolidate low-utilization nodes, validate performance, then re-run Savings Plans / RI scenarios.

Finding 3: Logging and NAT drag

Verbose AI tool logs and private subnet routing add avoidable CloudWatch, NAT Gateway, and data-transfer cost that does not improve customer outcomes.

Decision: Set log retention, sample traces, remove debug payloads, and evaluate VPC endpoints or route changes for repeat service calls.

Finding 4: Retry Storm

Uncontrolled retries increased LLM calls by roughly 30–45% in the reviewed window. Malformed tool responses caused repeated calls on the same user action with no circuit breaker or max retry cap.

Decision: Add retry caps, failure budgets, cheaper fallback model, and owner escalation when retry rate breaches threshold.

Finding 5: Context inflation and premium model default

Average prompt size increased from 3.2k tokens to 9.8k tokens after adding long support history. Premium model routing is also used for low-risk classification tasks.

Decision: Summarize history, cap retrieval windows, add model routing, and measure cost per resolved ticket.

Finding 6: No workflow margin reporting

Team tracks AWS and API cost separately, but not cost per resolved ticket or gross margin exposure by workflow. Finance cannot tell whether AI usage is profitable.

Decision: Add cost per workflow, cost per resolved ticket, AWS+AI blended cost, and gross margin exposure.

AWS commitment decision guardrails

Do not buy commitments before rightsizing. A Savings Plan or RI can make waste cheaper, but it does not remove waste.
Use Compute Savings Plans for flexible baseline compute. Best candidate when EC2 family, size, Region, or compute service may change.
Use EC2 Instance Savings Plans or RIs for stable family/Region usage. Best candidate only after utilization and architecture are stable.
Keep spiky, experimental, and pre-handoff usage outside commitments. AI rollout patterns can change fast after client adoption.
Re-run coverage after AI controls. Retry caps and model routing can reduce infrastructure demand, which changes the safe commitment level.

Official AWS docs describe Compute Savings Plans as more flexible and EC2 Instance Savings Plans / Reserved Instances as more specific commitment tools. Real client recommendations should use AWS Cost Explorer recommendations, CUR data, utilization, coverage, and business constraints.

Executive decision table

Hypothetical sample. Real decision memos use actual billing exports, usage data, workflow traces, and owner mapping.

Area	Owner	Cost driver	Estimated exposure	Confidence	Recommended decision	Priority
EC2 / ECS baseline	Platform / FinOps	On-Demand steady-state compute	$2,900/month AWS savings candidate	Medium — validate in CUR	Rightsize, then evaluate Compute Savings Plan vs EC2 Instance Savings Plan / RI	P0
EC2 app nodes	Platform	Oversized always-on instances	$1,100/month example exposure	High — utilization metrics	Rightsize before commitment purchase	P0
RDS / storage	Database owner	Overprovisioned storage/IOPS and stable DB usage	$740/month example exposure	Medium — needs storage metrics	Right-size storage and evaluate database commitment separately	P1
NAT / CloudWatch	Platform	Verbose logs, trace payloads, and repeat NAT path	$620/month example exposure	High — billing line items	Set retention, sample traces, review VPC endpoints	P1
Support triage	Support Ops	Retry storm after malformed tool responses	$2,400/month AI exposure	High — usage logs	Cap retries and add circuit breaker	P0
RAG support search	Engineering	Context inflation and oversized retrieval payloads	$3,100/month AI exposure	Medium — token logs, partial window	Optimize context and reroute retrieval	P0
Sales research agent	Sales Ops	Premium model overuse for low-risk tasks	$950/month AI exposure	Medium — model mix sample	Reroute to cheaper model tier	P1
Ticket summarization	Finance / Ops	No blended AWS+AI cost-per-resolution reporting	$1,200/month decision exposure	Unknown — needs cost-per-resolution telemetry	Keep, but add unit economics reporting	P1

Margin impact

Total monthly spend reviewed: $18,920 synthetic baseline
AWS optimization candidate: $4,300–5,400/month before validation and commitment decision
AI margin exposure: $6,450–7,650/month across retry, context, model routing, and reporting gaps
Primary risk: AWS and AI costs scaling faster than resolved tickets and client value
Key metric missing: blended AWS+AI cost per resolved ticket
Decision required: rightsize, cap, commit, optimize, reprice, or govern before usage scales further

Owner map

Agency engineering

Owns retry limits, context limits, routing logic, and quick wins before handoff.

Platform / FinOps

Owns AWS Cost Explorer review, EC2 rightsizing, Savings Plans / RI scenario modeling, log retention, and network cost cleanup.

Agency PM / operator

Owns handoff docs, scope boundaries, and Phase 2 remediation list.

Client owner

Owns provider accounts, budget approvals, and production monitoring after delivery.

Client finance

Owns commitment approval, monthly cost targets, margin thresholds, and whether to reprice or limit high-cost workflows.

Client ops / support

Owns resolution quality, escalation rules, and workflow usage assumptions.

Handoff risks

AWS payer account, Cost Explorer access, billing owners, and commitment approval path are not documented.
Commitment purchase timing is unclear; buying before rightsizing may lock in waste.
Provider account ownership is not documented for AWS, OpenAI, vector search, and observability.
Budget alert response path is unclear.
Client has no operating cost assumptions for pilot vs rollout usage.
Support scope does not define who investigates bill spikes after handoff.
Retries, logs, model calls, vector usage, NAT paths, and background jobs need caps before delivery.

Handoff checklist

Document AWS payer account owner, billing admins, Cost Explorer access, and commitment approval process.
Document which usage is eligible for Savings Plans / RI evaluation and which usage must remain On-Demand.
Document expected monthly cost range and assumptions.
Document spend alert thresholds and escalation owner.
Document retry caps, model routing, rate limits, EC2 scaling limits, log retention, and background job limits.
Document what to rightsize, commit, monitor, reprice, cap, rebuild, or kill in Phase 2.

30-day action plan

Export Cost Explorer/CUR data, tag workflows, add retry caps, and assign AWS/AI owners.

Rightsize EC2/RDS, reduce logs/NAT drag, summarize support history, and cap retrieval windows.

Run Savings Plans / RI scenarios after rightsizing and add blended AWS+AI cost per resolved ticket.

Decide what to commit, keep On-Demand, reduce, reprice, cap, or govern before handoff.

Final executive recommendation

Proceed with handoff only after P0 controls are added and the client receives an operating-cost appendix. Keep the AI assistant, but first rightsize obvious AWS waste, model conservative Savings Plans / RI scenarios, cap risky AI workflows, assign budget owners, and add blended AWS+AI cost-per-resolution reporting.

This page shows the deliverable format with representative data. Real client reviews use actual billing exports, usage data, workflow traces, handoff docs, and owner mapping.

Want this before you scale usage or reprice?

Olive One reviews AWS spend and AI workflow cost together, then turns savings opportunities and operating risk into a client-ready handoff report.

Get your AI Margin Map