Olive One Bill Shock Pattern

Cloud Run Runaway Cost

When an AI workflow scales faster than its margin model.

Executive summary. Cloud Run is excellent for lightweight APIs and event-driven workloads, but teams can create bill shock when services scale without max instances, concurrency controls, request throttling, budget ownership, or workflow-level cost attribution.

Use case. An AI support workflow receives a traffic spike. Each request triggers model calls, vector lookups, logging, and retries. The Cloud Run service scales correctly from an engineering perspective, but nobody owns the unit economics of the workflow.

A. The symptom

Finance sees margin compression, but product metrics look healthy.

Cloud Run spend spikes unexpectedly.
LLM/API spend rises in parallel.
Logs increase dramatically.
Support automation looks successful in product metrics.
Finance sees margin compression but cannot trace it to a workflow.

B. The hidden mechanism

The service scales, then every downstream dependency scales with it.

Request volume increases.
Concurrency is misconfigured or too low.
Max instances are not capped.
Retries multiply requests.
Every request triggers downstream paid services.
Logs grow with request volume.
No workflow owner is accountable for cost per resolved ticket or agent run.

C. Example cost shape

A workflow that appears to cost $900/month can become a $3,400/month workflow.

Hypothetical example numbers only:

Monthly support tickets30,000

AI calls per ticket2.5

Retries on failed requests3

Illustrative monthly cost$3.4K

The visible model cost may look acceptable at first. After retry amplification, freely scaling Cloud Run instances, retained logs, vector lookups, database calls, and downstream LLM/API calls, the same support workflow can move from an apparent $900/month cost to a hypothetical $3,400/month run-rate.

D. Detection signals

Look for request amplification plus downstream paid service fanout.

Cloud Run request count spike.
Instance count spike.
Concurrency too low.
Max instances missing.
5xx/429 rate increases.
Retry count increases.
Logs volume increases.
Downstream LLM/vector/database cost increases.
Missing owner tag or workflow tag.
No cost per resolved ticket or per agent run metric.

E. Scanner rule

Cloud Run Runaway Cost

Risk: High
Pattern: Cloud Run Runaway Cost
Cost Shape: Request amplification + downstream paid service fanout.
Business Risk: Margin compression and unowned AI workflow spend.
Recommended Action: Add max instances, tune concurrency, cap retries, add workflow cost attribution, add budget ownership, and track cost per successful outcome.

Scanner checklist: Cloud Run service without max instances, missing concurrency setting, missing timeout discipline, missing budget labels, missing service ownership labels, missing log retention, public unauthenticated endpoint where not expected, retry policy without cap/backoff, or service calling paid AI/model/vector APIs without a budget guardrail.

F. Recommended fix

Put economic guardrails around the workflow, not just the service.

Set max instances.
Tune concurrency.
Set timeouts.
Cap retries with exponential backoff.
Add rate limiting.
Tag by workflow, owner, and environment.
Add log retention policy.
Create budget alerts.
Track cost per business outcome.
Route low-risk tasks to a cheaper model/API where applicable.
Define a kill switch.

G. Executive interpretation

This is a workflow economics issue.

This is not merely a Cloud Run configuration issue. The service is scaling, but the business has not defined the acceptable cost per resolved ticket, document, customer, or agent run.

H. Olive One teardown angle

How Olive One would diagnose it.

Map the workflow from request to downstream paid services.
Identify cost fanout across Cloud Run, logs, model calls, vector lookups, database calls, and retries.
Estimate unit economics by resolved ticket, agent run, customer, or successful outcome.
Detect missing controls: max instances, concurrency, retry caps, budget ownership, labels, and kill switches.
Rank fixes by margin impact, implementation effort, and operational risk.
Produce an executive decision memo: keep, optimize, cap, reroute, reprice, or kill.

Get your AI Margin Map Send an anonymized workflow or billing export