Museum Wire
Law 0 · Katie's LawEvery system is shaped by the human drive to do less work. This is not a flaw. It is the economic force that produces all software — and all software failure.Law I · Boundary CollapseWhen data crosses into a system that interprets structure, without being constrained, it becomes executable.2026 IncidentAxios. 70 Million Downloads a Week. North Korea Inside.Law II · Ambient AuthorityWhen a system trusts the presence of a credential instead of verifying the intent behind it, authentication becomes indistinguishable from authorization.AXM-001Set Theory — Membership, Boundaries, and BelongingLaw III · Transitive TrustWhen a system inherits trust from a source it did not verify, the attack surface extends to everything that source touches.2026 IncidentClaude Code — The Accept-Data-Loss FlagLaw IV · Complexity AccretionSystems do not become complex. They accumulate complexity — one reasonable decision at a time — until no single person can hold the whole in their head.Law V · Temporal CouplingCode that assumes sequential execution, stable state, or consistent timing will fail the moment concurrency, scale, or latency proves the assumption wrong.2026 IncidentCopy Fail — 732 Bytes to Root on Every Linux DistributionAXM-002Boolean & Propositional Logic — True, False, and the Excluded MiddleLaw VI · Observer InterferenceWhen the system that monitors health becomes a participant in the system it monitors, observation becomes a failure vector.2025Amazon Kiro — The 13-Hour Outage2025Operation Chrysalis: The Notepad++ Supply Chain Hijack2025Replit Agent — The Vibe Code Wipe2025Shai-Hulud — The npm Worm That Ate Its Own Ecosystem2024Air Canada Chatbot — The Policy That Wasn't2024Change Healthcare — One-Third of US Healthcare, One Missing MFA2024CrowdStrike — The Security Update That Broke the World2024Google Gemini Image Generation — The Six-Day Pause2024XZ Utils — The Two-Year Infiltration20233CX — The Supply Chain That Ate Another Supply Chain2023Amazon Prime Video — The Per-Frame State Machine2023Bing Sydney — The Chatbot That Went Rogue2023Samsung ChatGPT Leak — The Employee Who Pasted the SecretEFFODE · LEGE · INTELLEGELaw 0 · Katie's LawEvery system is shaped by the human drive to do less work. This is not a flaw. It is the economic force that produces all software — and all software failure.Law I · Boundary CollapseWhen data crosses into a system that interprets structure, without being constrained, it becomes executable.2026 IncidentAxios. 70 Million Downloads a Week. North Korea Inside.Law II · Ambient AuthorityWhen a system trusts the presence of a credential instead of verifying the intent behind it, authentication becomes indistinguishable from authorization.AXM-001Set Theory — Membership, Boundaries, and BelongingLaw III · Transitive TrustWhen a system inherits trust from a source it did not verify, the attack surface extends to everything that source touches.2026 IncidentClaude Code — The Accept-Data-Loss FlagLaw IV · Complexity AccretionSystems do not become complex. They accumulate complexity — one reasonable decision at a time — until no single person can hold the whole in their head.Law V · Temporal CouplingCode that assumes sequential execution, stable state, or consistent timing will fail the moment concurrency, scale, or latency proves the assumption wrong.2026 IncidentCopy Fail — 732 Bytes to Root on Every Linux DistributionAXM-002Boolean & Propositional Logic — True, False, and the Excluded MiddleLaw VI · Observer InterferenceWhen the system that monitors health becomes a participant in the system it monitors, observation becomes a failure vector.2025Amazon Kiro — The 13-Hour Outage2025Operation Chrysalis: The Notepad++ Supply Chain Hijack2025Replit Agent — The Vibe Code Wipe2025Shai-Hulud — The npm Worm That Ate Its Own Ecosystem2024Air Canada Chatbot — The Policy That Wasn't2024Change Healthcare — One-Third of US Healthcare, One Missing MFA2024CrowdStrike — The Security Update That Broke the World2024Google Gemini Image Generation — The Six-Day Pause2024XZ Utils — The Two-Year Infiltration20233CX — The Supply Chain That Ate Another Supply Chain2023Amazon Prime Video — The Per-Frame State Machine2023Bing Sydney — The Chatbot That Went Rogue2023Samsung ChatGPT Leak — The Employee Who Pasted the SecretEFFODE · LEGE · INTELLEGE
Keyboard Navigation
W
A
S
D
or arrow keys · M for map · Q to exit
← Back to Incident Room
2023bugCorporation

Amazon Prime Video — The Per-Frame State Machine

Orders of magnitude higher infrastructure cost than necessary. Published as a 'success story' rather than a post-mortem.

2 min read
Root Cause

Video quality monitoring service processed every frame through individual AWS Step Function state transitions, designed for orchestration not high-frequency data processing

Aftermath

Team moved to monolith, reduced costs 90%. Published blog post. ThePrimeagen's reaction video went viral, highlighting the irony of AWS not understanding their own products.

The Incident

Amazon Prime Video's audio/video quality monitoring service was built on AWS Step Functions and Lambda. The service checked every video stream for quality defects — dropped frames, corruption, block artifacts.

The architecture processed every frame of every stream through individual Step Function state transitions. Step Functions charge per state transition. At video scale — 24-30 frames per second per stream — this meant millions of state transitions per stream.

The Architecture

``

Video stream → Step Function → Lambda (per frame) → S3 → Lambda → SNS

``

Each frame triggered a state machine transition. Each transition cost money. Each Lambda invocation had cold start potential. The architecture was designed for orchestration workflows (approve this order, route this ticket) not high-frequency data processing.

The "Fix"

The team collapsed the distributed architecture into a single monolith process. Same logic. Same quality checks. 90% cost reduction.

They published this as a blog post titled "Scaling up the Prime Video audio/video monitoring service and reducing costs by 90%." The framing was: we discovered monoliths can be better than microservices for some workloads.

The Reaction

ThePrimeagen's response captured what the blog post didn't say: this wasn't a discovery about microservices vs monoliths. This was Amazon — the company that built and sells AWS — not understanding which of their own products was appropriate for this workload. Step Functions are for state machines with infrequent transitions, not per-frame video processing.

Why It Matters

The "microservices for everything" best practice of 2015 was the design assumption that created this disaster. The architecture made sense on a whiteboard. It made sense in a design review. It didn't make sense when applied to a data flow that generates millions of events per second. Right-size your architecture to your data flow.

Techniques
microservices overuseper item orchestrationcost explosion