Pipeline Seams
Where did the contract break?
Every pipeline is a chain of implicit contracts between stages. Locate the broken handoff — what each stage assumed it was receiving, and what it actually got.
Generation → Ingestion
The source makes assumptions the ingestor doesn't know about — client-side clocks, undeclared currencies, non-unique IDs. What the source owes the pipeline starts here.
Ingestion → Storage
The delivery boundary. ACK doesn't mean committed. Committed doesn't mean durable. Where duplicates, loss, and ordering violations first enter the system.
Storage → Processing
Processing inherits the layout decisions storage made. Partition key mismatches, format assumptions, and schema drift become cost and correctness problems at query time.
Processing → Modeling
What arrives at the model layer — duplicates, nulls, ambiguous grain — was decided upstream. The modeling layer can't fix what processing already delivered.
Modeling → Serving
Consumers assume freshness, stable semantics, and correctness the model layer never explicitly promised. Freshness SLAs and metric definitions are the contracts at this seam.
Orchestration ↔ All Stages
Orchestration manages timing and dependencies across every seam. Undeclared dependencies, schedule assumptions, and retry logic are what silently corrupt pipelines.