Lens — Why

Every system you'll ever debug follows the same rules

Tools change. Frameworks get replaced. But the forces underneath — how computation distributes, how time misleads, how state complicates, how failures hide — those don't change. Each concept here applies regardless of which tool you're using.

Root cause trace
SparkException: OOM in sort phase
sort needs memory to complete
shuffle state is unbounded
partition count wrong for data skew
skew was never measured — assumed uniform
The fix is in the model, not the config.
Execution
Data
Distributed Systems