Lens — How

How the tool actually behaves — not how the docs say it should

Each page is a behavioral deep-dive into one system: the execution model you need to predict what it'll do, the failure modes you'll hit in production, and the operational characteristics the getting-started guide never mentions.

What docs don't say

sort-merge join

left side

held in memory

⟷

right side

held in memory

docs say "join" — not "hold both sides in memory"

Execution

Apache Spark

Stages, tasks, shuffles, and spills. How Spark distributes work, where it breaks, and why the query plan matters more than the cluster size.

89% of job time in one shuffle

Read →

Apache Flink

Streaming-first execution with checkpoints, backpressure, and state. How Flink manages continuous load differently from batch engines.

upstream100k msg/s

backpressure

downstream38k msg/s

state size14 GB → growing

Read →

Hadoop MapReduce

The batch execution model that shaped everything after it. Why modern engines still carry its trade-offs.

map phase4 min

↓ shuffle write48 GB to disk

↓ shuffle read48 GB from disk

reduce phase52 min

disk I/O is always the bottleneck

Read →

Continuity

Apache Kafka

Partitions, consumer groups, offsets, retention — and where exactly-once delivery gets complicated.

topic: orders partition: 0

producer offset4,821,044

consumer offset4,819,211

lag1,833 ← growing

delivery guaranteeat-least-once

Read →

RabbitMQ

Queue-based routing, acknowledgments, prefetch, and dead-letter exchanges. The queue model versus the log model — and why the distinction matters.

exchange: orders (direct)

→ queue: processing✓ acked

→ queue: notify✗ nack → DLX

prefetch10

unacked10 ← blocked

Read →

Storage

HDFS

Block replication, NameNode limits, and why this file system shaped the first generation of data lakes.

NameNode heap97% ← single point

block: events-2024-01.parquet

datanode-04✓ healthy

datanode-07✓ healthy

datanode-12~ stale replica

Read →

Object Storage

Eventual consistency, listing latency, small-file costs — the cloud storage substrate under most modern data platforms.

PUT bucket/file.parquet → 200 OK

LIST bucket/ immediately after

→ file.parquet not yet visible

consistency window: ms to seconds

10k files × 1 KB= 10k API calls

Read →

Table Formats

Apache Iceberg

Snapshot isolation, partition evolution, metadata scaling — transactional guarantees on top of object storage, and the fan-out that grows as your table does.

snapshot id8842732

partition evolvedno rewrite needed

concurrent writes:

writer_acommitted → 8842731

writer_bretry → 8842732

Read →

Delta Lake

Transaction log mechanics, write conflicts, and the compaction trade-off — what ACID on object storage actually costs, and where it diverges from Iceberg.

_delta_log/

00000.json op: WRITE v0

00001.json op: WRITE v1

concurrent write:

writer_acommitted

writer_bconflict → retry

Read →

Apache Hudi

Copy-on-Write vs Merge-on-Read — the write/read trade-off that determines performance at every layer, and why the incremental query model is different from the others.

table typeMOR

writeappend log (fast)

read (no compaction)base + 4 logs

read (after compaction)base only

CoW rewrites entire file per update

Read →

Analysis

PostgreSQL

Planner behavior, vacuum mechanics, connection limits — and where Postgres stops scaling for analytical workloads.

EXPLAIN SELECT * WHERE user_id=42

Seq Scan on events (18M rows)

index exists — planner chose scan

dead tuples34% bloat

connections498 / 500

Read →

BigQuery

Slot-based execution, columnar scanning, and the cost model that makes full table scans expensive in ways you don't see until the bill.

SELECT * WHERE date = '2024-01-01'

no partitioning2.1 TB billed

PARTITION BY date0.04 TB billed

WHERE clause ≠ partition pruning

Read →

Snowflake

Virtual warehouse behavior, auto-suspend, clustering, and where the separation of storage and compute leaks.

warehouse: X-Large16 credits/h

query ran2 min

idle (auto-suspend: 10m)+8 min

credits: query0.07

credits: idle0.56 ← 8×

Read →

Redshift

Distribution styles, sort keys, WLM queues — the operational surface area that managed services don't fully manage.

join: events ⟷ users

diststyle: EVEN48 GB shuffle

diststyle: KEY(user_id)0.2 GB shuffle

WLM slots5 total

queued queries2 waiting

Read →

Trino

Federation across heterogeneous sources, predicate pushdown limits, and why the coordinator becomes the bottleneck before anything else does.

federated query across 2 sources

iceberg connectorpushdown applied

postgres connectorfull scan 3.2M rows

coordinator heap91% ← bottleneck

pushdown only as good as the connector

Read →

Orchestration

Apache Airflow

Execution dates, catchup, task retries — and why the scheduler is often the bottleneck nobody suspects.

dag: daily_pipeline catchup=True

today2024-03-15

last success2024-02-09

backfill runs queued34

34 runs triggered — scheduler stalled

Read →

Dagster

Assets instead of tasks — why defining what your pipeline produces changes debugging, dependency tracking, and what breaks when you modify the graph.

asset: user_metrics (modified)

↑ depends on: cleaned_events

↑ depends on: raw_events

materialization impact:

raw_eventsstale → re-run

cleaned_eventsstale → re-run

Read →

Transformation

dbt

Ref resolution, incremental models, test contracts — and where dbt's simplicity creates hidden coupling.

model: fct_orders (incremental)

ref('stg_orders')✓ tested

ref('int_discounts')✗ no test

strategy: merge key: order_id

order_id not unique → duplicates

Read →

SQLMesh

State-aware execution that knows what changed and only re-runs what must — a genuinely different behavioral contract from dbt's compile-and-run model.

model: fct_orders [logic changed]

impacted models:

fct_ordersre-run

fct_revenuere-run (downstream)

dim_usersskip (unchanged)

state-aware — only reruns what changed

Read →

Discovery

Apache Atlas

Lineage tracking and classification — and where metadata graphs become stale faster than you expect.

lineage (Atlas):

raw_events → clean → metrics_v2