postgres can be your data lake (pg_lake)

An in-depth engineering conversation around pg_lake, Iceberg, Postgres, DuckDB, OLAP/OTLP and more, with Postgres expert Marco Slot

Stanislav Kozlovski

Apr 09, 2026

This is an engineering conversation around pg_lake - a new OSS Postgres extension that lets you query and manage Iceberg tables directly from Postgres.

Marco Slot, who has EXTENSIVE experience, shares with us various engineering internals, like:
• how pg_lake makes analytics (literally) 100x faster
• why Postgres is architecturally terrible at analytical queries (and how vectorized execution fixes this)
• how (and why) pg_lake intercepts query plans and delegates parts of the query tree to DuckDB
• Marco's hard-won experience through a decade+ career in Postgres
• versatility as the real moat of Postgres
• the practical differences in engineering b/w OLTP and OLAP
• and a lot more

TIMELINE

0:02 What is pg_lake?
2:23 Postgres' 100x slower problem and columnar storage experiments they had to make Postgres fast for analytics
6:00 practical examples and internals
16:20 perf internals - vectorized execution & CPU Optimization
23:00 pg_lake architecture (why DuckDB isn't embedded) and the connection-per-process issue
29:16 how pg_lake intercepts the query plan tree and delegates parts to DuckDB
41:09 Iceberg catalogs
48:24 postgres to iceberg ingestion patterns (and pg_incremental)
53:40 Marco's (long) career: early AWS, Citus, Microsoft, Crunchy Data & Snowflake
1:04:20 Marco's observations around the merging between OLTP and OLAP (and the subtle dev differences there)
1:15:30 reverse ETL
1:33:08 Iceberg as the TCP/IP for tables
1:35:00 Marco's thoughts on the "Just Use Postgres" fever

Marco

You can find Marco on:

LinkedIn: https://www.linkedin.com/in/marcoslot/
X: https://x.com/marcoslot
GitHub: https://github.com/marcoslot

Transcript

Feed this into your favorite AI for summarization, or to prompt it specific questions:
https://gist.githubusercontent.com/stanislavkozlovski/65c037a8963e49d8121b25003ec94715/raw/4f51f5dcd562b42e8d511b8bc58f0fff6ad5302e/foo.md

OTHER PLATFORMS

If you found anything useful from this episode, please consider supporting our growth (so we can continue delivering valuable content). You can do this by simply liking the post and sharing with a friend. It takes 8 seconds to do, and recording/producing this takes us 8hrs+

postgres can be your data lake (pg_lake)

TIMELINE

Marco

Transcript

OTHER PLATFORMS

Discussion about this video

Ready for more?