Resources — books, papers, blogs

Curated, not exhaustive. The temptation is to bookmark 200 things; you'll read 5. Pick the canonical 1 per area and go deep.

Books — buy/print these 5

Book	Author	Why
Designing Data-Intensive Applications (DDIA)	Martin Kleppmann	The single most important book for distributed systems interviews. Read all 12 chapters; chapters 5–9 are core.
Designing Machine Learning Systems	Chip Huyen	Great companion to ML system design interviews. Less depth than DDIA but covers the ML angle DDIA doesn't.
AI Engineering	Chip Huyen (2024)	The 2024 follow-up; covers LLM serving, RAG, fine-tuning, evals. Best one-stop for "how do real LLM products work."
Cracking the Coding Interview / Elements of Programming Interviews	McDowell / Aziz	One of these for pattern review. EPI is harder.
Hands-On Large Language Models	Alammar & Grootendorst	Visual-first walk-through of every transformer/LLM concept. Great refresher.

Books — borrow / skim

Database Internals (Petrov) — for storage engine depth (LSM, B-tree internals).
Designing Distributed Systems (Burns) — patterns at the K8s/cloud level.
Operating Systems: Three Easy Pieces (Remzi & Andrea Arpaci-Dusseau) — free online. Best OS book for refresh.
Java Concurrency in Practice (Goetz) — concurrency depth, even if you don't use Java.
C++ Concurrency in Action (Williams) — for C++ memory model depth.
Deep Learning (Goodfellow, Bengio, Courville) — old but the math foundations are timeless.
The Hundred-Page Machine Learning Book (Burkov) — quick refresh.
ML System Design Interview (Aminian & Khoa) — practice book.
Machine Learning System Design Interview (Xu) — alternate practice book.

Papers — must-read, ranked by ROI

Attention Is All You Need (Vaswani 2017, arxiv 1706.03762) — the transformer paper. Re-read.
GPT-3 / Language Models Are Few-Shot Learners (Brown 2020, arxiv 2005.14165) — scaling and ICL.
Chinchilla / Training Compute-Optimal LLMs (Hoffmann 2022, arxiv 2203.15556) — scaling laws.
InstructGPT (Ouyang 2022, arxiv 2203.02155) — original RLHF.
Constitutional AI (Bai 2022, arxiv 2212.08073) — Anthropic.
DPO (Rafailov 2023, arxiv 2305.18290) — preference optimization without RM.
FlashAttention 1/2/3 (Dao 2022, 2023, 2024) — IO-aware attention.
vLLM / PagedAttention (Kwon 2023, arxiv 2309.06180) — modern LLM serving.
Speculative Decoding (Leviathan 2023, arxiv 2211.17192).
ZeRO (Rajbhandari 2019, arxiv 1910.02054) — distributed training memory.
Megatron-LM (Shoeybi 2019, arxiv 1909.08053) — tensor parallel.
RoPE (Su 2021, arxiv 2104.09864) — rotary positional.
YaRN (Peng 2023, arxiv 2309.00071) — RoPE extension.
GRPO / DeepSeek Math (arxiv 2402.03300) — RL without critic.
DeepSeek V3 (arxiv 2412.19437) — MoE + MLA + FP8 + DualPipe.
DeepSeek R1 (arxiv 2501.12948) — reasoning-model RL recipe.
Llama 3 (arxiv 2407.21783).
Mamba (Gu & Dao 2023, arxiv 2312.00752) — SSMs.
DLRM (Naumov 2019, arxiv 1906.00091) — Meta's open recsys arch.
HSTU / Generative Recommenders (Zhai 2024, arxiv 2402.17152) — Meta's gen recsys.
TIGER (Rajput 2023, arxiv 2305.05065) — generative retrieval.
MMoE (Ma 2018, KDD) — multi-task with experts.
PLE / CGC (Tang 2020, RecSys) — extension of MMoE.
DCN-v2 (Wang 2020, arxiv 2008.13535) — feature crosses.
Two-tower / sampled-softmax debiasing (Yi 2019, RecSys).
Raft (Ongaro & Ousterhout 2014, USENIX ATC) — consensus.
Spanner (Corbett 2012) — TrueTime.
Dynamo (DeCandia 2007) — Amazon KV.
BigTable (Chang 2006).
FLP impossibility (Fischer, Lynch, Paterson 1985).

Blogs — high signal

Lilian Weng — every post. The single best ML deep-dive blog.
Cameron R. Wolfe — comprehensive paper deep-dives.
Sebastian Raschka — Ahead of AI — weekly summaries + tutorials.
Transformer Circuits Thread (Anthropic) — interpretability.
vLLM blog — inference deep dives.
EleutherAI blog — open LLM science.
Hugging Face blog — practical recipes.
Chip Huyen blog — ML systems + careers.
Nathan Lambert — Interconnects — RLHF / post-training news.
Martin Kleppmann blog — distributed systems.
Marc Brooker (AWS) blog — distributed systems / AWS.
Murat Demirbas — paper reviews.
Dan Luu — engineering deep dives.
High Scalability — case studies.
ByteByteGo — system design illustrations.
Hello Interview blog — current company-specific intel.

Newsletters / aggregators

Hugging Face daily papers
arxiv-sanity
The Algorithmic Bridge
SemiAnalysis (paid) — hardware + datacenter deep dives
Runtime (Tom Krazit) — enterprise infra
The Information (paid) — tech business intel
Stratechery (Ben Thompson) — strategy

Video / courses

Andrej Karpathy — "Let's build GPT" — and the rest of his Zero to Hero series.
MIT 6.824 Distributed Systems — free, with hands-on Raft labs.
Martin Kleppmann's distributed systems lecture series — Cambridge, free YouTube.
Hugging Face course — practical NLP.
Stanford CS336 — Language Modeling from Scratch
Stanford CS329S — Machine Learning Systems Design (Chip Huyen)
Full Stack Deep Learning
Andrej Karpathy YouTube — everything
Umar Jamil — paper-from-scratch implementations

Practice platforms

LeetCode — buy 1 month premium, filter by company tag.
NeetCode — curated 75/150 lists with explanations.
Deep-ML — ML coding practice problems.
Pramp — free peer mock interviews.
interviewing.io — paid mocks with ex-FAANG.
CoderPad sandbox — practice in the actual environment OpenAI/Anthropic use.
Exercism — pattern drills.

Forums / community

1Point3Acres — Chinese tech interview forum, freshest leaks.
Blind — anonymous tech, best comp data + culture takes.
Glassdoor — interview questions, ratings.
LeetCode Discuss — interview experiences.
r/MachineLearning
r/cscareerquestions
Hacker News — for company news + culture.

Comp / leveling tools

Levels.fyi — comp data. The reference.
Candor — negotiation help.
Rora — negotiation help.

Tools you should be fluent in

Python — always. Both for coding rounds and ML.
PyTorch — mandatory. Drill: write a model from scratch in 30 min.
NumPy — for ML coding rounds where they want zero ML libraries.
Linux CLI — basic file/process/network. top, strace, tcpdump.
Git — beyond the basics: rebase, bisect, reflog.
SQL — for any analytics interview part.
Docker basics — what an image is, how it runs, layered FS.
Kubernetes basics — pod / deployment / service abstractions; HPA.

What to skip

Long-form crash courses. You don't have time. Targeted reading on the topics this site identifies.
Becoming a Kaggle master. Not what frontier labs hire for.
Front-end / React deep dives — no one will ask you.
Memorizing Linux internals beyond what DDIA covers — diminishing returns.
Reading every ICLR/NeurIPS paper — pick a curator (Lilian, Cameron, Sebastian) and let them filter.