Resources — books, papers, blogs
Curated, not exhaustive. The temptation is to bookmark 200 things; you'll read 5. Pick the canonical 1 per area and go deep.
Books — buy/print these 5
| Book | Author | Why |
|---|---|---|
| Designing Data-Intensive Applications (DDIA) | Martin Kleppmann | The single most important book for distributed systems interviews. Read all 12 chapters; chapters 5–9 are core. |
| Designing Machine Learning Systems | Chip Huyen | Great companion to ML system design interviews. Less depth than DDIA but covers the ML angle DDIA doesn't. |
| AI Engineering | Chip Huyen (2024) | The 2024 follow-up; covers LLM serving, RAG, fine-tuning, evals. Best one-stop for "how do real LLM products work." |
| Cracking the Coding Interview / Elements of Programming Interviews | McDowell / Aziz | One of these for pattern review. EPI is harder. |
| Hands-On Large Language Models | Alammar & Grootendorst | Visual-first walk-through of every transformer/LLM concept. Great refresher. |
Books — borrow / skim
- Database Internals (Petrov) — for storage engine depth (LSM, B-tree internals).
- Designing Distributed Systems (Burns) — patterns at the K8s/cloud level.
- Operating Systems: Three Easy Pieces (Remzi & Andrea Arpaci-Dusseau) — free online. Best OS book for refresh.
- Java Concurrency in Practice (Goetz) — concurrency depth, even if you don't use Java.
- C++ Concurrency in Action (Williams) — for C++ memory model depth.
- Deep Learning (Goodfellow, Bengio, Courville) — old but the math foundations are timeless.
- The Hundred-Page Machine Learning Book (Burkov) — quick refresh.
- ML System Design Interview (Aminian & Khoa) — practice book.
- Machine Learning System Design Interview (Xu) — alternate practice book.
Papers — must-read, ranked by ROI
- Attention Is All You Need (Vaswani 2017, arxiv 1706.03762) — the transformer paper. Re-read.
- GPT-3 / Language Models Are Few-Shot Learners (Brown 2020, arxiv 2005.14165) — scaling and ICL.
- Chinchilla / Training Compute-Optimal LLMs (Hoffmann 2022, arxiv 2203.15556) — scaling laws.
- InstructGPT (Ouyang 2022, arxiv 2203.02155) — original RLHF.
- Constitutional AI (Bai 2022, arxiv 2212.08073) — Anthropic.
- DPO (Rafailov 2023, arxiv 2305.18290) — preference optimization without RM.
- FlashAttention 1/2/3 (Dao 2022, 2023, 2024) — IO-aware attention.
- vLLM / PagedAttention (Kwon 2023, arxiv 2309.06180) — modern LLM serving.
- Speculative Decoding (Leviathan 2023, arxiv 2211.17192).
- ZeRO (Rajbhandari 2019, arxiv 1910.02054) — distributed training memory.
- Megatron-LM (Shoeybi 2019, arxiv 1909.08053) — tensor parallel.
- RoPE (Su 2021, arxiv 2104.09864) — rotary positional.
- YaRN (Peng 2023, arxiv 2309.00071) — RoPE extension.
- GRPO / DeepSeek Math (arxiv 2402.03300) — RL without critic.
- DeepSeek V3 (arxiv 2412.19437) — MoE + MLA + FP8 + DualPipe.
- DeepSeek R1 (arxiv 2501.12948) — reasoning-model RL recipe.
- Llama 3 (arxiv 2407.21783).
- Mamba (Gu & Dao 2023, arxiv 2312.00752) — SSMs.
- DLRM (Naumov 2019, arxiv 1906.00091) — Meta's open recsys arch.
- HSTU / Generative Recommenders (Zhai 2024, arxiv 2402.17152) — Meta's gen recsys.
- TIGER (Rajput 2023, arxiv 2305.05065) — generative retrieval.
- MMoE (Ma 2018, KDD) — multi-task with experts.
- PLE / CGC (Tang 2020, RecSys) — extension of MMoE.
- DCN-v2 (Wang 2020, arxiv 2008.13535) — feature crosses.
- Two-tower / sampled-softmax debiasing (Yi 2019, RecSys).
- Raft (Ongaro & Ousterhout 2014, USENIX ATC) — consensus.
- Spanner (Corbett 2012) — TrueTime.
- Dynamo (DeCandia 2007) — Amazon KV.
- BigTable (Chang 2006).
- FLP impossibility (Fischer, Lynch, Paterson 1985).
Blogs — high signal
- Lilian Weng — every post. The single best ML deep-dive blog.
- Cameron R. Wolfe — comprehensive paper deep-dives.
- Sebastian Raschka — Ahead of AI — weekly summaries + tutorials.
- Transformer Circuits Thread (Anthropic) — interpretability.
- vLLM blog — inference deep dives.
- EleutherAI blog — open LLM science.
- Hugging Face blog — practical recipes.
- Chip Huyen blog — ML systems + careers.
- Nathan Lambert — Interconnects — RLHF / post-training news.
- Martin Kleppmann blog — distributed systems.
- Marc Brooker (AWS) blog — distributed systems / AWS.
- Murat Demirbas — paper reviews.
- Dan Luu — engineering deep dives.
- High Scalability — case studies.
- ByteByteGo — system design illustrations.
- Hello Interview blog — current company-specific intel.
Newsletters / aggregators
- Hugging Face daily papers
- arxiv-sanity
- The Algorithmic Bridge
- SemiAnalysis (paid) — hardware + datacenter deep dives
- Runtime (Tom Krazit) — enterprise infra
- The Information (paid) — tech business intel
- Stratechery (Ben Thompson) — strategy
Video / courses
- Andrej Karpathy — "Let's build GPT" — and the rest of his Zero to Hero series.
- MIT 6.824 Distributed Systems — free, with hands-on Raft labs.
- Martin Kleppmann's distributed systems lecture series — Cambridge, free YouTube.
- Hugging Face course — practical NLP.
- Stanford CS336 — Language Modeling from Scratch
- Stanford CS329S — Machine Learning Systems Design (Chip Huyen)
- Full Stack Deep Learning
- Andrej Karpathy YouTube — everything
- Umar Jamil — paper-from-scratch implementations
Practice platforms
- LeetCode — buy 1 month premium, filter by company tag.
- NeetCode — curated 75/150 lists with explanations.
- Deep-ML — ML coding practice problems.
- Pramp — free peer mock interviews.
- interviewing.io — paid mocks with ex-FAANG.
- CoderPad sandbox — practice in the actual environment OpenAI/Anthropic use.
- Exercism — pattern drills.
Forums / community
- 1Point3Acres — Chinese tech interview forum, freshest leaks.
- Blind — anonymous tech, best comp data + culture takes.
- Glassdoor — interview questions, ratings.
- LeetCode Discuss — interview experiences.
- r/MachineLearning
- r/cscareerquestions
- Hacker News — for company news + culture.
Comp / leveling tools
- Levels.fyi — comp data. The reference.
- Candor — negotiation help.
- Rora — negotiation help.
Tools you should be fluent in
- Python — always. Both for coding rounds and ML.
- PyTorch — mandatory. Drill: write a model from scratch in 30 min.
- NumPy — for ML coding rounds where they want zero ML libraries.
- Linux CLI — basic file/process/network.
top,strace,tcpdump. - Git — beyond the basics: rebase, bisect, reflog.
- SQL — for any analytics interview part.
- Docker basics — what an image is, how it runs, layered FS.
- Kubernetes basics — pod / deployment / service abstractions; HPA.
What to skip
- Long-form crash courses. You don't have time. Targeted reading on the topics this site identifies.
- Becoming a Kaggle master. Not what frontier labs hire for.
- Front-end / React deep dives — no one will ask you.
- Memorizing Linux internals beyond what DDIA covers — diminishing returns.
- Reading every ICLR/NeurIPS paper — pick a curator (Lilian, Cameron, Sebastian) and let them filter.