Resources — books, papers, blogs

Curated, not exhaustive. The temptation is to bookmark 200 things; you'll read 5. Pick the canonical 1 per area and go deep.

Books — buy/print these 5

BookAuthorWhy
Designing Data-Intensive Applications (DDIA)Martin KleppmannThe single most important book for distributed systems interviews. Read all 12 chapters; chapters 5–9 are core.
Designing Machine Learning SystemsChip HuyenGreat companion to ML system design interviews. Less depth than DDIA but covers the ML angle DDIA doesn't.
AI EngineeringChip Huyen (2024)The 2024 follow-up; covers LLM serving, RAG, fine-tuning, evals. Best one-stop for "how do real LLM products work."
Cracking the Coding Interview / Elements of Programming InterviewsMcDowell / AzizOne of these for pattern review. EPI is harder.
Hands-On Large Language ModelsAlammar & GrootendorstVisual-first walk-through of every transformer/LLM concept. Great refresher.

Books — borrow / skim

Papers — must-read, ranked by ROI

  1. Attention Is All You Need (Vaswani 2017, arxiv 1706.03762) — the transformer paper. Re-read.
  2. GPT-3 / Language Models Are Few-Shot Learners (Brown 2020, arxiv 2005.14165) — scaling and ICL.
  3. Chinchilla / Training Compute-Optimal LLMs (Hoffmann 2022, arxiv 2203.15556) — scaling laws.
  4. InstructGPT (Ouyang 2022, arxiv 2203.02155) — original RLHF.
  5. Constitutional AI (Bai 2022, arxiv 2212.08073) — Anthropic.
  6. DPO (Rafailov 2023, arxiv 2305.18290) — preference optimization without RM.
  7. FlashAttention 1/2/3 (Dao 2022, 2023, 2024) — IO-aware attention.
  8. vLLM / PagedAttention (Kwon 2023, arxiv 2309.06180) — modern LLM serving.
  9. Speculative Decoding (Leviathan 2023, arxiv 2211.17192).
  10. ZeRO (Rajbhandari 2019, arxiv 1910.02054) — distributed training memory.
  11. Megatron-LM (Shoeybi 2019, arxiv 1909.08053) — tensor parallel.
  12. RoPE (Su 2021, arxiv 2104.09864) — rotary positional.
  13. YaRN (Peng 2023, arxiv 2309.00071) — RoPE extension.
  14. GRPO / DeepSeek Math (arxiv 2402.03300) — RL without critic.
  15. DeepSeek V3 (arxiv 2412.19437) — MoE + MLA + FP8 + DualPipe.
  16. DeepSeek R1 (arxiv 2501.12948) — reasoning-model RL recipe.
  17. Llama 3 (arxiv 2407.21783).
  18. Mamba (Gu & Dao 2023, arxiv 2312.00752) — SSMs.
  19. DLRM (Naumov 2019, arxiv 1906.00091) — Meta's open recsys arch.
  20. HSTU / Generative Recommenders (Zhai 2024, arxiv 2402.17152) — Meta's gen recsys.
  21. TIGER (Rajput 2023, arxiv 2305.05065) — generative retrieval.
  22. MMoE (Ma 2018, KDD) — multi-task with experts.
  23. PLE / CGC (Tang 2020, RecSys) — extension of MMoE.
  24. DCN-v2 (Wang 2020, arxiv 2008.13535) — feature crosses.
  25. Two-tower / sampled-softmax debiasing (Yi 2019, RecSys).
  26. Raft (Ongaro & Ousterhout 2014, USENIX ATC) — consensus.
  27. Spanner (Corbett 2012) — TrueTime.
  28. Dynamo (DeCandia 2007) — Amazon KV.
  29. BigTable (Chang 2006).
  30. FLP impossibility (Fischer, Lynch, Paterson 1985).

Blogs — high signal

Newsletters / aggregators

Video / courses

Practice platforms

Forums / community

Comp / leveling tools

Tools you should be fluent in

What to skip