12-week study plan
Calibrated to "rusty Staff RS rebuilding for Sr Staff at frontier labs." Adjust as you go — the plan is a tool, not a contract.
Phase 1 · Bootstrap (weeks 1–2)
Goal: shake off rust without breaking confidence. Diagnose what's actually weak vs what just feels weak.
Week 1 — Diagnostic
- Day 1: take 2 LeetCode mediums cold (Two Sum II, Longest Substring Without Repeating, Number of Islands). Time yourself. Note where you stalled.
- Day 2: implement self-attention from scratch in NumPy. Write multi-head. No internet, no LLM.
- Day 3: take a mock ML system design ("Design YouTube recs"). 45 min. Self-record.
- Day 4: read DDIA Ch 1 (Reliable, Scalable, Maintainable). Take notes here under Distributed systems.
- Day 5: implement a Raft leader election skeleton (just election timer + RequestVote, no log). Forces you back into concurrency.
- Day 6–7: brutal honesty review. Score yourself 1–5 on: coding speed · DSA recall · DL fundamentals · LLM internals · distributed systems · system design narrative · behavioral storytelling. Anything ≤3 → main focus weeks 3–8.
Week 2 — Tooling + foundations
- Set up LeetCode premium (1 month, cancel later). Filter by company tag for Anthropic/OpenAI/Meta/Google/Anduril.
- Set up
scratch/repo for ML coding drills. Push everything to a private GitHub. - Pick a reference textbook per pillar (see resources). Don't buy more than 4 books.
- Update résumé. One version emphasizing pretraining, one emphasizing recsys/ranking. Get 2 experienced ML engineers (ideally Sr Staff+) to review.
- LinkedIn: open to work (private), update headline, write a 250-word "what I'm looking for" post (don't publish yet).
- Identify 5 warm intros: ex-coworkers who are now at Anthropic, OpenAI, TML, Anduril, Pinterest, Databricks. Don't ask for refs yet, just say "want to catch up."
Phase 2 · Depth (weeks 3–8)
Goal: repair weak areas; sharpen strong ones into Sr Staff narratives.
| Week | Mon | Tue | Wed | Thu | Fri | Sat/Sun |
|---|---|---|---|---|---|---|
| 3 Transformers + concurrency | Re-derive attention; LeetCode 2× medium | RoPE + KV cache notes; concurrency: Python GIL deep-dive | FlashAttention paper; LeetCode 2× medium | Implement multi-head + RoPE in PyTorch from scratch | Mock interview (LC hard); DDIA Ch 2 (Data Models) | Write up notes; 1 ML system design |
| 4 Distributed training | ZeRO 1/2/3 paper; LC 2× medium | FSDP code-walk; concurrency: asyncio internals | Megatron tensor parallel; LC 1× hard | Pipeline parallel (1F1B, interleaved); DDIA Ch 3 (Storage) | Implement a 2-GPU DDP toy in PyTorch | Mock ML design (Design pretraining infra) |
| 5 RLHF + reasoning | InstructGPT; LC 2× medium | DPO paper; concurrency: lock-free data structures | GRPO + DeepSeek R1; LC 1× hard | Constitutional AI + RLAIF; DDIA Ch 5 (Replication) | Implement DPO loss in PyTorch | 1 ML system design (Design RLHF pipeline) |
| 6 Inference + serving | vLLM blog + paged attention; LC 2× medium | Speculative decoding; concurrency: producer-consumer in C++ | Quantization (GPTQ, AWQ, FP8); LC 1× hard | Disaggregated prefill/decode; DDIA Ch 6 (Partitioning) | Build a toy KV cache + simple speculative decoder | 1 ML system design (Design ChatGPT serving) |
| 7 RecSys + ranking | DLRM paper review; LC 2× medium | Two-tower + ANN (HNSW); concurrency: distributed locks | Sequence models (SASRec, BERT4Rec); LC 1× hard | Generative recsys (TIGER); DDIA Ch 7 (Transactions) | Implement two-tower in PyTorch with in-batch negatives | 1 ML system design (Design Pinterest home feed) |
| 8 System design + behavioral | 3 worked classic system designs; LC 2× medium | Distributed rate limiter, distributed counter; concurrency: read-write locks | RAG at scale; LC 1× hard | Vector DB internals (HNSW, IVF-PQ); DDIA Ch 8–9 (Trouble + Consistency) | Write 5 STAR stories (impact, conflict, failure, leadership, ambiguity) | Mock behavioral with friend; refine stories |
Phase 3 · Apply + iterate (weeks 9–12)
Goal: open the funnel strategically. Tier-B first to recalibrate; Tier-A after 1–2 onsites; top tier last.
Week 9 — Funnel open
- Apply to 5 Tier-B companies (Pinterest, Anduril, Databricks, Notion, Snowflake — see companies).
- Reach out to all 5 warm contacts for referrals. Specific ask, attached résumé, 2-line "why this team."
- Continue daily LeetCode + 1 ML coding drill.
Week 10 — First loops
- Take whatever phone screens come. Treat as paid mocks. Decline take-homes only from companies you've ruled out.
- Apply to 5 more companies (Tier-A: Cursor, Perplexity, Sierra, Cohere, Adobe Firefly).
- Day-after-interview: write a postmortem. What went well, what to drill.
Week 11 — Top tier
- Apply to wishlist: Anthropic, OpenAI, Thinking Machines. Use referrals where possible.
- Apply to: DeepMind, World Labs, Physical Intelligence, Periodic Labs, Mistral.
- Schedule onsites with 2-week spacing if you can — not 3 in one week.
Week 12 — Negotiate
- Get all offers in writing before responding to any.
- Use negotiation playbook. Levels.fyi for benchmarks.
candor.co/rora.coif you want help. - Decision matrix: TC, equity liquidity risk, team fit, manager, learning rate, growth ceiling, RTO, commute. Weight per your priorities.
Estimated time budget
| Activity | Hrs/wk wks 1-2 | Hrs/wk wks 3-8 | Hrs/wk wks 9-12 |
|---|---|---|---|
| LeetCode / coding drills | 5 | 6 | 4 |
| ML coding from scratch | 2 | 3 | 1 |
| Reading (papers, DDIA) | 4 | 5 | 2 |
| System / ML design practice | 2 | 4 | 3 |
| Mock interviews | 1 | 2 | 3 |
| Apply / outreach / recruiter calls | 1 | 2 | 8 |
| Total | ~15 | ~22 | ~21 |
Reality check
You have a full-time Staff job and Selma. 22 hours/week is not realistic alongside both. Either drop Selma to ~5h/wk for 2 months, or extend the plan to 18 weeks. The plan above assumes the 12-week version.
What to skip
- LeetCode hards you've never seen. Diminishing returns past ~50 LC problems if you're sharp on patterns. Spend the time on system design instead.
- Reading every new paper. Pick a curated tracker (Lilian Weng, Cameron Wolfe, Sebastian Raschka). Skim the rest from titles.
- Going to 0 in front-end / web design. No one will ask you React. Don't waste a day on it.
- Cramming new languages. Use Python for everything. Use C++ only if a role explicitly needs it (rare for ML).