Claude Fable 5 worked my 12-hour night shift

In partnership with

Anthropic shipped a new tier last night, not a new version.

Claude Fable 5 is the first Mythos-class model open to everyone. Mythos now sits above Opus in Anthropic’s lineup. Same weights as Claude Mythos 5, which stays locked to vetted cyberdefense partners. The difference is a set of classifiers: requests touching cybersecurity, bio, or model distillation get rerouted to Opus 4.8. Anthropic says under 5% of sessions. I hit zero in twelve hours of coding.

— # (#)

The numbers

SWE-bench Pro: 80.3% against 69.2% for Opus 4.8. Cognition’s FrontierCode Diamond: 29.3% against 13.4%. More than double, on a benchmark that measures production-grade work, not toy tasks. Context: 1M tokens in, 128K out. Price: $10 in, $50 out per million tokens. Twice Opus, half of Mythos Preview.

The customer story that sticks: Stripe migrated a 50-million-line Ruby codebase in a day. Their manual estimate was two months with a full team.

Included in paid Claude plans until June 22. Then usage credits until capacity catches up. The API has no gate from day one.

My 12-hour night shift

I gave it a proper night shift. Twelve hours straight, half in Claude Code, half in Cursor. Four things stood out.

1. It takes problems whole. State logic I would normally slice into steps, a feature spanning a dozen files plus migrations. You hand it the problem, not the recipe. It plans, builds, and keeps the thread.

2. It spends most of its time testing. Right call. It writes tests I skip, runs them, fixes what breaks, runs again. Annoying if you want vibes. Correct if you ship.

3. The security audit hurt. The app was reviewed, tested, live. I called it clean. Fable returned several real bugs, some in code I signed off myself. Not linter noise. Real ones.

4. UX audits are underrated. My entire prompt: “lean, clean, intuitive UI.” It came back with flow-level findings worthy of a senior product designer. This one surprised me most.

An engineering fellowship to land your next job

Many engineers feel stalled because the role itself has not evolved. The work looks the same, but the market has moved.

Senior engineering in 2026 demands ownership, faster judgment, and comfort with ambiguity. If your role is not pushing you there, it may be holding you back.

Last cohort, 15 hiring partners sent 31 representatives to evaluate challengers through 246 live interviews. Gauntlet offers a reset. Apply now.

Cohort 6 starts July 6

_{Must be a US citizen to qualify.}

What the field says

— # (#)

Karpathy rates it a step change on the order of Claude 4.5, strongest on long problem-solving sessions. Read the thread, including the part about not skipping code review in prod.

Matthew Berman’s week-long test matches mine: one “full code review” prompt fanned out hundreds of parallel agents, one per file, and surfaced bugs other models miss.

Boris Cherny, creator of Claude Code, calls it the biggest step up since Opus 4.5: Fable went from coding agent to “a thought and design partner.” His evidence matches my point 2. It measures, logs, verifies the fix, then declares victory. Unprompted. Personality.

And the playground is open. One builder asked for a humanoid robot in CAD and got it: one goal prompt, two hours, 1.4 million tokens. Another had it write a melody, then build the piano visualizer to play it. The word everyone reaches for: taste.

— # (#)

Where it stumbles

Three fronts. The safeguards overreach: harmless prompts bounce as cyber or bio risks, and SemiAnalysis reports the distillation filter tripping on ordinary GPU work. Anthropic shipped the blocks too broad on purpose and says they will narrow. My fix: feed the rejected prompt back and ask Fable what exactly looks risky, then let it rephrase itself. It passes its own review. Second: it drinks tokens, roughly double Opus in my sessions, and Anthropic reset all usage limits on launch day. Budget for hungrier models from here. Third: slower than Opus at default effort. Thoroughness costs minutes.

Do this tonight

Pick the repo you trust most. Two prompts: a security audit, then “lean, clean, intuitive UI.” One hour. You will close issues you did not know existed.

Anthropic’s launch notes match my night shift. Objectives, not tasks: describe what done looks like and how to verify it, then let the model find the path. Rewrite your agent instruction files, the old ones anchor Fable to stale patterns. Effort levels: Anthropic says high by default, Berman says lower. Both right. High for the hand-off, medium when you stay in the loop.

Half my night ran in Cursor with Fable 5 selected. Strongest pairing I know for feature work right now. Not on Cursor yet? Start with my referral link. Two minutes, and I get a small kickback.

The window closes June 22. Test while it is part of your plan.

Will Fable 5 become your daily coding model?

Want the exact audit prompts from my night shift? Reply with “audit” and I will send them over. I read every reply.

Until next time,

Martin

Claude Fable 5 worked my 12-hour night shift

The numbers

My 12-hour night shift

An engineering fellowship to land your next job

What the field says

Where it stumbles

Do this tonight

Will Fable 5 become your daily coding model?

Keep Reading

Generative AI - Short & Sweet

Home

Account

GenerativeAI.net