Generative AI - Short & Sweet
Posts
The Week the Future Stopped Pretending

The Week the Future Stopped Pretending

Holodecks, hidden supermodels, and the week AI stopped feeling hypothetical

Martin Musiol
November 14, 2025 • Estimated Reading Time: 8 minutes

In partnership with

Friends,

Every big technological shift starts out looking small and slightly stupid. This week it’s a “game demo” that’s really a proto-holodeck, and a dark-launched model that quietly outperforms most of what’s on the market. None of these, on their own, feels like a revolution.

But taken together, they’re the tell: attention is getting cheaper to buy, worlds are getting cheaper to generate, and intelligence is getting cheaper to rent. The gap between “I have an idea” and “it exists in the world” is collapsing—and if you’re not actively exploiting that, you’re leaving absurd leverage on the table.

Tip: Roku quietly dropped a $5K ad-credit match for Q4. See ⬇️

Find customers on Roku this holiday season

Now through the end of the year is prime streaming time on Roku, with viewers spending 3.5 hours each day streaming content and shopping online. Roku Ads Manager simplifies campaign setup, lets you segment audiences, and provides real-time reporting. And, you can test creative variants and run shoppable ads to drive purchases directly on-screen.

Bonus: we’re gifting you $5K in ad credits when you spend your first $5K on Roku Ads Manager. Just sign up and use code GET5K. Terms apply.

Use code GET5K now

(✨ If you don’t want ads like these, Premium is the solution. )

Holodecks Just Got a Launch Date

In 48 hours we basically got the two missing pieces of a usable holodeck.
World Labs shipped Marble, a Large World Model that turns a single prompt into a navigable 3D world with layout, physics, lighting, and objects you can walk through, edit, and even drop into VR.
The next day, Google DeepMind unveiled SIMA 2, a Gemini-powered agent that lives inside 3D worlds, understands high-level goals (“build a base”, “explore and catalogue resources”), and plays as a reasoning co-player across many different games.

Two puzzle pieces, not competitors

Marble (World Labs) → the world generator: “Create the room / planet / city, with real physics and persistent space.”

— (@)

SIMA 2 (DeepMind) → the inhabitant: “Give me a smart co-pilot who can plan, explain its thinking, and adapt to new worlds.”

— (@)

Put them together and you get a simple loop:
Prompt a world (Marble) → Drop in an agent (SIMA 2) → You’re standing in an interactive holodeck with a thinking companion.
This is “rough, early, but real” in 2026–2027.

How we’ll actually step into these worlds
I don’t think the smartphone is the endgame here. Its form factor is already choking these use cases. The entry points will be a stack of devices:

VR headsets for full immersion (first consumer holodeck vibes).
AR glasses for overlaying generated worlds onto your living room and city streets.
Spatial / room-scale displays (projectors, LED walls, mixed-reality “caves”) for shared holodeck rooms at work, school, or arcades.
Neural and haptic interfaces for the longer term: wristbands, non-invasive brain–computer interfaces, and richer haptics that make “touching” virtual objects feel less fake.

My bet: we’ll soon walk through AI-generated environments, talk to agents that feel like co-players, and switch devices depending on context—headset at home, lightweight AR glasses on the go, shared rooms at work.

The only real open question isn’t if this happens, but how far we let these virtual worlds bleed into our everyday reality.

Gemini 3.0 Pro: From Incremental to “Oh, Okay Then”

Gemini 3.0 Pro is still in the dark-launch phase — only reachable via odd glitches (Vertex AI preview, Gemini “Advanced” routing errors, random CLI access).

But the pattern from people using it is clear: it doesn’t feel like a small upgrade, it feels like a step change. Unofficial numbers suggest HLE in the mid-60s, GPQA Diamond around 90 %, SWE-Bench Verified ~70 %, MMMU in the low-80s, plus almost flawless long-context retrieval close to a million tokens. That's crazy.

On top of that, testers report much stronger reasoning, fewer blatant mistakes on hard problems, and full apps/games coming out of single prompts.

Equally interesting is Google’s “launch strategy”: no big keynote yet, just a drip of “accidental” leaks, restricted previews, and screenshots everyone somehow sees. It’s hype by controlled scarcity.

For a grounded walkthrough, this is where I’d embed Wes’s breakdown:

If these results hold up once it’s public, the next shift won’t be “a better chatbot” — it’ll be systems quietly designing, building, and maintaining complex things in the background while we mostly specify goals.

GPT-5.1 & Kimi K2: Steep Climb, Not the Leap

OpenAI’s GPT-5.1 dropped just three months after GPT-5, in two flavours: Instant (fast, chatty, low-latency) and Thinking (slower, deeper reasoning).

Sam Altman, CEO OpenAI

It’s a clear upgrade in reasoning, instruction-following, coding, and tool use — plus better prompt caching and structured outputs, which makes it genuinely nicer to build software with. On the flip side, the updated safety card tightens what you can ask about high-stakes topics (health, etc.), which also means losing some of the brutally useful “ask the model like it’s a top specialist” behaviour that people were actually relying on.

Studio portrait of a young man with dark hair wearing a dark jacket, looking slightly down with a neutral expression.

Yang Zhilin, CEO MoonshotAI

China’s Moonshot AI is pushing from the other side with Kimi K2, a 1T-parameter open Mixture-of-Experts where only ~32B fire per token. The K2 Thinking variant is openly downloadable, heavily fine-tuned by the community, and already close to frontier models on coding and tool-using tasks — roughly GPT-5 level for code, if not 5.1. The bigger story: open source isn’t trailing by “a year or two” anymore; it’s sometimes a few weeks behind closed models.

Together, GPT-5.1 and Kimi K2 show a fast, continuous slope of improvement — sharper, cheaper, more agentic every quarter. Impressive, but still evolutionary. If something qualifies as a real step change right now, it’s probably Gemini 3.0 rather than these two.

Our Weekly Jobs 👨‍🏫

This is just one of many jobs that we have:

Lead Electrical Engineer - $180k/ year

Description You will be leading the group on interesting and challenging projects that include investigating, troubleshooting, and solving a wide variety of

genai-jobs.jobboardly.com/jobs/lead-electrical-engineer-1-nuclear-c43fe2af

Subscribe to Premium to See the Rest

Upgrade to Premium for exclusive demos, valuable insights, and an ad-free experience!

Get Exclusive Insights

Already a paying subscriber? Sign In.

A subscription gets you:

• ✅ Full access to 100% of all content.
• ✅ Exclusive DEMOs, reports, and other premium content.
• ✅ Ad-free experience.