o3 and last tech updates of 2024

In partnership with

Quick holiday check-in!

While you hopefully are enjoying your well-deserved time with your loved ones, I wanted to share the last AI highlights of 2024.

Before we do so, I am grateful you follow my AI/ tech journey. THANK YOU!
Next year, big things are planned.

We will build edge AI applications for you to follow (for instance, with the Jetson Orin Nano Super)
For Christmas, I got a PlayStation VR2 headset (🤩), and we will build VR applications.
And, of course, we will double down on AI Agents. AI Agents in many vertical use cases, i.e. customer service.
→ (I am also playing with the idea of building an AI Agent in the crypto space, but it is pretty complex: LMK if you have great use cases or want to team up)

Find below:
- o3 → the new-gen model by OpenAI (is it AGI?),
- DOWNLOAD your customizable AI Agent script ⬇️⬇️
- Unitree’s 4-legged robot is exceptionally agile, fast, and strong.

Writer RAG tool: build production-ready RAG apps in minutes

RAG in just a few lines of code? We’ve launched a predefined RAG tool on our developer platform, making it easy to bring your data into a Knowledge Graph and interact with it with AI. With a single API call, writer LLMs will intelligently call the RAG tool to chat with your data.

Integrated into Writer’s full-stack platform, it eliminates the need for complex vendor RAG setups, making it quick to build scalable, highly accurate AI workflows just by passing a graph ID of your data as a parameter to your RAG tool.

Learn more about our production ready RAG tooling here.

(✨ If you don’t want ads like these, Premium is the solution. It is like you are buying me a Starbucks Iced Honey Apple Almondmilk Flat White a month.)

o3 - the AGI-ish model - has superhuman results across benchmarks

o3 is the succeeding model of o1 (they skipped o2, because of potential confusions with o2 the company).

It uses what OpenAI calls a "private chain of thought," where it pauses to internally deliberate before generating responses, aiming to simulate human reasoning processes more effectively.

How good is it?

o3 excels in coding, math, and intelligence.

It scores 71.7% on SWE-Bench (real GitHub issues) and 2727 Elo on Codeforces, ranking with top, top, top coders.
In math and science, it achieves 96.7% on AIME, 87.7% on GPQA, and 25.2% on EpochAI, far surpassing others.
On ARC-AGI (THE AGI BENCHMARK - tests AI’s ability to learn and generalize like humans), it scores 87.5%, outperforming humans in learning from minimal examples.

And even experts from other fields are stunned.

See what Derya Unutmaz (MD), expert immunologist, has to say:

❝

Today, I’m sharing another insanely good o1-Pro scientific insight! This one is particularly special to me to a point of making me emotional, its that profound🥹

I asked o1-Pro to critically evaluate a review my students & I had written about a specific subset of immune cells called MAIT cells and their role in cancer. The result? I’m simply shocked beyond belief at o1-Pro’s critiques! 😱They were more insightful than my own—and this is a topic where I’m one of the few top experts in the world, having made some of the key discoveries!

As I read through its feedback, I found myself staring at my computer screen, fixated, overwhelmed by a mixture of emotions: disbelief, awe, joy, and a profound sense of humility.☺️Every single point it made, every question it asked—everything was unbelievably insightful!

The depth of its analysis is truly hard to comprehend. Even though we believed we had written a great review on the topic, which was accepted with only minor critiques, I was deeply humbled, thinking, “I should have addressed and included all these insights in the review.” Ouch! The only solace is that it didn’t find any errors.

[… more detail in post …]

https://x.com/DeryaTR_/status/1870570750945673343

o3 is currently too expensive to use broadly (thousands per problem), but we all know the AI (and compute) costs are dropping faster than my productivity when I “just check” social media.

Scale of systems like o3 comes faster than anyone can imagine.

Build Your Custom AI Agent: Download The AI Agent Script

Last week, I gave a workshop in Seoul at the AI Summit on how to build AI agents, along with delivering a keynote on the same topic.

What people loved most was understanding how to build an AI agent with just one script.

To get started, you need to first understand the key elements of an AI agent.

Download the script here, add your OpenAI API key (or any AI model you prefer), and enter the respective API key of the tool you’re using. By default, it’s set to the Weather API.

Unlock Premium to download

Customizable AI Agent Script - Python.pdf

34.83 KB • PDF File

Now, get started and start experimenting! 🔥

Unitree B2-W (Robot) is ready to ship! A Robot Revolution

— # (#)

Run CTV Ads on Roku This Q5

“Q5” is a key post-holiday shopping period
Reach shoppers where they’re streaming – on Roku
You can run self-serve CTV ads for just $500

Discover CTV performance on Roku

(✨ If you don’t want ads like these, Premium is the solution. It is like you are buying me a Starbucks Iced Honey Apple Almondmilk Flat White a month.)

Happy holidays and see you in 2025,
Martin 🙇

I recommend:
- Beehiiv if you write newsletters.
- Superhuman if you write a lot of emails.
- Cursor if you code a lot.
- Bolt.new for full-stack development.
- My book - Generative AI: Navigating the Course to AGI.
- Follow me on X.com.
AI for your org: We build custom AI solutions for half the market price and time (building with AI Agents). Contact us to know more.

You might like our last episodes:

AI’s Biggest Week: These New Releases Point to an Agent-Driven 2025

Bonus: Download your customizable AI AGENT SCRIPT🤖

mail.generativeai.net/p/ai-s-biggest-week-these-new-releases-point-to-an-agent-driven-2025

Use these ChatGPT Prompts to learn anything 10x faster

store these templates + grow your prompting library

mail.generativeai.net/p/use-these-chatgpt-prompts-to-learn-anything-10x-faster

What you need to know about o1 (pro)

Turn screenshots into code with O1 Pro, CIO Germany’s strategies, Gemini’s video lift, Elevenlabs voice

mail.generativeai.net/p/create-and-customize-apps-effortlessly-with-bolt-new-and-cursor-ai-1

The image displays a bar chart titled “Research Math (EpochAI Frontier Math)” that compares accuracy levels. The x-axis has two categories: “previous SoTA” (state-of-the-art) and “o3.” The y-axis represents accuracy, ranging from 0 to 100.

• The “previous SoTA” bar is low, reaching approximately 2.0% accuracy.

• The “o3” bar is significantly higher, reaching 25.2% accuracy.

o3 and last tech updates of 2024

Writer RAG tool: build production-ready RAG apps in minutes

o3 - the AGI-ish model - has superhuman results across benchmarks

Build Your Custom AI Agent: Download The AI Agent Script

Unitree B2-W (Robot) is ready to ship! A Robot Revolution

Run CTV Ads on Roku This Q5

You might like our last episodes:

Keep Reading

Generative AI - Short & Sweet

Home

Account

GenerativeAI.net