Claude 3's Tool Use & Genie's Game Creation, and other highlights

In partnership with

Sign Up OR Sponsor This Newsletter? 
Get in front of 40k+ tech enthusiasts. Get your brand seen by tech practitioners, founders, and execs.

Hey, it’s Martin.

This time, again, I carefully curated the news, assessing signal vs. noise.

The content:

  • Anthropic Unveils Tool Use Capability for Claude 3, Enhancing AI Interactions

  • Google Genie Lets Users Generate AI Outputs Resembling Video Games

  • Meet Me at the Open Data Science Conference - East

  • Quick AI Highlights

Once more, I built a 3-question quiz. 🤨 Last round was too easy, so I turned up the heat a bit. Can you answer all three?

Reading time: 4:58 min.

🟫 AI Can 10X Your Browsing Efficiency

Try out your Browsing Copilot Pro for free (by clicking on it, you support the newsletter🫥)

MaxAI.me - Outsmart Most People with 1-Click AI

MaxAI.me best AI features:

  • Chat with GPT-4, Claude 3, Gemini 1.5.

  • Perfect your writing anywhere.

  • Save 90% of your reading & watching time with AI summary.

  • Reply 10x faster on email & social media.

💯 Anthropic Unveils Tool Use Capability for Claude 3, Enhancing AI Interactions

The future is agentic. (Not a word in Webster yet.)

Anthropic has announced tool use, aka function calling, for Claude 3, making it an AI agent. It allows Claude to interact with external tools, APIs, and knowledge bases, extending its capabilities beyond conventional limits.

Claude can access and manipulate data from various sources, take actions through software tools, and receive results in natural language or a specified format.

Anthropic's research has shown that even when interfacing with hundreds of simple tools, Claude's 3 models maintain accuracy rates above 90%.

This means simpler architectures for AI products. To put it in simple terms, you need to connect with Claude 3, integrate your documents and tools, and can start working with it, using prompt engineering.

Side note: Prompt Engineering regarding professional products will likely remain integral.

Does this make the more complex RAG architecture (Retrieval Augmented Generation) obsolete in the long run? In the short term, no. The great thing about a RAG system is that it includes a semantic search to prioritize for content relevancy. In the long-term, maybe yes.

(Let me know if you have a strong opinion about this.)

Models calling models.

Anthropic has also shown how the Claude 3 model, Opus, calls Haiku subagents. Haiku is the faster and smaller model.

They asked for the fastest implementation of Quicksort online. Opus called 100 Haikus to find the top 100 links and to write tests to determine how fast each implementation is. Roughly 90 seconds later, the fastest implementation was identified.

Super impressive!

What else does this bring?

What about synthesizing findings of research across 100 fields? Or analyzing ALL competitors? A news channel that monitors and reports on real-time events? Solving GitHub issues on scale?

First, amazing products and agents are being built. Please look at the footnote below to see how developers have adapted tool use already.

🎲 Google Genie Lets Users Generate AI Outputs Resembling Video Games

Google DeepMind introduces Genie, a foundation world model trained from Internet videos that can generate endless playable (action-controllable) worlds from synthetic images, photographs, and even sketches.

Similar to OpenAI's Sora, this technology generates subsequent frames, but uniquely, it also incorporates keyboard inputs in real-time.

What’s under the hood?

Reading the paper, the tech of Google DeepMind's Genie has several key points:

  • It is trained on a large dataset of over 200,000 hours of publicly available Internet gaming videos, primarily from 2D platformer games. This allowed Genie to learn these games' underlying mechanics, physics, and design elements.

  • Genie uses the DeepMind Jax ecosystem, an internal framework used for model training.

  • It employs a latent action model (learns actions from videos), a video tokenizer (compress videos into discrete tokens to reduce dimensionality and enable higher quality video generation), and a dynamics model (takes in video to- kens and action embeddings and predicts future masked video tokens) to generate interactive 2D platformer video games from a single image prompt or text description.

  • Genie can recognize the main character within a game without being trained on action or text annotations.

  • The paper positions Genie as a new paradigm called "generative interactive environments", where entire interactive experiences can be generated from a single text or image prompt.

  • The goal is to enable Genie to imitate behaviors from unseen videos and potentially train generalist AI agents in the future.

The next steps for enhancement include improving the output's visual quality, introducing audio capabilities, and increasing overall responsiveness and speed - converging with Sora’s capabilities.

“Movies are going to become video games, and video games are going to become something unimaginably better, “ is what Sam Altman, the CEO of OpenAI, recently tweeted.

🙃 Meet Me at the Open Data Science Conference - East

[A ‘J’ is missing at the beginning of the Tweet.]

So pumped to speak at the ODSC East about GenAI, AI Agents, and AGI.

Bonus: Following the main event, join me for a book signing session where you can ask me anything about my book and more. (Plus, you might get a book code 😉 )

We continue, as literally 100% of the votes last time were for keeping the Quick AI Highlights! 🤝 

(Source) Devika - Open Source Agentic AI Software Engineer

We explored Devin AI, the best closed-source AI engineer. Now, Devika makes a comparable AI engineering tool accessible to all.

(Source) Tesla’s Robotaxi Announcement

Shortly after denying Reuters' claims that Tesla would drop a $25,000 EV for a robotaxi, Elon Musk announced a robotaxi reveal event on X for August 8.

(Source) Google just introduced ScreenAI, and it's wild

ScreenAI, a Vision-Language Model (VLM), fully understand UIs and infographics, handling tasks such as graphical QA with charts, pictures, maps, and more.

(Source) xAI Looking To Raise $3B

Elon Musk's backers are negotiating a $3 billion fundraising for his AI startup xAI, aiming for an $18 billion valuation, according to the Wall Street Journal.

🫡 Quiz Time

3 Questions: light, medium, and hard! Can you answer them all?

What is the significant capability of Anthropic's Claude 3 as mentioned?

Login or Subscribe to participate in polls.

What distinguishes Google DeepMind's Genie from OpenAI's Sora in generating video game environments?

Login or Subscribe to participate in polls.

What technique is used to enhance the diversity of ScreenAI's pre-training data?

Login or Subscribe to participate in polls.

That was it, folks. I hope you enjoyed it.

Last time, I was in Lüneburg, my hometown. Today, I am in Tuscany, Italy. Friends are getting married this weekend. 🤎 

In this photo, I'm writing this particular episode beneath a fig tree—a wholesome setting.

To an agentic future,

Martin