Google has deployed its strongest AI model ever! Also, the $7 trillion ask, Multi-agents, and more

Sponsored by

Following up on all AI news and advancements would take, on average, 78 hours this week (my rough estimation). So, I condensed it to the top 6 bits, which takes you 5:07 min. to read. Fly through it and you are all set for the week.

  1. Google Launched Gemini Advanced, incl. its most powerful LLM Gemini Ultra 1.0

  2. Sam wants $7 trillion

  3. OpenAI’s Focus on Multi-AI Agents

  4. AI Learns Object Recognition from Infant POV Footage, Informing How Humans Learn

  5. More Apple Vision Pro Apps - Huge GenAI Potential

  6. Tool of the Week: ElevenLabs GPT

Reading time: 5:07 min. Go, go, go!

✋ [Recommendation] The Bay Area Times

We explain the latest business, finance, and tech news with visuals and data. 📊

All in one free newsletter that takes < 5 minutes to read. 🗞

Save time and become more informed today.👇

🌱 GenAI Updates

🥇 Google Launched Gemini Advanced, incl. its most Powerful LLM Gemini Ultra 1.0

Gemini Advanced (no Bard, no more), now integrated into Google's One AI Premium Plan, is globally available in English across 150+ countries, with more languages planned. It utilizes the Gemini Ultra 1.0 model, which is the first model to surpass human expertise on the MMLU benchmark across disciplines like physics and history. Priced at $19.99/month after a free trial, the plan offers 2TB storage and Google Workspace integration.

How good is Gemini Ultra 1.0?
Benchmarks indicate that it performs exceptionally well against ChatGPT. However, I haven't had the chance to test it myself due to lack of access in Germany. The advanced coding capabilities of Gemini Ultra 1.0 are particularly impressive.

💻️ Sam wants $7 Trillion

Sam Altman, OpenAI's CEO, is in talks with investors, including the UAE government, to raise $5-$7 trillion to expand AI semiconductor production, addressing the global GPU shortage. This is almost 10% of the global GDP.
The plan includes constructing numerous chip plants and collaborating with chipmakers, energy firms, and investors. It also aligns with the Biden administration's $5 billion commitment to semiconductor R&D.
No one ever raised that kind of money, but Sam can do that.

🕵️ OpenAI’s Focus on Multi-AI Agents

For some reason, there are no big announcements, but OpenAI, we caught you! 😎 

OpenAI's next step involves developing AI agents designed to automate complex tasks across diverse environments. This includes tasks like data transfer between documents or spreadsheets and submitting expense reports through actions such as clicking, cursor movement, coding, developing apps, and performing market research, to name a few.

OpenAI focuses on Multi-Agents: A superior performance is achieved when multiple agents with varied roles collaborate towards shared objectives. They utilize the strengths of multiple specialized agents to surpass the capabilities of single-agent setups in areas like gaming, simulations, and problem-solving.

OpenAI’s sophisticated approach: OpenAI employs algorithms like Proximal Policy Optimization (PPO) and Multi-Agent Deep Deterministic Policy Gradients (MADDPG) to train agents in these complex settings. These algorithms optimize the agents' policies for improved task performance, particularly in continuous control and decision-making scenarios. With strategies such as actor-critic methods and a combination of centralized planning with decentralized execution, OpenAI tackles challenges inherent in multi-agent environments, like non-stationarity and the vast expansion of state and action spaces.

A look forward: These developments might include the potential for AI agents to manage entire departments or even companies with minimal human supervision, as hinted at in early research by Stanford - with the S&P 500 reaching an all-time high. The productivity boost from AI is evident, but how do we balance that with the societal impacts?

Multi-agents are not a very new trend. See AutoGen, for instance.

How should we balance AI adoption in the workplace with societal impacts?

(After voting you can leave a comment. Sharing comments next time.)

Login or Subscribe to participate in polls.

👼 AI Learns Object Recognition from Infant POV Footage, Informing How Humans Learn

NYU's new AI learning approach draws from a child's perspective, utilizing multimodal inputs captured by a head-mounted camera to mimic a child's daily visual and auditory experiences. The system, equipped with vision and text encoders, mastered around 40 words and concepts through contrastive learning from this limited dataset, making AI significantly less data-hungry.

It is called Child's View for Contrastive Learning (CVCL) and offers a more toddler-like learning trajectory by pairing sights with sounds. While currently, it only learned nouns, the following steps involve teaching this AI toddler verbs and the nuanced tones of speech, which is the essence of early human language acquisition.

The AI learned using video and audio from a helmet-mounted camera.

🥽 More Apple Vision Pro Apps
- Huge GenAI Potential

The Apple Vision Pro, a mixed-reality headset, was officially launched in the United States on February 2, 2024. This might be a defining moment not only for VR but also for potentially all applications.

As it opens up a whole new world for generative AI applications, we would like to shed some light on already existing Vision Pro applications.

Last episode we found:

What else to highlight:

Improved surgical precision with interactive AR medical imaging.

🦉 Tool of the Week

The ElevenLabs GPT lets you AI-generate a speech from your text. While initially for free, it is possible that you have to create an account at some point. The output quality is mind-boggling!

🎥 Newsletter as a Video

🔥 See You Next Week

The AI news are hot like the slopes: 11°C at the lift - abnormal. The only way out is head first in the snow; see my selfie.

Thank you so much for reading.

Martin