- Generative AI - Short & Sweet
- Posts
- 🍿 Friday - AI Wrap-up #11
🍿 Friday - AI Wrap-up #11
Competition is the fuel that ignites innovation.
As we see intense competition in the general AI space between OpenAI Google, xAI, Meta, Anthropic, and the like, we also see innovative newcomers like Kyutai.
It is a real-time native multimodal foundation model that can listen and speak, similar to what OpenAI demoed GPT-4o in May.
👋 I am Martin (if this is your first time reading this newsletter).
Like clockwork, every Friday, I share my top findings about AI and the exciting world of upcoming tech with you.
This time: Kyutai, LLM updates from GPT-4o, Llama 3, and Grok 2, my live AI Training, Llama Agents, and Groq, and more.
Reading time is exactly 203 sec; vamos!
Did Open Science just beat OpenAI? 🤯
Kyutai released Moshi, a real-time native multimodal foundation model that can listen and speak, similar to what OpenAI demoed GPT-4o in May.
Moshi:
Expresses and understands emotions
Listens and generates Audio/Speech (2 parallel streams to listen and speak)
Achieves a end-to-end latency of 200ms
It will be released open source
Moshi and Neil on stage giving some emotional improv.
— kyutai (@kyutai_labs)
7:56 PM • Jul 3, 2024
Learn AI in 5 Minutes a Day
AI Tool Report is one of the fastest-growing and most respected newsletters in the world, with over 550,000 readers from companies like OpenAI, Nvidia, Meta, Microsoft, and more.
Our research team spends hundreds of hours a week summarizing the latest news, and finding you the best opportunities to save time and earn more using AI.
LLM updates, besides Moshi
Last week: Anthropic > OpenAI. This week: OpenAI > Anthropic.
Over the past couple of days, GPT-4o (ChatGPT+) has been updated and has surpassed Claude 3.5 Sonnet on the LMSYS Leaderboard.
LMSYS Leaderboard on the 5th of July, 2024.
In the meantime, Meta (i.e., WhatsApp) has unveiled a preview of the latest Llama 3-405B model, allowing it to handle more complex prompts in the future.
Elon Musk announced xAI’s next iteration of its LLM, Grok 2, which will be released in August.
Lastly, Claude 3.5 Opus, Anthropic’s biggest model, will also be launched this summer.
On July 25th, I will speak/ train live about GenAI, AI Agents, and what is next for Artificial General Intelligence (AGI).
(My title on the banner is not entirely up to date anymore.)
A Flexible Framework for AI Agents: Llama Agents
Read about it in my last post.
Groq recently created the fastest OpenAI Whisper API ever made, 164x faster for $0.03 per hour of audio transcription.
I will write a more profound piece on how AI will disrupt customer support and AI (client) interaction in general. Stay tuned.
Thoughts welcome.
A good friend - a data scientist by day, investor by night - has an excellent newsletter combining value investing strategies and his tech/ AI expertise.
Give it a good read: https://contrariancashflows.com
Best summer vibes from Sicily, Italy. 🌇
This is from my morning run. City: Castellammare del Golfo.
-Martin
My upcoming training on GenAI, AI Agents, and AGI.
Would you like to sponsor a post? → www.passionfroot.me/ai
Spread the word, get the perk! Referral program.
My book: https://a.co/d/eMosWDc
I am a Judge for The National AI Awards 2024. Apply!
My webpage has a fresher look and feel (a lot is planned 😉 ): https://generativeai.net