- Generative AI - Short & Sweet
- Posts
- Sakana AI lets AI Models breed, Google VLOGGER photo-to-video, Mind-controlled gaming, and some rumors
Sakana AI lets AI Models breed, Google VLOGGER photo-to-video, Mind-controlled gaming, and some rumors
😇 Here are the week's GenAI/ AI/ Tech updates. Despite holidays worldwide, the pace of development keeps up.
Novel Ways of Building Foundation Models by Sakana AI
Google’s VLOGGER
Mind-control (like sci-fi, but reality)
Some Rumors cooking
Reading time: 4:11 min 🐠
❕ Our Sponsor has an Opportunity for You
Ride the wave of 23% compounded annual growth
That’s the forecasted growth rate of the smart shades between 2023-2033. And RYSE’s automated window shade tech is positioned to dominate the market. They’ve generated over 20X growth in share price for early shareholders, with significant upside remaining as they launch in over 100 Best Buy stores. Invest in the rapidly growing smart shades market →
🐟️ Novel Ways of Building Foundation Models by Sakana AI
(Source)
Sakana AI is a new AI research company based in Tokyo, Japan, developing nature-inspired AI models.
It was founded by former Google researchers David Ha and Llion Jones. Jones co-authored the influential 2017 paper "Attention Is All You Need," which introduced the transformer architecture. Ha previously led Google's AI research in Japan.
"Sakana" means fish in Japanese, reflecting their approach of drawing inspiration from the collective intelligence seen in schools of fish.
Rather than building massive centralized models, Sakana aims to create swarms of smaller, specialized models that collaborate.
Sakana has developed an evolutionary model merge method that uses evolutionary algorithms to automatically find optimal ways to combine diverse open-source models into powerful new models custom to specific capabilities.
They created a Japanese language model (LM) with strong math reasoning abilities and a Japanese vision LM that achieved state-of-the-art benchmark results despite not needing to be explicitly optimized.
Their 7B parameter LM even outperformed some 70B parameter models.
Leveraging the growing ecosystem of open models, Sakana believes its evolutionary approach can help organizations develop custom models faster and more cheaply before investing in building proprietary models from scratch.
😎 Google’s VLOGGER Creates Lifelike Talking Videos from a Single Photo
(Source)
Google launched VLOGGER, a holistic avatar video generation.
VLOGGER offers captivating videos with diverse subjects and lifelike motion:
→ Varied pixel colors: Vibrant spectrum enhancing visual experience.
→ Dynamic subject movement: Subjects bring energy and dynamism to scenes.
Create Dynamic and Expressive Characters
Experience the magic of animating characters by generating animated faces with just one image and a matching audio track.
Video Editing
- VLOGGER can be used to edit existing videos.
- Modify subjects' expressions by closing their mouths or eyes.
VLOGGER creates realistic talking videos in two stages:
→ Generate body motion controls from audio input.
→ Translate controls into frames using an image-to-image model with a reference image for identity.
Video Translation
One significant application of this model is video translation.
In this scenario, VLOGGER takes an existing video in one language and adapts the lip and facial movements to synchronize with the new audio.
Cutting Edge Tech
VLOGGER supports generating rich hand gestures with cutting-edge technology for content creators and influencers.
This is the first version. A few more iterations, and have rich avatar video generation much richer than existing ones. (perhaps even surpassing reality?)
🏀 A Deal you can’t Miss?
This incredible & FREE workshop (usually $199) is all you need to become an AI Genius & learn about the power & use-cases of AI in 2024.
Sign up now & get it at $0 (offer for first 100 people only) 🎁
In this workshop, you will learn:
🚀 To do quick excel analysis, & make stunning AI-powered PPTs
🚀 How to build your own personal AI assistant to do work at 1/10th the time
🚀 Multiple use-cases & features of ChatGPT & be on top of AI trends
🚀 Become an expert at prompting & use AI tools like never before
So are you ready to work fewer hours & drive 10x more impact? Hurry! Join the workshop here (100 FREE seats only!) 🎁
🧠 Mind-Controlled Gaming, and Beyond - Thanks to Neuralink
Something beautiful has happened.
Noland Arbaugh, who's paralyzed from the shoulders down, showcased how he can play chess using just his thoughts, thanks to a brain implant. He received this implant in January and mentioned that the procedure was surprisingly smooth, with no cognitive issues afterward.
He managed to play online chess and the game Civilization VI for 8 hours straight using the implant.
Neuralink's demo isn't just sci-fi cool; it's showing us the real potential of brain-computer interfaces to change lives. Forget the doom and gloom—it's time to focus on what's legit happening.
Long-term, it is possible to shunt the signals from the brain motor cortex past the damaged part of the spine to enable people to walk again and use their arms normally.
Elon Musk recently announced that Neuralink's next venture is 'Blindsight,' which aims to restore vision for those born without it.
👵 Rumor Mill
iOS 18 may include Anthropic
(Source) iOS 18 is set to enhance Apple's software with a strong AI focus, potentially collaborating with AI firms like Anthropic, known for its Claude 3 models. Why wouldn’t Apple build its own AI model, as they A) have all resources and B) already have a team and capabilities in place? I hope it is because they don’t want to have good but exceptional AI capabilities.
OpenAI's GPT-5 release could be as early as this summer
(Source) According to recent reports and leaks, OpenAI is preparing to release GPT-5, a significant upgrade to its language model, as early as this summer. Enterprise customers who have received demos of GPT-5 describe it as "materially better" than its predecessor, GPT-4. OpenAI has hinted at innovative features, including autonomously calling upon specialized AI agents to handle complex tasks.
🦃 Tweet of the Week
Education will never be the same.
People are finding some incredible use cases with Apple Vision Pro and spatial computing.
10 wild use cases:
1. Learning how heart works
— Min Choi (@minchoi)
3:38 PM • Mar 24, 2024
Hey there, help us improve! Rate the latest episode of our newsletter.(You can add a comment below once answered.) |
That was it - I hope you have a great week ahead. ❣️
I feel grateful these days for two reasons.
For one, my book is the 14th bestselling AI book on Amazon. (Feel free to push the rating further.) Second, I celebrated my birthday, and I am so happy that I could spend some time with friends—an irreplaceable feeling.
Thank you so much for reading.
Martin