- Generative AI - Short & Sweet
- Posts
- 🇨🇳 China’s Major AI Developments, AI Music Videos, and more
🇨🇳 China’s Major AI Developments, AI Music Videos, and more
Hey there!
From now on, I will make sure to more frequently scan for Chinese AI advancements. Some very talented teams have developed major AI products over there.
Additionally, AI music video generation is on fire. Also, be sure to check out FrugalGPT and other quick highlights.
Enjoy reading it in 4:45 min.
😇 Anyword is Crafting Literally Every Word with AI
Boost Marketing Results with Anyword’s Performance Prediction
Meet Anyword. The AI that knows your brand, your audience, and what content resonates.
Trusted by 1M+ companies, Anyword’s AI connects your marketing channels to analyze past performance and scale high-performing, on-brand content.
🇨🇳 China’s Major AI Developments
While we celebrate AI and technological advancements predominantly emerging from the U.S., China is also making significant strides.
Among these are a language model, a video generation model, and a humanoid robot comparable to high-profile projects like those from Anthropic, OpenAI, and Tesla.
The LLM - SenseNova 5.0
SenseTime, a leading Chinese AI company, unveiled its latest multi-modal LLM, SenseNova 5.0, during its Tech Day event in Shanghai on April 23, 2024.
After 10 TB of training data, SenseNova 5.0 surpassed GPT-4 Turbo and other models in various benchmarks, including MMBench, MathVista, AI2D, and ChartQA.
SenseNova 5.0 demonstrates advanced scientific and quantitative reasoning abilities. It excels in mathematical computations, coding skills, and logical reasoning.
Science will massively benefit from that. In the future (not that far away) research scientists, post-docs, and similar professionals will likely collaborate with LLMs as sparring partners.
Employing a Mixture of Experts architecture, SenseNova 5.0 offers an effective context window of 200k tokens during inference.
At about 110 words per second, it processes information roughly 10 times faster than GPT-4 Turbo and Claude 3 Opus. 🧨
The Video Generation Model - Vidu
Vidu is a groundbreaking text-to-video AI model developed jointly by Tsinghua University and the Chinese AI startup, ShengShu Technology.
This model is capable of generating 16-second video clips at 1080p resolution from textual descriptions with a single click.
The videos incorporate complex physical effects, such as lighting and shadows, demonstrating its advanced rendering capabilities.
Vidu is built on a proprietary architecture known as Universal Vision Transformer (U-ViT), which combines elements of Transformer and Diffusion models. This architecture was developed in September 2022, predating the Diffusion Transformer architecture used by OpenAI's Sora.
In comparison, while Vidu excels in terms of speed and complexity handling, Sora maintains a slight edge in visual fidelity.
The Robot - Astribot S1
Astribot S1 is a humanoid robot designed by Stardust Intelligence, a company based in Shenzhen that was established in 2022.
Astribot S1 is noted for its unprecedented speed and precision, a top arm movement speed of 10 meters per second, and the capability to handle payloads up to 10 kg per arm, surpassing average human strength. This robot operates autonomously without any teleoperation.
One of Astribot S1's remarkable capabilities is its ability to perform delicate tasks swiftly, such as removing a tablecloth from under a set of wine glasses without causing any disturbances.
My Commentary:
Indeed, the U.S. and China are at the forefront of the global AI race. According to Tortoise Media's Global AI Index, these nations are currently ranked No. 1 and No. 2 respectively, a ranking that assesses nations based on their AI investment, innovation, and implementation capabilities.
In terms of the advancement of LLMs, the Mixture of Experts (MoE) architecture presents a promising pathway for the future development of AI models, aiming to enhance both performance and intelligence.
It's impressive how fast Stardust Intelligence has managed to develop such a sophisticated product, especially considering the company was founded in 2022. If you know more, please let me know.
At the start of the year, I stated that 2024 would be the "year of the robots." To clarify, I meant that 2024 would mark the first year of many dominated by robotics. If this pace continues, robots will become as ubiquitous as smartphones within the next seven years.
Everyone will have a humanoid robot at home by 2030.
🎸 AI Music Video is Getting Wild
You can create stunning music videos using AI tools such as Suno v3 and Udio. The lyrics, video, and song itself are all AI-generated!
Here are my top 3.
LEFT OF ME 💀🫀
(a music video about a girl, a ferrari, and zombies)
[lyrics + montage + story] @Starhand_io
[banger] #SunoAI
[ai tools] #midjourney#leonardoai#pikalabs#stablevideo
[SOUND ON]#ai#AIart#AiArtSociety#generativeai#AIArtwork#AIArtworks#AIArtistCommunity… twitter.com/i/web/status/1…
— starhand (@Starhand_io)
1:16 PM • Apr 25, 2024
"GenAIMon" 🔊
In the near future, every kid will be able to create his own Pokémon style collection of creatures, and trade them with his friends.
With a few inputs and a touch of a button, even 10 year old will be able to spawn 1000's of them.
They will likely mix and match… twitter.com/i/web/status/1…
— Steve Mills (@SteveMills)
8:18 AM • Apr 25, 2024
“Ⱨ₳₵₭ ł₦₮Ø ₮ⱧɆ ⱤⱧɎ₮Ⱨ₥” 💻 💚 ⚡️
Many thanks to @HaiperGenAI & @udiomusic
#aicinema#aimusic
— Stone Kaiju ᯅ ⚡ ³³º¹ (@stonekaiju)
6:37 AM • Apr 21, 2024
Sponsor This Newsletter?
Get in front of 40k+ tech enthusiasts. Get your brand seen by tech practitioners, founders, and execs.
(Source) Midjourney’s Random Feature --sref random is a feature in Midjourney that allows you to apply a randomly selected style from an abstract range of styles that Midjourney can generate, without having to specify a particular style reference image. Try it out. | (Source) Alphabet (Google) is Worth $2 Trillion Now The company's profits surged nearly 60% year-over-year, driven by robust performances in Google Search, YouTube, and Google Cloud. Over 1,000 new Google Cloud features, enhanced by the GenAI model Gemini, also boosted this growth. |
(Source) Ready for a Chatbot Version of Your Favorite Instagram Influencers? Instagram is testing a program that allows its top influencers to interact with their followers via chatbot in direct messages. | (Source) FrugalGPT: How to Use LLMs While Reducing Cost and Improving Performance The paper discusses the cost associated with querying LLMs and proposes FrugalGPT, a framework that uses LLM APIs to process natural language queries within a budget constraint. The framework uses prompt adaptation, LLM approximation, and LLM cascade to reduce the inference. Very helpful for my projects to come! |
Would you like me to extract the key findings of the FrugalGPT paper?(Saving costs while increasing performance when querying LLMs) |
It’s a wrap.
Last Thursday, I spoke at the AI in Marketing Masterclass. It was a blast! A) absolutely great conversations, and B) it was in the Chelsea stadium (Stamford Bridge). ⚽️
Feel very free to recommend the newsletter to someone you love or don’t love, but like. 💗
To an agentic future,
Martin