- Generative AI - Short & Sweet
- Posts
- AI Wrap-Up: DEMO of AI mindmapping, AMZN Q, The LLM benchmark problem, etc.
AI Wrap-Up: DEMO of AI mindmapping, AMZN Q, The LLM benchmark problem, etc.
Happy Friday friend,
This time, the Friday Wrap-Up is beyond interesting. 🤝
But quickly, could you do us a favor to support our newsletter? Adding us as a contact improves our deliverability. Simply hover your mouse over the email and add us as a contact.
Highest appreciation!
Reading time is 140 sec; let’s go!
85% of all AI Projects Fail, but AE Studio Delivers
If you have a big idea and think AI should be part of it, meet AE.
We’re a development, data science and design studio working with founders and execs on custom software solutions. We turn AI/ML ideas into realities–from chatbots to NLP and more.
Tell us about your visionary concept or work challenge and we’ll make it real. The secret to our success is treating your project as if it were our own startup.
The Big Announcement of Amazon Q for Business and Developer: productivity boost by using an AI assistant/AI agent to leverage custom data
Nature Medicine: AI reduced overall deaths by 31% and deaths from heart issues by over 90%
The first randomized trial of medical #AI to show it saves lives
ECG-AI alert in 16,000 hospitalized patients
31% reduction of mortality (absolute 7 per 100 patients) in pre-specified high-risk group
nature.com/articles/s4159…
@NatureMedicine— Eric Topol (@EricTopol)
1:24 PM • Apr 29, 2024
Most LLM benchmarks are publicly available, and most LLMs are trained on them! It is a bad practice to test AI models on data that they have been trained on.
Every Data Scientist knows this and yet it seems to be an industry practice nowadays. Scale AI has built a new GSM8k benchmark, not known to state-of-the-art LLMs, and benchmarked them on it. It’s interesting to see that Anthropic is significantly better than the competition.
Data contamination is a huge problem for LLM evals right now. At Scale, we created a new test set for GSM8k *from scratch* to measure overfitting and found evidence that some models (most notably Mistral and Phi) do substantially worse on this new test set compared to GSM8k.
— Hugh Zhang (@hughbzhang)
3:40 AM • May 2, 2024
[Demo] Create Interactive Mindmaps with ChatGPT: I tried it and it works like a charm (2-step process)
In China, Elon Musk scores a huge win: on the path to self-driving cars
Tasha Keene, "The news is that Tesla might have gotten approval in China for FSD.(...) This announcement is a big deal because China is very protective their video data, their mapping data (...) the auto market is something that really matters to China. So, to let a foreign… twitter.com/i/web/status/1…
— Alex (@alex_avoigt)
9:07 AM • May 2, 2024
Now, weekend vibes only!
Your GenerativeAI.net team 🤎
What do you think about today's episode?(Share your thoughts by replying to this email.) |