The Way We Interact with AI Has Changed Substantially

Meta's Orion and advanced voice modes.

Meta’s Orion kicks off a novel way of interacting with AI, and human-level voice AI supports this development.

Plus, some OpenAI drama.

Apologies for not sharing updates on the AI agent managing my e-store this time. 🙇 There was no time.

There are a couple of high-priority things going on, one of which is the training I will give at the UN.

Let’s get started.

A woman is sitting on a sofa in a modern living room, wearing augmented reality (AR) glasses. She is interacting with several virtual, holographic-like digital screens projected in front of her. The screens display a variety of apps and features, such as video calls, messaging, and popular apps like Google, YouTube, Meta, Spotify, and more. The digital interfaces suggest an immersive mixed-reality experience, showcasing how AR technology can blend the virtual and physical worlds seamlessly for productivity or entertainment. The room features contemporary decor, including a wooden sideboard, a potted plant, and wall art. The image represents a futuristic use of AR smart glasses for hands-free navigation, virtual communication, and multitasking at home, illustrating advancements in augmented reality and wearable technology. Keywords: augmented reality, AR glasses, mixed-reality experience, smart glasses technology, digital interface, virtual communication, wearable technology, futuristic living room, immersive technology.

The intelligent AR glasses that will change how we interact with AI - Meta’s Orion

At the Connect 2024 last week, Meta presented Orion. It has nano-scale 3D structures that can overlay virtual objects, such as displays, at different depths in front of you.

All of this by weighing less than 100g.

The AR glasses integrate a camera system and a visual AI. You can interact with the AI via voice, hand-tracking, eye-tracking, or by lightly touching your fingers twice, even if your hands are in your pockets.

How is that possible? It has a neural interface, a wrist-based device that tracks your movements and sends commands to the glasses.

They are not ready for the consumer market yet. Meta will tweak them to make them more affordable and fashionable.

You might ask why it is such a big deal.

Because the way we interact with the digital world, the internet, and AI is very due to change. It is a huge interruption when someone looks at their phone or Apple Watch while talking to you, even if only one second.

It hasn’t changed since the release of the iPhone in 2007, which is an eternity in tech terms.

Apple presented the Vision Pro this year. They are good, even though I don’t know a single person that has them. However, the Vision Pro are not fit for daily use. Too big, too heavy, too sci-fi to walk around like this.

But even the concept of glasses itself might not be right. Not everyone likes to wear glasses. How we interact with tech and AI will not be limited to glasses like Orion. AI pens or Rabbit R1 aren’t it, either.

They are often borderline nonfunctional (a fair review of the R1).

What will the future of web interaction be? There has been a lot of research on it.1,2,3

The papers I read indicate the future of web interaction is through IoT (Internet of Things), SWoT (Social Web of Things, which extends the concept of IoT by integrating smart objects with social networks), tangible interfaces, edge computing, and other cross-device applications.

Smart glasses like Meta's Orion may be key, but the broader trend is towards integrated, seamless interactions between digital and physical worlds powered by smart, socially-aware devices.

A holistic solution to interacting with AI and tech will come.

Even more so because it converges with another AI development: lightweight LLMs.

Meta also just released Llama 3.2 on Meta Connect. It is a family of four models: two super lightweights with 1B and 3B parameters and two multimodals with 11B and 90B parameters.

Lightweight AI models (SLMs - small language models) are suited for edge devices, answering a person’s request immediately or communicating to a larger model that steers the ubiquitous AI fleet.

Not only multimodal; SLMs will become more capable. Their costs will asymptotically go towards zero, and they will be faster because of the underlying hardware improvements (i.e. Groq).

We have never had so many technological convergences like we have today.

Non-typing interaction with AI is not only fostered by the AI understanding what we want but also by responding in the most natural way possible for humans.

Today, advanced AI voice modes are indistinguishable from human speech - two examples:

Last week, OpenAI released their next-gen voice modes, which you can interrupt, despite being an large LM, while it is capable of speaking all internet dominant languages.

And NotebookLM by Google, which emulates a podcast out of a given text. I have demoed this in my episode, and the quality of these AI podcast emulations is hands down superb.

There are many use cases, one of which stands out for me: taking care of lonely elderly people.

Korean researchers have found:

AI speakers will alleviate loneliness and improve human-machine interaction in single-person households, especially for the elderly.

Kim, H., Hwang, S., Kim, J., & Lee, Z. 4

How do you envision we will interact with the web? Reply to this email and let me know your thoughts.

Only Sam Altman is left from the original squad

Let’s start with Sam Altman's hypocrisy.

First, he doesn’t have equity in OpenAI. Says he doesn’t need more money, and it is so hard for people to understand, because he has already enough.

I agree; his current net worth is around $2B. Wanting more - no problem for me.

Four months later, OpenAI will be restructured legally, and Sam Altman will receive 7% of the OpenAI shares that are worth $10.5B at a $150B valuation.

Why the lies? Just be straight about it.

The restructuring aims to position OpenAI for future growth and potentially an initial public offering (IPO). Meaning, even more money.

Sam Altman remains

Mira Murati, the CTO of OpenAI, recently announced her departure from the company.

Other than that, these essential figures left already:

  •  Ilya Sutskever - Co-founder, left in May 2024 to start Safe Superintelligence.

  • Jan Leike - Co-leader of 'superalignment,' joined Anthropic.

  • Bob McGrew - Chief Research Officer, resigned alongside Murati.

  • Barret Zoph - VP of Research, left with Murati and McGrew.

  • Greg Brockman - Co-founder and President of OpenAI, is on sabbatical.

With these very strong people left, Sam Altman is the last original team member.

However, I believe OpenAI remains at the pole position in the AI race.

Why? Because of 3 reasons.

First, o1 is obviously the strongest AI model, and it started a new era of AI models. By now it is clear that they know how to get to stage 3 of the five stages of AI : that’ll be true AI agents.

Second, their fresh legal setup, current market valuation, and potential IPO will flood them with money. This will continue to attract the best AI experts in the world.

Third, Sam Altman himself. Maybe he is not always honest, maybe not likeable, but when it comes to partnerships, ecosystems, and building products he is among the top 3 in the world.

He has all the characteristics a CEO needs to have to lead OpenAI to a bright future. Especially with these AI model foundations they have.

That’s a wrap! I hope you enjoyed it.

A clear blue sky with a heart-shaped smoke trail drawn by an aircraft. The heart is centered in the sky, and the sun shines brightly in the top left corner. Below the heart, green trees form a natural horizon, completing the serene outdoor scene. The image captures a moment of peaceful beauty and creativity.

Spread love!

Martin

Our webpage

1  Mashal, I., Alsaryrah, O., Chung, T., Yang, C., Kuo, W., & Agrawal, D. (2015). Choices for interaction with things on Internet and underlying issues. Ad Hoc Networks, 28, 68-90. https://doi.org/10.1016/j.adhoc.2014.12.006.

2  Gubbi, J., Buyya, R., Marusic, S., & Palaniswami, M. (2012). Internet of Things (IoT): A vision, architectural elements, and future directions. ArXiv, abs/1207.0203. https://doi.org/10.1016/j.future.2013.01.010.

3  Vázquez, J., & López-de-Ipiña, D. (2008). Social Devices: Autonomous Artifacts That Communicate on the Internet. , 308-324. https://doi.org/10.1007/978-3-540-78731-0_20.

4  Kim, H., Hwang, S., Kim, J., & Lee, Z. (2022). Toward Smart Communication Components: Recent Advances in Human and AI Speaker Interaction. Electronics. https://doi.org/10.3390/electronics11101533.