🤖All things AGI: What is it and how do we get there? + AI highlights

In partnership with

This print is all things AGI.

What is it, and how do we get there?

There are some challenges that we know of, but what is missing to get AGI is a complex puzzle. Whether you are an AGI enthusiast or not, the truth is teams worldwide are working on it.

(Would love to hear your thoughts. Are you afraid of it? Excited? In the middle?)

Let's dive right in. AI highlights are at the end.

Enjoy reading it in 5 min. 

🤖 What are the technical challenges to AGI?

What is it that stands between us and AGI systems?

I've been contemplating AGI since I first saw Star Wars (32D2, and other droids) and wrote about it extensively in my book, Generative AI: Navigating the Course to the AGI Future.

Achieving AGI will be incredibly challenging.

What is AGI (Artificial General Intelligence)?

AGI aims to replicate human cognitive abilities, enabling machines to understand, learn, and perform any intellectual task a human can. It can execute tasks it wasn't specifically trained for, and it demonstrates autonomous self-control and self-understanding in the process.

There are several important considerations/ challenges ahead.

Understanding the World

One of the most significant and controversial challenges is understanding the world. This debate is epitomized by Yann LeCun and Ilya Sutskever, who represent two extremes on the argumentative spectrum.

LeCun argues that AI models need grounding in reality. He believes language is an approximation and too ambiguous to be sufficient for real understanding.

Sutskever contends that training AI to accurately predict the next word in diverse texts leads to learning a world model.

He argues that although it appears we are learning statistical correlations in text, this process involves learning representations of the processes that produced the text, thus understanding the world through text.

When you provide feedback to a model, such as through RLHF, the AI takes this into account and genuinely learns.

(I would like to add here that it is learning but only to a degree.)

However, it is true that text alone cannot fully replicate real-world experiences.

For instance, reading about life in a rainforest is not the same as experiencing it firsthand. Real understanding requires direct experience and sensory interaction.

LeCun believes that autoregressive models predicting words (e.g., GPT-4, Llama 3) are insufficient.

He advocates for the Joint Embedding Predictive Architecture (JEPA), which predicts concepts rather than words. Although promising, JEPA's performance is not yet at the forefront of AI models.

To truly understand the world, AI needs sensory inputs (smell, taste, touch, etc.) and a robotic body to interact with and understand the world deeply.

Reasoning

Another current limitation is reasoning.

Even though there are Institutions like MaiNLP and the Munich Center for Machine Learning argue that reasoning capabilities in AI are not yet fully evident, I am observing it almost every day.

AI models can develop a business strategy and reason through the rationale behind their proposed plan.

However, researchers from NYU emphasize the importance of self-consistency for reliable and accurate reasoning. This is crucial for tasks requiring multi-step reasoning.

An additional challenge is ensuring that AI can hold and defend its reasoning consistently when challenged by users. It cannot simply change its stance like a "Volta Bandiera" (meaning "flag in the wind" in Italian).

Persistent Memory

Persistent memory is elementary for AGI.

Researchers from Cisco and the University of Texas, Austin, in an October 2023 paper, highlighted the need to supplement LLMs with long-term memory.

Why? This helps overcome context window limitations and creates a foundation for sustained reasoning, cumulative learning (key for improving over time), and long-term user interaction.

This is being worked on and solved in near-term.

Planning

Planning requires reasoning, memory, and a comprehensive world model to be successful. It also requires agency and a goal.

Princeton and Microsoft Research are working on improving planning skills by taking inspiration from the human brain, where planning involves the recurrent interaction of specialized modules in the prefrontal cortex.

To enhance planning, multi-agent systems can be the solution. One AI agent can coordinate others, each focused on specific tasks, drastically improving problem-solving skills.

Multitasking

Our brains constantly multitask, regulating autonomic functions, processing sensory information, maintaining homeostasis, managing attention and emotions, controlling movements, and processing memory and language.

AI models need an architecture that supports multitasking similarly to be on par with humans.

Larger Models and Data

Increasing the size of models and the data they are trained on has led to unforeseen capabilities - the so called emergent capabilities.

This trend should continue, incorporating multi-modal data and physical interactions.

What ChatGPT thinks an AGI looks like. Kinda cool.

To approach AGI, we know that at least we must enhance world understanding, improve reasoning capabilities, have persistent memory, advance planning abilities, enable multitasking, train larger models on diverse data, including sensory inputs.

A new architecture will be crucial. JEPA by FAIR might be a step, but something special may still be needed.

Each aspect above has actually so much more depth, and we didn’t even cover motivation, autonomy and other aspects of AI systems; or any safety aspects.

For a deeper understanding of what AGI not only needs but also what this will mean for us, consider reading Generative AI: Navigating the Course to the AGI Future.

🧶 Speaking Of Intelligent Systems (Not Fully AGI Yet) AE Studio Can Help You Develop Your Solutions

85% of all AI Projects Fail, but AE Studio Delivers

If you have a big idea and think AI should be part of it, meet AE.

We’re a development, data science and design studio working with founders and execs on custom software solutions. We turn AI/ML ideas into realities–from chatbots to NLP and more.

Tell us about your visionary concept or work challenge and we’ll make it real. The secret to our success is treating your project as if it were our own startup.

(Source) AI Agents on mobile devices are next level - they will respond to messages for you

(Source) xAI has raised $6 billion in Series B funding, positioning itself to compete with major AI industry players (it will be a race for talent)

(Source) Simple, concise prompting is all you need to build full-fledged games with GPT-4o

(Source) Naver's office Starbucks: 100 robots deliver coffee and items (in South Korea)

Learning that robots deliver coffee at Naver’s office at scale reminded me of my recent visit to South Korea, where a robot delivered our Kimchi soup, but a waitress handed it to us.

In the past couple of weeks, I traveled extensively, mainly for conferences. First, I was in Korea (7 hours ahead), then attended a wedding over a weekend, flew to Los Angeles (-9 hours) to visit my friend Simeon, and now I'm in Phoenix for the Generative AI Applications Summit, which I'm really looking forward to.

But my circadian rhythm doesn’t even know who I am any more. 😆 

Phoenix by night out of my hotel room.

Martin

Was the show a delight, or just alright?

(You can comment after voting)

Login or Subscribe to participate in polls.