šŸ”¹ The Future is Flow Engineering

I stand by what I said a year ago. It's becoming evident that in three years, we will outsource everything that requires low to medium cognitive competence in the virtual space. We will focus on our life strategies, our relationships, and our next business moves.

For this, we will need strong AI models (include tool use and you have AI Agents) and AI Agents that can work hand-in-hand.

ā€¦ which leads us to Flow Engineering.

Let us examine this dynamic field in the next 6 minutes. šŸ¤Ž

P.S.: Below, youā€™ll find in the highlights Claude 3.5 Artifacts, Mistral AIā€™s update, Muskā€™s next move, and Ilya Sutskeverā€™s SSI Inc. šŸ¤ 

šŸ”¹ The Future is Flow Engineering

Source: My Midjourney creation.

Source: My Midjourney creation.

When we look at the next evolutionary step, which is already present to some degree but not yet fully mature, it is clear that itā€™ll be AI Agents.

Plenty of ideas, projects, and products can already be seen.

They will first achieve widespread adoption across industries before we go about our lives with a fleet of AI Agents in the professional and private space.

What are AI Agents?

An AI agent is a language model with tool usage. It autonomously interacts with its environment, collects data, makes decisions, and performs tasks to achieve human-set goals without constant human input. 

AI Agents can take on many roles. Be it a frontend developer, virtual assistant, market researcher, or content writer of sorts. 

In the chapter about AI Agents in GenAI-Navigating AGI [book] I describe a day in a life with a fleet of agents (orchestrated by your AI Chief of Staff).

The skill of AI Agents is determined by their LLMā€™s quality and tool uses - number and functionality of APIs they include.

For instance, Claude 3.5 Sonnet is currently the most powerful language model, and with its artifact feature it comes with its own workspace. See below for more.

But there is more to them. For a deep-dive, see my GenAI book.

AI Agents can work as a single agent or in a multi-agent framework. There are different team setups for multiple agents: e.g. joint chat or hierarchical chat.

What is Flow Engineering?

Flow Engineering involves designing and optimizing the interactions between AI Agents so that they can achieve the most optimal outcome/ solve the tasks at hand.

Flow Engineering improves accuracy by iteratively refining the solution between AI Agents. A coding agentā€™s output could be tested thoroughly by a quality assurance agent.

Flow Engineering supports in producing consistent outputs, as e.g. the result is compared to other previous results by another AI Agent.

Dynamic AI Agent Flows can solve specific problems, but also find generalizable solutions, making an approach robust. This depends on your constraints. (Again, still much in research, and trying out how it is done best.)

From

From a paper last month by Codium AI: though powerful, they proposed this rigid flow. A good first step (looking at the results theyā€™ve achieved), but not the end of Flow Engineering.

As we differentiate single AI Agents performing one role, and multiple AI Agents function as teams, it is obvious that ā€¦

ā€¦ what teams can achieve no individual can do.

A well-orchestrated team, fleet, cluster, or army makes the AI perform at its best. The GPT-5, the Claude 4 Opus, the Llama 4.

It is the next evolutionary step in AI.

So knowing Flow Engineering is as important as Prompt Engineering.

Flow engineering for multiple AI agents involves individual AI Agents, enabling them to work together efficiently.

While Flow Engineering of a single AI Agent involves:

  • Pre-Processing Dialogue: Before generating the solution, the model reasons about the problem in natural language to set up a clear understanding of the requirements.

  • Self-Reflection and Reasoning: The model self-reflects (perhaps including separate feedback) to understand the problem better, reasoning about the necessary tests to break down the problem into manageable components.

  • Tests: The AI generates additional tests to validate solutions, ensuring the code works for given examples and generalizes to unseen cases.

Multi-agent interaction involves:

  • Specialized Agents (the Team): Each agent performs a specific function, such as data collection, analysis, writing, project management, quality assurance, and much more. Not fully yet, but in the future pretty much every role that exists in the real-world and beyond. This leads to more efficient and accurate outcomes, when they are orchestrated right. (This is what Flow Engineering is about.)

  • Coordination and Orchestration: A central coordinating agent, like a chief editor or project manager manages the workflow by assigning tasks, monitoring progress, and integrating outputs from specialized agents.

  • Iterative Refinement: The workflow is iterative, with agents refining their outputs based on feedback. For example, a Reviewer agent identifies errors, and a Reviser agent addresses them, ensuring higher quality and accuracy.

  • Memory and State Management: Agents maintain context and state across interactions. LangGraph supports cyclical flows and built-in memory, enabling agents to remember previous interactions and maintain coherence and consistency. For Claude 3.5 it is artifacts.

Nowadays, there are many different ways of working in teams, from micromanaging to macromanaging, from SCRUM to waterfall, and everything in between.

These methods and setups, and their best practices, can inform the process of Flow Engineering.

The Human-in-the-loop

Lastly, the human-in-the-loop is currently relevant (even with one agent). But long-term, the human will be leaving the system. 

AIs will talk to clients, AIs will talk to APIs, and AIs will talk to AIs. This future, which hopefully is not far away (because just communicating does not equal getting things done; itā€™s not the value we add), will get a lot of things done!

How to setup AI Agents? And, how to Flow Engineer?

Keeping it simple here, take a look at LangChainā€™s LangGraph.

To an agentic future!

(Source) Anthropic released Artifacts. Using Artifacts Claude 3.5 Sonnet can build good output.

In literally 20 seconds, incl. typing, I let it build this functional (!) blood sugar tracking app.

My own creation.

My own creation.

(Source)  See more interesting showcases/ use cases of Claude 3.5 Sonnet.

One of many examples: spaceX landing game. Source: https://x.com/ProperPrompter/status/1804941907363303826

One of many examples: spaceX landing game. Source: https://x.com/ProperPrompter/status/1804941907363303826

(Source) Dell & Super Micro to support Muskā€™s xAI Supercomputer

Elon Musk has partnered with Dell and Super Micro to provide the server racks to house his greatly anticipated supercomputer, for his AI startup xAI.

(Source) Ilya Sutskever is launching Safe Superintelligence Inc., an AI startup that will prioritize safety over ā€˜commercial pressures.ā€™

Andā€¦ a wrap!

Spoiler alert: this week I will explore what else I can build with Claude 3.5 Sonnet (and its Artifacts).

Martin