Explainer

What are AI agents, really?

An AI agent is software that takes a goal, decides what to do next, uses tools to do it, checks its work, and reports back. That is the whole idea. Everything else is implementation detail — and where most of the confusion lives.

The simplest possible definition

A chatbot answers a question. An agent works through a task. It might search, open files, write code, call APIs, drive a browser, send an email, or wait for an event before continuing. The interesting word is decide: the model is choosing the next step, not just generating text.

The four things every agent needs

The three flavors people confuse

“Agent” is a stretched word. Three different things sit underneath it:

The mistake is to use “agent” to mean the third one when you actually want the first.

What 2026 changed

Three things finally lined up. Frontier models got reliably good at multi-step planning. Tool-use, structured output, and long-context support all matured. And inference got cheap enough that an agent run that cost $4 a year ago now costs cents. Result: the question changed from “can an agent do this?” to “is the agent cheaper, faster, or more accurate than what we do today?”

What works in practice today

Where the hype goes wrong

Three places, repeatedly:

How to start, today

  1. Pick one workflow with a clear time or cost metric. The more boring the better.
  2. List the smallest set of tools the agent needs. Resist adding more.
  3. Write five test cases before you write the prompt.
  4. Build the simplest version. Run it behind a human approval for two weeks.
  5. Look at the failures. They tell you whether to expand the agent or shrink it.