AI agent

Agentic systems are LLM-based systems that can act autonomously to achieve goals. AI agents, also known as software agents, are a key component of many AI applications, from personal assistants to autonomous vehicles and software automation.

Common components of an AI agent include:

  • Models

  • Skills and tools

  • Data, knowledge, and memory

  • Input and output interfaces (for example, chat, audio, or APIs)

  • Guardrails

  • Orchestration

What makes an agent

The term agent predates large language models. In computer science, a software agent is a program that acts on behalf of a user or another program with some degree of authority to decide for itself what to do. What distinguishes an agent from an ordinary program — or from a single LLM call — is a cluster of properties:

  • Autonomy — it acts without constant human intervention, deciding its own next step.

  • Reactivity — it perceives its environment and responds to it. For an LLM agent, the "environment" is reached through tools (see MCP).

  • Proactiveness — it pursues goals on its own initiative, not only in direct response to a prompt.

  • Social ability — it can communicate and coordinate with other agents (see A2A).

  • Persistence — it runs as an ongoing process working toward a goal, rather than executing once and stopping.

  • Mobility (in some definitions) — it can relocate its execution across machines.

A plain script has none of these; a single model inference has only the beginnings of reactivity. An agent is what emerges when a model is wrapped in a loop that gives it tools, goals, and the autonomy to use them — see Models, agents, and harnesses, below.

This idea has a long history. The roots reach back to Carl Hewitt’s Actor model (1977) and were popularized by Apple’s "Knowledge Navigator" concept video (1987), then developed through decades of research into distributed artificial intelligence and multi-agent systems. Today’s LLM-based agents are a powerful new implementation of this much older idea: the LLM supplies the reasoning that earlier agents lacked.

Models, agents, and harnesses

To understand agentic systems, it helps to distinguish three closely related concepts:

  • Model: The underlying LLM — a reasoning engine that transforms input tokens into output tokens. It has no persistent state, no tools, and no memory. It exists only during inference.

  • Agent: A model running within an active context — equipped with tools, a system prompt, and a conversation history. The agent is the model in operation.

  • Harness: The infrastructure layer that creates and manages agents. It provides the tools the agent can use (file system access, shell execution, web search, and others), manages the context window, enforces guardrails, handles memory persistence, and orchestrates multi-step task execution.

The harness shapes what an agent can do far more than the choice of model alone. Two agents powered by the same model but run inside different harnesses will behave very differently, because the harness defines the agent’s action space and constraints.

There are two broad categories of harness in practice:

  • Developer-facing harnesses — Tools used by software developers to delegate coding tasks to an agent: file editing, shell execution, running tests, reading documentation. The developer specifies a goal; the harness plans and executes a sequence of steps to reach it autonomously. Examples include Claude Code, OpenCode, and Aider. These are distinct from IDE coding assistants (GitHub Copilot, Cursor), which are inline tools where the developer drives the interaction at a fine-grained level. Developer-facing harnesses operate at the task level; IDE assistants operate at the line or block level.

  • Production/infrastructure harnesses — Platforms for deploying, running, and governing agents in production systems. These manage agent lifecycles, enforce permission boundaries, wire observability pipelines, and coordinate multi-agent workflows. Examples include Microsoft Agent Framework, n8n, and OpenClaw.

Harness engineering

Because the harness shapes an agent’s behaviour more than the choice of model does, designing and operating it well is a discipline in its own right — harness engineering. A well-designed harness gives the agent its tools, context, guardrails, and feedback loops, and constrains the solution space so the agent is more likely to do the right thing, and so that when it does go wrong, the problem is caught early and cheaply. See harness engineering for a full treatment.

Agent development frameworks

Frameworks and SDKs provide the infrastructure for building and orchestrating agents. It’s important to distinguish between two types:

  • Development frameworks — Code libraries and SDKs used to write agent code. Examples: LangChain, LlamaIndex, Semantic Kernel. These are tools for building agents.

  • Agent harnesses — Runtime environments and orchestration systems that execute agents, manage their lifecycle, control permissions, provide observability, and coordinate multi-agent workflows. Some of these blur the line by offering both development and runtime capabilities. Examples: Microsoft Agent Framework, n8n, OpenClaw.

Many frameworks require a harness to run agents in production, while some frameworks include built-in harness capabilities. The distinction helps clarify whether you’re looking for a tool to write agent code or an environment to run agents. Designing and operating these harnesses is a discipline in its own right — see harness engineering.

General-purpose development frameworks

  • LangChain — Code library for building agents and LLM-powered applications. Emphasizes model interoperability, rapid prototyping, and an extensive ecosystem of integrations. Available in Python and JavaScript/TypeScript. Its LangGraph component handles multi-agent orchestration. (Repo.)

  • Pydantic AI — Typed agent framework from the team behind Pydantic. Uses Pydantic models to enforce safe, predictable, structured outputs from LLMs. Python.

  • PocketFlow — Deliberately minimalist agent framework whose core is around 100 lines of code. Models agentic applications as a graph of nodes, with no heavy dependencies.

  • Microsoft Semantic Kernel — Code library for building and orchestrating AI agents and multi-agent systems. Emphasizes plugin architecture and enterprise features. Available in Python, .NET, and Java.

Data and RAG frameworks

  • LlamaIndex — Code library specialized in data ingestion, organization, and retrieval-augmented generation (RAG). Bridges enterprise data and LLM capabilities with 300+ integration packages.

  • Haystack — Open-source framework from deepset for building search, RAG, and agent pipelines from modular, swappable components. Python.

  • Docling — Document-ingestion library that parses PDF, DOCX, HTML, and other formats into structured output for RAG pipelines. A building block rather than an agent framework in its own right.

Vendor-specific SDKs and harnesses

  • [Anthropic Claude SDK / Agent SDK]

  • OpenAI Agents SDK — Code library from OpenAI for building agents with their models.

  • Google Agent Development Kit (ADK) — Code library from Google for agent development.

  • Microsoft Agent Framework — Production-grade harness for running multi-agent workflows. Includes workflow orchestration, observability, YAML-based agent definitions, and cloud deployment options. Available in Python and C#/.NET.

Low-code platforms and foundations

  • n8n — Low-code automation harness with AI starter kit for bootstrapping and running agent workflows.

  • OpenClaw — Open-source, self-hosted agent harness with 15+ messaging channel integrations and support for 40+ LLM providers.

  • Hermes Agent — OpenClaw-like agent harness, regarded for its speed and self-improving capabilities.

  • Pi — Code library providing minimal core components for building custom agents. Requires extensions like pi-schedule-prompt and @pi-agents/loop to implement autonomous tasks through recurring prompts, cron-style jobs, and dynamic pacing. Can also be used as a basic terminal-based coding assistant out-of-the-box.

See also AI coding assistants, many of which have agentic properties, too.

Agent orchestration platforms

Above individual agents sits the orchestration layer, which coordinates teams of agents working together toward shared goals — managing scheduling, delegation, budgets, and governance across a whole system of autonomous workers. Platforms in this space include Claude Cowork, AG2, CrewAI, LangGraph, MetaGPT, and SuperAGI. See AI agent orchestration for the full landscape.

Benchmarks

  • Terminal Bench benchmarks AI agent front-ends (user interfaces) for terminal environments, backed by various models.


References