Agents use AI technology in a few different ways:
The Plan
Agents are typically initiated by a textual command like “create an invoice for the order and send it to the buyer via email.” Agents take this text and use LLMs to create a step-by-step execution plan. The plan is validated, remediated and ultimately executed. By using an LLM for planning, the agent can convert the natural language text into programmatic decisions.
Tools
As a plan executes, AI agents can leverage other software tools (like APIs and databases) through two primary mechanisms: LLMs generating executable code (tool builders) or producing structured configurations (such as function calling specifications). This provides a flexible framework for AI systems to integrate with and control external software resources.
Text Reasoning & Manipulation
Agents will almost always use LLMs to perform some analysis on the text. This often includes tasks like breaking text into sections, summarization, formatting, rewriting in a specific style, etc.
Modality
Agents often use AI models for voice, image, videos, and other domain specific problems. For example, an agent might take a document as input, and create a summarized voice file as the output.
Agents leverage LLMs to perform the ‘thinking’ for many of their tasks as well as to interact with the outside world by reading, speaking, listening, etc.
Text processing Fundamentally, LLMs were designed to be language processing engines. This is what they’re great at, and should be used for.