Agents¶
1. Why this matters¶
Chains are deterministic DAGs — the steps are fixed at build time. Agents are dynamic: the LLM decides at each step what to do next based on what it's seen so far. This lets you build assistants that handle open-ended tasks like:
- "Find the latest LangChain release notes and summarize the breaking changes."
- "Look up customer X, check their open tickets, and draft a reply."
- "Calculate this, search that, then write a report."
You couldn't write a single chain for those without overfitting to one example.
2. Mental model¶
The agent loop = ReAct: Reason, Act, Observe, repeat.
flowchart TD
Q[User question] --> T[LLM: Think — what should I do?]
T -->|need a tool| A[Call tool]
A --> O[Observation: tool result]
O --> T
T -->|enough info| F[Final answer]
F --> R[User]
Each iteration:
1. Reason — LLM looks at the conversation so far + tool results + the question.
2. Act — emits a tool_call (or decides it's done).
3. Observe — your code runs the tool, appends ToolMessage to history.
4. Repeat until the LLM stops calling tools.
A hard cap (max iterations) prevents runaway loops.
3. Architecture / Flow¶
The state machine inside create_react_agent:
flowchart TB
Start([Start]) --> Agent[Agent Node:<br/>LLM with bound tools]
Agent -->|tool_calls present| Tools[Tools Node:<br/>execute each tool_call]
Tools --> Agent
Agent -->|no tool_calls| End([END: return final answer])
That's it — two nodes, conditional edge based on whether the AI message contains tool calls.
This is exactly the kind of cyclic graph LangChain's plain Runnable chains can't express, which is why agents now live in LangGraph.
4. Core concepts¶
- Agent loop — the think→act→observe cycle, capped by max iterations.
create_react_agent(model, tools, ...)— the modern factory function (fromlanggraph.prebuilt). Returns a compiled LangGraph agent.- System prompt — sets agent personality and rules (when to use tools, how to format output).
- Checkpointer — persistent state across invocations. Lets agents resume, replay, or be paused for human approval.
- Recursion limit — hard cap on iterations (default ~25) so a buggy agent can't burn through tokens.
- Tool error handling —
handle_tool_errors=Truelets the agent see error strings and self-correct.
5. Code — minimal working example¶
The modern way — LangGraph's create_react_agent:
from langgraph.prebuilt import create_react_agent
from langchain_openai import ChatOpenAI
from langchain_core.tools import tool
@tool
def add(a: int, b: int) -> int:
"""Add two integers."""
return a + b
@tool
def multiply(a: int, b: int) -> int:
"""Multiply two integers."""
return a * b
agent = create_react_agent(
model=ChatOpenAI(model="gpt-4o-mini", temperature=0),
tools=[add, multiply],
)
result = agent.invoke({
"messages": [("user", "What is (3 + 4) * 5?")]
})
print(result["messages"][-1].content) # "(3 + 4) * 5 = 35"
6. Code — real-world pattern¶
Web-search agent with persistent memory and a system prompt:
from langgraph.prebuilt import create_react_agent
from langgraph.checkpoint.memory import MemorySaver
from langchain_openai import ChatOpenAI
from langchain_community.tools.tavily_search import TavilySearchResults
from langchain_core.tools import tool
import requests
search = TavilySearchResults(max_results=4)
@tool
def get_weather(city: str) -> str:
"""Get current weather for a city. Use ONLY for weather questions."""
r = requests.get(f"https://wttr.in/{city}?format=3", timeout=5)
return r.text
agent = create_react_agent(
model=ChatOpenAI(model="gpt-4o-mini", temperature=0),
tools=[search, get_weather],
prompt=(
"You are a research assistant. Use tools when you need fresh data. "
"Always cite sources. If unsure, say so."
),
checkpointer=MemorySaver(), # in-memory state — swap for Postgres/Redis in prod
)
config = {"configurable": {"thread_id": "user-42"}}
print(agent.invoke(
{"messages": [("user", "What's the weather in Paris and the latest news there?")]},
config=config,
)["messages"][-1].content)
# Follow-up uses the same thread, so the agent remembers prior context
print(agent.invoke(
{"messages": [("user", "And in Tokyo?")]},
config=config,
)["messages"][-1].content)
Stream the agent's intermediate steps (great for UIs and debugging):
for event in agent.stream(
{"messages": [("user", "What is 23 * 17, then add 100?")]},
config=config,
stream_mode="updates",
):
print(event)
# Yields: tool calls, tool results, final answer — in order.
Legacy AgentExecutor (still in docs, deprecated for new code):
# DEPRECATED — kept here for recognition, not recommendation.
from langchain.agents import AgentExecutor, create_react_agent # langchain (not langgraph)
from langchain import hub
prompt = hub.pull("hwchase17/react")
agent = create_react_agent(llm=ChatOpenAI(), tools=[search], prompt=prompt)
executor = AgentExecutor(agent=agent, tools=[search], verbose=True, max_iterations=5)
executor.invoke({"input": "..."})
Prefer the LangGraph version above for anything new.
7. Common pitfalls¶
- ❗ Using agents when a chain would do. Agents are slower, costlier, and harder to debug. If the steps are known, write a chain.
- ❗ No
recursion_limit. A buggy agent can loop forever. Always cap iterations (LangGraph defaults to 25; lower for prod). - ❗ Vague tool descriptions. Agents pick wrong tools when descriptions overlap. Sharpen them.
- ❗ No human-in-the-loop for destructive tools.
delete_user,send_email,transfer_funds— these need a confirmation step. LangGraph supports interrupts cleanly. - ❗ In-memory checkpointer in production. State vanishes on restart. Use
PostgresSaver/SqliteSaver. - ❗ No tracing. Agents are emergent — without LangSmith you can't debug why it took 4 wrong turns.
- ❗ 15+ tools. Selection accuracy drops fast. Either route requests to a sub-agent with a smaller tool set, or split into multiple agents.
8. When to use vs not use¶
| Use an agent when | Use a plain chain when |
|---|---|
| Steps to solve aren't known in advance | The pipeline is the same for every request |
| The LLM needs to decide if and which tool to call | You know which tools to call and in what order |
| You want exploratory behavior | You need predictable latency / cost |
| Tasks span multiple domains | Single domain, simple I/O |
For anything beyond a basic single-agent loop — multi-agent, human-in-the-loop, durable workflows, complex routing — drop into LangGraph directly. create_react_agent is the easy on-ramp; LangGraph itself is what you graduate to.
9. Cheatsheet¶
# Modern — LangGraph
from langgraph.prebuilt import create_react_agent
from langgraph.checkpoint.memory import MemorySaver
from langgraph.checkpoint.postgres import PostgresSaver # production
agent = create_react_agent(
model=ChatOpenAI(model="gpt-4o-mini", temperature=0),
tools=[tool_a, tool_b],
prompt="System message string OR a callable that builds messages",
checkpointer=MemorySaver(), # persistent state
interrupt_before=["tools"], # human-in-the-loop
debug=True,
)
# Invoke
result = agent.invoke(
{"messages": [("user", "your question")]},
config={"configurable": {"thread_id": "abc"},
"recursion_limit": 15},
)
# Stream
for ev in agent.stream({"messages": [...]}, config=cfg, stream_mode="updates"):
...
# Inspect / replay
state = agent.get_state(cfg) # current state
history = agent.get_state_history(cfg) # all checkpoints
agent.update_state(cfg, {...}) # rewrite state (advanced)
# Legacy — kept only for understanding old code
from langchain.agents import (
AgentExecutor,
create_react_agent, # NOT the same as langgraph's
create_tool_calling_agent,
create_openai_tools_agent,
)
# AgentExecutor(agent=..., tools=..., max_iterations=..., handle_parsing_errors=True)
10. Q&A — recall test¶
-
Q: What is the ReAct pattern? A: A loop: the model Reasons (thinks aloud), Acts (calls a tool), Observes the result, then reasons again until it can give a final answer.
-
Q: When NOT to use an agent? A: Whenever you know the steps. Agents add cost, latency, and unpredictability. A deterministic LCEL chain is better when the pipeline is fixed.
-
Q:
create_react_agentinlangchain.agentsvslanggraph.prebuilt— which one? A:langgraph.prebuilt.create_react_agent. Thelangchain.agentsversion +AgentExecutoris deprecated in favor of the LangGraph implementation. -
Q: How do you cap an agent so it can't loop forever? A: Pass
config={"recursion_limit": N}to.invoke()/.stream(). Default is ~25; in production set lower (5–10) and surface a clear error if exceeded. -
Q: What does a checkpointer do? A: Persists agent state (message history, scratchpad, partial results) so the agent can be resumed across restarts, paused for human approval, or replayed for debugging.
-
Q: When is a single agent not enough? A: When the task spans distinct expertise (research + coding + writing), or needs > ~10 tools, or has long autonomous loops. That's multi-agent / supervisor-worker territory — build it as a LangGraph graph directly.
Practice¶
What does this print?
Expected: True
Set a max iteration limit so the agent doesn't loop forever
Expected: True
Quiz — Quick check¶
What you remember
Q1. What does a ReAct agent do at each step?
- Reasons about what to do (thought), calls a tool (action), observes the result, repeats until done
- Generates an answer directly
- Trains itself
- Caches results
Why: ReAct = Reasoning + Acting. The agent thinks "I need to know X", calls a tool to find X, sees the result, decides next. The loop continues until the agent has enough info to answer.
Q2. Why set a max_iterations limit on an agent?
- To save memory
- To prevent infinite loops when the agent gets confused or hits an unrecoverable error
- Required by LangChain
- To improve accuracy
Why: An agent can get stuck calling the same tool repeatedly, or interpret tool errors in a loop. Set
max_iterations=10(or similar) as a safety net. Better to fail with "couldn't complete" than to spin forever.
Q3. When should you build a multi-agent system instead of a single agent?
- When the task spans distinct expertise (research + coding + writing) or needs >10 tools or has long autonomous loops
- Never — single agents are always better
- When latency matters
- Required for production
Why: Multi-agent splits responsibility. A "research agent" with web tools and a "coding agent" with code tools each gets fewer, sharper tools — better tool choice quality. Build as a LangGraph graph for control.
Common doubts¶
ReAct, OpenAI function-calling, or tool-calling — which?
Modern OpenAI/Anthropic native tool-calling is more reliable than text-based ReAct. Use it when available. LangChain's create_tool_calling_agent wraps this. ReAct (text-based) is still useful for models that don't support tool-calling natively.
Why are agents so slow and expensive?
Each iteration is a full LLM call. A typical agent task takes 5-15 iterations = 5-15 API calls. Slow because LLMs are slow; expensive because each call has prompt + response tokens. Use cheaper models (gpt-4o-mini) for routing, expensive ones for hard reasoning.
Should I use LangChain or LangGraph for agents?
LangGraph for new projects. LangChain's older agent abstractions (AgentExecutor) are simpler but less flexible. LangGraph gives you explicit state, branching, retries, and human-in-the-loop. The LangChain team is moving everyone to LangGraph.