The AI landscape has shifted dramatically. While chatbots dominated for years, we’re now witnessing something far more powerful:
autonomous AI agents that don’t just respond—they plan, execute, and accomplish goals .
Chatbot vs AI Agent
Aspect Chatbot AI Agent
Purpose Respond to prompts Achieve goals autonomously
Behavior Reactive (one-shot) Proactive (multi-step)
Planning None Breaks goals into subtasks
Tools No external tools Uses APIs, DBs, code execution
Memory Limited context Short + long-term + episodic
Iteration None (single response) ReAct loop until complete
Example “How to analyze sales data?” Actually queries DB, creates charts, generates report
ReAct Loop: The Cognitive Backbone
AI Agent System Architecture
Building an Agent with LangChain
from langchain.agents import Tool, AgentExecutor, create_react_agent
from langchain_openai import ChatOpenAI
from langchain import hub
from langchain_community.tools import DuckDuckGoSearchRun
import pandas as pd
# Initialize LLM
llm = ChatOpenAI(model="gpt-4", temperature=0)
# Define tools
def analyze_sales(query: str) -> str:
"""Analyze sales data from database."""
df = pd.read_csv("sales_q3.csv")
if "total" in query.lower():
return f"Total Q3 sales: ${df['amount'].sum():,.2f}"
elif "top" in query.lower():
top_products = df.groupby('product')['amount'].sum().nlargest(5)
return f"Top products: {top_products.to_dict()}"
else:
return df.describe().to_string()
def create_chart(data: str) -> str:
"""Create visualization from data."""
return "Chart created: sales_chart.png"
# Register tools
tools = [
Tool(
name="AnalyzeSales",
func=analyze_sales,
description="Analyze Q3 sales data. Input: query like 'total' or 'top products'"
),
Tool(
name="CreateChart",
func=create_chart,
description="Create visualization. Input: data description"
),
DuckDuckGoSearchRun()
]
# Create agent
prompt = hub.pull("hwchase17/react")
agent = create_react_agent(llm, tools, prompt)
agent_executor = AgentExecutor(
agent=agent,
tools=tools,
verbose=True,
max_iterations=5
)
# Execute
result = agent_executor.invoke({
"input": "Analyze Q3 sales, find top products, and create a chart"
})
print(result["output"])
Multi-Agent System with LangGraph
from langgraph.graph import StateGraph, END
from typing import TypedDict, Annotated
import operator
# Define state
class AgentState(TypedDict):
task: str
plan: list[str]
current_step: int
results: Annotated[list, operator.add]
final_output: str
# Specialized agents
def planner_agent(state: AgentState) -> AgentState:
"""Break task into subtasks."""
plan = [
"Query database for Q3 sales",
"Analyze top performing products",
"Create visualization",
"Generate presentation"
]
state["plan"] = plan
state["current_step"] = 0
return state
def data_analyst_agent(state: AgentState) -> AgentState:
"""Execute data analysis."""
result = "Q3 sales: $1.2M, top product: Widget A"
state["results"].append(result)
state["current_step"] += 1
return state
def visualization_agent(state: AgentState) -> AgentState:
"""Create charts and graphs."""
result = "Created: bar_chart.png, trend_line.png"
state["results"].append(result)
state["current_step"] += 1
return state
def presentation_agent(state: AgentState) -> AgentState:
"""Generate final presentation."""
state["final_output"] = "presentation.pptx created"
return state
# Build workflow graph
workflow = StateGraph(AgentState)
workflow.add_node("planner", planner_agent)
workflow.add_node("data_analyst", data_analyst_agent)
workflow.add_node("visualizer", visualization_agent)
workflow.add_node("presentation", presentation_agent)
workflow.set_entry_point("planner")
workflow.add_edge("planner", "data_analyst")
workflow.add_edge("data_analyst", "visualizer")
workflow.add_edge("visualizer", "presentation")
workflow.add_edge("presentation", END)
app = workflow.compile()
# Execute
result = app.invoke({
"task": "Analyze Q3 sales and create presentation",
"plan": [],
"current_step": 0,
"results": [],
"final_output": ""
})
print(result["final_output"])
Memory Implementation with Vector DB
from langchain.memory import ConversationBufferMemory, VectorStoreRetrieverMemory
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma
# Short-term memory
short_term_memory = ConversationBufferMemory(
memory_key="chat_history",
return_messages=True,
max_token_limit=2000
)
# Long-term memory (vector store)
embeddings = OpenAIEmbeddings()
vectorstore = Chroma(
collection_name="agent_memory",
embedding_function=embeddings,
persist_directory="./memory"
)
long_term_memory = VectorStoreRetrieverMemory(
retriever=vectorstore.as_retriever(search_kwargs={"k": 5}),
memory_key="long_term_context"
)
# Store information
long_term_memory.save_context(
{"input": "What were our Q3 sales goals?"},
{"output": "Q3 goal was $1.5M, we achieved $1.2M (80% of target)"}
)
# Retrieve relevant memories
relevant_memories = long_term_memory.load_memory_variables(
{"prompt": "How did we perform this quarter?"}
)
print(relevant_memories)
Production Frameworks Comparison
Framework Best For Key Features Learning Curve
LangChain General-purpose agents Huge ecosystem, many tools, RAG Medium
LangGraph Complex multi-agent workflows State management, cycles, checkpoints Medium-High
AutoGen Multi-agent conversations Group chat, code execution, human-in-loop Medium
CrewAI Role-based teams Simple API, agent collaboration Low
LlamaIndex Data-focused agents RAG, indexing, query engines Low-Medium
Semantic Kernel Enterprise .NET/Python Microsoft-backed, planner, plugins Medium
Best Practices
Limit iterations: Set max 5-10 steps to prevent infinite loops
Tool descriptions matter: Clear, specific descriptions help LLM choose correctly
Add human-in-the-loop: Require approval for destructive actions
Implement retry logic: Tools can fail, handle errors gracefully
Monitor costs: Each LLM call costs money, track usage
Use structured output: Pydantic models for reliable tool calls
Log everything: Trace reasoning + actions for debugging
Start simple: Single agent first, add multi-agent later
Test edge cases: What if tool fails? LLM hallucinates?
Memory pruning: Clean old, irrelevant memories periodically
Production Considerations
Latency: Each ReAct iteration = 1 LLM call (~2-5s). Complex tasks take minutes.
Cost: 10-iteration task with GPT-4 = $0.30-0.50. Scale this by 1000s of users.
Reliability: LLMs can fail, get stuck, or hallucinate. Need fallbacks.
Security: Agents execute code, call APIs. Sandbox and validate everything.
Observability: Use LangSmith, Weights & Biases, or custom logging.
Rate limits: OpenAI/Anthropic have strict limits. Implement queues.
References
Related
Discover more from C4: Container, Code, Cloud & Context
Subscribe to get the latest posts sent to your email.