Why Multi-Agent in Letta?
Letta’s persistent memory architecture makes it uniquely powerful for multi-agent systems. Unlike frameworks where agents lose state after each call, Letta agents maintain:
- Persistent identity — each agent has a consistent persona and history
- Shared memory blocks — agents can read/write shared state without external storage
- Cross-agent messaging — agents communicate asynchronously
This enables architectures where specialized agents collaborate over time, each maintaining their own expertise and context while sharing relevant knowledge.
Core Multi-Agent Concepts
In Letta, multi-agent collaboration happens through:
- Agent-to-Agent Messaging — one agent sends a message to another
- Shared Memory Blocks — agents read from the same memory block
- Orchestration Agents — a manager agent that routes tasks to sub-agents
- Tool Calls to Other Agents — wrapping agents as callable tools
Setting Up a Multi-Agent System
from letta import create_client
from letta.schemas.memory import ChatMemory, Memory
from letta.schemas.block import Block
from letta.schemas.llm_config import LLMConfig
from letta.schemas.embedding_config import EmbeddingConfig
client = create_client() # connects to local Letta server
# Shared LLM config
llm_config = LLMConfig(
model="gpt-4o-mini",
model_endpoint_type="openai",
model_endpoint="https://api.openai.com/v1",
context_window=128000,
)
embed_config = EmbeddingConfig(
embedding_model="text-embedding-3-small",
embedding_endpoint_type="openai",
embedding_endpoint="https://api.openai.com/v1",
embedding_dim=1536,
)
Pattern 1: Orchestrator + Specialist Agents
The most common pattern: one orchestrator routes tasks to specialist agents.
# Create a shared memory block for task status
shared_state = client.create_block(
label="shared_state",
value="Task queue: []\nCompleted: []\nErrors: []",
limit=2000,
)
# Create specialist agents
research_agent = client.create_agent(
name="researcher",
system=(
"You are a specialized research agent. "
"When asked to research a topic, search for information, "
"summarize key findings, and store them in your archival memory. "
"Report back with a structured summary."
),
memory=ChatMemory(
human="",
persona="I am a research specialist with deep web search capabilities.",
),
block_ids=[shared_state.id], # shared access
llm_config=llm_config,
embedding_config=embed_config,
)
writer_agent = client.create_agent(
name="writer",
system=(
"You are a specialized content writer. "
"When given research findings, turn them into clear, engaging articles. "
"Store completed drafts in your archival memory."
),
memory=ChatMemory(
human="",
persona="I am a professional technical writer with a clear, direct style.",
),
block_ids=[shared_state.id],
llm_config=llm_config,
embedding_config=embed_config,
)
# Orchestrator agent
orchestrator = client.create_agent(
name="orchestrator",
system=(
"You are the task orchestrator. You coordinate research_agent and writer_agent. "
"For any content creation task:\n"
"1. First send the research task to the researcher\n"
"2. Once research is done, send findings to the writer\n"
"3. Compile the final output"
),
memory=ChatMemory(
human="",
persona="I am an orchestrator who manages a team of specialized agents.",
),
block_ids=[shared_state.id],
llm_config=llm_config,
embedding_config=embed_config,
)
Sending Messages Between Agents
# Step 1: Orchestrator receives the user task
orchestrator_response = client.send_message(
agent_id=orchestrator.id,
message="Create an article about LLM memory management techniques.",
role="user",
)
print("Orchestrator:", orchestrator_response.messages[-1].text)
# Step 2: Research agent gets the research task
research_response = client.send_message(
agent_id=research_agent.id,
message="Research LLM memory management techniques. Cover: types of memory, popular frameworks, and best practices. Summarize in 500 words.",
role="user",
)
research_findings = research_response.messages[-1].text
print("Research findings:", research_findings[:300])
# Step 3: Writer agent gets the research and produces content
writer_response = client.send_message(
agent_id=writer_agent.id,
message=f"Write a 1000-word article based on this research:\n\n{research_findings}",
role="user",
)
final_article = writer_response.messages[-1].text
print("Final article:", final_article[:500])
Pattern 2: Shared Memory Blocks
The most powerful Letta multi-agent feature: shared memory blocks allow agents to read and write to the same mutable state.
# Create a shared project memory block
project_memory = client.create_block(
label="project_memory",
value="""
# Project: AI Agent Cookbook Articles
## Assigned Topics
- LLM memory management: ASSIGNED to researcher
- RAG pipeline optimization: ASSIGNED to researcher
## Research Complete
(none yet)
## Articles Written
(none yet)
""",
limit=5000,
)
# Both agents share this block
agent_a = client.create_agent(
name="agent_a",
block_ids=[project_memory.id],
# ... other config
)
agent_b = client.create_agent(
name="agent_b",
block_ids=[project_memory.id],
# ... other config
)
# Agent A updates the shared memory block
client.send_message(
agent_id=agent_a.id,
message="Update the project memory to mark 'LLM memory management' research as complete with key findings.",
role="user",
)
# Agent B can now read Agent A's updates
client.send_message(
agent_id=agent_b.id,
message="Check the project memory and write an article for the first completed research topic.",
role="user",
)
Pattern 3: Agents as Tools
Wrap agents as callable tools for seamless orchestration:
from letta.schemas.tool import Tool
def call_research_agent(topic: str) -> str:
"""
Call the research agent to investigate a topic.
Returns a structured research summary.
Args:
topic: The topic to research
Returns:
Research findings as a string
"""
response = client.send_message(
agent_id=research_agent.id,
message=f"Research: {topic}",
role="user",
)
return response.messages[-1].text
# Register as a tool
research_tool = client.create_tool(call_research_agent)
# Attach to orchestrator
client.add_tool_to_agent(
agent_id=orchestrator.id,
tool_id=research_tool.id,
)
# Now orchestrator can call the research agent as a tool
orchestrator_response = client.send_message(
agent_id=orchestrator.id,
message="Create a comprehensive guide on RAG pipelines. Use your tools.",
role="user",
)
Conversation History Across Agents
A key advantage of Letta multi-agent systems: each agent maintains its full conversation history independently.
# Check what each agent remembers
research_memory = client.get_core_memory(research_agent.id)
writer_memory = client.get_core_memory(writer_agent.id)
print("Researcher core memory:", research_memory.get_block("human").value)
print("Writer core memory:", writer_memory.get_block("human").value)
# Search archival memory of an agent
research_archive = client.get_archival_memory(
agent_id=research_agent.id,
query="LLM memory management",
limit=5,
)
for passage in research_archive:
print(passage.text[:200])
Practical Example: Editorial Team
# Build a 3-agent editorial team
agents = {
"researcher": client.create_agent(
name="editorial_researcher",
system="Research topics thoroughly. Store findings in archival memory.",
# ...
),
"writer": client.create_agent(
name="editorial_writer",
system="Write clear technical articles from research briefs.",
# ...
),
"editor": client.create_agent(
name="editorial_editor",
system="Review articles for accuracy, clarity, and completeness. Return edited version.",
# ...
),
}
# Pipeline
def create_article(topic: str) -> str:
# Stage 1: Research
research = client.send_message(
agent_id=agents["researcher"].id,
message=f"Research: {topic}",
role="user",
).messages[-1].text
# Stage 2: Write
draft = client.send_message(
agent_id=agents["writer"].id,
message=f"Write article based on:\n{research}",
role="user",
).messages[-1].text
# Stage 3: Edit
final = client.send_message(
agent_id=agents["editor"].id,
message=f"Edit this article:\n{draft}",
role="user",
).messages[-1].text
return final
article = create_article("Letta memory architecture for production AI agents")
print(article)
Frequently Asked Questions
How do I limit what each agent can see in shared memory?
Shared blocks are all-or-nothing per block. For fine-grained access control, create separate blocks with different access groups. Agent A gets block_a and block_shared; Agent B gets block_b and block_shared.
Can Letta agents run in parallel?
Yes. client.send_message() is synchronous but you can use Python’s asyncio or ThreadPoolExecutor to call multiple agents concurrently. The Letta server handles concurrent agent state safely.
What happens if a sub-agent fails?
Exceptions from client.send_message() propagate normally. Wrap calls in try/except and implement retry logic. The orchestrator agent can also be instructed via its system prompt to handle sub-agent failures gracefully.
How many agents can share a memory block?
There’s no hard limit — any number of agents can reference the same block ID. However, concurrent writes to the same block may conflict. For high-concurrency systems, use separate blocks with periodic merge operations.
Next Steps
- Letta Tool Use and External Integrations — Equip your multi-agent system with external capabilities
- Letta Deployment and Production — Deploy your agent network in production
- CrewAI Multi-Agent Workflows — Compare with CrewAI’s team-based approach