Advanced Letta Explore 3 min read

Letta Multi-Agent Collaboration: Build Agent Networks

#letta #multi-agent #agent-networks #collaboration #memgpt #orchestration

Why Multi-Agent in Letta?

Letta’s persistent memory architecture makes it uniquely powerful for multi-agent systems. Unlike frameworks where agents lose state after each call, Letta agents maintain:

  • Persistent identity — each agent has a consistent persona and history
  • Shared memory blocks — agents can read/write shared state without external storage
  • Cross-agent messaging — agents communicate asynchronously

This enables architectures where specialized agents collaborate over time, each maintaining their own expertise and context while sharing relevant knowledge.

Core Multi-Agent Concepts

In Letta, multi-agent collaboration happens through:

  1. Agent-to-Agent Messaging — one agent sends a message to another
  2. Shared Memory Blocks — agents read from the same memory block
  3. Orchestration Agents — a manager agent that routes tasks to sub-agents
  4. Tool Calls to Other Agents — wrapping agents as callable tools

Setting Up a Multi-Agent System

from letta import create_client
from letta.schemas.memory import ChatMemory, Memory
from letta.schemas.block import Block
from letta.schemas.llm_config import LLMConfig
from letta.schemas.embedding_config import EmbeddingConfig

client = create_client()  # connects to local Letta server

# Shared LLM config
llm_config = LLMConfig(
    model="gpt-4o-mini",
    model_endpoint_type="openai",
    model_endpoint="https://api.openai.com/v1",
    context_window=128000,
)

embed_config = EmbeddingConfig(
    embedding_model="text-embedding-3-small",
    embedding_endpoint_type="openai",
    embedding_endpoint="https://api.openai.com/v1",
    embedding_dim=1536,
)

Pattern 1: Orchestrator + Specialist Agents

The most common pattern: one orchestrator routes tasks to specialist agents.

# Create a shared memory block for task status
shared_state = client.create_block(
    label="shared_state",
    value="Task queue: []\nCompleted: []\nErrors: []",
    limit=2000,
)

# Create specialist agents
research_agent = client.create_agent(
    name="researcher",
    system=(
        "You are a specialized research agent. "
        "When asked to research a topic, search for information, "
        "summarize key findings, and store them in your archival memory. "
        "Report back with a structured summary."
    ),
    memory=ChatMemory(
        human="",
        persona="I am a research specialist with deep web search capabilities.",
    ),
    block_ids=[shared_state.id],  # shared access
    llm_config=llm_config,
    embedding_config=embed_config,
)

writer_agent = client.create_agent(
    name="writer",
    system=(
        "You are a specialized content writer. "
        "When given research findings, turn them into clear, engaging articles. "
        "Store completed drafts in your archival memory."
    ),
    memory=ChatMemory(
        human="",
        persona="I am a professional technical writer with a clear, direct style.",
    ),
    block_ids=[shared_state.id],
    llm_config=llm_config,
    embedding_config=embed_config,
)

# Orchestrator agent
orchestrator = client.create_agent(
    name="orchestrator",
    system=(
        "You are the task orchestrator. You coordinate research_agent and writer_agent. "
        "For any content creation task:\n"
        "1. First send the research task to the researcher\n"
        "2. Once research is done, send findings to the writer\n"
        "3. Compile the final output"
    ),
    memory=ChatMemory(
        human="",
        persona="I am an orchestrator who manages a team of specialized agents.",
    ),
    block_ids=[shared_state.id],
    llm_config=llm_config,
    embedding_config=embed_config,
)

Sending Messages Between Agents

# Step 1: Orchestrator receives the user task
orchestrator_response = client.send_message(
    agent_id=orchestrator.id,
    message="Create an article about LLM memory management techniques.",
    role="user",
)
print("Orchestrator:", orchestrator_response.messages[-1].text)

# Step 2: Research agent gets the research task
research_response = client.send_message(
    agent_id=research_agent.id,
    message="Research LLM memory management techniques. Cover: types of memory, popular frameworks, and best practices. Summarize in 500 words.",
    role="user",
)
research_findings = research_response.messages[-1].text
print("Research findings:", research_findings[:300])

# Step 3: Writer agent gets the research and produces content
writer_response = client.send_message(
    agent_id=writer_agent.id,
    message=f"Write a 1000-word article based on this research:\n\n{research_findings}",
    role="user",
)
final_article = writer_response.messages[-1].text
print("Final article:", final_article[:500])

Pattern 2: Shared Memory Blocks

The most powerful Letta multi-agent feature: shared memory blocks allow agents to read and write to the same mutable state.

# Create a shared project memory block
project_memory = client.create_block(
    label="project_memory",
    value="""
# Project: AI Agent Cookbook Articles

## Assigned Topics
- LLM memory management: ASSIGNED to researcher
- RAG pipeline optimization: ASSIGNED to researcher

## Research Complete
(none yet)

## Articles Written
(none yet)
""",
    limit=5000,
)

# Both agents share this block
agent_a = client.create_agent(
    name="agent_a",
    block_ids=[project_memory.id],
    # ... other config
)

agent_b = client.create_agent(
    name="agent_b",
    block_ids=[project_memory.id],
    # ... other config
)

# Agent A updates the shared memory block
client.send_message(
    agent_id=agent_a.id,
    message="Update the project memory to mark 'LLM memory management' research as complete with key findings.",
    role="user",
)

# Agent B can now read Agent A's updates
client.send_message(
    agent_id=agent_b.id,
    message="Check the project memory and write an article for the first completed research topic.",
    role="user",
)

Pattern 3: Agents as Tools

Wrap agents as callable tools for seamless orchestration:

from letta.schemas.tool import Tool

def call_research_agent(topic: str) -> str:
    """
    Call the research agent to investigate a topic.
    Returns a structured research summary.

    Args:
        topic: The topic to research
    Returns:
        Research findings as a string
    """
    response = client.send_message(
        agent_id=research_agent.id,
        message=f"Research: {topic}",
        role="user",
    )
    return response.messages[-1].text

# Register as a tool
research_tool = client.create_tool(call_research_agent)

# Attach to orchestrator
client.add_tool_to_agent(
    agent_id=orchestrator.id,
    tool_id=research_tool.id,
)

# Now orchestrator can call the research agent as a tool
orchestrator_response = client.send_message(
    agent_id=orchestrator.id,
    message="Create a comprehensive guide on RAG pipelines. Use your tools.",
    role="user",
)

Conversation History Across Agents

A key advantage of Letta multi-agent systems: each agent maintains its full conversation history independently.

# Check what each agent remembers
research_memory = client.get_core_memory(research_agent.id)
writer_memory = client.get_core_memory(writer_agent.id)

print("Researcher core memory:", research_memory.get_block("human").value)
print("Writer core memory:", writer_memory.get_block("human").value)

# Search archival memory of an agent
research_archive = client.get_archival_memory(
    agent_id=research_agent.id,
    query="LLM memory management",
    limit=5,
)
for passage in research_archive:
    print(passage.text[:200])

Practical Example: Editorial Team

# Build a 3-agent editorial team
agents = {
    "researcher": client.create_agent(
        name="editorial_researcher",
        system="Research topics thoroughly. Store findings in archival memory.",
        # ...
    ),
    "writer": client.create_agent(
        name="editorial_writer",
        system="Write clear technical articles from research briefs.",
        # ...
    ),
    "editor": client.create_agent(
        name="editorial_editor",
        system="Review articles for accuracy, clarity, and completeness. Return edited version.",
        # ...
    ),
}

# Pipeline
def create_article(topic: str) -> str:
    # Stage 1: Research
    research = client.send_message(
        agent_id=agents["researcher"].id,
        message=f"Research: {topic}",
        role="user",
    ).messages[-1].text

    # Stage 2: Write
    draft = client.send_message(
        agent_id=agents["writer"].id,
        message=f"Write article based on:\n{research}",
        role="user",
    ).messages[-1].text

    # Stage 3: Edit
    final = client.send_message(
        agent_id=agents["editor"].id,
        message=f"Edit this article:\n{draft}",
        role="user",
    ).messages[-1].text

    return final

article = create_article("Letta memory architecture for production AI agents")
print(article)

Frequently Asked Questions

How do I limit what each agent can see in shared memory?

Shared blocks are all-or-nothing per block. For fine-grained access control, create separate blocks with different access groups. Agent A gets block_a and block_shared; Agent B gets block_b and block_shared.

Can Letta agents run in parallel?

Yes. client.send_message() is synchronous but you can use Python’s asyncio or ThreadPoolExecutor to call multiple agents concurrently. The Letta server handles concurrent agent state safely.

What happens if a sub-agent fails?

Exceptions from client.send_message() propagate normally. Wrap calls in try/except and implement retry logic. The orchestrator agent can also be instructed via its system prompt to handle sub-agent failures gracefully.

How many agents can share a memory block?

There’s no hard limit — any number of agents can reference the same block ID. However, concurrent writes to the same block may conflict. For high-concurrency systems, use separate blocks with periodic merge operations.

Next Steps

Related Articles