What Is MetaGPT?
MetaGPT is an open-source multi-agent framework with a unique twist: it models a software company, not just an AI assistant. Each agent plays a specific corporate role — Product Manager, Architect, Engineer, QA Engineer — and they collaborate via a shared message board to build software from a single natural-language requirement.
You give MetaGPT one sentence: "Create a snake game in Python". It outputs:
- A PRD (Product Requirements Document)
- A system design document with architecture decisions
- API specifications
- Fully working Python code
- Unit tests
- A test report
All generated sequentially, by different agents, each checking the previous agent’s work.
MetaGPT was first published in a paper by Sirui Hong et al. in 2023. The repository reached 30,000+ GitHub stars within months, making it one of the most-watched AI agent projects ever. The framework has since evolved significantly and is actively maintained.
The Software Company Metaphor
The key insight behind MetaGPT is that human teams use Standard Operating Procedures (SOPs). A product manager doesn’t just hand a vague idea to engineers — there’s a requirements document, design review, code review, and QA process. These SOPs exist because they prevent errors and miscommunication.
MetaGPT encodes the same SOPs for AI agents:
User Requirement
↓
Product Manager → PRD (requirements)
↓
Architect → System Design + API specs
↓
Engineer(s) → Code implementation
↓
QA Engineer → Unit tests + test report
Each role passes structured, typed outputs to the next — not raw text. This structured communication is what makes MetaGPT more reliable than “have one agent do everything.”
Built-in Roles
Product Manager
Turns a one-line requirement into a detailed PRD with:
- User stories
- Competitive analysis
- Feature requirements
- UI/UX notes
Architect
Reads the PRD and produces:
- System design (components, data flow)
- Technology stack recommendations
- API interface definitions
Engineer
Reads the design documents and writes:
- Implementation code (Python, JavaScript, or other)
- Code that matches the API specs exactly
QA Engineer
Writes unit tests for the code and runs them, producing a test report.
Installation
Requirements: Python 3.9+, Node.js 16+
pip install metagpt
Initialize the configuration file:
metagpt --init-config
This creates ~/.metagpt/config2.yaml. Edit it to add your API key:
llm:
api_type: "openai"
model: "gpt-4o-mini"
api_key: "sk-your-key-here"
Or for Anthropic Claude:
llm:
api_type: "anthropic"
model: "claude-sonnet-4-6-20250514"
api_key: "sk-ant-your-key-here"
Running Your First Project
Command Line
metagpt "Create a CLI tool that converts Markdown files to HTML"
MetaGPT will create a workspace directory with all the generated files:
workspace/
cli_markdown_converter/
docs/
prd.md
system_design.md
api_spec.md
src/
converter.py
utils.py
tests/
test_converter.py
README.md
Python API
import asyncio
from metagpt.software_company import generate_repo, ProjectRepo
async def main():
repo: ProjectRepo = await generate_repo(
"Create a REST API with FastAPI for a todo list app"
)
print(repo) # prints the directory structure and files
asyncio.run(main())
Using Individual Roles
You can use MetaGPT roles individually instead of running the full company:
from metagpt.roles import ProductManager, Engineer, QaEngineer
from metagpt.context import Context
import asyncio
async def main():
context = Context()
pm = ProductManager(context=context)
engineer = Engineer(context=context)
qa = QaEngineer(context=context)
# Generate PRD
prd = await pm.run("Create a URL shortener service")
print("PRD generated")
# Generate code from PRD
code = await engineer.run(prd)
print("Code generated")
# Generate tests
test_report = await qa.run(code)
print("Tests run")
asyncio.run(main())
Creating Custom Roles
MetaGPT’s real power is extensibility — you can define custom roles for your domain:
from metagpt.roles.role import Role
from metagpt.actions import Action
class WriteAPIDoc(Action):
name: str = "WriteAPIDoc"
async def run(self, code: str) -> str:
prompt = f"Write API documentation in OpenAPI 3.0 YAML format for this code:\n\n{code}"
return await self._aask(prompt)
class APIDocWriter(Role):
name: str = "APIDocWriter"
profile: str = "API Documentation Specialist"
def __init__(self, **kwargs):
super().__init__(**kwargs)
self.set_actions([WriteAPIDoc])
async def _act(self) -> str:
code = self.get_memories()[0].content
return await self.rc.todo.run(code)
MetaGPT vs Other Frameworks
| Feature | MetaGPT | CrewAI | AutoGen |
|---|---|---|---|
| Primary focus | Software development | General workflows | Conversational multi-agent |
| Agent coordination | SOP-based (structured) | Task-based | Conversation-based |
| Output type | Code + docs + tests | Any task output | Text + code |
| Best for | Full software projects | Business workflows | Research, Q&A |
| Setup complexity | Medium | Low | Medium |
MetaGPT shines when you want to generate complete, structured software artifacts — not just code snippets. For general task automation, CrewAI or LangChain agents are simpler.
Strengths and Limitations
Strengths:
- Produces structured, professional-quality documents alongside code
- Role separation prevents one agent from becoming a bottleneck
- Strong for greenfield projects where you have a clear requirement
- The SOP model catches more logical errors than single-agent approaches
Limitations:
- Heavier setup than single-agent tools
- Long generation time (5–15 minutes for a full project)
- LLM quality greatly affects output — GPT-4o produces significantly better results than GPT-4o-mini
- Struggles with brownfield (existing) codebases — designed for new projects
Frequently Asked Questions
How does MetaGPT differ from simply asking ChatGPT to write code?
ChatGPT writes code in one shot with no structured process. MetaGPT runs a multi-stage pipeline: requirements → design → code → tests. Each stage involves different “specialists” reviewing the previous stage’s output. This catches more errors and produces more complete, documented projects. The output quality difference is significant for anything more complex than a single script.
What language does MetaGPT generate code in?
Primarily Python, but it can generate JavaScript, TypeScript, Go, and others depending on what you specify in the requirement. Add “in TypeScript using React” or “in Go” to your requirement string.
Does MetaGPT run the generated code?
The QA Engineer role runs the generated tests. The framework doesn’t run the application itself — that’s up to you. However, since the code and tests are generated together, passing tests are a strong quality signal.
How much does a typical MetaGPT run cost?
With GPT-4o: a simple project (200–500 lines of code) costs roughly $0.50–2.00 in API tokens. Complex projects can cost $3–10. Using gpt-4o-mini reduces costs by ~10x but significantly reduces quality on architectural decisions.
Can I run MetaGPT with a local LLM?
Yes. Configure api_type: "ollama" and a local model in config2.yaml. The quality will be noticeably lower than GPT-4o, but works for simple projects and is completely free. Qwen2.5-Coder models are the recommended local option.
Next Steps
- Getting Started with CrewAI — A more flexible multi-agent framework for general workflows
- What is OpenDevin — Another AI software engineer, but interactive
- LangChain vs AutoGen — Compare agent orchestration approaches