Mastering Claw Code: Creating Custom Tools for Complex Agentic Workflows

Q: Can I write custom tools in Python instead of Rust?

The claw binary is Rust, so native workspace extensions must be Rust crates. However, you can call external Python scripts from a workflow shell script — use claw prompt to generate the Python and then invoke python3 to execute it. The tool boundary is the subprocess boundary. See Automate Python Scripting with a Custom Claw Code Agent for a worked example of this hybrid approach.

Q: Why does `cargo install claw-code` not work?

The claw-code package on crates.io is a deprecated stub that does not contain the claw binary. The canonical binary — originally named agent-code — lives only in the ultraworkers/claw-code GitHub repository. You must build it from source using cargo build --workspace. Running cargo install claw-code installs an empty placeholder.

Q: How do I pass secrets safely in containerized workflows?

Never bake API keys into the Containerfile or commit them to the repository. Pass them at runtime via -e flags: bash docker run --rm \ -e ANTHROPICAPIKEY="$ANTHROPICAPIKEY" \ claw-code-dev ./target/debug/claw prompt "hello" In CI/CD, use your platform's native secrets management (GitHub Actions Secrets, GitLab CI Variables) and inject them as environment variables at pipeline runtime.

Q: What exactly does `claw doctor` validate?

claw doctor checks that required API key environment variables are set, verifies system dependencies (such as container runtimes when sandbox features are active), and reports the current binary version. It does not make a live API call, so it runs safely offline. Treat it as the canonical first debugging step whenever behavior is unexpected.

If you’ve worked through the earlier Claw Code guides and are ready to move beyond single prompts, Mastering Claw Code: Creating Custom Tools for Complex Agentic Workflows is where the architecture becomes genuinely powerful. Claw Code’s Rust workspace is intentionally modular — new capabilities live in new crates, and the CLI is one entry point into a larger composable system. This article shows you how to build that system: adding your own crates, wiring them into multi-step pipelines, and validating everything with the built-in parity harness.

Prerequisites

This is an advanced tutorial. You should already have:

A working claw binary built from source (see Getting Started with Claw Code)
Rust toolchain via rustup (rustc 1.75+ recommended)
Docker or Podman installed for containerized steps
ANTHROPIC_API_KEY set in your environment

Verify your setup before continuing:

export ANTHROPIC_API_KEY="sk-ant-..."
./target/debug/claw doctor

A clean doctor output confirms all dependencies resolve correctly. If anything is missing, consult PARITY.md in the repository root — it lists which features are complete in the current Rust port before you spend time debugging something that is intentionally unimplemented.

The Claw Workspace Architecture

The critical mental model: claw-code/rust/ is a Cargo workspace — a collection of related crates sharing a single Cargo.lock and build cache. The workspace root’s Cargo.toml lists every member. The claw binary is one of those members. Adding a custom tool means adding a new crate and wiring it in — no forking required.

flowchart TD
    A[rust/Cargo.toml<br/>Workspace Root] --> B[crates/claw-cli<br/>Binary Entry Point]
    A --> C[crates/claw-core<br/>Shared Logic]
    A --> D[crates/claw-tools<br/>Built-in Tools]
    A --> E[crates/your-tool<br/>Custom Extension]
    B --> C
    B --> D
    B --> E
    D --> C
    E --> C

This structure keeps your custom code isolated and independently testable while still participating in the shared build. Cargo resolves the entire dependency graph at once, so version conflicts surface at compile time rather than at runtime.

Creating a Custom Tool Crate

A custom tool crate is a Rust library that exposes a structured interface the CLI can call. The pattern is: create the crate, implement your logic, register it as a workspace member, and reference it from the CLI crate.

Step 1: Scaffold the crate

# From the rust/ workspace root
mkdir -p crates/claw-tool-summarizer/src

Create crates/claw-tool-summarizer/Cargo.toml:

[package]
name = "claw-tool-summarizer"
version = "0.1.0"
edition = "2021"

[dependencies]
serde = { version = "1", features = ["derive"] }
serde_json = "1"
tokio = { version = "1", features = ["full"] }
anyhow = "1"

Step 2: Implement the tool

Create crates/claw-tool-summarizer/src/lib.rs:

use anyhow::Result;
use serde::{Deserialize, Serialize};

/// Input schema for the summarizer tool.
#[derive(Debug, Deserialize)]
pub struct SummarizeInput {
    /// The raw text to summarize.
    pub text: String,
    /// Target word count for the summary (optional).
    pub max_words: Option<usize>,
}

/// Output schema for the summarizer tool.
#[derive(Debug, Serialize)]
pub struct SummarizeOutput {
    pub summary: String,
    pub word_count: usize,
    pub truncated: bool,
}

/// Entry point for the summarizer tool.
pub async fn run(input: SummarizeInput) -> Result<SummarizeOutput> {
    let max = input.max_words.unwrap_or(100);

    let words: Vec<&str> = input.text.split_whitespace().collect();
    let truncated = words.len() > max;
    let selected: Vec<&str> = words.into_iter().take(max).collect();
    let summary = selected.join(" ");
    let word_count = summary.split_whitespace().count();

    Ok(SummarizeOutput {
        summary,
        word_count,
        truncated,
    })
}

Step 3: Register in the workspace

Edit rust/Cargo.toml and add your crate to the [workspace] members list:

[workspace]
members = [
    "crates/claw-cli",
    "crates/claw-core",
    "crates/claw-tools",
    "crates/claw-tool-summarizer",   # ← add this line
]

Step 4: Build the full workspace

# From rust/
cargo build --workspace

Cargo compiles all crates together — a compile error in any member surfaces immediately. Fix it before continuing.

Orchestrating Complex Workflows

With your custom tool compiling, compose it into a multi-step agentic pipeline. Complex workflows are directed sequences of model calls and tool invocations:

sequenceDiagram
    participant Shell as Shell Script
    participant Claw as claw CLI
    participant Tool as Custom Tool
    participant LLM as AI Model

    Shell->>Claw: claw prompt "analyze this"
    Claw->>LLM: Send prompt
    LLM-->>Claw: Response with tool_use request
    Claw->>Tool: Invoke claw-tool-summarizer
    Tool-->>Claw: SummarizeOutput JSON
    Claw->>LLM: Return tool_result
    LLM-->>Claw: Final response
    Claw-->>Shell: Print output

For production, wrap your claw prompt calls in a shell script that pipes data between steps:

#!/usr/bin/env bash
set -euo pipefail

CLAW="./target/debug/claw"
WORK_DIR="./workflow_output"
mkdir -p "$WORK_DIR"

echo "[1/3] Gathering repository context..."
CONTEXT=$("$CLAW" prompt "List the top-level directories in this repository and describe each in one sentence.")
echo "$CONTEXT" > "$WORK_DIR/context.txt"

echo "[2/3] Generating action plan..."
PLAN=$("$CLAW" prompt "Given this project context, write a 5-step refactoring plan:

$CONTEXT")
echo "$PLAN" > "$WORK_DIR/plan.txt"

echo "[3/3] Producing implementation skeleton..."
SKELETON=$("$CLAW" prompt "Based on this plan, output a Rust module skeleton with only struct and fn signatures:

$PLAN")
echo "$SKELETON" > "$WORK_DIR/skeleton.rs"

echo "Done. Outputs written to $WORK_DIR/"

set -euo pipefail ensures the script aborts on any failure rather than propagating bad data through the chain. Each step depends on the previous output — treat this dependency as a contract.

For Windows PowerShell environments:

$env:ANTHROPIC_API_KEY = "sk-ant-..."
$claw = ".\target\debug\claw.exe"

$context = & $claw prompt "List all Rust crates in this workspace and their purpose."
$context | Out-File -FilePath ".\workflow_output\context.txt"

$plan = & $claw prompt "Using this context, outline a test coverage improvement plan:`n$context"
$plan | Out-File -FilePath ".\workflow_output\plan.txt"

Write-Host "Workflow complete."

Containerizing Your Tool for Reproducible Deployments

The claw-code repository ships a canonical Containerfile for a reason: development environments drift, but container images do not. For tools you share or run in CI, build once and use everywhere.

# Build the image from the repo root
docker build -t claw-code-dev -f Containerfile .

# Run your custom workflow inside the container
docker run --rm -it \
  -e ANTHROPIC_API_KEY="$ANTHROPIC_API_KEY" \
  -v "$PWD":/workspace \
  claw-code-dev \
  bash -c "cd /workspace/rust && cargo build --workspace && ./target/debug/claw doctor"

The -v "$PWD":/workspace mount gives the container access to your local source tree, so you can iterate on code without rebuilding the image on every change.

Check the sandbox context at any time:

./target/debug/claw sandbox

This reports whether the process is running inside a container — critical when building tools that read or write files, since paths available inside Docker differ from the host. Mismatched path assumptions are one of the most common bugs in containerized agentic pipelines.

This pattern parallels how MetaGPT Custom Roles and Actions: Build Your Own Software Team isolates agent roles in separate execution contexts, though Claw Code operates at the Rust binary layer rather than a Python orchestration layer.

Validating with the Parity Harness

Before shipping a custom tool, run it through the parity harness — a deterministic mock-service layer that replays recorded API responses instead of hitting a live model. This gives you fast, cost-free regression tests.

# Run all workspace tests (includes parity harness)
cargo test --workspace

# Run tests specific to your crate
cargo test -p claw-tool-summarizer

Write unit tests directly in your crate’s lib.rs:

#[cfg(test)]
mod tests {
    use super::*;

    #[tokio::test]
    async fn test_summarize_truncation() {
        let long_text = "word ".repeat(200);
        let input = SummarizeInput {
            text: long_text,
            max_words: Some(50),
        };
        let output = run(input).await.expect("summarizer should not error");
        assert!(output.truncated, "output should be marked as truncated");
        assert!(
            output.word_count <= 50,
            "word count should not exceed max_words"
        );
    }

    #[tokio::test]
    async fn test_summarize_no_truncation() {
        let short_text = "This is a short sentence.".to_string();
        let input = SummarizeInput {
            text: short_text,
            max_words: None,
        };
        let output = run(input).await.expect("summarizer should not error");
        assert!(!output.truncated);
    }
}

Consult PARITY.md before writing integration tests — it documents which upstream behaviors are mocked and which require a live API key. Multi-turn conversation tests are especially sensitive: the mock replay must match the exact message sequence the live model would produce.

This testing discipline mirrors the approach used in AutoGen Human-in-the-Loop: Keep Humans in Control of AI Agents, where deterministic checkpoints ensure agent decision points are verifiable rather than opaque.

Frequently Asked Questions

Can I write custom tools in Python instead of Rust?

The claw binary is Rust, so native workspace extensions must be Rust crates. However, you can call external Python scripts from a workflow shell script — use claw prompt to generate the Python and then invoke python3 to execute it. The tool boundary is the subprocess boundary. See Automate Python Scripting with a Custom Claw Code Agent for a worked example of this hybrid approach.

Why does `cargo install claw-code` not work?

The claw-code package on crates.io is a deprecated stub that does not contain the claw binary. The canonical binary — originally named agent-code — lives only in the ultraworkers/claw-code GitHub repository. You must build it from source using cargo build --workspace. Running cargo install claw-code installs an empty placeholder.

How do I pass secrets safely in containerized workflows?

Never bake API keys into the Containerfile or commit them to the repository. Pass them at runtime via -e flags:

docker run --rm \
  -e ANTHROPIC_API_KEY="$ANTHROPIC_API_KEY" \
  claw-code-dev ./target/debug/claw prompt "hello"

In CI/CD, use your platform’s native secrets management (GitHub Actions Secrets, GitLab CI Variables) and inject them as environment variables at pipeline runtime.

What exactly does `claw doctor` validate?

claw doctor checks that required API key environment variables are set, verifies system dependencies (such as container runtimes when sandbox features are active), and reports the current binary version. It does not make a live API call, so it runs safely offline. Treat it as the canonical first debugging step whenever behavior is unexpected.

Can I run the parity harness without Cargo?

No. The parity harness is a Rust test suite embedded in the workspace and must be invoked via cargo test. It is not available as a standalone binary. To reproduce a specific scenario in isolation, copy the relevant mock fixtures from the tests/ directory and run cargo test -p <crate-name> -- <test_name> to target a single test by name.