Automate Python Scripting with a Custom Claw Code Agent

Q: Can I use OpenAI models instead of Claude with Claw Code?

Yes. Set OPENAIAPIKEY in your environment instead of ANTHROPICAPIKEY. The harness supports OpenAI as a backend, though the primary development focus is on the Anthropic integration. Check the repository's README for the exact flag or config entry required to switch providers.

Q: How do I constrain the agent to a specific directory?

Start claw from within the directory you want it to operate in, and phrase your prompts relative to that root. The harness resolves file paths relative to the working directory of the process, so changing into a subdirectory before calling subprocess.run effectively sandboxes file access. Explicit path validation in your wrapper script adds a second layer of control.

If you want to automate Python scripting with a custom Claw Code agent, you’ve picked one of the most direct paths to local, LLM-powered task automation available today. Claw Code is a Rust-based agent harness that gives an LLM — by default, Claude — structured access to your local environment. Unlike managed platforms, it runs entirely on your machine, which means no cloud round-trips for file I/O and no vendor lock-in beyond your API key. This tutorial walks you from a clean checkout to a working agent that can read, analyze, and patch Python scripts on demand.

Setting Up Claw Code from Source

Claw Code is not available on crates.io in a functional form. Do not run cargo install claw-code — that installs a deprecated stub. The real tool lives at ultraworkers/claw-code and must be built from source.

Prerequisites:

Rust toolchain (1.75+ recommended) — install via rustup.rs
An ANTHROPIC_API_KEY from the Anthropic developer console (not a Claude.ai Pro subscription)
Git

# Clone and build
git clone https://github.com/ultraworkers/claw-code
cd claw-code/rust
cargo build --workspace

The build compiles the entire workspace. The claw binary lands at ./target/debug/claw (or claw.exe on Windows).

Set your API key before running anything:

# Linux / macOS
export ANTHROPIC_API_KEY="sk-ant-..."

# Windows PowerShell
$env:ANTHROPIC_API_KEY = "sk-ant-..."

Run the health check to confirm everything is wired up:

./target/debug/claw doctor

A passing doctor output confirms the binary can reach the Anthropic API. If you’re working inside a Docker or Podman container, run claw sandbox — the sandbox detection subsystem will report the container environment and adjust the agent’s behavior accordingly.

How the Claw Harness Works

Before writing any automation logic, it helps to understand the agent harness model. Claw is not a library you import — it’s a CLI wrapper that manages the conversation loop between the LLM and your local filesystem.

flowchart TD
    A[User: claw prompt] --> B[Harness: parse prompt]
    B --> C[LLM: generate plan + tool calls]
    C --> D{Tool type}
    D -->|read_file| E[Filesystem read]
    D -->|write_file| F[Filesystem write]
    D -->|run_command| G[Shell exec]
    E & F & G --> H[Harness: collect results]
    H --> C
    C -->|task complete| I[Final response to user]

Each claw prompt invocation starts a new ReAct-style loop — the model reasons, picks a tool action, observes the result, and iterates until the task is done. This is the same pattern described in the ReAct: Reasoning and Acting — The Paper Behind Agent Frameworks paper that underpins most modern agent frameworks.

The harness handles:

Tool dispatch — routing LLM tool calls to local capabilities (file reads, shell commands, writes)
Context management — keeping the conversation history within the model’s context window
Sandbox awareness — restricting destructive operations when running inside a container

Designing a Python Automation Prompt

The fastest way to build a “custom agent” in Claw Code is through a structured system prompt — a plain text file that instructs the model to behave like a specialized tool.

Create a file called python_agent_prompt.txt:

You are a Python scripting assistant. Your job is to:
1. Read Python files from the current directory when asked.
2. Identify issues: missing type hints, unused imports, non-idiomatic patterns.
3. Propose and apply fixes inline, writing the corrected file back.
4. Always confirm what you changed and why.

Rules:
- Never delete files without explicit permission.
- Always show a diff-style summary of changes.
- Ask before running any shell command that modifies system state.

Now pass this prompt to claw along with a task:

./target/debug/claw prompt "$(cat python_agent_prompt.txt) --- Task: Read fetch_data.py, find unused imports, and clean them up."

The model will read fetch_data.py, reason about the imports, and write the fixed version back. You’ll see its reasoning steps in the terminal output.

Building a Reusable Wrapper Script

Running a long prompt string every time is tedious. Wrap the pattern in a Python script that acts as your automation interface — ironically, a Python script to drive Python automation.

Create automate.py in the claw-code/ directory:

#!/usr/bin/env python3
"""
automate.py — drive claw prompt for Python file automation tasks.
Usage: python automate.py <file.py> <task>
"""

import subprocess
import sys
from pathlib import Path

SYSTEM_PROMPT = """You are a Python scripting assistant. Your job is to:
1. Read Python files from the current directory when asked.
2. Identify issues: missing type hints, unused imports, non-idiomatic patterns.
3. Propose and apply fixes inline, writing the corrected file back.
4. Always confirm what you changed and why.

Rules:
- Never delete files without explicit permission.
- Always show a diff-style summary of changes.
- Ask before running any shell command that modifies system state.
"""

def run_agent(target_file: str, task: str) -> None:
    claw_bin = Path(__file__).parent / "rust" / "target" / "debug" / "claw"

    if not claw_bin.exists():
        print(f"Error: claw binary not found at {claw_bin}")
        print("Run: cd rust && cargo build --workspace")
        sys.exit(1)

    full_prompt = f"{SYSTEM_PROMPT}\n---\nTarget file: {target_file}\nTask: {task}"

    result = subprocess.run(
        [str(claw_bin), "prompt", full_prompt],
        capture_output=False,  # stream output live
        text=True,
    )

    if result.returncode != 0:
        print(f"\n[automate.py] claw exited with code {result.returncode}")
        sys.exit(result.returncode)

if __name__ == "__main__":
    if len(sys.argv) < 3:
        print("Usage: python automate.py <file.py> \"<task description>\"")
        sys.exit(1)

    run_agent(sys.argv[1], sys.argv[2])

Make it executable and test it against a sample file:

chmod +x automate.py

# Create a messy sample script to clean up
cat > sample.py << 'EOF'
import os
import sys
import json
import re

def fetch(url):
    import requests
    r = requests.get(url)
    return r.json()

def process(data):
    result = []
    for item in data:
        result.append(item['value'])
    return result
EOF

python automate.py sample.py "Remove unused imports and add type hints to all functions"

The agent reads sample.py, identifies that os, sys, json, and re are never used, removes them, adds type annotations to fetch and process, and writes the file back — all through the harness’s tool dispatch loop.

Chaining Tasks: A Multi-Step Automation Workflow

Real automation rarely stops at a single file. You can chain tasks by calling run_agent multiple times or by writing a batch automation script that processes a whole directory.

#!/usr/bin/env python3
"""
batch_automate.py — run the Python agent over every .py file in a directory.
"""

import subprocess
import sys
from pathlib import Path

CLAW_BIN = Path(__file__).parent / "rust" / "target" / "debug" / "claw"

TASKS = [
    "Remove unused imports",
    "Add return type hints to all top-level functions",
    "Replace bare except clauses with specific exception types",
]

def process_file(py_file: Path) -> None:
    print(f"\n=== Processing {py_file.name} ===")
    for task in TASKS:
        print(f"  → Task: {task}")
        prompt = (
            f"You are a Python code quality agent.\n"
            f"File: {py_file}\n"
            f"Task: {task}\n"
            f"Apply the change directly to the file. Show a brief summary."
        )
        subprocess.run([str(CLAW_BIN), "prompt", prompt], check=True)

def main(directory: str) -> None:
    target = Path(directory)
    py_files = sorted(target.glob("**/*.py"))

    if not py_files:
        print(f"No .py files found in {directory}")
        return

    for f in py_files:
        process_file(f)

    print("\n[batch_automate.py] All files processed.")

if __name__ == "__main__":
    if len(sys.argv) < 2:
        print("Usage: python batch_automate.py <directory>")
        sys.exit(1)
    main(sys.argv[1])

Run it against a project folder:

python batch_automate.py ./my_project/

Each file gets three sequential passes: import cleanup, type hint injection, and exception handling improvement. Because each claw prompt call is stateless, the tasks are independent — no state leaks between files. For use cases where agents do need to share state across passes, see Agent Communication and State Management in Multi-Agent Systems.

Production Patterns and Next Steps

Once you’ve confirmed the basic workflow, a few patterns make it production-ready:

Version control before automation. Always run git add -A && git stash before a batch run. If the agent makes an unwanted change, git stash pop restores your working tree in seconds.

Dry-run mode. Prefix your task with “Do not write the file yet — only show me what you would change.” This turns any task into a preview pass, which is useful for auditing before committing.

Prompt versioning. Store your system prompts in a prompts/ directory under version control. Treat them like code — small wording changes can significantly affect output quality.

Rate limiting. The Anthropic API applies per-minute token limits. For large batch runs, add a time.sleep(2) between subprocess.run calls in batch_automate.py to stay within limits.

Container isolation. For untrusted codebases, run claw inside Docker. The sandbox detection feature will adapt automatically, and you get file-system isolation for free. Pass your API key as a container secret rather than a plain environment variable.

The claw harness is intentionally minimal — it’s a tool dispatch loop, not a framework. For applications that need persistent memory, message passing between agents, or dynamic role assignment, you’ll want to layer on a multi-agent coordinator. The patterns in Getting Started with AgentScope: Build Your First Multi-Agent App translate directly to workflows where claw handles the file I/O leaf tasks while a coordinator manages the overall plan.

Frequently Asked Questions

Can I use OpenAI models instead of Claude with Claw Code?

Yes. Set OPENAI_API_KEY in your environment instead of ANTHROPIC_API_KEY. The harness supports OpenAI as a backend, though the primary development focus is on the Anthropic integration. Check the repository’s README for the exact flag or config entry required to switch providers.

Why does `cargo install claw-code` install something broken?

The crate published to crates.io under that name is a deprecated stub from an earlier, unrelated project. The active implementation lives exclusively in the ultraworkers/claw-code GitHub repository and is not published to the registry. Always build from source.

Is it safe to point `batch_automate.py` at a production codebase?

Not without a safety net. The agent can and will write files directly. Run it on a feature branch or a stash-protected working tree, review the diff with git diff, and only merge if you’re satisfied. The dry-run prompt pattern (asking the model to describe changes without writing) is useful for a first pass.

How do I constrain the agent to a specific directory?

Start claw from within the directory you want it to operate in, and phrase your prompts relative to that root. The harness resolves file paths relative to the working directory of the process, so changing into a subdirectory before calling subprocess.run effectively sandboxes file access. Explicit path validation in your wrapper script adds a second layer of control.

What happens if the model exceeds the context window mid-task?

The harness will return an error and stop the current loop iteration. For large files, break the task into smaller, focused prompts rather than asking the model to process the entire file at once. Targeting specific functions or sections — “add type hints to all functions in the utils module” rather than “fix the whole project” — keeps each prompt well within context limits.