Claude Code is designed as an interactive CLI tool. But it exposes an HTTP API that turns it into a programmable worker — one step in a larger automated pipeline.
This reference covers how to use Claude Code programmatically: submitting tasks, polling for results, chaining outputs, and integrating with workflow engines.
The HTTP API
Claude Code can run as an HTTP server that accepts task submissions and returns results.
Submit a task
curl -X POST http://localhost:7847/execute \
-H "Content-Type: application/json" \
-d '{
"prompt": "Analyze the error logs in /var/log/app.log and summarize the top 3 issues",
"allowed_tools": ["Read", "Glob", "Grep"],
"timeout": 120000
}'
Response:
{
"session_id": "abc-123",
"status": "running"
}
Poll for completion
curl http://localhost:7847/sessions/abc-123
{
"session_id": "abc-123",
"status": "completed",
"response": "Analysis of error logs:\n1. Database connection timeouts...",
"result": { ... }
}
Key parameters
| Parameter | Type | Description |
|---|---|---|
prompt | string | The task to execute |
allowed_tools | array | Whitelist of tools Claude can use |
timeout | integer | Max execution time in ms |
allowed_tools is the security boundary. If your pipeline only needs Claude to read and analyze (not write), restrict to ["Read", "Glob", "Grep"]. If it needs to modify files, add ["Edit", "Write"]. Never give more tools than the task requires.
The Polling Pattern
Claude Code tasks are asynchronous. The standard integration pattern is submit-poll-collect:
import httpx, time
def run_claude_task(prompt, tools=None, timeout=120000):
resp = httpx.post("http://localhost:7847/execute", json={
"prompt": prompt,
"allowed_tools": tools or ["Read", "Glob", "Grep"],
"timeout": timeout
})
session_id = resp.json()["session_id"]
while True:
time.sleep(5)
status = httpx.get(f"http://localhost:7847/sessions/{session_id}").json()
if status["status"] in ("completed", "failed"):
return status
Timeout considerations
Claude Code sessions have two timeouts:
- Task timeout — how long the session can run (set via
timeoutparameter) - Poll timeout — how long your client waits before giving up
Set task timeout generous enough for the work. Set poll timeout slightly longer than task timeout. A task that fails silently due to timeout is harder to debug than one that returns an explicit timeout error.
Composability
The response field in Claude’s output is a plain string. This makes it composable with any downstream processor.
Chain: Claude → LLM extraction
Step 1: Claude Code analyzes a codebase
→ output: {"response": "Found 3 critical issues: ..."}
Step 2: Local LLM extracts structured data
→ input: Claude's response as prompt context
→ output: {"issues": [{"severity": "critical", ...}]}
The pattern: Claude does the complex, tool-using analysis. A cheaper LLM does the structured extraction. This optimizes cost — Claude’s capability is used where it’s needed (tool use, multi-step reasoning), and commodity extraction is offloaded.
Chain: HTTP fetch → Claude analysis
Step 1: HTTP worker fetches external data
→ output: {"response": "<raw API data>"}
Step 2: Claude Code interprets the data
→ input: raw data as context
→ output: {"response": "Based on the FRED yield curve data..."}
The HTTP worker handles the structured fetch. Claude handles the unstructured interpretation. Each worker does what it’s best at.
Integration with Workflow Engines
Claude Code as a worker fits naturally into DAG-based workflow engines. Here’s the integration pattern:
Worker definition
A workflow engine needs to know:
- How to submit work to Claude (HTTP POST)
- How to check completion (HTTP GET polling)
- How to extract the result (
responsefield) - What to do on failure (
status: "failed")
Example: Elixir/BEAM workflow engine
defmodule Workers.Claude do
def execute(args, deps) do
prompt = build_prompt(args, deps)
{:ok, session_id} = submit_task(prompt, args)
poll_until_complete(session_id, args[:timeout] || 120_000)
end
defp build_prompt(args, deps) do
base = args[:prompt]
# Inject upstream dependencies as context
context = deps |> Enum.map(fn {k, v} -> "#{k}: #{v["response"]}" end) |> Enum.join("\n")
if context != "", do: "Context:\n#{context}\n\nTask: #{base}", else: base
end
end
Key design decisions:
- Upstream worker outputs are injected as context into Claude’s prompt
- The
responsekey is standardized across all workers (Claude, LLM, HTTP) - Timeout is configurable per-step
Dependency injection
In a multi-step workflow:
step_1 (http: fetch data) ──┐
├──→ step_3 (claude: analyze both)
step_2 (http: fetch data) ──┘
Step 3 receives both upstream outputs in its deps map. The prompt template can reference them:
Context from market data: {{step_1.response}}
Context from news feed: {{step_2.response}}
Analyze the relationship between these data points.
Sequential Execution Constraint
If your pipeline uses a local LLM alongside Claude Code, and the local LLM is single-threaded, you must execute LLM steps sequentially:
;; Wrong: parallel submission will queue and timeout
(pmap #(submit-workflow %) [workflow-1 workflow-2 workflow-3])
;; Right: sequential with blocking wait
(doseq [w [workflow-1 workflow-2 workflow-3]]
(let [id (submit-workflow w)]
(wait-for-completion id)))
Claude Code itself can run concurrently with other workers (HTTP, etc.). The sequential constraint only applies to shared resources like a single-GPU LLM server.
Production Patterns
Perception pipeline
A recurring pipeline that runs on cron:
Every 6 hours:
1. HTTP worker fetches 5 data sources (parallel — all HTTP, no shared resource)
2. For each source, Claude or LLM extracts insights (sequential — shared LLM)
3. Results posted to staging API for human review
This pattern automates information gathering while keeping humans in the loop for quality control.
Code review pipeline
On PR creation:
1. HTTP worker fetches diff from GitHub API
2. Claude Code analyzes diff with access to full repo context
3. Claude's review posted as PR comment
Claude’s advantage over simpler LLMs here: it can use Glob, Grep, and Read to understand the full codebase context around the changed lines.
Automated documentation
On feature branch merge:
1. Claude Code reads the changed files
2. Claude updates relevant documentation
3. Changes committed to a docs branch for review
The allowed_tools constraint is critical here: give Claude ["Read", "Glob", "Grep", "Edit"] for the docs directory only, not the entire repo.
Error Handling
| Error | Cause | Resolution |
|---|---|---|
| Timeout | Task too complex or LLM too slow | Increase timeout, simplify prompt |
| Empty response | Claude couldn’t complete the task | Check prompt clarity, check allowed_tools |
| Connection refused | Claude API not running | Verify service is up at the expected port |
| Partial result | Session interrupted | Implement retry with idempotent prompts |
For production pipelines, always implement:
- Retry with backoff for transient failures
- Dead letter queue for tasks that fail repeatedly
- Result validation before passing to downstream steps
When to Use Claude Code vs. a Standard LLM
| Use Case | Claude Code | Standard LLM |
|---|---|---|
| Needs file system access | Yes | No |
| Needs to run commands | Yes | No |
| Needs multi-step tool use | Yes | Possible but brittle |
| Simple text transformation | Overkill | Yes |
| Structured extraction | Overkill | Yes |
| Bulk processing (1000+ items) | Too slow | Yes |
Claude Code is the right worker when the task requires agency — reading files, searching codebases, running tests, making decisions across multiple steps. For everything else, a direct LLM API call is faster and cheaper.