Skip to content
Go back

Claude Code as a Pipeline Worker — Programmatic Orchestration via HTTP API

Claude Code is designed as an interactive CLI tool. But it exposes an HTTP API that turns it into a programmable worker — one step in a larger automated pipeline.

This reference covers how to use Claude Code programmatically: submitting tasks, polling for results, chaining outputs, and integrating with workflow engines.

The HTTP API

Claude Code can run as an HTTP server that accepts task submissions and returns results.

Submit a task

curl -X POST http://localhost:7847/execute \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Analyze the error logs in /var/log/app.log and summarize the top 3 issues",
    "allowed_tools": ["Read", "Glob", "Grep"],
    "timeout": 120000
  }'

Response:

{
  "session_id": "abc-123",
  "status": "running"
}

Poll for completion

curl http://localhost:7847/sessions/abc-123
{
  "session_id": "abc-123",
  "status": "completed",
  "response": "Analysis of error logs:\n1. Database connection timeouts...",
  "result": { ... }
}

Key parameters

ParameterTypeDescription
promptstringThe task to execute
allowed_toolsarrayWhitelist of tools Claude can use
timeoutintegerMax execution time in ms

allowed_tools is the security boundary. If your pipeline only needs Claude to read and analyze (not write), restrict to ["Read", "Glob", "Grep"]. If it needs to modify files, add ["Edit", "Write"]. Never give more tools than the task requires.

The Polling Pattern

Claude Code tasks are asynchronous. The standard integration pattern is submit-poll-collect:

import httpx, time

def run_claude_task(prompt, tools=None, timeout=120000):
    resp = httpx.post("http://localhost:7847/execute", json={
        "prompt": prompt,
        "allowed_tools": tools or ["Read", "Glob", "Grep"],
        "timeout": timeout
    })
    session_id = resp.json()["session_id"]

    while True:
        time.sleep(5)
        status = httpx.get(f"http://localhost:7847/sessions/{session_id}").json()
        if status["status"] in ("completed", "failed"):
            return status

Timeout considerations

Claude Code sessions have two timeouts:

  1. Task timeout — how long the session can run (set via timeout parameter)
  2. Poll timeout — how long your client waits before giving up

Set task timeout generous enough for the work. Set poll timeout slightly longer than task timeout. A task that fails silently due to timeout is harder to debug than one that returns an explicit timeout error.

Composability

The response field in Claude’s output is a plain string. This makes it composable with any downstream processor.

Chain: Claude → LLM extraction

Step 1: Claude Code analyzes a codebase
  → output: {"response": "Found 3 critical issues: ..."}

Step 2: Local LLM extracts structured data
  → input: Claude's response as prompt context
  → output: {"issues": [{"severity": "critical", ...}]}

The pattern: Claude does the complex, tool-using analysis. A cheaper LLM does the structured extraction. This optimizes cost — Claude’s capability is used where it’s needed (tool use, multi-step reasoning), and commodity extraction is offloaded.

Chain: HTTP fetch → Claude analysis

Step 1: HTTP worker fetches external data
  → output: {"response": "<raw API data>"}

Step 2: Claude Code interprets the data
  → input: raw data as context
  → output: {"response": "Based on the FRED yield curve data..."}

The HTTP worker handles the structured fetch. Claude handles the unstructured interpretation. Each worker does what it’s best at.

Integration with Workflow Engines

Claude Code as a worker fits naturally into DAG-based workflow engines. Here’s the integration pattern:

Worker definition

A workflow engine needs to know:

  1. How to submit work to Claude (HTTP POST)
  2. How to check completion (HTTP GET polling)
  3. How to extract the result (response field)
  4. What to do on failure (status: "failed")

Example: Elixir/BEAM workflow engine

defmodule Workers.Claude do
  def execute(args, deps) do
    prompt = build_prompt(args, deps)
    {:ok, session_id} = submit_task(prompt, args)
    poll_until_complete(session_id, args[:timeout] || 120_000)
  end

  defp build_prompt(args, deps) do
    base = args[:prompt]
    # Inject upstream dependencies as context
    context = deps |> Enum.map(fn {k, v} -> "#{k}: #{v["response"]}" end) |> Enum.join("\n")
    if context != "", do: "Context:\n#{context}\n\nTask: #{base}", else: base
  end
end

Key design decisions:

Dependency injection

In a multi-step workflow:

step_1 (http: fetch data) ──┐
                              ├──→ step_3 (claude: analyze both)
step_2 (http: fetch data) ──┘

Step 3 receives both upstream outputs in its deps map. The prompt template can reference them:

Context from market data: {{step_1.response}}
Context from news feed: {{step_2.response}}

Analyze the relationship between these data points.

Sequential Execution Constraint

If your pipeline uses a local LLM alongside Claude Code, and the local LLM is single-threaded, you must execute LLM steps sequentially:

;; Wrong: parallel submission will queue and timeout
(pmap #(submit-workflow %) [workflow-1 workflow-2 workflow-3])

;; Right: sequential with blocking wait
(doseq [w [workflow-1 workflow-2 workflow-3]]
  (let [id (submit-workflow w)]
    (wait-for-completion id)))

Claude Code itself can run concurrently with other workers (HTTP, etc.). The sequential constraint only applies to shared resources like a single-GPU LLM server.

Production Patterns

Perception pipeline

A recurring pipeline that runs on cron:

Every 6 hours:
  1. HTTP worker fetches 5 data sources (parallel — all HTTP, no shared resource)
  2. For each source, Claude or LLM extracts insights (sequential — shared LLM)
  3. Results posted to staging API for human review

This pattern automates information gathering while keeping humans in the loop for quality control.

Code review pipeline

On PR creation:
  1. HTTP worker fetches diff from GitHub API
  2. Claude Code analyzes diff with access to full repo context
  3. Claude's review posted as PR comment

Claude’s advantage over simpler LLMs here: it can use Glob, Grep, and Read to understand the full codebase context around the changed lines.

Automated documentation

On feature branch merge:
  1. Claude Code reads the changed files
  2. Claude updates relevant documentation
  3. Changes committed to a docs branch for review

The allowed_tools constraint is critical here: give Claude ["Read", "Glob", "Grep", "Edit"] for the docs directory only, not the entire repo.

Error Handling

ErrorCauseResolution
TimeoutTask too complex or LLM too slowIncrease timeout, simplify prompt
Empty responseClaude couldn’t complete the taskCheck prompt clarity, check allowed_tools
Connection refusedClaude API not runningVerify service is up at the expected port
Partial resultSession interruptedImplement retry with idempotent prompts

For production pipelines, always implement:

  1. Retry with backoff for transient failures
  2. Dead letter queue for tasks that fail repeatedly
  3. Result validation before passing to downstream steps

When to Use Claude Code vs. a Standard LLM

Use CaseClaude CodeStandard LLM
Needs file system accessYesNo
Needs to run commandsYesNo
Needs multi-step tool useYesPossible but brittle
Simple text transformationOverkillYes
Structured extractionOverkillYes
Bulk processing (1000+ items)Too slowYes

Claude Code is the right worker when the task requires agency — reading files, searching codebases, running tests, making decisions across multiple steps. For everything else, a direct LLM API call is faster and cheaper.


Share this post on:

Previous Post
Agent Teams in Claude Code — Native Multi-Agent Orchestration
Next Post
Cross-Session Memory in Claude Code — Three Patterns for Persistent AI Workflows