AI Agents

Orch8 ships three agent-optimized handlers and pre-built sequence templates for common agent patterns. Agents run as ordinary instances — they get the same crash recovery, rate limiting, human input, and output memoization as any other workflow.

llm_call — multi-provider LLM handler

Built-in handler for calling language model APIs. Supports 9+ providers (OpenAI, Anthropic, Groq, Mistral, Cohere, Together AI, Fireworks, Perplexity, Ollama). Provider is selected per step via params.provider. API keys live in context.config and are never logged.

{
  "type": "step", "id": "llm_reasoning",
  "handler": "llm_call",
  "params": {
    "provider": "openai",
    "model": "gpt-4o",
    "messages": "{{context.data.messages}}",
    "tools": "{{context.config.available_tools}}",
    "temperature": 0.2
  },
  "timeout": 60000,
  "retry": { "max_attempts": 3, "initial_backoff": 1000, "max_backoff": 15000, "backoff_multiplier": 2.0 }
}

// Supported providers: openai, anthropic, groq, mistral, cohere,
//                      together_ai, fireworks, perplexity, ollama

tool_call — dispatch to registered handlers

Routes the LLM's tool-call response to the correct handler. Reads params.tool_name and params.tool_args from the LLM output and dispatches to the named registered handler. Combine with a router block when the tool set is small and explicit, or use tool_call when the agent selects from a dynamic list.

{
  "type": "step", "id": "dispatch_tool",
  "handler": "tool_call",
  "params": {
    "tool_name": "{{steps.llm_reasoning.output.tool_call.name}}",
    "tool_args": "{{steps.llm_reasoning.output.tool_call.arguments}}"
  },
  "timeout": 30000
}

human_review — pause for human input

Pauses the agent and waits for a human signal before continuing. Equivalent to wait_for_input but named for agent contexts. The instance moves to Waiting and uses zero scheduler resources until the signal arrives. An optional escalation handler fires if nobody responds within timeout.

{
  "type": "step", "id": "request_human_review",
  "handler": "human_review",
  "params": {
    "reviewer_email": "{{context.config.reviewer_email}}",
    "payload": "{{steps.llm_reasoning.output}}",
    "context_summary": "Agent is ready to execute: {{steps.plan_step.output.plan}}"
  },
  "wait_for_input": {
    "prompt": "Review the proposed plan and approve, reject, or edit.",
    "timeout": 86400000,
    "escalation_handler": "notify_backup_reviewer"
  }
}

// Resume the agent from your app

POST /instances/{id}/signals
{ "signal_type": "custom", "payload": { "decision": "approved" } }

ReAct agent template

Pre-built ReAct (Reason + Act) loop template. The agent reasons about the task, picks a tool, executes it, observes the result, and loops until it produces a final answer or hits the iteration cap. Crashed mid-loop? Completed iterations replay from their memoized output — not from scratch.

{
  "name": "react_agent",
  "namespace": "ai",
  "blocks": [
    { "type": "step", "id": "init", "handler": "noop",
      "params": { "messages": [], "iteration": 0, "max_iterations": 10 } },
    {
      "type": "loop", "id": "react_loop",
      "condition": "context.data.agent_done != true",
      "max_iterations": 10,
      "body": [
        { "type": "step", "id": "reason", "handler": "llm_call",
          "params": { "provider": "openai", "model": "gpt-4o",
                      "messages": "{{context.data.messages}}",
                      "tools": "{{context.config.tools}}" },
          "timeout": 60000, "retry": { "max_attempts": 3, "initial_backoff": 1000, "max_backoff": 15000, "backoff_multiplier": 2.0 } },
        { "type": "step", "id": "act", "handler": "tool_call",
          "params": { "tool_name": "{{steps.reason.output.tool_call.name}}",
                      "tool_args": "{{steps.reason.output.tool_call.arguments}}" } }
      ]
    },
    { "type": "step", "id": "finalize", "handler": "noop",
      "params": { "result": "{{context.data.final_answer}}" } }
  ]
}

Multi-agent delegation

A coordinator agent decomposes a task into subtasks and spawns specialist agents via sub_sequence. The coordinator waits for all specialists to complete before synthesizing results. Each specialist is an independent instance with its own crash recovery.

// Coordinator: spawns specialists in parallel
{
  "type": "parallel", "id": "spawn_specialists",
  "branches": [
    [{ "type": "sub_sequence", "id": "research_agent",
       "sequence_name": "research_specialist",
       "input": { "query": "{{steps.decompose.output.subtask_1}}" } }],
    [{ "type": "sub_sequence", "id": "code_agent",
       "sequence_name": "code_specialist",
       "input": { "task": "{{steps.decompose.output.subtask_2}}" } }]
  ]
}

// Synthesize after all specialists finish
{ "type": "step", "id": "synthesize", "handler": "llm_call",
  "params": { "provider": "anthropic", "model": "claude-opus-4-6",
              "messages": [{ "role": "user",
                "content": "Synthesize: research={{steps.research_agent.output}}, code={{steps.code_agent.output}}" }] } }

SSE streaming — real-time agent output

Subscribe to live block outputs as they execute. The engine pushes Server-Sent Events whenever a step completes. Connect from any SSE client — browser EventSource, curl, or SDK.

# Subscribe to live step outputs
GET /instances/{id}/stream
Accept: text/event-stream

# Response stream (one event per completed block)
event: block_output
data: {"block_id":"reason","output":{"tool_call":{"name":"search","arguments":{"q":"..."}}}}

event: block_output
data: {"block_id":"act","output":{"results":[...]}}

event: instance_complete
data: {"status":"completed","final_output":{"answer":"..."}}

Useful for streaming partial agent results to a UI as they arrive — no polling required. The connection closes automatically when the instance reaches a terminal state.