Concurrency & Rate Limits
Orch8 provides built-in concurrency control, idempotency, and per-resource rate limiting — all backed by PostgreSQL for multi-node consistency. When demand exceeds capacity, work is deferred to the next available slot. Nothing is ever dropped or failed due to throttling.
Concurrency Keys#
Concurrency keys limit how many workflow instances with the same key can run simultaneously. This is essential for preventing conflicting operations — for example, ensuring only one campaign runs for a given contact at a time, or that a single order is never processed by two parallel instances.
Set concurrency_key when creating an instance. The engine checks how many active instances share that key and, if the limit is reached, defers the new instance until a slot opens.
Configuration#
POST /instances
{
"sequence_name": "outreach_campaign",
"version": 1,
"tenant_id": "acme",
"concurrency_key": "contact:john@acme.com",
"max_concurrency": 1,
"context": {
"contact_email": "john@acme.com",
"campaign_id": "camp-42"
}
}| Parameter | Type | Required | Description |
|---|---|---|---|
| concurrency_key | string | Required | Arbitrary string that groups instances. Instances with the same key compete for slots. |
| max_concurrency | integer | Optional | Maximum number of instances with this key that may run at once. Defaults to 1. |
Behavior when limit is exceeded#
When a new instance is created and the concurrency limit for its key is already met, the instance enters a deferred state. It remains in the queue and is automatically promoted to running as soon as an existing instance with the same key completes, fails, or is cancelled. Deferred instances are never dropped.
// Creating a second instance with the same key while one is running:
POST /instances
{
"sequence_name": "outreach_campaign",
"version": 1,
"tenant_id": "acme",
"concurrency_key": "contact:john@acme.com",
"max_concurrency": 1,
"context": { "campaign_id": "camp-99" }
}
// Response (201):
{
"id": "inst-abc-456",
"status": "deferred",
"concurrency_key": "contact:john@acme.com",
"deferred_reason": "concurrency_limit_reached"
}contact:john@acme.com) without interfering with each other. The engine internally qualifies keys by tenant_id.Idempotency Keys#
Idempotency keys prevent duplicate instance creation. This is critical when upstream systems may retry requests — for example, a webhook delivery that fires twice due to a network timeout, or a queue consumer that reprocesses a message after a crash.
When an instance is created with an idempotency_key, the engine checks whether an instance with that key already exists within the idempotency window. If it does, the API returns the existing instance instead of creating a duplicate.
Configuration#
POST /instances
{
"sequence_name": "welcome_onboarding",
"version": 1,
"tenant_id": "acme",
"idempotency_key": "signup:user-12345:2024-01-15",
"idempotency_window": "24h",
"context": {
"user_id": "user-12345",
"signup_source": "landing_page"
}
}| Parameter | Type | Required | Description |
|---|---|---|---|
| idempotency_key | string | Optional | Unique key for deduplication. If omitted, no deduplication is performed. |
| idempotency_window | string | Optional | Duration string for how long the key is retained. Defaults to "24h". Accepts values like "1h", "12h", "7d". |
Deduplicated response#
When a duplicate request is detected, the API returns the existing instance with an additional "deduplicated": true field. The HTTP status code is still 201, so callers that only check status codes continue to work without changes.
// Second request with the same idempotency_key:
POST /instances
{
"sequence_name": "welcome_onboarding",
"version": 1,
"tenant_id": "acme",
"idempotency_key": "signup:user-12345:2024-01-15",
"context": { "user_id": "user-12345" }
}
// Response (201):
{
"id": "inst-original-789",
"status": "running",
"idempotency_key": "signup:user-12345:2024-01-15",
"deduplicated": true
}stripe:evt_123abc, signup:user-42:2024-01-15. Avoid random UUIDs generated per request — those defeat the purpose of deduplication.Rate Limiting#
Rate limiting controls how frequently steps can execute against a shared resource. The engine uses a sliding window counter scoped by tenant and resource key. When a step would exceed the limit, it is automatically deferred to the next available slot — overages are rescheduled, never dropped or failed.
Rate limiting involves two pieces: a rate_limit_key on the step definition, and a rate limit resource configuration that defines the window and maximum count.
Step configuration#
Add rate_limit_key to any step that should be throttled. The key string references a rate limit resource created via the API.
{
"id": "send-email",
"type": "step",
"handler": "http_request",
"params": {
"url": "https://api.sendgrid.com/v3/mail/send",
"method": "POST",
"body": {
"from": "outreach@acme.com",
"to": "{{context.recipient}}",
"subject": "Hello from Acme"
}
},
"rate_limit_key": "mailbox:outreach@acme.com"
}Rate limit resource#
Rate limit resources are created via the API and define the window size and maximum count for a given key.
POST /rate-limits
{
"tenant_id": "acme",
"resource_key": "mailbox:outreach@acme.com",
"max_count": 30,
"window_seconds": 86400
}| Parameter | Type | Required | Description |
|---|---|---|---|
| resource_key | string | Required | Unique identifier for the rate-limited resource. Must match the rate_limit_key on steps. |
| max_count | integer | Required | Maximum number of actions allowed within the sliding window. |
| window_seconds | integer | Required | Size of the sliding window in seconds. E.g., 86400 for a daily limit, 3600 for hourly. |
How the Sliding Window Works#
The engine maintains a count of actions that occurred within a rolling time window. Unlike fixed windows that reset at calendar boundaries (which allow bursts at window edges), a sliding window ensures the limit is respected at every point in time.
When a step with a rate_limit_key is ready to execute, the scheduler performs a single atomic operation:
- 1.Count how many actions for this resource key occurred within the last
window_seconds. - 2.If the count is below
max_count, execute the step and record the action timestamp. - 3.If the count equals
max_count, calculate when the oldest action in the window will expire. Reschedule the step to that time.
This means rate-limited steps are never rejected — they are pushed forward in time to the next available slot. The workflow continues running as soon as the slot opens.
// Example: 30 emails per day (86400 seconds)
{
"resource_key": "mailbox:outreach@acme.com",
"max_count": 30,
"window_seconds": 86400
}
// If 30 emails have been sent and the oldest was 20 hours ago,
// the next slot opens in 4 hours.
// The step is automatically rescheduled to that time.Multi-Node Consistency#
Rate limit counters are stored in PostgreSQL, not in process memory. Every scheduler node reads and writes the same counters table. This design provides several guarantees:
- ●No split-brain — two nodes cannot independently exceed a limit because the check-and-increment is a single atomic database transaction.
- ●Crash resilience — if a scheduler node restarts, counters are not lost. The new process reads current state from the database.
- ●Horizontal scaling — add or remove scheduler nodes freely. Rate limit accuracy is unaffected by cluster topology changes.
API Reference#
Create or update a rate limit resource. If a resource with the same resource_key and tenant_id already exists, the window and count are updated in place.
POST /rate-limits
{
"tenant_id": "acme",
"resource_key": "api:openai",
"max_count": 100,
"window_seconds": 60
}
// Response (201):
{
"id": "rl-550e8400-...",
"tenant_id": "acme",
"resource_key": "api:openai",
"max_count": 100,
"window_seconds": 60,
"created_at": "2024-01-15T10:00:00Z"
}List all rate limit resources for a tenant. Returns an array of rate limit objects.
GET /rate-limits?tenant_id=acme
// Response (200):
[
{
"id": "rl-550e8400-...",
"tenant_id": "acme",
"resource_key": "mailbox:outreach@acme.com",
"max_count": 30,
"window_seconds": 86400,
"created_at": "2024-01-15T10:00:00Z"
},
{
"id": "rl-660f9511-...",
"tenant_id": "acme",
"resource_key": "api:openai",
"max_count": 100,
"window_seconds": 60,
"created_at": "2024-01-16T08:30:00Z"
}
]Remove a rate limit resource. Steps referencing this key will no longer be throttled — they execute immediately. Existing deferred steps are released on the next scheduler tick.
DELETE /rate-limits/mailbox:outreach@acme.com?tenant_id=acme
// Response: 204 No ContentCommon Patterns#
Email warmup#
New email accounts need to gradually increase sending volume to build sender reputation. Combine rate limiting with resource pools to start at 10 emails per day and ramp to 100 over two weeks.
// Rate limit for a new mailbox — start conservative
POST /rate-limits
{
"tenant_id": "acme",
"resource_key": "mailbox:new-sender@acme.com",
"max_count": 10,
"window_seconds": 86400
}
// After 1 week, increase via update
POST /rate-limits
{
"tenant_id": "acme",
"resource_key": "mailbox:new-sender@acme.com",
"max_count": 50,
"window_seconds": 86400
}
// After 2 weeks, full capacity
POST /rate-limits
{
"tenant_id": "acme",
"resource_key": "mailbox:new-sender@acme.com",
"max_count": 100,
"window_seconds": 86400
}For automated warmup ramps that adjust daily without manual updates, see the Resource Pools warmup feature.
API gateway protection#
Limit outbound calls to an external API to stay within provider quotas. Multiple workflows sharing the same rate_limit_key are throttled together.
// Rate limit: 100 requests per minute to OpenAI
POST /rate-limits
{
"tenant_id": "acme",
"resource_key": "api:openai",
"max_count": 100,
"window_seconds": 60
}
// Any step referencing this key is throttled
{
"id": "generate-summary",
"type": "step",
"handler": "http_request",
"params": {
"url": "https://api.openai.com/v1/chat/completions",
"method": "POST",
"body": { "model": "gpt-4", "messages": "{{context.messages}}" }
},
"rate_limit_key": "api:openai"
}Per-user throttling#
Limit how many notifications a single user receives per hour. Use a dynamic key that includes the user ID so each user gets an independent counter.
// Rate limit: 5 notifications per hour per user
POST /rate-limits
{
"tenant_id": "acme",
"resource_key": "notify:user-12345",
"max_count": 5,
"window_seconds": 3600
}
// Step definition — key is per-user
{
"id": "send-notification",
"type": "step",
"handler": "http_request",
"params": {
"url": "https://push.acme.com/send",
"method": "POST",
"body": {
"user_id": "{{context.user_id}}",
"message": "{{context.notification_text}}"
}
},
"rate_limit_key": "notify:{{context.user_id}}"
}notify:{{context.user_id}}), make sure a corresponding rate limit resource exists for each resolved key. If no resource matches, the step executes without throttling.Combining concurrency and rate limiting#
Concurrency keys and rate limits solve different problems and can be used together. Concurrency keys prevent parallel execution of conflicting workflows. Rate limits control throughput over time.
// Instance: one campaign per contact at a time (concurrency)
POST /instances
{
"sequence_name": "outreach_campaign",
"version": 1,
"tenant_id": "acme",
"concurrency_key": "contact:john@acme.com",
"max_concurrency": 1,
"context": { "contact_email": "john@acme.com" }
}
// Step within the workflow: throttle email sending (rate limit)
{
"id": "send-email",
"type": "step",
"handler": "http_request",
"params": { "url": "https://api.sendgrid.com/v3/mail/send", "method": "POST" },
"rate_limit_key": "mailbox:outreach@acme.com"
}In this example, the concurrency key ensures only one campaign runs per contact. The rate limit ensures the shared mailbox does not exceed 30 emails per day across all campaigns, regardless of how many contacts are being reached.
Ready to try Orch8?
One command to install. Two minutes to your first workflow.
curl -fsSL https://orch8.io/start.sh | sh