Throughput and rate limits

Last updated

IsoKron applies two different kinds of rate limits to your workspace. Both are designed to keep the platform fast for everyone without constraining typical customer workflows; if your pattern exceeds either you can request an adjustment.

1. API request rate limit (token bucket)

Every HTTP request to the API is metered through a per-workspace token bucket that refills at 60 requests per minute (with a small burst allowance). The token bucket is in-memory per API process and resets if a process restarts.

Requests that exceed the bucket get HTTP 429 with a retry_after_seconds hint:

{
  "error": "rate_limited",
  "scope": "workspace",
  "retry_after_seconds": 60
}

The 60/min bucket is generous for interactive use (dashboard, compilation kick-offs, ledger reads). If your integration sustains more than 60 req/min reliably, that's the signal to contact support for a plan-tier bucket size adjustment.

2. Per-workspace event throughput (soft limit)

Every customer action that mutates a Component, Ticket, Decision, or other Class-1/Class-2 entity emits an event into your workspace's audit chain. That audit chain has a soft per-workspace ceiling on sustained event emission rate, structured as a two-tier limit because the two retention classes have meaningfully different cost profiles:

Tier 1 — Audit-class events

Entity emissions (the queue-drained event_log that records every Component / Ticket / Decision mutation) are audit-class — the payload lives inline in the audit chain itself, no separate substrate write. Our internal Phase 2 measurement campaign measured the audit-class ceiling at ~1,241 events per second per workspace on workstation substrate with a flat ~37 ms p99 latency floor (3-rep median; 1% variance).

This is the limit that applies to the dominant typical-customer workflow: the compiler's per-stage state changes, audit-log entries from operator actions, etc.

Tier 2 — Standard / extended retention class events

Events that carry a separate VCR substrate object (cached artifacts, large attachments, etc.) hit a more conservative documented soft limit of 200 events per second per workspace. The substrate write is the rate-limiting cost: each event writes to one or two cloud-storage substrates (Supabase Storage for primary, Cloudflare R2 for extended retention) which dominate the per-event latency.

This limit is the TB1 conservative pre-launch figure from the Phase 2 measurement campaign; first-customer production validation may revise it upward.

What happens at the soft limit

We don't reject events outright at the soft limit — there's no HTTP 429 from event throughput. Instead, the audit chain commits more slowly under sustained pressure (events queue and drain as the chain advances). Your dashboard's "last event" timestamp lags slightly behind the wall clock if you push past the limit.

At the hard limit (currently ~2× the soft limit; we measure this per Phase 5 first-customer validation), event emissions begin to return HTTP 429 with an events_queued error code. Operators are paged automatically at that point.

Tail-latency expectations

Under Tier 1 (audit-class), 99% of audit-chain commits complete in under ~37 ms end-to-end. The latency floor comes from the per-tenant micro-batching window the chain-head uses to amortize the lock-acquisition cost across multiple events.

Under Tier 2 (standard / extended), 99% of substrate writes complete in under ~131 ms end-to-end — wider because the substrate PUT is on the hot path.

How we measure these limits

The two-tier soft limits reflect internal measurement against a local Postgres + MinIO stack (per the Phase 2 measurement campaign and the D25 hybrid routing follow-up). They are based on internal local-substrate measurements pending first-customer production validation — once first-customer workloads run on production substrate (Supabase Storage + Cloudflare R2 over real network), we re-measure and update this page with the production figures.

Until then the limits are conservative-but-real: a workload at the soft limit will keep up cleanly in our local rig; whether the same workload runs faster in production is the validation question.

Asking for more headroom

If you expect to sustain > 60 req/min (API rate limit), > 1,241 EPS audit-class, or > 200 EPS standard/extended, email support with the following:

  • Workspace ID + estimated sustained request rate
  • What your fleet is doing at peak (e.g., "fan-out compilation across 50 customer projects in parallel")
  • Whether you need a temporary lift (one-off batch import) or a sustained plan-tier change

Most adjustments are configuration-only (bucket-size bump for the API limit; substrate provisioning for the event limit) and land within one business day.

Last updated when D25 hybrid routing shipped at v1 launch (audit-class events go through micro-batched chain-head locks per the C2 path; standard / extended retention classes stay on the per-event C1 path). Pre-launch the universal soft limit was ~200 EPS per workspace; the hybrid design measured the audit-class ceiling at ~1,241 EPS on our local rig with standard / extended retaining the conservative 200 EPS pre- launch limit. Production substrate validation pending Phase 5.

← All docs