← Back to products

Harmony

Paid

Durable Execution

Durable execution for AI agents. One Rust binary that includes Vinyl (memory) and Echo (coordination) built in. Zero external dependencies.

Crash recovery & exactly-once replay
Provider-aware retry & auto model fallback
ReAct loops, approval gates, parallel agents
Time-travel debugging & prompt versioning

What is Harmony?

Reverb Harmony is an embedded durable execution runtime, often described as 'Temporal for agents'. It ships as a single Rust binary that includes Vinyl (graph memory) and Echo (memory coordination) built in. You get durable execution, persistent memory, and multi-agent coordination all in one binary with zero external dependencies. Define workflows in YAML or code, and Harmony ensures they complete reliably with exactly-once semantics and checkpoint-based crash recovery.

Why durable execution?

AI agent workflows are long-running, expensive, and involve irreversible side effects. When an LLM call takes 30 seconds and costs $0.10, you don't want to redo it because of a crash. Harmony answers four questions: What is running? Where is it? What happens if it crashes? Can it resume exactly where it left off?

Provider-aware LLM retry

Harmony knows which LLM provider you're calling and retries intelligently for each one. Anthropic sends Retry-After headers, and Harmony respects them exactly. OpenAI has tiered rate limits with x-ratelimit headers, so Harmony reads remaining quota and waits precisely until reset. Google's Gemini doesn't send retry headers at all, so Harmony uses aggressive exponential backoff with a higher multiplier. Groq resets fast, so Harmony uses short 0.5s backoffs. Each provider also gets smart fallback: if Claude Opus is overloaded, Harmony automatically falls back to Sonnet; if GPT-4o hits its limit, it drops to gpt-4o-mini. This saves you money by never retrying blindly and always picking the cheapest viable model when the primary is unavailable.

Embedded, not external

Unlike Temporal or Restate, Harmony is an embedded library, not an external orchestrator. No separate server, no registration step, no dual ports. Add it as a dependency, annotate your workflows, and run your app. One binary, one port, zero infrastructure.

Memory integration

Harmony has dedicated step types for Echo operations (EchoReadStep, EchoWriteStep) with transactional atomicity. Memory writes through Harmony are exactly-once: the journal records intent, Echo executes with an idempotency key, and replay skips completed writes.

Token budgets and cost tracking

Harmony tracks token usage and API costs per workflow in real time. You can set per-workflow token budgets with hard limits (abort), soft limits (warn and continue), or adaptive limits (reduce context window). Every LLM call records input tokens, output tokens, estimated cost, and which prompt version produced the output. This gives you full visibility into where your money is going and lets you set guardrails before a runaway agent burns through your API credits.

YAML and code workflows

Simple flows are defined in YAML, making them readable, versionable, and easy to share. Complex logic uses code with Rust, Python, or TypeScript SDKs (via Mozilla's uniffi for polyglot bindings). Both support the same step types: tool calls, Echo reads/writes, branching, parallel execution, and child workflow spawning. The same workflow definition works locally with SQLite and at scale with PostgreSQL. Change the config, not the code.

First-class ReAct loop

The Reason-Act-Observe loop is the core pattern behind every autonomous agent. Harmony makes it a durable primitive with ctx.react_loop(). Each iteration (the LLM reasons, picks a tool, observes the result) is individually checkpointed to the journal. If the process crashes on iteration 7 of 12, it resumes at iteration 7 with full state, not from the beginning. The loop also enforces configurable exit conditions: max iterations, token budget exhaustion, confidence thresholds, or explicit agent signals. This turns a fragile unbounded loop into a reliable, auditable execution trace.

Time-travel debugging

Every step, decision, and LLM call in a workflow is journaled. Harmony lets you replay execution to any point in history and inspect the exact state the agent saw at that moment: what was in memory, which prompt was sent, what the model returned, why it branched left instead of right. This is essential for debugging agents in production where 'why did it do that?' is the hardest question to answer. Same journal, same execution, deterministic replay guaranteed.

Structured output validation

When an LLM returns malformed JSON, missing fields, or output that doesn't match your expected schema, Harmony automatically retries with a corrective prompt that includes the validation error. You define the expected output shape once, and Harmony handles parsing, validation, and retry in a single durable step. No more manual try/catch loops around every LLM call. If the model keeps failing after max attempts, Harmony falls back to a simpler model that tends to follow instructions more reliably.

Human-in-the-loop gates

Some agent actions are too consequential to run unsupervised: deploying code, sending emails, modifying production data. Harmony supports durable approval gates where a workflow pauses, notifies a human, and resumes only after explicit approval. The pause is crash-safe; even if the server restarts while waiting, the workflow resumes in its paused state. Gates can also have timeouts, auto-approve rules for low-risk actions, and escalation chains.

Parallel agent orchestration

Harmony can run multiple LLM calls or tool invocations in parallel within a single workflow, each with its own durable checkpoint. Fan out to three models simultaneously, collect their outputs, run consensus through Echo, and merge the results. If one branch fails, only that branch retries. Child workflows can be spawned for hierarchical agent architectures where a planner agent delegates to specialist agents, each running their own durable workflow.

Prompt versioning and A/B testing

Every LLM call in Harmony records which prompt version produced the output. When you update a prompt, Harmony can route a percentage of traffic to the new version while keeping the old one as control. All results are tagged with their prompt version in the journal, so you can compare outputs, costs, and latencies across versions. This turns prompt engineering from guesswork into measurable experimentation.

Be the first to know when we launch

Join the waitlist for early access to Reverb Cloud and updates on the platform.