Reverb Harmony is an embedded durable execution runtime, often described as 'Temporal for agents'. It guarantees exactly-once execution of agent workflows with checkpoint-based recovery from crashes and restarts. Define workflows in YAML or code, and Harmony ensures they complete reliably.
AI agent workflows are long-running, expensive, and involve irreversible side effects. When an LLM call takes 30 seconds and costs $0.10, you don't want to redo it because of a crash. Harmony answers four questions: What is running? Where is it? What happens if it crashes? Can it resume exactly where it left off?
Harmony knows which LLM provider you're calling and retries intelligently for each one. Anthropic sends Retry-After headers, and Harmony respects them exactly. OpenAI has tiered rate limits with x-ratelimit headers, so Harmony reads remaining quota and waits precisely until reset. Google's Gemini doesn't send retry headers at all, so Harmony uses aggressive exponential backoff with a higher multiplier. Groq resets fast, so Harmony uses short 0.5s backoffs. Each provider also gets smart fallback: if Claude Opus is overloaded, Harmony automatically falls back to Sonnet; if GPT-4o hits its limit, it drops to gpt-4o-mini. This saves you money by never retrying blindly and always picking the cheapest viable model when the primary is unavailable.
Unlike Temporal or Restate, Harmony is an embedded library, not an external orchestrator. No separate server, no registration step, no dual ports. Add it as a dependency, annotate your workflows, and run your app. One binary, one port, zero infrastructure.
Harmony has dedicated step types for Echo operations (EchoReadStep, EchoWriteStep) with transactional atomicity. Memory writes through Harmony are exactly-once: the journal records intent, Echo executes with an idempotency key, and replay skips completed writes.
Harmony tracks token usage and API costs per workflow in real time. You can set per-workflow token budgets with hard limits (abort), soft limits (warn and continue), or adaptive limits (reduce context window). Every LLM call records input tokens, output tokens, estimated cost, and which prompt version produced the output. This gives you full visibility into where your money is going and lets you set guardrails before a runaway agent burns through your API credits.
Simple flows are defined in YAML, making them readable, versionable, and easy to share. Complex logic uses code with Rust, Python, or TypeScript SDKs (via Mozilla's uniffi for polyglot bindings). Both support the same step types: tool calls, Echo reads/writes, branching, parallel execution, and child workflow spawning. The same workflow definition works locally with SQLite and at scale with PostgreSQL. Change the config, not the code.
Join the waitlist for early access to Reverb Cloud and updates on the platform.