Circuit Breaker
The circuit breaker pattern protects the Obsidian headless CLI from cascading failures when the Obsidian API or sync servers are overloaded or unreachable. It uses sony/gobreaker/v2 to detect unhealthy backends and fail fast, while allowing automatic recovery.
Overview
When a remote service degrades, continuing to hammer it with requests wastes resources and delays user feedback. A circuit breaker sits between the caller and the service, tracking recent failures. Once failures exceed a threshold, the breaker "opens" and immediately rejects new requests without touching the network. After a cooldown period, it enters a "half-open" probing state to test if the service has recovered.
Design Philosophy
The breaker exists to protect both the client and the server. Failing fast gives users immediate feedback instead of hanging on every request, and reducing request volume gives an overloaded server time to recover.
This project deploys circuit breakers at two levels:
- HTTP API breaker — a single shared breaker protecting all REST API calls (authentication, vault management, publish).
- WebSocket sync breaker — one breaker per vault protecting WebSocket sync connections.
Architecture
The circuit breaker is layered inside the existing retry mechanism, following the "Polly pattern": retry wraps breaker wraps transport.
Each retry attempt passes through the breaker. When the breaker is open, gobreaker.ErrOpenState is returned immediately and treated as a permanent error by the retrier, so no more retries are attempted until the breaker transitions to half-open.
Retry + Breaker Flow
HTTP Breaker (Shared)
All REST API endpoints hit the same backend (api.obsidian.md), so a single breaker instance on api.Client protects auth, vault, and publish calls alike. If the API backend is overloaded, there is no point distinguishing between endpoint failures — they all indicate the same underlying problem.
Shared vs Per-Vault
A single HTTP breaker works because all endpoints share the same backend. If one endpoint fails repeatedly, the backend is likely struggling everywhere. Opening the breaker for all calls reduces pressure faster.
WebSocket Breaker (Per-Vault)
Each vault may connect to a different sync host (e.g., sync-1.obsidian.md vs sync-2.obsidian.md). A per-vault breaker on sync.Engine isolates failures: one overloaded sync host does not block connections to others.
Configuration
HTTP API Breaker
| Setting | Value | Rationale |
|---|---|---|
| Name | "obsidian-api" | Identifies the breaker in logs |
| MaxRequests | 3 | Number of probes allowed in half-open state |
| Interval | 30s | Rolling window for failure counting |
| Timeout | 30s | Duration the breaker stays open before probing |
| ReadyToTrip | 5 consecutive failures | Opens after 5 failures in a row |
| IsExcluded | context.Canceled, context.DeadlineExceeded | Client-side cancellation is not a service health indicator |
| IsSuccessful | nil error = success; any error = failure | Includes "overloaded" responses as failures |
WebSocket Breaker (Per-Vault)
| Setting | Value | Rationale |
|---|---|---|
| Name | "obsidian-sync-ws-{vaultID}" | Identifies the vault in logs |
| MaxRequests | 1 | Binary state — one probe in half-open is enough |
| Interval | 0 | No rolling window needed; binary open/closed |
| Timeout | 60s | Longer cooldown for WS reconnection cycles |
| ReadyToTrip | 3 consecutive failures | Opens after 3 consecutive connect failures |
| IsExcluded | None | All WS failures count toward the threshold |
State Machine
The circuit breaker follows a three-state cycle:
Closed (Normal Operation)
- All requests pass through to the underlying service.
- Successes and failures are tracked within the rolling window (
Interval). - When
ReadyToTripfailures are detected (consecutive), the breaker transitions to Open.
Open (Failing Fast)
- All requests are rejected immediately with
gobreaker.ErrOpenState. - The breaker remains open for the
Timeoutduration (30s for HTTP, 60s for WS). - No network calls are made — the client fails fast and conserves resources.
- After
Timeoutexpires, the breaker transitions to Half-Open.
Half-Open (Probing)
- A limited number of requests (
MaxRequests) are allowed through as probes. - If a probe succeeds, the breaker transitions back to Closed (service recovered).
- If a probe fails, the breaker returns to Open (service still degraded).
- This prevents a recovered-but-fragile service from being immediately overwhelmed.
State Change Logging
All state transitions are logged via zerolog at Warn level:
circuit breaker obsidian-api state changed from closed to open
circuit breaker obsidian-api state changed from open to half-open
circuit breaker obsidian-api state changed from half-open to closed
Retry Integration
The retry and circuit breaker work together via the Polly pattern:
- Retry (outer layer) manages
cenkalti/backoffexponential backoff with jitter. - Circuit Breaker (inner layer) decides whether a request should even attempt the network.
- Transport (HTTP/WS) performs the actual I/O.
Critical Layering Order
The retry layer must wrap the breaker, never the other way around. If the breaker wrapped the retry, it would count an entire retry storm as a single attempt, defeating the purpose of rapid failure detection.
When the breaker is open, it returns gobreaker.ErrOpenState. The retry layer detects this error and wraps it as backoff.Permanent, which stops all retries immediately. This is correct because:
- Retrying while the breaker is open would only produce the same error.
- The breaker will transition to half-open on its own timer.
- Stopping retries gives the user immediate feedback instead of an artificial delay.
Consecutive Failure Counting
Overloaded 200 responses count as a failure per retry attempt. After 5 consecutive overloaded responses (each one a separate retry attempt within the rolling window), the breaker opens. Subsequent retry attempts hit the open breaker and stop immediately.
Package Wiring
src/internal/circuitbreaker/
New package containing:
- Factory functions for creating HTTP and WebSocket breakers with the correct configuration.
BreakerErrortype — wrapsgobreaker.ErrOpenStatewith a user-friendly message.IsBreakerError(err error) bool— helper to detect when an error originated from an open circuit.
src/internal/api/
The HTTP breaker is attached to the api.Client struct. It wraps:
postJSON()— all POST requests (auth, vault, publish metadata).uploadPublishedFile()— file uploads to publish hosts.
Every outgoing HTTP call passes through the shared breaker before reaching the network.
src/internal/sync/
The WebSocket breaker is attached to the per-vault sync.Engine struct. It wraps:
ensureConnected()— the high-level connection gate.connect()— the init handshake sequence.dialWorker()— the low-level WebSocket dial operation.
A per-vault breaker name is constructed at engine creation time: "obsidian-sync-ws-{vaultID}".
src/internal/cli/
The App struct caches the api.Client (with its breaker) across commands. When a breaker error surfaces to the CLI layer, it is translated into a user-friendly message rather than a raw Go error.
Error Handling
BreakerError Type
The circuitbreaker package defines a BreakerError type that wraps the underlying gobreaker.ErrOpenState with context:
type BreakerError struct {
Message string
Err error // gobreaker.ErrOpenState
}
func (e *BreakerError) Error() string { return e.Message }
IsBreakerError Helper
func IsBreakerError(err error) bool
Returns true if the error is or wraps a BreakerError, or if it is an unwrapped gobreaker.ErrOpenState or gobreaker.ErrTooManyRequests sentinel error. Used by the CLI and retry layers to detect open-circuit conditions.
User-Facing Messages
The CLI translates breaker errors into actionable messages:
| Breaker | Message |
|---|---|
| HTTP API | Obsidian API is temporarily unavailable (circuit open); retry in ~30s |
| WS Sync | Vault {id} sync is temporarily unavailable (circuit open); retry in ~60s |
User Experience
These messages tell the user immediately that the problem is server-side (not their connection) and provide an expected recovery timeframe. This is much better than a generic "connection failed" error.
Overloaded Server Handling
The Obsidian API can return 200 OK with "overloaded" in the response body when the server is under strain. The circuit breaker treats this as a failure, not a success:
- The
api.Clientparses the response body after receiving a 200 status. - If the body contains
"overloaded", the caller returns a non-nil error. - The
IsSuccessfulfunction on the breaker sees a non-nil error and counts it as a failure. - After enough consecutive overloaded responses, the breaker opens and new requests fail fast.
Overloaded Responses
A 200 OK with "overloaded" is still a failure from the breaker's perspective. A raw HTTP status check would count it as success, hiding the real server state and allowing the client to continue hammering a struggling backend.
This design is intentional: an overloaded server is already struggling, and reducing request volume gives it time to recover. A raw 200 would otherwise be counted as success by the breaker, hiding the real server state.