Commit ebf2e86
authored
Cosmos: Adds Per Partition Automatic Failover and Circuit Breaker Specs (#3880)
## Description
Introduces the **Per-Partition Automatic Failover (PPAF) & Per-Partition
Circuit Breaker (PPCB)** design spec for `azure_data_cosmos_driver`.
This spec describes partition-level failover mechanisms that complement
the existing account-level failover in the driver's 7-stage operation
pipeline. Instead of marking an entire region unavailable when a single
partition becomes unhealthy, only the affected partition is routed to an
alternate region — preserving local latency for healthy partitions.
### What's in the spec
- **Two complementary mechanisms**:
- **PPAF** — per-partition failover for writes on single-master
accounts, triggered by 403/3, 503, 429/3092, 410
- **PPCB** — per-partition circuit breaker for reads (any account) and
writes on multi-master accounts, threshold-gated
- **Component design**: `PartitionEndpointState`,
`PartitionFailoverEntry`, `PartitionFailoverConfig` — all managed via
the driver's existing lock-free CAS pattern (no `RwLock<HashMap>` like
the SDK)
- **Operation pipeline integration**: How partition-level overrides plug
into `resolve_endpoint()` (Stage 2), `evaluate_transport_result()`
(Stage 5), and `LocationStateStore::apply()` (Stage 6)
- **Background failback loop**: Periodic sweep that expires stale
partition overrides, spawned via `BackgroundTaskManager` (#3945)
- **Status code handling matrix**: Complete mapping of HTTP
status/sub-status codes to emitted `LocationEffect`s
- **Configuration surface**: All thresholds and intervals configurable
via environment variables
- **Test coverage plan**: Pure routing system tests, eligibility tests,
circuit breaker counter tests, integration tests, and end-to-end
operation loop tests
- **Prerequisites**: Missing pieces that must be implemented (partition
key range ID availability, `ResourceType.is_partitioned()`, env var
reading, `sync_account_properties` integration)
### Key design decisions
| Decision | Rationale |
|----------|-----------|
| Immutable CAS snapshots (not `RwLock<HashMap>`) | Follows driver's
existing lock-free pattern; eliminates reader/writer contention on hot
path |
| Two separate maps (PPAF vs PPCB) | Avoids cross-contamination between
single-master write failover and multi-master circuit breaker routing
strategies |
| Plain counters (not `AtomicI32`) | Entire `PartitionEndpointState` is
swapped atomically via CAS — no need for interior atomic counters |
| Failback sweeps both maps | Improvement over SDK which only sweeps
PPCB; trivial with immutable-snapshot pattern |
| `BackgroundTaskManager` for failback loop | Provides abort-on-drop,
panic safety, and graceful shutdown (#3945) |
| Acceptable CAS counter loss under contention | Delays threshold
trigger by at most one failure — better trade-off than introducing locks
|
### Dependencies
- #3945 — `BackgroundTaskManager` (must merge first; spec references it
for failback loop spawning)
### Files
| File | Action |
|------|--------|
| `azure_data_cosmos_driver/docs/PARTITION_LEVEL_FAILOVER_SPEC.md` |
**New** |1 parent 962558f commit ebf2e86
File tree
1 file changed
+1532
-0
lines changed- sdk/cosmos/azure_data_cosmos_driver/docs
1 file changed
+1532
-0
lines changed
0 commit comments