Fix preferred_locations stale ordering after hub region change#3874
Closed
tvaron3 wants to merge 6 commits intoAzure:release/azure_data_cosmos-previewsfrom
Closed
Fix preferred_locations stale ordering after hub region change#3874tvaron3 wants to merge 6 commits intoAzure:release/azure_data_cosmos-previewsfrom
tvaron3 wants to merge 6 commits intoAzure:release/azure_data_cosmos-previewsfrom
Conversation
…datta/fix_retry_policy
LocationCache::update() was cloning the accumulated preferred_locations and extending it, which permanently baked in the region order from the first account-property fetch. Subsequent gateway refreshes with new region orders (e.g. after a hub region change) were silently ignored. Add an immutable initial_preferred_locations field that preserves the user-specified preferred regions from construction. On each update() call, rebuild the effective preferred list from this immutable copy so that: - Customer-specified regions are always honored as the priority prefix - Gateway-returned regions update their order on every refresh - No stale ordering is carried forward from prior calls Add two new tests verifying the fix for both the no-user-preference and with-user-preference scenarios. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…policy' into tvaron3/fix-preferred-locations-mutation
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Fixes stale preferred_locations ordering in the Cosmos routing LocationCache after hub region changes, ensuring the SDK re-respects the gateway-provided region order on each account refresh (while still honoring any user-provided preferred-region prefix).
Changes:
- Rebuild
locations_info.preferred_locationson eachLocationCache::update()from an immutableinitial_preferred_locationsbaseline to avoid permanently baking in the first gateway ordering. - Add unit tests covering both “no user preference” and “user preference prefix preserved” scenarios across consecutive updates with reordered gateway regions.
- Update retry handling for
403/3(WriteForbidden) to gate PK-range endpoint marking behind PPAF/PPCB eligibility checks; add a changelog entry for the location ordering fix.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| sdk/cosmos/azure_data_cosmos/src/routing/location_cache.rs | Adds initial_preferred_locations, changes update() to rebuild effective ordering each refresh, and adds regression tests. |
| sdk/cosmos/azure_data_cosmos/src/retry_policies/client_retry_policy.rs | Adjusts 403/3 retry path to use per-partition failover / circuit breaker eligibility checks before marking endpoints unavailable. |
| sdk/cosmos/azure_data_cosmos/CHANGELOG.md | Adds an unreleased “Bugs Fixed” entry documenting the hub-region ordering fix. |
Comments suppressed due to low confidence (1)
sdk/cosmos/azure_data_cosmos/src/routing/location_cache.rs:216
existingis built once fromeffective_preferred_locationsand never updated when youpushnew regions. That means duplicates inread_locations(or duplicates already present ininitial_preferred_locations) can still produce duplicate entries inpreferred_locations, despite the dedup intent. Consider making the set mutable and inserting each region as it’s appended (and optionally normalizing/deduping the initial list as well) so the preferred list is guaranteed unique.
// Use HashSet for O(1) lookups instead of O(n) linear search
let existing: HashSet<RegionName> = effective_preferred_locations.iter().cloned().collect();
// Extend with read locations not already in preferred locations - O(n)
for location in &read_locations {
if !existing.contains(&location.name) {
effective_preferred_locations.push(location.name.clone());
}
0481ae9 to
dca37df
Compare
simorenoh
approved these changes
Mar 13, 2026
Member
Author
|
Application region is now mandatory so this change is not relevant currently. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
[Internal]
Depends on this pr #3861 that should be merged and reviewed in first.
Problem
LocationCache::update()permanently bakes in the region order from the first account-property fetch. On each call it clones the accumulatedpreferred_locations, extends it with any new read regions via aHashSetdedup check, and stores the result back. Because existing regions are already in the set, their order never changes — even when the gateway returns a new region order after a hub region change.Impact: After a hub region change, the SDK ignores the updated region ordering from the gateway. Traffic continues routing based on the stale order from the initial account fetch rather than reflecting the new hub.
Fix
Add an immutable
initial_preferred_locationsfield toLocationCachethat preserves the user-specified preferred regions from construction. On eachupdate()call, the effective preferred list is rebuilt from this immutable copy rather than from the accumulated state:Changes
location_cache.rs: Addedinitial_preferred_locationsfield toLocationCache, stored innew(), used inupdate()to rebuild effective preferred list from the original user intent on every account refresh.preferred_locations_not_permanently_mutated_no_user_preference— verifies gateway order updates take effect when no user preference is setpreferred_locations_not_permanently_mutated_with_user_preference— verifies user prefix is preserved while gateway tail updates