Skip to content

Fix preferred_locations stale ordering after hub region change#3874

Closed
tvaron3 wants to merge 6 commits intoAzure:release/azure_data_cosmos-previewsfrom
tvaron3:tvaron3/fix-preferred-locations-mutation
Closed

Fix preferred_locations stale ordering after hub region change#3874
tvaron3 wants to merge 6 commits intoAzure:release/azure_data_cosmos-previewsfrom
tvaron3:tvaron3/fix-preferred-locations-mutation

Conversation

@tvaron3
Copy link
Copy Markdown
Member

@tvaron3 tvaron3 commented Mar 5, 2026

[Internal]

Depends on this pr #3861 that should be merged and reviewed in first.

Problem

LocationCache::update() permanently bakes in the region order from the first account-property fetch. On each call it clones the accumulated preferred_locations, extends it with any new read regions via a HashSet dedup check, and stores the result back. Because existing regions are already in the set, their order never changes — even when the gateway returns a new region order after a hub region change.

Impact: After a hub region change, the SDK ignores the updated region ordering from the gateway. Traffic continues routing based on the stale order from the initial account fetch rather than reflecting the new hub.

Fix

Add an immutable initial_preferred_locations field to LocationCache that preserves the user-specified preferred regions from construction. On each update() call, the effective preferred list is rebuilt from this immutable copy rather than from the accumulated state:

  • Customer provided preferred regions: Always honored as the priority prefix; gateway-returned regions that are not in the customer list are appended in the gateway's current order (which updates on each refresh).
  • No customer preference: The effective list is derived entirely from the gateway's response order, correctly reflecting hub changes.

Changes

  • location_cache.rs: Added initial_preferred_locations field to LocationCache, stored in new(), used in update() to rebuild effective preferred list from the original user intent on every account refresh.
  • Added two new tests:
    • preferred_locations_not_permanently_mutated_no_user_preference — verifies gateway order updates take effect when no user preference is set
    • preferred_locations_not_permanently_mutated_with_user_preference — verifies user prefix is preserved while gateway tail updates

kundadebdatta and others added 4 commits March 4, 2026 09:28
LocationCache::update() was cloning the accumulated preferred_locations
and extending it, which permanently baked in the region order from the
first account-property fetch. Subsequent gateway refreshes with new
region orders (e.g. after a hub region change) were silently ignored.

Add an immutable initial_preferred_locations field that preserves the
user-specified preferred regions from construction. On each update()
call, rebuild the effective preferred list from this immutable copy
so that:
- Customer-specified regions are always honored as the priority prefix
- Gateway-returned regions update their order on every refresh
- No stale ordering is carried forward from prior calls

Add two new tests verifying the fix for both the no-user-preference
and with-user-preference scenarios.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
tvaron3 and others added 2 commits March 5, 2026 16:05
…policy' into tvaron3/fix-preferred-locations-mutation
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@tvaron3 tvaron3 marked this pull request as ready for review March 6, 2026 23:12
@tvaron3 tvaron3 requested a review from a team as a code owner March 6, 2026 23:12
Copilot AI review requested due to automatic review settings March 6, 2026 23:12
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes stale preferred_locations ordering in the Cosmos routing LocationCache after hub region changes, ensuring the SDK re-respects the gateway-provided region order on each account refresh (while still honoring any user-provided preferred-region prefix).

Changes:

  • Rebuild locations_info.preferred_locations on each LocationCache::update() from an immutable initial_preferred_locations baseline to avoid permanently baking in the first gateway ordering.
  • Add unit tests covering both “no user preference” and “user preference prefix preserved” scenarios across consecutive updates with reordered gateway regions.
  • Update retry handling for 403/3 (WriteForbidden) to gate PK-range endpoint marking behind PPAF/PPCB eligibility checks; add a changelog entry for the location ordering fix.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
sdk/cosmos/azure_data_cosmos/src/routing/location_cache.rs Adds initial_preferred_locations, changes update() to rebuild effective ordering each refresh, and adds regression tests.
sdk/cosmos/azure_data_cosmos/src/retry_policies/client_retry_policy.rs Adjusts 403/3 retry path to use per-partition failover / circuit breaker eligibility checks before marking endpoints unavailable.
sdk/cosmos/azure_data_cosmos/CHANGELOG.md Adds an unreleased “Bugs Fixed” entry documenting the hub-region ordering fix.
Comments suppressed due to low confidence (1)

sdk/cosmos/azure_data_cosmos/src/routing/location_cache.rs:216

  • existing is built once from effective_preferred_locations and never updated when you push new regions. That means duplicates in read_locations (or duplicates already present in initial_preferred_locations) can still produce duplicate entries in preferred_locations, despite the dedup intent. Consider making the set mutable and inserting each region as it’s appended (and optionally normalizing/deduping the initial list as well) so the preferred list is guaranteed unique.
        // Use HashSet for O(1) lookups instead of O(n) linear search
        let existing: HashSet<RegionName> = effective_preferred_locations.iter().cloned().collect();

        // Extend with read locations not already in preferred locations - O(n)
        for location in &read_locations {
            if !existing.contains(&location.name) {
                effective_preferred_locations.push(location.name.clone());
            }

Copy link
Copy Markdown
Member

@FabianMeiswinkel FabianMeiswinkel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - Thx!

@github-project-automation github-project-automation bot moved this from Todo to Approved in CosmosDB Go/Rust Crew Mar 9, 2026
@tvaron3
Copy link
Copy Markdown
Member Author

tvaron3 commented Mar 16, 2026

Application region is now mandatory so this change is not relevant currently.

@tvaron3 tvaron3 closed this Mar 16, 2026
@github-project-automation github-project-automation bot moved this from Approved to Done in CosmosDB Go/Rust Crew Mar 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Cosmos The azure_cosmos crate

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

5 participants