Skip to content

Purview: Parallelize PSPC cold-cache scope refresh#5832

Open
taisirhassan wants to merge 7 commits into
microsoft:mainfrom
taisirhassan:purview-parallel-scope-cache
Open

Purview: Parallelize PSPC cold-cache scope refresh#5832
taisirhassan wants to merge 7 commits into
microsoft:mainfrom
taisirhassan:purview-parallel-scope-cache

Conversation

@taisirhassan
Copy link
Copy Markdown
Contributor

@taisirhassan taisirhassan commented May 14, 2026

Motivation and Context

Improve Purview PSPC cold-cache behavior by allowing ProcessContent to run without waiting for a foreground ProtectionScopes lookup. This reduces user-visible latency on cache misses while still refreshing the ProtectionScopes cache for future requests.

The implementation aligns Agent Framework's Python and .NET Purview packages with the Purview SDK's parallel scope retrieval behavior.

Description

  • On a ProtectionScopes cache miss, refresh scopes in the background while ProcessContent runs in the foreground.
  • Preserve cache-hit behavior: cached scopes are evaluated locally before deciding whether to process inline, queue offline processing, or send ContentActivities.
  • Preserve SDK parity for inline evaluation: cold misses do not force Prefer: evaluateInline; cached inline scopes set inline evaluation explicitly.
  • Add .NET ScopeRetrievalJob handling so background scope refreshes use the existing background job infrastructure.
  • Send ContentActivities in the background when a cold-miss scope refresh finds no applicable scopes, preserving the audit signal.
  • Cache tenant-level 402 (Payment Required) responses from background scope refreshes so subsequent requests for the tenant short-circuit before repeating Purview work.
  • Invalidate cached scopes when ProcessContent reports modified protection-scope state.
  • Preserve restriction-only policy actions and deduplicate combined policy actions by (action, restriction_action) in Python and .NET.
  • Align .NET multi-location scope matching with Python and the Purview SDK by treating a scope as applicable when any location matches.
  • Update the Python Purview sample to run the same good -> expected block -> good flow across agent middleware, chat middleware, custom cache, and default cache scenarios.

Validation

  • Python Purview processor tests: 36 passed
  • .NET Purview unit tests: 56 passed on net10.0
  • .NET Purview unit tests: 56 passed on net472
  • Python sample syntax check: py_compile passed
  • Manual sample run verified the good (cold cache) -> expected block -> good (warm cache) orchestration, including cache miss/hit behavior and Purview prompt blocking with the configured Prompt blocked by policy message.

Contribution Checklist

  • The code builds clean without any errors or warnings
  • The PR follows the Contribution Guidelines
  • All unit tests pass, and I have added new tests where possible
  • Is this a breaking change? If yes, add "[BREAKING]" prefix to the title of the PR.

Copilot AI review requested due to automatic review settings May 14, 2026 00:23
@moonbox3 moonbox3 added documentation Improvements or additions to documentation python .NET labels May 14, 2026
@github-actions github-actions Bot changed the title Add parallel Purview scope cache refresh .NET: Add parallel Purview scope cache refresh May 14, 2026
@github-actions github-actions Bot changed the title .NET: Add parallel Purview scope cache refresh Python: Add parallel Purview scope cache refresh May 14, 2026
@moonbox3
Copy link
Copy Markdown
Contributor

moonbox3 commented May 14, 2026

Python Test Coverage

Python Test Coverage Report •
FileStmtsMissCoverMissing
packages/purview/agent_framework_purview
   _processor.py1841293%178, 258–261, 288, 316–317, 328, 330, 336, 338
TOTAL34131390988% 

Python Unit Test Overview

Tests Skipped Failures Errors Time
6801 30 💤 0 ❌ 0 🔥 1m 53s ⏱️

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds opt-in parallel Purview protection-scope cache refresh for Python and .NET so cold-cache ProcessContent calls can proceed while scope data is warmed asynchronously.

Changes:

  • Adds parallel protection scope retrieval settings and documentation.
  • Adds background scope refresh paths and ProcessContent scope identifier invalidation.
  • Extends tests for settings, models, cache invalidation, and parallel retrieval behavior.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
python/packages/purview/tests/purview/test_settings.py Covers new Python setting behavior.
python/packages/purview/tests/purview/test_purview_models.py Covers ProcessContent scope identifier deserialization.
python/packages/purview/tests/purview/test_processor.py Adds Python processor tests for invalidation and parallel refresh.
python/packages/purview/README.md Documents Python cache invalidation and parallel retrieval option.
python/packages/purview/agent_framework_purview/_settings.py Adds Python settings key.
python/packages/purview/agent_framework_purview/_processor.py Implements Python parallel refresh and scope-id invalidation.
python/packages/purview/agent_framework_purview/_models.py Adds ProcessContent response scope identifier support.
dotnet/tests/Microsoft.Agents.AI.Purview.UnitTests/ScopedContentProcessorTests.cs Adds .NET processor tests for invalidation and parallel retrieval.
dotnet/src/Microsoft.Agents.AI.Purview/ScopedContentProcessor.cs Implements .NET parallel refresh queuing and foreground invalidation.
dotnet/src/Microsoft.Agents.AI.Purview/README.md Documents .NET setting.
dotnet/src/Microsoft.Agents.AI.Purview/PurviewSettings.cs Adds .NET setting.
dotnet/src/Microsoft.Agents.AI.Purview/PurviewClient.cs Captures ProcessContent ETag as scope identifier.
dotnet/src/Microsoft.Agents.AI.Purview/Models/Responses/ProcessContentResponse.cs Adds non-serialized scope identifier property.
dotnet/src/Microsoft.Agents.AI.Purview/Models/Jobs/ScopeRetrievalJob.cs Adds background scope refresh job model.
dotnet/src/Microsoft.Agents.AI.Purview/BackgroundJobRunner.cs Executes background scope retrieval jobs and caches responses.
Comments suppressed due to low confidence (1)

python/packages/purview/agent_framework_purview/_processor.py:314

  • Calling _combine_policy_actions with an empty dlp_actions list is not a no-op: the helper only keeps existing actions that have a.action, so a ProcessContent response containing a restriction-only policy action (for example restrictionAction == block) is dropped. The parallel cold path always reaches this line with an empty local action list, which can cause process_messages to miss a service-enforced block.
        pc_resp.policy_actions = self._combine_policy_actions(pc_resp.policy_actions, dlp_actions)

Comment thread python/packages/purview/agent_framework_purview/_processor.py Outdated
Comment thread python/packages/purview/agent_framework_purview/_processor.py Outdated
Comment thread dotnet/src/Microsoft.Agents.AI.Purview/ScopedContentProcessor.cs Outdated
Comment thread dotnet/src/Microsoft.Agents.AI.Purview/BackgroundJobRunner.cs Outdated
@taisirhassan taisirhassan force-pushed the purview-parallel-scope-cache branch from e6b517e to 1ebf7f7 Compare May 14, 2026 00:38
@taisirhassan taisirhassan force-pushed the purview-parallel-scope-cache branch from 1ebf7f7 to ba50cd3 Compare May 14, 2026 01:17
@taisirhassan taisirhassan requested a review from Copilot May 14, 2026 01:19
@taisirhassan taisirhassan changed the title Python: Add parallel Purview scope cache refresh Purview: Parallelize PSPC cold-cache scope refresh May 14, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.

Comments suppressed due to low confidence (1)

python/packages/purview/tests/purview/test_processor.py:347

  • This test never exercises the cached-scope path: _process_with_scopes sees a cache miss, starts get_protection_scopes only in the background, and calls ProcessContent without the scope_identifier from psResponse. As written it only verifies cold-miss removal for a zero response, so it does not cover the documented case where cached scopes are stale; seed the cache with psResponse or assert the background behavior separately.
        request = process_content_request_factory()
        await processor._process_with_scopes(request)
        await asyncio.gather(*list(processor._background_tasks))

        cached = await cache.get(f"purview:payment_required:{request.tenant_id}")
        assert isinstance(cached, PurviewPaymentRequiredError)

    async def test_map_messages_with_user_id_in_additional_properties(self, mock_client: AsyncMock) -> None:
        """Test user_id extraction from message additional_properties."""
        settings = PurviewSettings(
            app_name="Test App",
            tenant_id="12345678-1234-1234-1234-123456789012",

Comment thread dotnet/src/Microsoft.Agents.AI.Purview/BackgroundJobRunner.cs Outdated
Comment thread python/packages/purview/agent_framework_purview/_processor.py Outdated
Comment thread python/packages/purview/agent_framework_purview/_processor.py
Comment thread dotnet/tests/Microsoft.Agents.AI.Purview.UnitTests/ScopedContentProcessorTests.cs Outdated
Comment thread python/packages/purview/agent_framework_purview/_processor.py
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Comment thread python/packages/purview/agent_framework_purview/_processor.py
Comment thread python/packages/purview/agent_framework_purview/_processor.py Outdated
Comment thread python/packages/purview/agent_framework_purview/_processor.py
Comment thread dotnet/src/Microsoft.Agents.AI.Purview/BackgroundJobRunner.cs Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (3)

python/samples/05-end-to-end/purview_agent/sample_purview_agent.py:189

  • This also feeds the AZURE_OPENAI_ENDPOINT value into project_endpoint. Since project_endpoint must be a Foundry project endpoint, the chat middleware path will fail or target the wrong service when users follow the sample's environment variable names.
        project_endpoint=endpoint,

python/samples/05-end-to-end/purview_agent/sample_purview_agent.py:236

  • This custom-cache path has the same endpoint mismatch: the sample reads AZURE_OPENAI_ENDPOINT but now passes it as a Foundry project_endpoint. Use the Foundry project endpoint environment variable here as well so the sample remains runnable.
    client = FoundryChatClient(model=deployment, project_endpoint=endpoint, credential=AzureCliCredential())

python/samples/05-end-to-end/purview_agent/sample_purview_agent.py:278

  • This default-cache client is initialized with the value from AZURE_OPENAI_ENDPOINT even though project_endpoint expects a Foundry project endpoint. Keeping the old environment variable here makes this sample path fail for users with an Azure OpenAI endpoint configured.
    client = FoundryChatClient(model=deployment, project_endpoint=endpoint, credential=AzureCliCredential())

Comment thread python/samples/05-end-to-end/purview_agent/sample_purview_agent.py
Comment thread dotnet/src/Microsoft.Agents.AI.Purview/ScopedContentProcessor.cs
Comment thread dotnet/src/Microsoft.Agents.AI.Purview/BackgroundJobRunner.cs Outdated
@taisirhassan taisirhassan requested a review from Copilot May 14, 2026 04:47
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 1 comment.

Comment thread dotnet/src/Microsoft.Agents.AI.Purview/ScopedContentProcessor.cs
@Rishabh4275
Copy link
Copy Markdown
Contributor

  1. Restore dedup in _combine_policy_actions (key by (action, restriction_action) or similar) and add a regression test for the duplicate-action case.
  2. Decide and document/align the 402-on-first-call contract between Python and .NET.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 3 comments.

Comment thread dotnet/src/Microsoft.Agents.AI.Purview/ScopedContentProcessor.cs
Comment thread python/packages/purview/agent_framework_purview/_processor.py
Comment thread python/samples/05-end-to-end/purview_agent/sample_purview_agent.py
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (1)

dotnet/src/Microsoft.Agents.AI.Purview/BackgroundJobRunner.cs:87

  • This payment-required cache write can also abort the background consumer if the cache provider throws a SystemException-derived failure, because the outer runner filter will not catch it. Since 402 caching is intended to be best-effort, this should be caught and logged locally so a cache outage does not stop all subsequent background jobs.
                    await this._cacheProvider.SetAsync(
                        new PaymentRequiredCacheKey(scopeRetrievalJob.Request.TenantId),
                        new PaymentRequiredCacheEntry(ex.Message),
                        CancellationToken.None).ConfigureAwait(false);

Comment thread dotnet/src/Microsoft.Agents.AI.Purview/BackgroundJobRunner.cs
Comment thread python/packages/purview/agent_framework_purview/_processor.py
Comment thread python/samples/05-end-to-end/purview_agent/sample_purview_agent.py
@Rishabh4275
Copy link
Copy Markdown
Contributor

Could we also check the cache miss contentActivities call path?

@Rishabh4275
Copy link
Copy Markdown
Contributor

Rishabh4275 commented May 14, 2026

Check the README(4, all purview, one sample, and one with the actual package(python and .net)) as well.
Also want to confirm that we're checking the good vs block path vs good path.
Orchestrate multiple messages one after another.

Comment thread dotnet/src/Microsoft.Agents.AI.Purview/BackgroundJobRunner.cs Outdated
Comment thread dotnet/src/Microsoft.Agents.AI.Purview/README.md Outdated
Comment thread dotnet/src/Microsoft.Agents.AI.Purview/ScopedContentProcessor.cs Outdated
Comment thread python/packages/a2a/agent_framework_a2a/_agent.py
Comment thread python/packages/purview/AGENTS.md Outdated
taisirhassan and others added 7 commits May 18, 2026 12:27
 Deduplicate combined policy actions by action and restriction action so restriction-only actions are preserved
without duplicating identical entries. Cache tenant-level payment-required state from background scope refresh so
subsequent calls short-circuit consistently.
…l and add unit tests for cache write failures
…tyJob when no applicable scopes are found and update related tests
…ct caching optimizations and policy enforcement scenarios

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation .NET python workflows Related to Workflows in agent-framework

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants