Python: Bambriz context provider#5139
Conversation
There was a problem hiding this comment.
Pull request overview
Adds a new Azure Cosmos DB-backed context provider to the Python agent-framework-azure-cosmos package, enabling retrieval-time context injection and post-run message writeback to Cosmos DB.
Changes:
- Introduces
CosmosContextProvider+CosmosContextSearchModewith full-text/vector/hybrid query construction and writeback behavior. - Exposes the new provider from the package public API and updates package metadata/docs.
- Adds a comprehensive new test suite for the context provider.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| python/packages/azure-cosmos/agent_framework_azure_cosmos/_context_provider.py | New Cosmos DB context provider implementation (retrieval + writeback). |
| python/packages/azure-cosmos/agent_framework_azure_cosmos/init.py | Exports the new provider/search mode in the public API. |
| python/packages/azure-cosmos/tests/test_cosmos_context_provider.py | New tests covering retrieval modes, settings resolution, writeback, and lifecycle. |
| python/packages/azure-cosmos/README.md | Documents context provider usage/configuration and updates wording. |
| python/packages/azure-cosmos/AGENTS.md | Updates package overview and usage snippets to include the context provider. |
| python/packages/azure-cosmos/pyproject.toml | Updates package description to include context provider support. |
Fixed some issues exposed during end to end testing
- Streamline _context_provider.py implementation - Reduce test complexity in test_cosmos_context_provider.py - Remove deprecated cosmos_context_shared.py sample - Update AGENTS.md and README.md documentation - Update uv.lock Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…/agent-framework into bambriz-context-provider
|
@moonbox3 could we get a review on this PR? |
moonbox3
left a comment
There was a problem hiding this comment.
Please also have a look at the failing CI/CD checks.
| user_id = context.metadata.get("user_id") if context.metadata else None | ||
|
|
||
| base_sort_key = time.time_ns() | ||
| for index, message in enumerate(writeback): |
There was a problem hiding this comment.
Should we be doing the writeback one upsert at a time? If message N raises (429, transient network, throttling), messages 0..N-1 are already committed and the exception propagates out of after_run. Next turn, before_run will retrieve half-written exchanges (user without assistant, or vice versa) and quietly poison the RAG context. Neither the mem0 (single add()) nor redis (single load()) sibling has this shape. Wondering if we want a try/except around upsert_item that logs and continues, or batch the documents so it is all-or-nothing per turn.
| if self.message_field_name and self.message_field_name not in fields: | ||
| fields.append(self.message_field_name) | ||
| select = ", ".join(f"c.{f}" for f in fields) | ||
| base = f"SELECT TOP {self.scan_limit} {select} FROM c" # noqa: S608 # nosec B608 |
There was a problem hiding this comment.
The # nosec suppresses bandit, but select is built from content_field_names / message_field_name (line 332-335) and vector_field_name (line 346, 351) is also interpolated into the ORDER BY, all constructor strings with no validation. Cosmos NoSQL cannot parameterize field names, so the only defense is constructor-time validation. In a multi-tenant agent host where field names come from request/tenant config, a value like embedding}, @x) OFFSET 0 LIMIT 1; -- injects. Have we thought about a ^[A-Za-z_][A-Za-z0-9_]*$ check at __init__ for every user-supplied field name?
| base = f"SELECT TOP {self.scan_limit} {select} FROM c" # noqa: S608 # nosec B608 | |
| base = f"SELECT TOP {self.scan_limit} {select} FROM c" # noqa: S608 # nosec B608 - field names validated in __init__ |
(plus the validation in __init__ for content_field_names, message_field_name, vector_field_name)
| self.vector_field_name = vector_field_name | ||
| self.embedding_function = embedding_function | ||
| self.partition_key = partition_key | ||
| self.weights = tuple(float(w) for w in weights) if weights is not None else None |
There was a problem hiding this comment.
Cosmos RRF() requires weights in [0, 1]. No bounds check here means a misconfigured weights=[2.0, 1.0] (which is what test_hybrid_weights_passed_through actually uses) crashes mid-before_run against a real endpoint with an opaque server error inside an agent turn rather than a clear ValueError at construction. Worth validating at init?
| self.weights = tuple(float(w) for w in weights) if weights is not None else None | |
| if weights is not None: | |
| for w in weights: | |
| if not 0.0 <= float(w) <= 1.0: | |
| raise ValueError(f"RRF weight {w!r} must be in [0, 1]") | |
| self.weights = tuple(float(w) for w in weights) | |
| else: | |
| self.weights = None |
|
|
||
| container = await self._get_container() | ||
| properties = await container.read() | ||
| paths: list[str] = properties.get("partitionKey", {}).get("paths", []) # type: ignore[assignment] |
There was a problem hiding this comment.
What happens on a container with a hash/synthetic partition key, or any response shape where paths is empty? paths[0] raises IndexError and crashes after_run with no useful message. Should we guard with an explicit error?
| paths: list[str] = properties.get("partitionKey", {}).get("paths", []) # type: ignore[assignment] | |
| paths: list[str] = properties.get("partitionKey", {}).get("paths", []) # type: ignore[assignment] | |
| if not paths: | |
| raise ValueError( | |
| f"Could not determine partition key path for container '{self.container_name}'." | |
| ) | |
| field = paths[0].lstrip("/") | |
| self._partition_key_field = field |
| exc_tb: Any, | ||
| ) -> None: | ||
| try: | ||
| await self.close() |
There was a problem hiding this comment.
If the async with block already raised and close() also raises, the close error is silently dropped. That can mask real client/socket leak signals from azure-cosmos. Other Azure clients in the repo at least log here. Should this except Exception log with exc_info=True before swallowing?
| await self.close() | |
| try: | |
| await self.close() | |
| except Exception: | |
| if exc_type is None: | |
| raise | |
| logger.warning("Error closing Cosmos client during exception handling.", exc_info=True) |
Motivation and Context
Adds an Azure Cosmos DB context provider to the
agent-framework-azure-cosmospackage, enabling RAG-stylecontext injection from Cosmos DB before model invocation and automatic write-back of request/response messages after
each run.
Description
New:
CosmosContextProviderContextProviderprotocol for Azure Cosmos DBCosmosContextSearchMode:VECTOR,FULL_TEXT, andHYBRIDuser/assistantmessages into a retrieval query stringafter_run)top_k,scan_limit,partition_key,weights(for hybrid RRF),content_field_namesSamples (
python/samples/02-agents/context_providers/azure_cosmos/)cosmos_context_basics.py— Vector search context provider usagecosmos_context_fulltext.py— Full-text retrieval modecosmos_context_hybrid.py— Hybrid retrieval with RRF weightsTests
test_cosmos_context_provider.py— Unit tests covering all search modes, message filtering, write-back, anderror handling (95% coverage)
Other changes
AGENTS.mdwith context provider documentation and usage examplesREADME.mdwith detailed context provider API docsContribution Checklist