Python: fix(python/openai): use full-history mode by default to fix tool call streaming with OpenRouter#5111
Conversation
…aming with OpenRouter _get_conversation_id now returns None unless store=True is explicitly set, making server-managed conversation mode opt-in. This ensures compatibility with providers that don't support previous_response_id (e.g. OpenRouter), where sending only a tool result without full conversation context caused an HTTP 400 error. Fixes microsoft#5105
|
@microsoft-github-policy-service agree |
chetantoshniwal
left a comment
There was a problem hiding this comment.
Automated Code Review
Reviewers: 4 | Confidence: 90%
✓ Correctness
This PR fixes a correctness issue where
_get_conversation_idwould return a conversation ID whenstore=None(the default), causing providers that don't supportprevious_response_id(e.g. OpenRouter) to fail. The fix changes the guard fromif store is Falsetoif store is not True, so only an explicitstore=Trueopt-in enables server-managed conversation mode. The logic change is correct and the tests are well-structured. One note: the same_get_conversation_idmethod in_responses_client.py(line 448) still uses the oldif store is Falseguard, but this may be intentional since the Responses API is OpenAI-native and always supportsprevious_response_id.
✓ Security Reliability
This PR fixes a reliability issue (GitHub #5105) where
_get_conversation_idwould return a conversation ID whenstore=None(the default), causing failures with OpenAI-compatible providers (e.g., OpenRouter) that don't supportprevious_response_id. The fix correctly changes the guard fromif store is Falsetoif store is not True, ensuring only explicit opt-in viastore=Trueenables server-managed conversation mode. The code change is minimal and well-tested. No security issues are introduced. Note that the sibling_responses_client.pystill uses the oldif store is Falsecheck, but this is likely intentional since the Responses API is OpenAI-native and always supports conversation IDs.
✓ Test Coverage
The test coverage for this behavioral change is solid. The new test
test_parse_response_with_store_none_returns_nonedirectly validates the regression fix (issue #5105) by asserting that bothstore=Noneandstore=FalsereturnNonefrom_get_conversation_id. The streaming tests (test_streaming_response_created_typeandtest_streaming_response_in_progress_type) were properly updated to verify bothstore=True(returns conversation_id) andstore=None(returnsNone) paths. One minor gap: the non-streaming_get_conversation_idpath lacks a directstore=Truewith-conversation positive test in this diff, though this is likely covered by a pre-existing test. Astore=Falsecase for streaming is untested but is functionally identical tostore=Noneunder the newis not Trueguard. Overall the tests are meaningful — they assert specific return values rather than just exercising code paths.
✓ Design Approach
The change correctly inverts the semantics of
_get_conversation_idfrom opt-out (return None only whenstore=False) to opt-in (return non-None only whenstore=True). Previously,store=None(the default when a caller omits the field) would fall through the guard and return aprevious_response_id-based conversation ID, which breaks providers like OpenRouter that don't support that parameter. The fix properly treatsstore=Nonethe same asstore=False, making server-managed conversation mode strictly opt-in. The change is minimal, addresses the root cause rather than masking it, and the updated tests adequately cover all threestorestates. No blocking design issues found.
Automated review by chetantoshniwal's agents
| assert client._get_conversation_id(mock_response, store=False) is None | ||
|
|
||
|
|
||
| def test_parse_response_uses_response_id_when_no_conversation() -> None: |
There was a problem hiding this comment.
This test covers the store=None and store=False negative cases well, but doesn't verify the positive case (store=True returning conv_456). Adding assert client._get_conversation_id(mock_response, store=True) == 'conv_456' would make this a complete tri-state test and guard against future regressions that might accidentally break the store=True path.
| def test_parse_response_uses_response_id_when_no_conversation() -> None: | |
| assert client._get_conversation_id(mock_response, store=None) is None | |
| assert client._get_conversation_id(mock_response, store=False) is None | |
| assert client._get_conversation_id(mock_response, store=True) == "conv_456" |
| mode, which is compatible with providers that do not support ``previous_response_id`` | ||
| (e.g. OpenRouter). | ||
| """ | ||
| if store is not True: |
There was a problem hiding this comment.
so this is not the way, this client is designed to connect to the Responses API and that API (at least in both OpenAI and Foundry) uses service side storage by default, so only when store is explicitly set to False do we not store anything, the assumption for these clients is that when store is not set, the value is True, hence the check for store=False above. In order to use this as expected with other providers, either create a issue on their end that they adhere to the responses API which means they would have to support the store=True mechanism, or that they raise when store=None, or in your own code, set the options to store=False
There was a problem hiding this comment.
Pull request overview
This PR changes the OpenAI Python chat client to default to full-history mode (no server-managed continuation) to avoid streaming failures after tool calls on providers that don’t support previous_response_id (e.g., OpenRouter), while keeping server-managed mode available via explicit store=True.
Changes:
- Updated
_get_conversation_idto returnNoneunlessstore=Trueis explicitly set, making server-managed continuation opt-in. - Added/updated unit tests to cover
store=None(default) vsstore=Truebehavior for both response parsing and streaming events.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
python/packages/openai/agent_framework_openai/_chat_client.py |
Makes conversation-id/continuation behavior opt-in by requiring store=True to emit a conversation identifier. |
python/packages/openai/tests/openai/test_openai_chat_client.py |
Adds regression coverage ensuring default (store=None) uses full-history mode and streaming only includes conversation_id when store=True. |
|
Hey @eavanvalkenburg — appreciate the thorough review and totally understand the concern about the Responses API design intent. That said, I want to push back a bit on the framing. The agent-framework positions itself as a multi-provider framework, not an OpenAI/Foundry-exclusive client. Defaulting to server-managed conversation mode (store=True assumption) means any provider that doesn't support previous_response_id — OpenRouter, Anthropic, Mistral, etc. — fails silently with an HTTP 400. That's a pretty rough default for a framework that advertises broad provider support. The fix doesn't remove server-managed mode — it just makes it opt-in with store=True, which is one line for OpenAI native users. Meanwhile, everyone else gets a working default out of the box. On your suggestion to have users set store=False explicitly — that works as a workaround, but it puts the burden on every non-OpenAI user to know about an internal implementation detail just to avoid a crash. That feels like the wrong abstraction layer for the fix. Would you be open to merging with the opt-in approach, with updated docs making clear that store=True is required for OpenAI native server-managed mode? Happy to add that if it helps get this across the line. |
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Summary
Fixes #5105 — streaming crashes after tool execution when using OpenRouter or any provider that doesn't support
previous_response_id.Root cause:
_get_conversation_idreturnedresponse.idby default (whenstorewas not explicitlyFalse). This caused the executor to enter server-managed continuation mode — clearing the full conversation history and sending only the tool result in the next request. Providers like OpenRouter ignoreprevious_response_idand received a barefunction_call_outputat position 0 → HTTP 400.Fix:
_get_conversation_idnow returnsNoneunlessstore=Trueis explicitly set. Server-managed conversation mode is opt-in; the default is full-history mode, which is universally compatible.Users who want the server-managed optimization (OpenAI native only) should set
store=Truein their chat options.Changes
python/packages/openai/agent_framework_openai/_chat_client.py: changedif store is False→if store is not Truein_get_conversation_idpython/packages/openai/tests/openai/test_openai_chat_client.py: updated streaming event tests to cover bothstore=Trueandstore=None; addedtest_parse_response_with_store_none_returns_noneas a regression testTest plan
uv run pytest packages/openai/tests/ -m "not integration"— 270 passed, 0 failuresstore=Truestill enables server-managed conversation mode for OpenAI native