Python: Expose cross-session memory origin for downstream governance (Dai et al., arXiv:2605.06158)

## Summary

When a `ContextProvider` injects messages from a different `session_id` than the requesting one (the legitimate cross-session-memory use case in `_harness/_memory.py`, Mem0 with default scope, and any session-agnostic backend), the framework retains no provenance back to the originating session. Downstream observers — governance middleware, audit pipelines, behavioral baselines — cannot distinguish "fresh content for this session" from "content originally written in a prior session under different tool permissions."

This makes the stateful-agent-backdoor attack class documented in [Dai et al., arXiv:2605.06158](https://arxiv.org/abs/2605.06158) (preprint May 2026) invisible to governance layers that hook into the context provider pipeline. The paper specifically names Microsoft Agent Framework in its §6.1 / Table 10 realism survey as satisfying the attack's two preconditions (session-level isolation via `AgentSession`; persistent storage shared across sessions by default).

## Motivation

The Dai et al. attack chains sub-backdoors across sessions via persistent memory: session 1 (no sensitive tools) writes attack state; session 2 (file-system tools available) collects secrets and updates state; session 3 (network tools available) exfiltrates. Each session's individual actions are policy-permitted; the chain crosses session boundaries via persisted state. The paper reports 80–95% attack success rate across four base models under supply-chain SFT delivery.

The paper's primary suggested defense is cross-session behavioral pattern analysis (§6.3). This isn't currently possible against MAF without forking or wrapping every concrete provider, because the framework doesn't surface the cross-session signal at the context layer where governance middleware operates.

## Existing architecture — already mostly there

MAF's session model is already well-designed for this:

- `SessionContext.session_id` is first-class
- `HistoryProvider.get_messages(session_id, ...)` / `save_messages(session_id, ...)` are session-aware abstract methods ([`_sessions.py:458-493`](https://github.com/microsoft/agent-framework/blob/main/python/packages/core/agent_framework/_sessions.py#L458-L493))
- `Message.additional_properties["_attribution"]` is already used to attribute injected context to its source provider ([`_sessions.py:233-248`](https://github.com/microsoft/agent-framework/blob/main/python/packages/core/agent_framework/_sessions.py#L233-L248), [`:556-563`](https://github.com/microsoft/agent-framework/blob/main/python/packages/core/agent_framework/_sessions.py#L556-L563))
- The long-term memory system at `_harness/_memory.py` already tracks `MemoryTopicRecord.session_ids: list[str]` per consolidated topic ([`_memory.py:353`](https://github.com/microsoft/agent-framework/blob/main/python/packages/core/agent_framework/_harness/_memory.py#L353), [`:382`](https://github.com/microsoft/agent-framework/blob/main/python/packages/core/agent_framework/_harness/_memory.py#L382)) — the cross-session provenance data is **already captured**, just not propagated to the attribution layer.

The gap is just that the existing `_attribution` payload (`{"source_id", "source_type"}`) doesn't include the originating session_id, so context observers can't see when a message they received came from a different session than the one they're running in.

## Proposed approach (additive, backward-compatible)

Extend the existing `_attribution` payload with an optional `origin_session_id: str | None` key. Update built-in providers and the harness memory system to populate it when returning content from a different session than the requesting one. Ship a sample `ContextProvider` observer in `samples/` demonstrating how a governance layer can detect cross-session injection.

Sketch:

```python
# _sessions.py: SessionContext.extend_messages
attribution = {"source_id": source_id, "source_type": type(source).__name__}
if origin_session_id is not None:
    attribution["origin_session_id"] = origin_session_id
```

```python
# Downstream observer (sample)
class CrossSessionObserver(ContextProvider):
    async def before_run(self, *, agent, session, context, state):
        current = context.session_id
        for source_id, messages in context.context_messages.items():
            for msg in messages:
                attrib = msg.additional_properties.get("_attribution", {})
                origin = attrib.get("origin_session_id")
                if origin and origin != current:
                    self._on_cross_session_access(source_id, origin, current, msg)
```

No changes to the abstract `ContextProvider` / `HistoryProvider` interfaces. No new public types. The default behavior is unchanged — if a provider doesn't populate `origin_session_id`, observers see nothing (attribution-absent case is indistinguishable from same-session, preserving today's semantics).

## Scope proposed for the accompanying draft PR

1. Extend `SessionContext.extend_messages` to accept an optional `origin_session_id` parameter, threaded into the attribution dict
2. Update the harness memory system (`_harness/_memory.py`) to populate the field when injecting consolidated memories or transcripts from prior sessions
3. Sample observer in `samples/` (Python), README citing Dai et al.
4. Tests in `test_sessions.py` (attribution roundtrip) and `test_harness_memory.py` (origin populated correctly)

Estimated ~25-50 LOC core + sample + tests. Backward-compatible (additive). No new public APIs beyond the new attribution key.

The accompanying PR will be marked **draft** explicitly to invite API-shape discussion before merge — per CONTRIBUTING's guidance on not surprising maintainers with new APIs. Happy to revise the shape based on any direction below.

## Open API questions for maintainer input

1. **Attribution key naming.** `origin_session_id` is one option; `source_session_id` mirrors `source_id` more closely. Preference?
2. **Threading mechanism.** Should `extend_messages` take a parameter, or should providers set the field directly on `msg.additional_properties["_attribution"]` after they call `extend_messages`? Parameter is more discoverable; direct-set is simpler and matches how `_split_service_call_messages` reads attribution.
3. **Built-in providers.** Should `InMemoryHistoryProvider` / `FileHistoryProvider` populate the field? They're session-scoped by construction so it should always equal `session_id` when set, but populating consistently helps observers reason about absence (no field = no info; field present and equal = same-session; field present and different = cross-session).
4. **Typed shape.** Whether to extend the attribution dict in-place (current proposal) or introduce a typed `MessageAttribution` dataclass. Latter is cleaner long-term but a bigger surface change.

## Related

- [microsoft/agent-framework#5472](https://github.com/microsoft/agent-framework/issues/5472) — adjacent session-isolation issue; complementary surface.
- AGT-side companion issue: [microsoft/agent-governance-toolkit#2374](https://github.com/microsoft/agent-governance-toolkit/issues/2374) — documents the downstream impact in Microsoft's governance toolkit.

## Disclosure note

The Dai et al. paper is publicly available on arXiv (preprint May 2026) and the authors' Ethics Statement (Appendix J) explicitly notes the work was published openly under a "supply-chain barrier" rationale rather than via coordinated disclosure (contrast: the concurrent LoopTrap paper, arXiv:2605.05846, explicitly disclosed to surveyed framework maintainers pre-submission). This issue is the first MAF-side reference to Dai et al. I could find via `gh search`; if there's existing coordination I missed, happy to redirect.

---

Surfaced during independent audit conducted by @finnoybu (Ken Tannenbaum, AEGIS Initiative); [MEDIUM, python/packages/core].


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python: Expose cross-session memory origin for downstream governance (Dai et al., arXiv:2605.06158) #5914

Summary

Motivation

Existing architecture — already mostly there

Proposed approach (additive, backward-compatible)

Scope proposed for the accompanying draft PR

Open API questions for maintainer input

Related

Disclosure note

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Python: Expose cross-session memory origin for downstream governance (Dai et al., arXiv:2605.06158) #5914

Description

Summary

Motivation

Existing architecture — already mostly there

Proposed approach (additive, backward-compatible)

Scope proposed for the accompanying draft PR

Open API questions for maintainer input

Related

Disclosure note

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions