Skip to content

[pull] main from danny-avila:main#79

Merged
pull[bot] merged 8 commits into
innFactory:mainfrom
danny-avila:main
May 8, 2026
Merged

[pull] main from danny-avila:main#79
pull[bot] merged 8 commits into
innFactory:mainfrom
danny-avila:main

Conversation

@pull
Copy link
Copy Markdown

@pull pull Bot commented May 8, 2026

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

…v.0) (#148)

* 🧱 refactor: storage_session_id rename + kind discriminator (3.1.80-dev.0)

Combines Phase B (per-file session_id rename) and Phase C (kind
discriminator for designed cross-user-within-tenant caching) in one
cutover. Pre-release simplifies coordination — no wire-compat aliases.

Type changes:
- `CodeEnvFile`: per-file `session_id` → `storage_session_id`. Adds
  `kind: 'skill' | 'agent' | 'user' | 'system'` (required) and
  `version?: number` (used for `kind: 'skill'`). New `CodeEnvKind`
  type alias exported.
- `FileRef`: same `storage_session_id` rename. `kind?` and `version?`
  optional (legacy passthroughs may not carry them).
- Top-level `session_id` on `CodeSessionContext` / `ExecuteResult` /
  artifact / response shapes is unchanged — that's the canonical
  execution-session name and a future engine that uses one session
  concept lands on it directly.

ToolNode propagates `kind` and `version?` through:
- `runTool` injection (`_injected_files`)
- `getCodeSessionContext` (event-driven path)
- `updateCodeSession` and `storeCodeSessionFromResults` (already
  pass-through via spread; new optional fields flow naturally)

Defaults:
- ToolNode and the `/files` fallback tag missing-kind files as
  `kind: 'user'`. Most ad-hoc files are user-private; shared
  resources (`skill`, `agent`) populate their kind upstream in LC.

Tests:
- 787 / 787 tools tests pass; tsc clean.
- 20 / 20 in `ToolNode.session.test.ts` (per-file storage and kind
  pinned at every boundary).

Coordination:
- Companion: LibreChat #12960 (sets `kind` at every write site,
  threads `skill.version` through skill primes), codeapi #1455
  (rewrites `resolveSessionKey` as a kind-switch with
  `authContext.tenantId` prefix). All three deploy in lockstep —
  pre-release, no aliases retained.

* 🧱 refactor: discriminated-union CodeEnvFile + JSDoc id asymmetry

Tightens the per-file ref contract codeapi expects on `_injected_files`:

- `CodeEnvFile` is now a discriminated union on `kind`. `version` is
  statically required for `kind: 'skill'` and statically forbidden for
  `agent` / `user`. The constraint holds at compile time on every
  consumer, not just at codeapi's runtime validator boundary.
- `CODE_ENV_KINDS` const-tuple keeps the runtime list and the TS union
  locked together — adding a new kind to the tuple updates both at
  once, so a partial update won't silently widen one side.
- JSDoc on `id` calls out the per-kind asymmetry: skill UUID for
  `'skill'`, agent id for `'agent'`, informational-only for `'user'`
  (codeapi resolves user identity from the auth context). Future
  readers seeing `RequestFile { id, kind: 'user' }` shouldn't assume
  `id` routes anywhere.

ToolNode collapses two near-identical CodeEnvFile constructors
(the runTool-injection path and the event-driven `getCodeSessionContext`
path) onto a single `toInjectedFileRef` helper. The helper does the
discriminated-union narrowing in one place instead of duplicating the
"if kind===skill set version" logic at every call site, where the
previous shape — `const ref = { ... }; if (file.version != null) ref.version = file.version;` —
no longer typechecks against the union variants. Skill refs missing
`version` fall back to `'user'` so an upstream contract bug surfaces
as a degraded sessionKey rather than a runtime crash; primeSkillFiles
is the only writer and always sets `version`.

Companion: LibreChat #12960 (mirrors the discriminated union on
`CodeEnvRef`, drops `entity_id` everywhere). codeapi #1455 already
validates the same shape on the wire.
…ey) (#154)

Companion to codeapi PR #1455 review fix. The single `id` field on
`CodeEnvFile` / `FileRef` was overloaded for two incompatible
purposes when codeapi's auth layer received it:

1. **SessionKey resolution** — `<tenant>:<kind>:<id>:v:<version>`
   wants the RESOURCE id (skill `_id`, agent id).
2. **Upload-existence lookup** — `upload:<sessionKey><storage_session_id><id>`
   wants the STORAGE id (the file_server's per-file nanoid).

These are different values for shared kinds — a skill file's resource
id is the skill's mongo ObjectId, but the storage id is a 21-char
nanoid the file_server generated at upload time. With them collapsed
onto a single field, codeapi rejected every shared-kind `/exec`
authorization with `session_key_mismatch` because the sessionKey
re-derivation used the storage nanoid where the resource id was
required.

## What changed

- `CodeEnvFile.id` keeps its existing meaning (storage file_id —
  matches worker-side `TFile.id` storage URL contract, no churn on
  the api/ workspace).
- New `CodeEnvFile.resource_id` carries the entity-that-owns-this-
  file's-session identity. Required on the input wire shape;
  optional on `FileRef` (artifact responses don't always carry
  resource provenance).
- `toInjectedFileRef` defaults `resource_id` to `id` when input
  doesn't supply it — informational for `kind: 'user'` and a graceful
  degraded path for unmigrated callers (sessionKey will mismatch
  on shared kinds rather than crash).
- `/files` fallback in `CodeExecutor` / `BashExecutor` /
  `ProgrammaticToolCalling` sets `resource_id` alongside `id`
  for code-output files (informational for `'user'` kind anyway).

## Test plan

- [x] `npx tsc --noEmit` clean
- [x] `npx jest src/tools src/types` — 788 / 788 pass (existing
  ToolNode.session tests updated to assert the new `resource_id`
  field on _injected_files / codeSessionContext)
* fix: preserve resource identity on inherited file echoes (skill 403 regression)

After a successful skill prime + first /exec, a follow-up /exec
rejected with codeapi `session_key_mismatch` 403:

    cached:    legacy:skill:69dcf561...:v:59
    resolved:  legacy:user:local-test-user
    file:      kind=user, resource_id=<storage_id>

The worker `inherited: true` echo carries only `(id, name,
storage_session_id)` — the sandbox doesn't re-attest to the
ownership identity (`kind`, `resource_id`, `version`), which was
signed at upload. `updateCodeSession` was matching on `name` alone
and overwriting the prior entry verbatim from the echo, so the
identity fields fell off the in-memory CodeSessionContext.

On the next turn, `toInjectedFileRef` sees `kind: undefined` →
falls back to `'user'`, sees `resource_id: undefined` → falls back
to `file.id` (the storage uuid). codeapi then derives
`legacy:user:<authContext.userId>` and 403s against the cached
skill sessionKey.

Fix: merge worker echoes onto the prior entry by
`(storage_session_id, id)`. Echo overlays only the fields it
actually owns (`inherited` flag, normalized name, etc.); identity
fields persist.

## Test plan

Two regression tests added to `storeCodeSessionFromResults`:
- worker echo of a `kind: 'skill'` file preserves
  kind/resource_id/version
- genuinely new files (no prior entry) land without identity
  fields, leaving `toInjectedFileRef` to default `kind: 'user'`

22/22 ToolNode.session tests pass; tsc clean.

* refactor: consolidate updateCodeSession to two passes, terse comments

* chore: bump version to 3.1.80-dev.2
* fix: remove broken /files/<session_id> HTTP fallback (silent post-Phase C failure)

The legacy fallback in `CodeExecutor`, `BashExecutor`,
`ProgrammaticToolCalling`, and `BashProgrammaticToolCalling` fired
when `_injected_files` was empty AND `session_id` was set, fetching
`/files/<session_id>?detail=full` from codeapi to recover the
session's file list.

Post-Phase C, codeapi's `sessionAuth` middleware on that route
requires `?kind=...&id=...&version=...` query params (parsed via
`parseUploadSessionKeyInput`). The tool layer doesn't have the
resource identity at this point, so every fetch 400'd and the catch
block silently swallowed it. Net effect: a "resilience" path that
hadn't worked since the rename, masking real upstream bugs because
the throw was caught and the request continued with empty files.

That silent failure almost masked a separate LC chat-attach upload
bug (codeapi `/upload` field-name regression) by emitting noise on
the broken fallback path while the actual file priming chain was
the real culprit. Removing the fallback keeps the failure mode
visible: empty `_injected_files` now logs at debug level instead of
firing a doomed HTTP fetch.

`fetchSessionFiles` stays exported from `ProgrammaticToolCalling`
for back-compat with external consumers, but is no longer called
internally. External callers should expect the same 400 — that's a
codeapi-side concern, not the tool layer's to paper over.

Bumps to 3.1.80-dev.3.

## Test plan

- [x] `npx jest src/tools/__tests__` — 673/696 pass (23 skipped, 0 fail)
- [x] `npx tsc --noEmit` — clean
- [x] `npx eslint` on the four touched files — clean

* refactor: extract renderExecutionFileSummary helper, dedupe across executors

Code/Bash executors and PTC's `formatCompletedResponse` each
inlined the same "split files by `inherited` flag, render each
section, append LLM-facing notes" logic and triplicated 7 string
constants (`otherMessage`, `inheritedFileMessage`, `accessMessage`,
`emptyOutputMessage`, `inheritedFilesHeader`, `generatedFilesHeader`,
`inheritedNote`). Single source of truth now lives in
`CodeExecutor.ts` as `renderExecutionFileSummary` + the constants;
`BashExecutor` and `ProgrammaticToolCalling` import.

Net diff: -142 / +62 (≈80 lines removed). Future LLM-facing copy
changes are a one-site edit instead of three.

`renderFileSection` stays exported for any external consumers who
prefer single-section rendering; internal callers all go through
the new `renderExecutionFileSummary` wrapper.

## Test plan

- [x] `npx jest src/tools/__tests__` — 673/696 pass (23 skipped pre-existing, 0 fail)
- [x] `npx tsc --noEmit` — clean
- [x] `npx eslint` on the four touched files — clean

* refactor: drop post-execution file summary from LLM-facing tool output

The "Generated files: ..." / "Available files (inputs, ...): ..."
block appended to every code-execution tool result was actively
harmful to model behavior:

- The prescriptive copy ("File is already downloaded by the user",
  "Available as an input", etc.) led models to invent paths like
  /mnt/user-data/uploads/ when chasing an "original" they imagined
  from the listing.
- The inherited-vs-generated split forced a categorization the
  model often disagreed with (e.g. a script that mutates an input
  in place produces a file the listing classed as "inherited" but
  the model thought of as "generated").
- For bash specifically — the most natural mode is to `ls
  /mnt/data/` and discover what's actually on disk. The summary
  shadowed that with a stale, opinionated take.

Strip the entire summary from the LLM-facing tool result. The
artifact still carries every file (with `inherited` flag intact)
so the host's session-tracking layer (`storeCodeSessionFromResults`,
`updateCodeSession`, `_injected_files` plumbing) keeps working
unchanged. The model now sees stdout/stderr only — and explores
the filesystem the same way it would in any unfamiliar shell.

Removes:
- `renderExecutionFileSummary` (added in the prior refactor)
- `renderFileSection` (now unused, was exported)
- `imageExtRegex`, `imageMessage` (helpers for `renderFileSection`)
- 6 message constants: `otherMessage`, `inheritedFileMessage`,
  `accessMessage`, `inheritedFilesHeader`, `generatedFilesHeader`,
  `inheritedNote`

Keeps:
- `emptyOutputMessage` (the empty-stdout hint, which IS useful —
  a blank tool result is otherwise indistinguishable from a hung
  call)
- `getCodeBaseURL` (independent concern)
- The static schema description text mentioning `/mnt/data/` —
  that's how the model learns where files live in general; it's
  the per-call summary that was noise

## Test plan

- [x] `npx jest src/tools/__tests__` — 668/691 pass (23 skipped pre-existing, 0 fail)
- [x] `npx tsc --noEmit` — clean
- [x] `npx eslint` — clean
Cross-provider conversations carrying OpenAI Responses-style tool-call
IDs (e.g. \`fc_...|call_...\`) hit a 400 from Anthropic on replay because
the IDs contain \`|\` and can exceed 64 chars, violating Anthropic's
\`^[a-zA-Z0-9_-]+$\` and length constraints.

Adds \`normalizeAnthropicToolCallId\` and applies it at the four sites
that emit IDs to the wire — \`_convertLangChainToolCallToAnthropic\` and
the three \`tool_result.tool_use_id\` constructions in
\`_ensureMessageContents\`. Server tool IDs (\`srvtoolu_\` prefix) are
left untouched.

Non-compliant inputs are sanitized then suffixed with a 10-hex-char
SHA-256 prefix of the *original* ID, so two long IDs that share a
64-char prefix (or two short IDs differing only by an invalid char like
\`|\` vs \`.\`) still produce distinct outputs — Anthropic's "tool_use ids
must be unique" check stays satisfied. The function is pure and
deterministic so paired \`tool_use.id\` and \`tool_result.tool_use_id\`
remain matched without a session map.

Verified live against the Anthropic API: previously-rejected payloads
now succeed, including the colliding-prefix case.
#155)

@langchain/google-common@2.1.29 swapped `require("uuid")` for
`require("@langchain/core/utils/uuid")` (#10776, "remediate uuid@10
vulnerability"). @langchain/core@1.1.42+ is built with Rolldown, which
miscompiles `export { default as v4 } from "./v4.js"` to
`exports.v4 = require_v4` (the namespace) instead of
`exports.v4 = require_v4.default` (the function). Result: any Gemini
function/tool call crashes with `(0, ...uuid.v4) is not a function`
through `partsToToolsRaw → functionCallPartToToolRaw`.

Upstream tracking — both still open as of this commit:
- rolldown/rolldown#9299 (Rolldown CJS emit)
- langchain-ai/langchainjs#10827 (the symptom)
- langchain-ai/langchainjs#10832 (proposed core fix)

Reported in danny-avila/LibreChat#13003.

What changed:
- Pin @langchain/google-{common,gauth,genai,vertexai} from 2.1.30 to
  2.1.28, the last release that uses `require("uuid")` directly. The
  *only* runtime delta in 2.1.29 for these packages is a one-line import
  swap in gemini.ts and media_core.ts; 2.1.30 had no google-common code
  changes (peer-dep version bump only).
- Add overrides for those four packages so transitive consumers can't
  pull 2.1.29+ back in.
- Add `"uuid": "$uuid"` override forcing the entire tree to this
  package's uuid@^11.1.1 — closes the same uuid@10 deprecation concern
  that motivated 2.1.29 upstream, without the broken code path.

Drop these pins/overrides once @langchain/core ships the fix in #10832.
#157)

* 🧠 fix: Include Gemini Reasoning Tokens in Vertex Stream Usage

`@langchain/google-common`'s streaming `_streamResponseChunks` builds
`usage_metadata` inline as `output_tokens = candidatesTokenCount` and
drops `thoughtsTokenCount` entirely (`chat_models.cjs:178-182`). For
thinking models this undercounts billed output by the reasoning tokens
and breaks the documented `total_tokens === input_tokens + output_tokens`
invariant. The non-stream `_generate` path uses `responseToUsageMetadata`
and is correct.

`CustomChatGoogleGenerativeAI` already does the right thing for the
Google API path by overriding `_convertToUsageMetadata`. This commit
mirrors that for Vertex by wrapping the parent's stream generator and
folding `output_token_details.reasoning` back into `output_tokens` —
but only when `total - input > output`, so cases where reasoning is
already a subset of output (OpenAI-style) are no-ops.

Reported in danny-avila/LibreChat#13006.

- `repairStreamUsageMetadata` exported for unit testing.
- 6 unit tests cover: streaming undercount fix, no-op when already
  correct, missing/zero reasoning, undefined usage, double-counting
  guard via the gap check.

* 🩹 fix: replace buggy fallback usage_metadata with tracked good value

Initial fix tried to repair `chunk.message.usage_metadata` in place by
folding in `output_token_details.reasoning`. Live testing showed that
fallback usage_metadata is constructed without `output_token_details`
at all, so the in-place repair couldn't find the reasoning count.

Real flow inside `_streamResponseChunks` (debugged with live Vertex):
- Streaming chunks set `chunk.generationInfo.usage_metadata` with
  `output_tokens = candidatesTokenCount + thoughtsTokenCount` and
  `output_token_details.reasoning` populated.
- Trailing fallback chunk (`output === null`) sets
  `chunk.message.usage_metadata` inline with
  `output_tokens = candidatesTokenCount` and NO output_token_details.
- After AIMessageChunk.concat the final usage_metadata is the buggy
  fallback (chunk 0 contributes 0).

Repair: track the last-seen `generationInfo.usage_metadata`, and when
the fallback chunk's `message.usage_metadata` arrives, replace it
wholesale. Guarded so we only replace when the candidate has a larger
output and the same total — defensive against stale state.

Live verified against gemini-3-flash-preview:
- Without fix: output_tokens=135 (text only), invariant broken
- With fix:    output_tokens=333 (text+reasoning), invariant holds

Updated unit tests for the new helper signature; added a live
integration test asserting the invariant.
@pull pull Bot locked and limited conversation to collaborators May 8, 2026
@pull pull Bot added the ⤵️ pull label May 8, 2026
@pull pull Bot merged commit 604e97c into innFactory:main May 8, 2026
1 check passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant