Skip to content

chat(): duplicate TOOL_CALL_END (no preceding START) for server-executed tools breaks AG-UI verify #519

@AlemTuzlak

Description

@AlemTuzlak

Summary

chat() emits a duplicate TOOL_CALL_END event (with no preceding TOOL_CALL_START) for every server-executed tool. The first TOOL_CALL_END comes from the adapter during streaming; the second comes from buildToolResultChunks() after server execution.

This violates the AG-UI streaming contract — @ag-ui/client's verifyEvents middleware (and any spec-strict consumer) treats a TOOL_CALL_END not preceded by a matching TOOL_CALL_START as a protocol error and rejects the stream.

Reproducible on @tanstack/ai@0.14.0 (latest). I verified the same code path is on main.

Reproducer

import { chat, toolDefinition } from "@tanstack/ai";
import { openaiText } from "@tanstack/ai-openai";
import { z } from "zod";

const weatherTool = toolDefinition({
  name: "getWeather",
  description: "Get the weather for a city",
  inputSchema: z.object({ city: z.string() }),
}).server(async ({ city }) => ({ city, tempC: 21 }));

const stream = chat({
  adapter: openaiText("gpt-4o"),
  messages: [{ role: "user", content: "What's the weather in Paris?" }],
  tools: [weatherTool],
});

// Count events per toolCallId to see the duplicate END.
const counts = new Map<string, Record<string, number>>();
for await (const chunk of stream) {
  const id = (chunk as any).toolCallId;
  if (!id) continue;
  const c = counts.get(id) ?? {};
  c[chunk.type] = (c[chunk.type] ?? 0) + 1;
  counts.set(id, c);
}
console.log(counts);
// Observed:
// Map {
//   "<id>" => { TOOL_CALL_START: 1, TOOL_CALL_ARGS: N, TOOL_CALL_END: 2, TOOL_CALL_RESULT: 1 }
// }
// Expected: TOOL_CALL_END: 1

The same shape happens for undiscoveredLazyResults (see processToolCalls in chat/index.ts, around buildToolResultChunks(undiscoveredLazyResults, finishEvt)).

Root cause

In packages/typescript/ai/src/activities/chat/index.ts, buildToolResultChunks(results, finishEvent, argsMap?) always pushes a TOOL_CALL_END chunk, but only pushes TOOL_CALL_START + TOOL_CALL_ARGS when argsMap is provided:

https://github.com/TanStack/ai/blob/main/packages/typescript/ai/src/activities/chat/index.ts#L1198-L1240

private buildToolResultChunks(
  results: Array<ToolResult>,
  finishEvent: RunFinishedEvent,
  argsMap?: Map<string, string>,
): Array<StreamChunk> {
  const chunks: Array<StreamChunk> = []
  for (const result of results) {
    const content = JSON.stringify(result.result)

    if (argsMap) {
      chunks.push({ type: 'TOOL_CALL_START', /* ... */ })
      chunks.push({ type: 'TOOL_CALL_ARGS',  /* ... */ })
    }

    chunks.push({ type: 'TOOL_CALL_END',    /* ... */ })  // <-- always
    chunks.push({ type: 'TOOL_CALL_RESULT', /* ... */ })
    // ...
  }
}

There are five call sites. Two paths pass argsMap (and are spec-clean), three don't:

Site Path argsMap Result
index.ts ~L767 checkForPendingToolCalls → undiscovered lazy no duplicate END
index.ts ~L847 / ~L874 checkForPendingToolCalls → continuation re-execution yes (added in #372) OK
index.ts ~L924 processToolCalls → undiscovered lazy no duplicate END
index.ts ~L1003 processToolCalls → mixed approval / client + executed results no duplicate END
index.ts ~L1029 processToolCalls → normal post-execution no duplicate END

#372 (0.10.2, Emit TOOL_CALL_START and TOOL_CALL_ARGS for pending tool calls during continuation re-executions) fixed two of these by threading argsMap through. The other three were left as-is and still produce the orphan TOOL_CALL_END.

The duplicate happens inside iteration 1 of the agent loop (in the executeToolCalls cyclePhase), so agentLoopStrategy: maxIterations(1) does not work around it — shouldContinue() hardcodes if (this.cyclePhase === 'executeToolCalls') return true and the duplicate is emitted before any loop strategy is consulted.

Why it matters

Any AG-UI-spec-strict consumer rejects the stream. Concretely, CopilotKit's runtime pipes the chat() output through @ag-ui/client's verifyEvents middleware, which throws on TOOL_CALL_END without a matching TOOL_CALL_START. We're currently working around this in @copilotkit/runtime by stopping conversion at the first RUN_FINISHED (CopilotKit#4476), but that's a consumer-side guard and it discards real events from the second iteration of multi-turn agentic runs.

The same problem will surface for any other AG-UI-strict consumer (anything wired to @ag-ui/client's verifier, including AG-UI's own dev tooling).

Proposed fix

The post-execution TOOL_CALL_END is redundant — the adapter already emitted it during streaming for the same toolCallId. The post-execution phase only needs to emit the new information: TOOL_CALL_RESULT (and the assistant-message bookkeeping that already happens after).

Options, in order of preference:

  1. Drop the TOOL_CALL_END push from buildToolResultChunks. The adapter is already responsible for START / ARGS / END; buildToolResultChunks should only contribute TOOL_CALL_RESULT. This fixes all five call sites with one change and is semantically the most correct.
  2. Keep the current shape but emit a synthetic TOOL_CALL_START whenever TOOL_CALL_END is emitted — i.e. always pass an argsMap (build one from the executed ToolCall[] if the caller didn't). This keeps streams "self-contained" but adds duplicate START/ARGS events in the normal post-execution path, which is its own protocol-shape problem.
  3. Document that chat() output is not AG-UI-spec-conformant and require consumers to filter post-RUN_FINISHED events themselves.

Option 1 is what I'd ship. Happy to open a PR if you agree on the shape.

Versions

  • @tanstack/ai: 0.14.0
  • @tanstack/ai-openai: 0.8.2
  • Node: 20.x
  • Adapter: openaiText("gpt-4o")

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions