Skip to content

Add LangSmith tracing sample#292

Draft
xumaple wants to merge 11 commits intomainfrom
maplexu/langsmith-sample
Draft

Add LangSmith tracing sample#292
xumaple wants to merge 11 commits intomainfrom
maplexu/langsmith-sample

Conversation

@xumaple
Copy link
Copy Markdown
Contributor

@xumaple xumaple commented Apr 17, 2026

Summary

  • Adds langsmith_tracing/ sample demonstrating temporalio.contrib.langsmith.LangSmithPlugin
  • basic/: one-shot LLM workflow (prompt → OpenAI activity → result)
  • chatbot/: conversational loop with save_note/read_note tools, signals, queries, and dynamic LangSmith trace names
  • Includes mocked tests, README, and langsmith-tracing dependency group
  • Declares langchain and langsmith-tracing as conflicting groups in [tool.uv] to avoid version conflicts

Test plan

  • poe format — passes
  • mypy on langsmith_tracing/ and tests/langsmith_tracing/ — passes
  • pytest tests/langsmith_tracing/ — basic test passes, chatbot tests skip (pending SDK plugin release)
  • Manual test: run worker + starter for both basic and chatbot examples

🤖 Generated with Claude Code

xumaple and others added 11 commits April 17, 2026 16:17
Demonstrates the LangSmithPlugin for automatic LangSmith tracing in
Temporal workflows. Includes two examples:
- basic/: one-shot LLM workflow (prompt → OpenAI → result)
- chatbot/: conversational loop with save_note/read_note tools,
  signals, queries, and dynamic trace names

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add @Traceable with metadata, tags, and run_type variety to workflows
- Add client-side @Traceable on starters for end-to-end trace linking
- Add --temporal-runs flag to workers and starters for add_temporal_runs toggle
- Use run_type="llm" for OpenAI calls, "tool" for save_note, "chain" for orchestration
- Add trace tree diagrams to README showing both add_temporal_runs modes
- Add "Three Layers of Tracing" section to README (wrap_openai, @Traceable, Temporal plugin)
- Skip all tests when temporalio.contrib.langsmith is unavailable (pending SDK release)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Split top-level README into per-sample READMEs (basic/ and chatbot/)
  with detailed trace structure diagrams and screenshot placeholders
- Add replay safety comment on @workflow.run methods explaining why
  @Traceable must wrap an inner function
- Make read_note a proper activity for LangSmith trace visibility
- Change save_note/read_note to use NoteRequest dataclass (single arg)
- Fix misplaced comments in activities (wrap_openai, run_type docs)
- Add links to LangSmith docs and Temporal SDK plugin docs
- Add add_temporal_runs explanation section to top-level README

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Audit findings addressed:
- Remove max_cached_workflows=0 (debugging leftover, not needed)
- Expand replay safety comment to explain why (I/O on replay)
- Tighten type annotations: str|list[dict[str,Any]], dict[str,Any]
- Clarify read_note docstring (passthrough for tracing visibility)
- Clarify activity docstrings (retries handled by Temporal, not OpenAI)
- Make trace tree annotations consistent across basic/chatbot READMEs
- Extract shared test helpers (make_text_response, poll_last_response)
- Add TimeoutError on poll failure instead of silent assertion
- Fix "simplest possible" wording to "very simple"

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Changed from explaining internals (I/O on replay) to describing what
the user would see: duplicate or orphaned traces. Also swept all other
comments for outcome-focus — no further changes needed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- save_note and read_note are now @Traceable methods on ChatbotWorkflow,
  not activities. Notes live in workflow state (durable via event history).
- Remove NoteRequest dataclass and activity registrations
- Fix wrap_openai comment placement (describes child span, not parent)
- Remove retry policy from basic workflow (unnecessary for simple example)
- Add comment highlighting that non-@workflow.run methods can use @Traceable
- Update chatbot README trace trees to reflect workflow methods vs activities

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove skip markers from tests — CI should fail loudly if deps missing
- Pin temporalio[pydantic,langsmith]>=1.26.0 (plugin is released)
- Rename kwargs to response_args in chatbot activity
- Remove notes CLI command from chatbot starter
- Update chatbot README trace trees for accuracy
- Change input prompt from "You: " to "> "

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- StartWorkflow/RunWorkflow are siblings, not parent-child
- StartActivity/RunActivity are siblings under RunWorkflow
- @Traceable activity spans nest under RunActivity
- Signal/query traces are separate roots (or under client @Traceable)
- Merge client-side and worker-side into unified trees
- add_temporal_runs=False: only @Traceable spans, all under client root

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
temporalio>=1.26.0 requires openai-agents>=0.14.0 but the openai-agents
group pins ==0.3.2, causing a resolution failure.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Without the group flag, uv resolves the base temporalio dep (1.23.0)
which doesn't have contrib.langsmith.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
"""Call OpenAI Responses API. Retries handled by Temporal, not the OpenAI client."""
# wrap_openai patches the client so each API call (e.g. responses.create)
# creates its own child span with model parameters and token usage.
client = wrap_openai(AsyncOpenAI(max_retries=0))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe worth a comment explaining why retries set to 0 (since temporal activity handles the retries)


client = await Client.connect(
**config,
data_converter=pydantic_data_converter,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it'd be simpler not to use pydantic for an example like this that doesn't strictly need pydantic? I don't care strongly, I'm open to keeping it if you think it's adding enough value

Comment thread langsmith_tracing/basic/README.md
Comment on lines +53 to +58
loop = asyncio.new_event_loop()
try:
loop.run_until_complete(main())
except KeyboardInterrupt:
interrupt_event.set()
loop.run_until_complete(loop.shutdown_asyncgens())
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why all this instead of just the default asyncio.run(main()) that other examples use?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if my earlier comment about combining the worker and workflow in the same file makes handling these interrupts more graceful.

OpenAIRequest(model="gpt-4o-mini", input=prompt),
start_to_close_timeout=timedelta(seconds=60),
)
return response.output_text
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for prior comment re: pydantic, you could probably skip the pydantic data converter by having the activity just return this Response::output_text instead of the whole Response since this is all you're actually using

Comment on lines +70 to +78
async def send_and_wait(msg: str):
prev_response = await wf_handle.query(ChatbotWorkflow.last_response)
await wf_handle.signal(ChatbotWorkflow.user_message, msg)
for _ in range(60):
await asyncio.sleep(0.5)
response = await wf_handle.query(ChatbotWorkflow.last_response)
if response != prev_response:
return response
return None
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe worth encapsulating this logic in the workflow itself to demonstrate that a workflow update func can synchronously wait for the workflow to get to a desired state rather than looping over queries here

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In terms of what we see in Langsmith, if we put this in workflow update, then it no longer shows up as a child of RunWorkflow, but rather each workflow update would be a separate sibling to Start/RunWorkflow in the form of StartWorkflowUpdate. Is that what you're imagining?


from langsmith_tracing.chatbot.activities import OpenAIRequest, call_openai

RETRY = RetryPolicy(initial_interval=timedelta(seconds=2), maximum_attempts=3)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this is only used once, may as well be inlined

Comment on lines +82 to +87
return await traceable(
name=f"Session {now}",
run_type="chain",
metadata={"workflow_id": workflow.info().workflow_id},
tags=["chatbot-session"],
)(self._session)()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe worth a short comment that this is an alternative way to mark funcs traceable rather than the nested func def with the @traceable(...) decorator that you mostly used elsewhere

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. we can define an inline function for dynamic trace names, but this traceable function is another way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants