Skip to content

fix(otel): exporter shutdown data loss + thread safety in span processor#165

Open
pmady wants to merge 3 commits intofuture-agi:mainfrom
pmady:fix/exporter-shutdown-and-thread-safety
Open

fix(otel): exporter shutdown data loss + thread safety in span processor#165
pmady wants to merge 3 commits intofuture-agi:mainfrom
pmady:fix/exporter-shutdown-and-thread-safety

Conversation

@pmady
Copy link
Copy Markdown

@pmady pmady commented May 1, 2026

What does this PR do?

Fixes the exporter shutdown bug and thread safety issue from #164.

  • GRPCSpanExporter.shutdown() and HTTPSpanExporter.shutdown() now call super().shutdown() after closing the session, so the parent flushes buffered spans
  • SimpleSpanProcessor.shutdown() takes a lock around the snapshot + clear of _active_spans (per @JayaSurya-27's review -- hot path stays lock-free, dict ops are atomic under GIL)
  • swapped print() to logger.warning/logger.error in the shutdown methods since the module logger already exists

Why?

The missing super().shutdown() drops pending spans on process exit. Hits any k8s pod doing graceful shutdown.

Fixes #164 (findings 2, 4, partially 5)

How was it tested?

  • pytest tests/test_config.py tests/test_init.py tests/test_tracers.py -- 67 passed, no regressions
  • verified via inspect.getsource() that both exporters call super().shutdown()
  • Integration tests pass (or N/A -- no gateway behavior changed)

test_otel.py has a pre-existing import error (SESSION_NAME removed) unrelated to this change.

Checklist

  • Branch is off main
  • Commit messages follow Conventional Commits
  • No TODOs or commented-out code
  • No API keys or secrets in the diff

Notes for reviewers

three commits, easy to review individually. lock placement follows @JayaSurya-27's suggestion from the issue thread.

cc @JayaSurya-27

pmady added 3 commits May 1, 2026 12:24
Both GRPCSpanExporter and HTTPSpanExporter close the session
but skip the parent shutdown, which flushes buffered spans.
This drops pending trace data on process exit.

Fixes part of future-agi#164

Signed-off-by: pmady <pavan4devops@gmail.com>
The snapshot (line 445) and clear (line 458) in
SimpleSpanProcessor.shutdown() can race with on_start
writes from other threads. Lock only around those two
operations per review feedback -- the hot path stays
lock-free since dict insert/pop are atomic under GIL.

Fixes part of future-agi#164

Signed-off-by: pmady <pavan4devops@gmail.com>
The module-level logger is already defined at line 55 but
the shutdown methods use print(). Switch to logger.warning
and logger.error so platform teams can route these through
their logging pipeline.

Partially addresses future-agi#164 finding 5

Signed-off-by: pmady <pavan4devops@gmail.com>
@pmady
Copy link
Copy Markdown
Author

pmady commented May 1, 2026

this addresses findings 2, 4, and part of 5 from #164. kept the lock scoped to just the shutdown path like you suggested -- lmk if you'd rather see it wider

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Architectural review: OTel pipeline scaling observations for enterprise deployments

1 participant