Skip to content

fix(openfeature): avoid startup worker timeout#18161

Draft
leoromanovsky wants to merge 1 commit into
mainfrom
leoromanovsky/fix-ffe-startup-otel-config
Draft

fix(openfeature): avoid startup worker timeout#18161
leoromanovsky wants to merge 1 commit into
mainfrom
leoromanovsky/fix-ffe-startup-otel-config

Conversation

@leoromanovsky
Copy link
Copy Markdown
Contributor

Motivation

The FEATURE_FLAGGING_AND_EXPERIMENTATION system-tests job showed gunicorn workers timing out at 30s during startup, with stderr also reporting OpenTelemetry configuration OTEL_METRIC_EXPORT_INTERVAL is not supported by Datadog. The OTEL metric reader interval is actually supported and consumed by ddtrace, and the OpenFeature provider default initialization wait lined up exactly with common 30s web-server worker timeouts.

Changes

  • Treat OTEL_METRIC_EXPORT_INTERVAL and OTEL_METRIC_EXPORT_TIMEOUT as supported OTEL startup configurations.
  • Reduce the default DataDogProvider initialization timeout from 30s to 25s so OpenFeature can mark the provider not ready and recover later instead of letting a common worker timeout kill startup.
  • Add a subprocess regression test for the supported OTEL metric reader configs.
  • Add a release note covering both startup fixes.

Decisions

  • Kept the OpenFeature provider blocking/recovery model intact instead of changing the lifecycle semantics.
  • Chose 25s as a narrow change below common 30s worker timeouts while still giving Remote Config time to arrive.
  • Left the timeout configurable through DD_EXPERIMENTAL_FLAGGING_PROVIDER_INITIALIZATION_TIMEOUT_MS for deployments that need a different startup budget.

Testing

  • scripts/ddtest riot -v run --pass-env -s 120e7d0 -- tests/opentelemetry/test_config.py::test_otel_metric_reader_configuration_supported
  • scripts/ddtest riot -v run --pass-env -s 18421e5 -- -k test_initialize_timeout_raises
  • scripts/ddtest riot -v run --pass-env -s 18421e5 -- -k test_late_recovery_after_timeout
  • git diff --check

@cit-pr-commenter-54b7da
Copy link
Copy Markdown

Codeowners resolved as

ddtrace/internal/settings/_otel_remapper.py                             @DataDog/apm-sdk-capabilities-python
ddtrace/internal/settings/openfeature.py                                @DataDog/feature-flagging-and-experimentation-sdk
releasenotes/notes/fix-ffe-startup-otel-metric-interval-b1c7f93d2c8f4a0e.yaml  @DataDog/apm-python
tests/openfeature/conftest.py                                           @DataDog/feature-flagging-and-experimentation-sdk
tests/opentelemetry/test_config.py                                      @DataDog/apm-sdk-capabilities-python @DataDog/apm-core-python

@datadog-datadog-prod-us1
Copy link
Copy Markdown
Contributor

datadog-datadog-prod-us1 Bot commented May 19, 2026

Tests

🎉 All green!

🧪 All tests passed
❄️ No new flaky tests detected

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: c9abfbb | Docs | Datadog PR Page | Give us feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant