Skip to content

Add FFE evaluation completion hook#3909

Open
leoromanovsky wants to merge 6 commits into
leo.romanovsky/milestone-1-runtime-evaluationfrom
leo.romanovsky/m2-m3-evaluation-completed-base
Open

Add FFE evaluation completion hook#3909
leoromanovsky wants to merge 6 commits into
leo.romanovsky/milestone-1-runtime-evaluationfrom
leo.romanovsky/m2-m3-evaluation-completed-base

Conversation

@leoromanovsky
Copy link
Copy Markdown

@leoromanovsky leoromanovsky commented May 22, 2026

Motivation

EVP exposure delivery and OTLP evaluation metrics both need the same internal post-evaluation signal. This PR adds that shared hook layer on top of the runtime evaluation work in #3906.

It does not add exposure transport, caching, batching, OpenFeature SDK hook behavior, or metric export behavior.

Planning/reference doc: https://docs.google.com/document/d/1NvMfTpZWLBlFmEFNjdnlMyeVpy5l7KD8qujGFco6w2w/edit?tab=t.0

Decisions

  • The hook fires after Client::evaluate() produces EvaluationDetails for evaluated success/default/error results.
  • Caller validation failures before evaluation, such as invalid flag keys or invalid default types, do not enter the hook.
  • Hook exceptions are swallowed; telemetry side effects must never alter returned flag values.
  • The hook is internal and dependency-injected through createWithDependencies() only.
  • The hook lives in the Datadog feature-flag client layer so PHP 7 API and PHP 8 OpenFeature bridge paths share the same post-evaluation signal.
  • DefaultEvaluationCompletedHook, CompositeEvaluationCompletedHook, and OpenFeature SDK hook behavior land with their first consumers in Add FFE exposure emission #3910 / Add FFE evaluation metrics #3911.

Where this PR fits in the stack

This is the internal hook layer. Runtime evaluations (#3906) sit beneath it; EVP exposures (#3910) and OTLP metrics (#3911) are independent siblings directly on top of it.

pr3909-hook-layer-stack-position

Where this PR fits in the target system

This PR contributes only the internal DD Client post-evaluation envelope and callback point. Downstream writers, sidecar FFIs, libdatadog sidecar forwarding, and Agent / OTLP backends land in #3910 / #3911 / DataDog/libdatadog#2026.

pr3909-hook-layer-system-scope

Changes

  • Adds an internal EvaluationCompleted envelope with flag key, value/default types, targeting context, result value, reason, variant, error fields, allocation key, and doLog.
  • Adds EvaluationCompletedHook plus a no-op implementation.
  • Invokes the hook from DDTrace\FeatureFlags\Client::evaluate() after evaluation details are produced.
  • Adds unit coverage for the Datadog PHP API path and the PHP 8 OpenFeature provider bridge path.

@datadog-datadog-prod-us1-2
Copy link
Copy Markdown

datadog-datadog-prod-us1-2 Bot commented May 22, 2026

Pipelines  Tests

Fix all issues with BitsAI

⚠️ Warnings

🚦 8 Pipeline jobs failed

DataDog/apm-reliability/dd-trace-php | check libxml2 version   View in Datadog   GitLab

🔄 Retry job. This looks flaky and may succeed on retry. 0/193 nodes are available: insufficient resources for scheduling due to memory and CPU constraints alongside node taints and pod affinity issues.

DataDog/apm-reliability/dd-trace-php | test_extension_ci: [7.4]   View in Datadog   GitLab

🔄 Retry job. This looks flaky and may succeed on retry. Multiple tests failed due to process timeouts, including dynamic config updates and debugger log probe installation.

DataDog/apm-reliability/dd-trace-php | test_extension_ci: [8.0]   View in Datadog   GitLab

🔄 Retry job. This looks flaky and may succeed on retry. Timeout occurred while executing live debugger span decoration probe and metric probe tests.

View all 8 failed jobs.

ℹ️ Info

No other issues found (see more)

🧪 All tests passed
❄️ No new flaky tests detected

🎯 Code Coverage (details)
Patch Coverage: 100.00%
Overall Coverage: 60.71% (+0.01%)

Useful? React with 👍 / 👎

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 04adf69 | Docs | Datadog PR Page | Give us feedback!

@pr-commenter
Copy link
Copy Markdown

pr-commenter Bot commented May 22, 2026

Benchmarks [ tracer ]

Benchmark execution time: 2026-05-24 04:07:19

Comparing candidate commit 04adf69 in PR branch leo.romanovsky/m2-m3-evaluation-completed-base with baseline commit 555e9a0 in branch leo.romanovsky/milestone-1-runtime-evaluation.

Found 0 performance improvements and 5 performance regressions! Performance is the same for 188 metrics, 1 unstable metrics.

scenario:ContextPropagationBench/benchInject64Bit-opcache

  • 🟥 execution_time [+384.614ns; +759.386ns] or [+3.009%; +5.941%]

scenario:MessagePackSerializationBench/benchMessagePackSerialization

  • 🟥 execution_time [+4.334µs; +6.506µs] or [+4.334%; +6.506%]

scenario:MessagePackSerializationBench/benchMessagePackSerialization-opcache

  • 🟥 execution_time [+2.169µs; +4.191µs] or [+2.172%; +4.196%]

scenario:PHPRedisBench/benchRedisOverhead

  • 🟥 execution_time [+31.110µs; +42.732µs] or [+3.223%; +4.427%]

scenario:SpanBench/benchDatadogAPI

  • 🟥 execution_time [+1.289µs; +2.655µs] or [+2.001%; +4.122%]

The canonical fixture PHPT explicitly enumerates the FFE classes it
requires before instantiating the Datadog FeatureFlags client. The
shared evaluation-completed envelope/hook added on this branch made
Client::createWithDependencies() reference NoopEvaluationCompletedHook,
EvaluationCompletedHook, and EvaluationCompleted, but the test helper
had not been updated, so the packaged/extension PHPT failed with
"Class DDTrace\FeatureFlags\Internal\NoopEvaluationCompletedHook not
found" before any fixture case ran.

Add the three new files to require_feature_flag_api so the PHPT
matches the runtime class graph used by Client::evaluate().
Adds Mermaid sources and rendered PNGs for the hook (this) PR plus a
README documenting the regeneration workflow.

- `docs/php-ffe-stack/stack-pr3909.mmd` + `.png` — 4-PR stack with this
  PR highlighted (M1 done; EVP and metrics as siblings to come).
- `docs/php-ffe-stack/system-pr3909.mmd` + `.png` — target system
  architecture; this PR contributes the EvaluationCompletedHook +
  OpenFeature provider hook surface. All downstream nodes (writers,
  sidecar FFI, sidecar process, backends) marked future.
- `docs/php-ffe-stack/README.md` — npx invocation for regenerating
  PNGs locally; PR-by-PR diagram table; architectural rule note.

The architectural rule encoded in the system diagram (all I/O via the
libdatadog sidecar) is the same rule Bob applied to PR #3910. See
DataDog/libdatadog#2026 for the sidecar-side support.
leoromanovsky added a commit that referenced this pull request May 23, 2026
Adds the M3 evaluation-metrics layer on top of the hook PR (#3909) as a
sibling of the EVP exposures PR (#3910). Records `feature_flag.evaluations`
for both PHP 7 (DD Client hook) and PHP 8 (OpenFeature SDK hook); both
paths share `EvaluationMetricHook::sharedWriter()` for unified
aggregation. OTLP/protobuf payloads are encoded in PHP via the existing
`OtlpMetricEncoder` and delivered to the user-configured OTLP HTTP
metrics intake through the libdatadog sidecar (`ddog_sidecar_send_ffe_metrics`
FFI added in DataDog/libdatadog#2026).

This branch is force-pushed (user-authorized one-time exception to the
no-force-push rule, 2026-05-23) to restructure history away from being
linearly stacked on the M2 exposures PR (#3910). The PR now stacks
directly on the hook PR (#3909) as a sibling of the EVP PR.

PHP side:

- Add `Internal/Metric/EvaluationMetricWriter` with bounded series
  aggregation, drop accounting, and shutdown flush.
- Add `Internal/Metric/EvaluationMetricHook` (DD Client hook) and
  `OtlpMetricEncoder` (PHP 7-safe protobuf encoding).
- Add `Internal/Metric/SidecarOtlpMetricsTransport` that calls
  `\DDTrace\send_ffe_metrics()` (FFI declared in #3910). Endpoint
  resolution: `OTEL_EXPORTER_OTLP_METRICS_ENDPOINT`, falling back to
  `OTEL_EXPORTER_OTLP_ENDPOINT + /v1/metrics`, default
  `http://localhost:4318/v1/metrics`.
- Add `DDTrace\OpenFeature\EvalMetricsHook` implementing
  `OpenFeature\interfaces\hooks\Hook` (after + error stages), registered
  on `DataDogProvider` via `setHooks()`.
- `DataDogProvider` constructs its internal DD `Client` with
  `DefaultEvaluationCompletedHook::createWithoutMetric()` so the
  OpenFeature path records the metric via the OpenFeature hook (PR 3911
  scope) and NOT via the DD Client hook — preventing double-counting.
  PHP 7 path keeps recording via the DD Client hook.
- Add `Internal/CompositeEvaluationCompletedHook` and
  `Internal/DefaultEvaluationCompletedHook` (metric-only composite).
  This is the merge-conflict point with PR #3910's `[ExposureHook]`
  composite — second merge resolves by combining both hooks.
- Update `Client::create()` to call `DefaultEvaluationCompletedHook::create()`.
- Drop the obsolete `testOtlpTransportBuildsHttpProtobufRequest` PHPUnit
  test (HTTP construction now lives in libdatadog, covered by
  `cargo test -p datadog-sidecar ffe_metrics_flusher`).
- Add `_files_openfeature.php` entry for `EvalMetricsHook.php`.

C/Rust bridge: the `\DDTrace\send_ffe_metrics()` native function, its C
wrapper `ddtrace_sidecar_send_ffe_metrics()`, and the
`ddog_sidecar_send_ffe_metrics` FFI declaration in `components-rs/sidecar.h`
were already added in #3910. This PR's branch picks up those changes
once #3910 merges (or via the same libdatadog submodule pin during
review). For development locally the libdatadog submodule is pinned to
the FFE branch tip (`29762335c`).

Docs:

- Add `docs/php-ffe-stack/{stack,system}-pr3911.{mmd,png}` per the
  4-PR documentation convention.

Validation:

- `php vendor/bin/phpunit --config phpunit.xml tests/api/Unit/FeatureFlags`
  → 40 tests, 160 assertions, OK.
- Mermaid PNGs regenerate via `npx @mermaid-js/mermaid-cli`.

`make test_featureflags`, OpenFeature PHPUnit, and ffe-dogfooding
end-to-end validation will run in CI / are validated separately by
FOLLOW-05 Steps 4–5.
…gh rez

Same three fixes as on the M2 (#3910) and M3 (#3911) sibling branches:

1. Quote the YAML `title:` so the `#PR-number` survives parsing
   (otherwise YAML treats the `#` as a comment and the title renders
   as "PHP FFE 4-PR stack — current =" with the rest missing).
2. `flowchart LR` → `flowchart TD` on the system diagram so the
   PHP-process / host-sidecar / backend lanes stack vertically.
3. Render at 2400×2400 `--scale 3` instead of ~600px default.
leoromanovsky added a commit that referenced this pull request May 23, 2026
Brings the PHP FFE diagram convention to the M1 PR. Each subsequent PR
in the stack (#3909, #3910, #3911) already carried its own stack +
system diagram; #3906 was missing them.

Mirrors the format used by the rest of the stack:
- `stack-pr3906.mmd` — the 4-PR stack with #3906 badged as current
  and the downstream layers shown as "future".
- `system-pr3906.mmd` — the target end-to-end architecture with
  M1's scope (UserCode, OpenFeature Client, DataDogProvider,
  DDTrace FeatureFlags Client, NativeEvaluator, Remote Config client)
  highlighted, and everything from the Hook layer onward dashed.

All conventions match the other branches: quoted YAML titles (to keep
`#PR-number` out of the YAML comment parser), `flowchart TD`
orientation, rendered with `-w 2400 -H 2400 --scale 3 -b white`.
@leoromanovsky leoromanovsky marked this pull request as ready for review May 24, 2026 04:44
@leoromanovsky leoromanovsky requested a review from a team as a code owner May 24, 2026 04:44
@leoromanovsky leoromanovsky requested review from greghuels and sameerank and removed request for a team May 24, 2026 04:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants