feat(llmobs): prompt management SDK methods by PROFeNoM · Pull Request #18186 · DataDog/dd-trace-py

PROFeNoM · 2026-05-20T13:09:21Z

Description

Adds prompt management methods to the LLMObs Python SDK, calling the new public API endpoints from MLOB-7523.

Blocked by: https://github.com/DataDog/dd-source/pull/443753 (public CRUD API routes)

New public API

# Write methods (require DD_API_KEY + DD_APP_KEY)
LLMObs.create_prompt(prompt_id, template, *, title, description, user_version, labels)
LLMObs.create_prompt_version(prompt_id, template, *, description, user_version, labels)
LLMObs.update_prompt(prompt_id, *, title, description)
LLMObs.update_prompt_version(prompt_id, version, *, labels, description)
LLMObs.delete_prompt(prompt_id)

# Read methods (require DD_API_KEY only)
LLMObs.list_prompts(*, ml_app)
LLMObs.list_prompt_versions(prompt_id)

Not covered: GET /prompts/{prompt_id}/versions/{version} is intentionally left out. Prefer to add it once #18127 (hybrid prompt delivery) lands, since it changes the get_prompt signature - at that point we can decide whether to add version= as an optional parameter to get_prompt rather than a separate method.

New types (`ddtrace/llmobs/types.py`)

PromptLabel = Literal["development", "production"] - allowed label values
ChatMessage(TypedDict) - template message format (role + content only)
PromptResponse(TypedDict) - returned by create/update/list prompt operations
PromptVersionResponse(TypedDict) - returned by create/update/list version operations
DeletedPromptResponse(TypedDict) - returned by delete

Exception hierarchy

PromptAPIError (base)
  PromptAuthError        - 401/403 (bad API/app key)
  PromptValidationError  - 400 (bad input)
  PromptNotFoundError    - 404
  PromptConflictError    - 409 (duplicate prompt_id)
  PromptServerError      - 5xx

Each method documents which exceptions it can raise.

Cache changes (`cache.py`)

WarmCache now uses per-prompt subdirectories: ~/.cache/datadog/llmobs/prompts/{prompt_id}/{label}.json
Added evict_prompt(prompt_id) - uses shutil.rmtree on the prompt directory
Previous flat-file layout had prefix collision bugs (e.g., deleting "greeting" could evict "greeting-v2" cache entries)

Files changed

File	Change
`types.py`	`PromptLabel`, `ChatMessage`, 3 response TypedDicts, 5 exception classes
`cache.py`	Per-prompt subdirectory layout, `evict_prompt` method
`manager.py`	`_request` helper, 7 CRUD methods, API key validation on `get_prompt`
`_llmobs.py`	7 public classmethods delegating to manager
`test_prompts.py`	Exception mapping test, cache eviction test

E2E validation (staging) - 13/13 pass

Tested against datad0g.com using the SDK. Save the script below, install the branch wheel, replace keys with your own staging credentials, and run with python test_sdk.py.

#	Journey	SDK method	Expected
1	Create prompt	`create_prompt()`	Returns PromptResponse with matching prompt_id
2	Create duplicate	`create_prompt()`	Raises PromptConflictError (409)
3	List prompts	`list_prompts()`	Our prompt in returned list
4	Create version	`create_prompt_version()`	Returns PromptVersionResponse
5	List versions	`list_prompt_versions()`	>= 2 versions
6	Update prompt	`update_prompt()`	Returns updated PromptResponse
7	Update version	`update_prompt_version()`	Returns updated PromptVersionResponse
8	Get prompt (read path)	`get_prompt()`	source="registry"
9	Get prompt with label	`get_prompt(label="development")`	Matches labeled version
10	Validation error	`update_prompt()` with no fields	Raises PromptValidationError (client-side)
11	Delete prompt	`delete_prompt()`	Returns DeletedPromptResponse
12	Get deleted prompt	`get_prompt()`	Raises ValueError (no fallback)
13	Delete non-existent	`delete_prompt()`	Raises PromptNotFoundError (404)

"""
Setup:
    uv venv && source .venv/bin/activate
    # Install the branch wheel (update build number as needed):
    uv pip install --reinstall \
        --find-links https://dd-trace-py-builds.s3.amazonaws.com/114547844/index.html \
        ddtrace==4.10.0rc1

Run:
    DD_API_KEY="<your-datad0g-api-key>" \
    DD_APP_KEY="<your-datad0g-app-key>" \
    DD_SITE=datad0g.com \
    python test_sdk.py
"""

import os
import sys
import time
import traceback

os.environ.setdefault("DD_API_KEY", "<your-datad0g-api-key>")
os.environ.setdefault("DD_APP_KEY", "<your-datad0g-app-key>")
os.environ.setdefault("DD_SITE", "datad0g.com")

from ddtrace.llmobs import LLMObs
from ddtrace.llmobs.types import (
    PromptAPIError,
    PromptAuthError,
    PromptConflictError,
    PromptNotFoundError,
    PromptServerError,
    PromptValidationError,
)

RUN_ID = f"e2e-{int(time.time())}"
PROMPT_ID = f"sdk-test-{RUN_ID}"

passed = 0
failed = 0


def test(name):
    def decorator(fn):
        global passed, failed
        try:
            fn()
            print(f"[PASS] {name}")
            passed += 1
        except Exception:
            print(f"[FAIL] {name}")
            traceback.print_exc()
            failed += 1
        return fn
    return decorator


print(f"=== Prompt CRUD E2E - SDK ===")
print(f"Run ID: {RUN_ID}")
print(f"Prompt ID: {PROMPT_ID}")
print()


@test("Create prompt")
def _():
    resp = LLMObs.create_prompt(
        PROMPT_ID,
        [
            {"role": "system", "content": "You are a {{persona}}."},
            {"role": "user", "content": "{{question}}"},
        ],
        title="E2E Test Prompt",
        description="Created by SDK e2e test",
    )
    print(f"  Response: {resp}")
    assert resp["prompt_id"] == PROMPT_ID


@test("Create duplicate raises PromptConflictError")
def _():
    try:
        LLMObs.create_prompt(PROMPT_ID, [{"role": "user", "content": "dup"}])
        assert False, "Should have raised"
    except PromptConflictError as e:
        assert e.status == 409


@test("List prompts")
def _():
    prompts = LLMObs.list_prompts()
    print(f"  Total: {len(prompts)}")
    assert PROMPT_ID in [p["prompt_id"] for p in prompts]


@test("Create prompt version")
def _():
    resp = LLMObs.create_prompt_version(
        PROMPT_ID,
        [
            {"role": "system", "content": "You are a helpful {{persona}}."},
            {"role": "user", "content": "Please answer: {{question}}"},
        ],
        description="v2 - improved",
        user_version="v2",
    )
    print(f"  Response: {resp}")


@test("List prompt versions")
def _():
    versions = LLMObs.list_prompt_versions(PROMPT_ID)
    print(f"  Count: {len(versions)}")
    assert len(versions) >= 2


@test("Update prompt metadata")
def _():
    resp = LLMObs.update_prompt(PROMPT_ID, title="Updated", description="Updated")
    print(f"  Response: {resp}")


@test("Update prompt version")
def _():
    versions = LLMObs.list_prompt_versions(PROMPT_ID)
    ver = str(versions[-1].get("version", 1))
    resp = LLMObs.update_prompt_version(
        PROMPT_ID, ver, description="Updated version", labels=["development"]
    )
    print(f"  Response: {resp}")


@test("Get prompt (read path)")
def _():
    prompt = LLMObs.get_prompt(PROMPT_ID)
    print(f"  id={prompt.id}, version={prompt.version}, source={prompt.source}")
    assert prompt.id == PROMPT_ID
    assert prompt.source == "registry"


@test("Get prompt with label=development")
def _():
    try:
        prompt = LLMObs.get_prompt(PROMPT_ID, label="development")
        print(f"  id={prompt.id}, version={prompt.version}, label={prompt.label}")
    except ValueError as e:
        print(f"  Label not found (expected if not set): {e}")


@test("Update with no fields raises PromptValidationError")
def _():
    try:
        LLMObs.update_prompt(PROMPT_ID)
        assert False
    except PromptValidationError as e:
        assert e.status == 0


@test("Delete prompt")
def _():
    resp = LLMObs.delete_prompt(PROMPT_ID)
    print(f"  Response: {resp}")


@test("Get deleted prompt raises ValueError")
def _():
    LLMObs.clear_prompt_cache(hot=True, warm=True)
    try:
        LLMObs.get_prompt(PROMPT_ID)
        assert False
    except ValueError as e:
        assert "could not be fetched" in str(e)


@test("Delete non-existent raises PromptNotFoundError")
def _():
    try:
        LLMObs.delete_prompt("nonexistent-" + RUN_ID)
        assert False
    except PromptNotFoundError as e:
        assert e.status == 404


print()
print(f"=== Results: {passed} passed, {failed} failed ===")
sys.exit(1 if failed > 0 else 0)

Add CRUD operations for LLM Observability prompt registry to the Python SDK.

datadog-prod-us1-6 · 2026-05-20T13:10:16Z

Tests

🎉 All green!

🧪 All tests passed
❄️ No new flaky tests detected

_{This comment will be updated automatically if new data arrives.

🔗 Commit SHA: 17efe7f | Docs | Datadog PR Page | Give us feedback!}

cit-pr-commenter-54b7da · 2026-05-20T13:14:59Z

Codeowners resolved as

ddtrace/llmobs/_prompts/manager.py                                      @DataDog/ml-observability

list_prompts and list_prompt_versions only need DD_API_KEY since the backend uses ValidReportingAPIUser for read endpoints.

… [MLOB-7524]

…ons [MLOB-7524]

_request() passes body= to conn.request(), mock signature needed to match.

…[MLOB-7524] Handler expects filter[ml_app], SDK was sending ml_app.

…B-7524] Plain JSON API returns bare arrays, not JSONAPI {"data": [...]} wrappers.

feat(llmobs): add prompt management write methods [MLOB-7524]

7577f2e

Add CRUD operations for LLM Observability prompt registry to the Python SDK.

chore(llmobs): add release note for prompt CRUD methods [MLOB-7524]

d79cc3a

PROFeNoM changed the title ~~feat(llmobs): prompt management write methods [MLOB-7524]~~ feat(llmobs): prompt management write methods May 20, 2026

PROFeNoM added 3 commits May 20, 2026 15:15

fix(llmobs): don't require DD_APP_KEY for read methods [MLOB-7524]

b73a398

list_prompts and list_prompt_versions only need DD_API_KEY since the backend uses ValidReportingAPIUser for read endpoints.

fix(llmobs): validate DD_API_KEY in _request, clarify auth docstrings…

b72675d

… [MLOB-7524]

fix(llmobs): validate DD_API_KEY in get_prompt [MLOB-7524]

f0ec2a3

PROFeNoM changed the title ~~feat(llmobs): prompt management write methods~~ feat(llmobs): prompt management SDK methods [MLOB-7524] May 20, 2026

PROFeNoM added 2 commits May 20, 2026 15:52

fix(llmobs): use Any return on _request with typed call-site annotati…

d23e3b5

…ons [MLOB-7524]

style(llmobs): format types.py [MLOB-7524]

9faa87a

PROFeNoM changed the title ~~feat(llmobs): prompt management SDK methods [MLOB-7524]~~ feat(llmobs): prompt management SDK methods May 21, 2026

PROFeNoM added 3 commits May 21, 2026 09:22

fix(llmobs): add body kwarg to MockHTTPConnection.request [MLOB-7524]

7dc0e5c

_request() passes body= to conn.request(), mock signature needed to match.

fix(llmobs): use filter[ml_app] query param to match API expectation …

4bbd190

…[MLOB-7524] Handler expects filter[ml_app], SDK was sending ml_app.

fix(llmobs): remove data envelope unwrapping from list endpoints [MLO…

17efe7f

…B-7524] Plain JSON API returns bare arrays, not JSONAPI {"data": [...]} wrappers.

PROFeNoM force-pushed the alex/MLOB-7524_prompt-crud-api branch from cfd1734 to 17efe7f Compare May 21, 2026 16:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(llmobs): prompt management SDK methods#18186

feat(llmobs): prompt management SDK methods#18186
PROFeNoM wants to merge 10 commits into
mainfrom
alex/MLOB-7524_prompt-crud-api

PROFeNoM commented May 20, 2026 •

edited

Loading

Uh oh!

datadog-prod-us1-6 Bot commented May 20, 2026 •

edited by datadog-official Bot

Loading

Uh oh!

cit-pr-commenter-54b7da Bot commented May 20, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

PROFeNoM commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

New public API

New types (ddtrace/llmobs/types.py)

Exception hierarchy

Cache changes (cache.py)

Files changed

Uh oh!

datadog-prod-us1-6 Bot commented May 20, 2026 • edited by datadog-official Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cit-pr-commenter-54b7da Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codeowners resolved as

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

PROFeNoM commented May 20, 2026 •

edited

Loading

New types (`ddtrace/llmobs/types.py`)

Cache changes (`cache.py`)

datadog-prod-us1-6 Bot commented May 20, 2026 •

edited by datadog-official Bot

Loading

cit-pr-commenter-54b7da Bot commented May 20, 2026 •

edited

Loading