Update openai test #5835

XG-xin · 2025-12-05T20:14:54Z

Motivation

Move reasoning tokens from metadata to metrics in nodejs openai tests.
Need to configure to skip some old dd-trace versions.

Changes

Workflow

⚠️ Create your PR as draft ⚠️
Work on you PR until the CI passes
Mark it as ready for review
- Test logic is modified? -> Get a review from RFC owner.
- Framework is modified, or non obvious usage of it -> get a review from R&P team

🚀 Once your PR is reviewed and the CI green, you can merge it!

🛟 #apm-shared-testing 🛟

Reviewer checklist

If PR title starts with [<language>], double-check that only <language> is impacted by the change
No system-tests internal is modified. Otherwise, I have the approval from R&P team
A docker base image is modified?
- the relevant build-XXX-image label is present
A scenario is added (or removed)?
- Get a review from R&P team

github-actions · 2025-12-05T20:15:22Z

CODEOWNERS have been resolved as:

tests/integration_frameworks/llm/openai/test_openai_llmobs.py           @DataDog/ml-observability

tests/integration_frameworks/llm/openai/test_openai_llmobs.py

sabrenner · 2025-12-10T00:07:23Z

I think what we can do to best avoid issues with updating the repos is

Split out the Responses tests into their own test class in test_openai_llmobs.py, ie

@features.llm_observability_openai_llm_interactions
@scenarios.integration_frameworks
class TestOpenAiResponses(BaseOpenaiTest):

then in each of the nodejs.yml and python.yml manifests, we can mark those test classes as irrelevant

test_openai_llmobs.py:
  TestOpenAiEmbeddingInteractions: *ref_5_80_0
  TestOpenAiLlmInteractions: *ref_5_80_0
  TestOpenAiPromptTracking: missing_feature
  TestOpenAiResponses: irrelevant

and similarly for python.yml. We can update these once the features land in a given release.

This will resolve the responses tests. For the other tests where we assert reasoning tokens are present but 0, we can do

assert_llmobs_span_event(
  ...
  metrics=mock.ANY
)

# assert input, output, total, and maybe cached tokens separately
assert llm_span_event["metrics"]["input_tokens"] = ...

to make the tests version-independent and more resilient to future metrics, as we only want to assert specific tokens in those tests. lmk if this makes/doesn't make sense!

move reasoning tokens from metadata to metrics in openai tests

06c3bc2

sabrenner reviewed Dec 5, 2025

View reviewed changes

tests/integration_frameworks/llm/openai/test_openai_llmobs.py Outdated Show resolved Hide resolved

update reasoning token count to the actual numnber

5747c92

kmk142789 approved these changes Dec 6, 2025

View reviewed changes

XG-xin marked this pull request as ready for review December 9, 2025 19:20

XG-xin requested a review from a team as a code owner December 9, 2025 19:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update openai test #5835

Update openai test #5835

XG-xin commented Dec 5, 2025

Uh oh!

github-actions bot commented Dec 5, 2025

Uh oh!

Uh oh!

sabrenner commented Dec 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Update openai test #5835

Are you sure you want to change the base?

Update openai test #5835

Conversation

XG-xin commented Dec 5, 2025

Motivation

Changes

Workflow

Reviewer checklist

Uh oh!

github-actions bot commented Dec 5, 2025

Uh oh!

Uh oh!

sabrenner commented Dec 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants