Skip to content

Conversation

@lrafeei
Copy link
Contributor

@lrafeei lrafeei commented Jan 3, 2026

This PR adds support and testing for the following:

  • Makes use of the sentinel span for trace/span mapping
  • MessageTraces and MessageTransactions (PRODUCER/CONSUMER kind)

@lrafeei lrafeei changed the title Hybrid agent message queue traces Hybrid agent Message Queue Traces and Transactions Jan 3, 2026
@github-actions
Copy link

github-actions bot commented Jan 3, 2026

MegaLinter analysis: Success

Descriptor Linter Files Fixed Errors Warnings Elapsed time
✅ ACTION actionlint 7 0 0 0.92s
✅ MARKDOWN markdownlint 7 0 0 0 1.39s
✅ PYTHON ruff 997 26 0 0 1.08s
✅ PYTHON ruff-format 997 31 0 0 0.39s
✅ YAML prettier 15 0 0 0 1.61s
✅ YAML v8r 15 0 0 4.81s
✅ YAML yamllint 15 0 0 0.7s

See detailed reports in MegaLinter artifacts

MegaLinter is graciously provided by OX Security

@mergify mergify bot added the tests-failing Tests failing in CI. label Jan 3, 2026
@codecov-commenter
Copy link

codecov-commenter commented Jan 3, 2026

Codecov Report

❌ Patch coverage is 68.29268% with 13 lines in your changes missing coverage. Please review.
✅ Project coverage is 80.14%. Comparing base (d71dad1) to head (449a712).
⚠️ Report is 23 commits behind head on develop-hybrid-core-tracing.

Files with missing lines Patch % Lines
newrelic/api/opentelemetry.py 68.29% 10 Missing and 3 partials ⚠️
Additional details and impacted files
@@                       Coverage Diff                       @@
##           develop-hybrid-core-tracing    #1619      +/-   ##
===============================================================
+ Coverage                        79.93%   80.14%   +0.21%     
===============================================================
  Files                              213      213              
  Lines                            25168    25239      +71     
  Branches                          4001     4023      +22     
===============================================================
+ Hits                             20117    20229     +112     
+ Misses                            3622     3579      -43     
- Partials                          1429     1431       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@hmstepanek hmstepanek marked this pull request as ready for review January 5, 2026 17:25
@hmstepanek hmstepanek requested a review from a team as a code owner January 5, 2026 17:25
Copy link
Contributor

@hmstepanek hmstepanek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general implementation looks good. I think the main thing on my mind for this PR is the consumer and producer parts of the agent spec as compared to the tests and some of the comments here-they seem to be a bit at odds with each other but maybe I'm just missing something. I vaguely remember our behavior being different than some of the other agent teams in regards to when we create a trace vs a transaction for message brokers so that might be where my confusion stems from.

with BackgroundTask(application, name="Foo"):
with tracer.start_as_current_span(name="Bar", kind=otel_api_trace.SpanKind.INTERNAL):
try:
with pytest.raises(Exception):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
with pytest.raises(Exception):
with pytest.raises(Exception, match=r"Test exception message"):

scoped_metrics=[(f"Function/{transaction_name}", 1)],
rollup_metrics=_test_application_rollup_metrics + [(f"Function/{transaction_name}", 1)],
scoped_metrics=[(f"Function/{transaction_name} http send", 2)],
rollup_metrics=_test_application_rollup_metrics + [(f"Function/{transaction_name} http send", 2)],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixup for megalinter:

Suggested change
rollup_metrics=_test_application_rollup_metrics + [(f"Function/{transaction_name} http send", 2)],
rollup_metrics=[(f"Function/{transaction_name} http send", 2), *_test_application_rollup_metrics],

# While this OTel span exists it will not be explicitly
# translate to a NR trace. This may occur during the
# creation of a Transaction, which will create the root
# span. This may also occur during special cases, such
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question about this-the spec says the following:

  • Creating a span with a span kind of SpanKind.CONSUMER will start a OtherTransaction/* transaction
    and create a corresponding segment.
  • If the CONSUMER span occurs within an already existing NR transaction, it will be included in
    the transaction trace.

Wouldn't the above mean that a nr trace should be created on multiple calls to the consumer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep! For the rest of the message queues that is what we do. For Kafka, we ended up treating the consumer as a transaction and the producer as a trace. But yes, I have another PR for the rest of the queues that tweaks this logic.

if not nr_trace or (nr_trace and getattr(nr_trace, "end_time", None)):
# Check to see if New Relic trace ever existed
# or, if it does, that trace has already ended
if getattr(self.nr_trace, "end_time", None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't quite the same behavior as the previous. In the previous if the self.nr_trace was None it would also return. Do we want to match the previous behavior still?

Suggested change
if getattr(self.nr_trace, "end_time", None):
if not self.nr_trace or getattr(self.nr_trace, "end_time", None):

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oof, not sure how that piece of the logic got removed--that should have stayed in there...

# Message specific attributes
if self.attributes.get("messaging.system"):
destination_name = self.attributes.get("messaging.destination")
self.nr_transaction.destination_name = destination_name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.nr_transaction.destination_name = destination_name

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I remember why I stored it as a separate variable instead of directly putting it in as a transaction attribute: I also use that name in the Kafka Nodes metric a few lines below.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh--oops missed that. Ok fine as is then!

# NOTE: Sentinel, MessageTrace, DatastoreTrace, and
# ExternalTrace types do not have a name attribute.
self._name = name
if hasattr(self, "nr_trace") and hasattr(self.nr_trace, "name"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we are missing coverage for this function.

if not nr_trace or (nr_trace and getattr(nr_trace, "end_time", None)):
# Check to see if New Relic trace ever existed
# or, if it does, that trace has already ended
if getattr(self.nr_trace, "end_time", None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we are missing coverage in the tests for this case.

send_producer_message()

test()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if we should have a test here to validate that a transaction and trace are not created for a producer in the case that the span exists outside a transaction?

  • Creating a span with a span kind of SpanKind.PRODUCER will not start a NR transaction.
  • If the PRODUCER span occurs within an already existing NR transaction, a segment will
    be created and it will be included in the transaction trace.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair. I'll put in a test for the first scenario and the second scenario I will use an existing consumer test to prove that this is the case

@lrafeei lrafeei requested a review from hmstepanek January 5, 2026 21:02
@lrafeei lrafeei force-pushed the hybrid-agent-message-queue-traces branch from 207bf9c to c30a0e4 Compare January 5, 2026 21:21
@hmstepanek hmstepanek merged commit b3dc905 into develop-hybrid-core-tracing Jan 6, 2026
6 checks passed
@hmstepanek hmstepanek deleted the hybrid-agent-message-queue-traces branch January 6, 2026 18:10
@mergify mergify bot removed the tests-failing Tests failing in CI. label Jan 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants