-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SmallRye Fault Tolerance: add support for OpenTelemetry Metrics #46636
base: main
Are you sure you want to change the base?
Conversation
Hi @brunobat, this is what we discussed earlier today. The failing TCK test is a TCK bug; I submitted a PR (microprofile/microprofile-fault-tolerance#655), but until that is merged and released, I've kept an exclusion. EDIT: I also filed a TCK challenge: microprofile/microprofile-fault-tolerance#656 |
This comment has been minimized.
This comment has been minimized.
Hmm, the failing tests might be relevant. I'll take a look. |
This requires propagating the "metrics enabled at build time" information through the `OpenTelemetrySdkBuildItem`. I took the liberty to also add the tracing and logging build time enablement information, even though I have no use for it at the moment. One TCK test remains excluded, because it has a bug [1][2]. [1] microprofile/microprofile-fault-tolerance#656 [2] microprofile/microprofile-fault-tolerance#655
bff6ca9
to
49fa4b9
Compare
Hi @brunobat, I had to add some "build time enabled" information to the |
Status for workflow
|
Status | Name | Step | Failures | Logs | Raw logs | Build scan |
---|---|---|---|---|---|---|
✔️ | JVM Tests - JDK 17 | Logs | Raw logs | 🔍 | ||
✔️ | JVM Tests - JDK 21 | Logs | Raw logs | 🔍 | ||
✖ | JVM Tests - JDK 17 Windows | Build |
Failures | Logs | Raw logs | 🔍 |
Full information is available in the Build summary check run.
You can consult the Develocity build scans.
Failures
⚙️ JVM Tests - JDK 17 Windows #
- Failing: extensions/vertx-http/deployment
! Skipped: devtools/cli extensions/agroal/deployment extensions/amazon-lambda-http/deployment and 146 more
📦 extensions/vertx-http/deployment
✖ io.quarkus.vertx.http.proxy.TrustedForwarderDnsResolveTest.testTrustedProxyResolved
- History - More details - Source on GitHub
java.io.IOException:
Error while binding on /127.0.0.1:53530
original message : Address already in use: bind
at org.apache.mina.transport.socket.nio.NioDatagramAcceptor.open(NioDatagramAcceptor.java:708)
at org.apache.mina.transport.socket.nio.NioDatagramAcceptor.registerHandles(NioDatagramAcceptor.java:228)
at org.apache.mina.transport.socket.nio.NioDatagramAcceptor.access$300(NioDatagramAcceptor.java:68)
at org.apache.mina.transport.socket.nio.NioDatagramAcceptor$Acceptor.run(NioDatagramAcceptor.java:164)
at org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:64)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the additional methods in the OpenTelemetrySdkBuildItem
.
I left a comment related with the runtime behaviour.
metricsProviders++; | ||
} | ||
if (openTelemetrySdk.map(OpenTelemetrySdkBuildItem::isMetricsBuildTimeEnabled).orElse(false)) { | ||
builder.addBeanClass("io.smallrye.faulttolerance.metrics.OpenTelemetryProvider"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will only be affected by the build time configuration. We are still going to instrument if the OTel SDK is disabled at runtime...
I suggest you skip the processFaultToleranceAnnotations()
method (just return or skip all) in case OpenTelemetry is used and if disabled at runtime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cannot really do that, because I try to do a lot of things at build time and there's no runtime config at that time.
What I could do is read the runtime config and remove the OTEL metrics provider if runtime-disabled. I'm not yet sure what the best way would be -- clearly the RuntimeValue
is a Quarkus-specific concept and so it cannot be in SmallRye Fault Tolerance. On the other hand, SmallRye Fault Tolerance has configuration for disabling the specific metric providers at runtime, so there might be a way to translate the OTEL config to the SRye FT config, somehow. Not yet sure how.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, that build item is only produced at runtime, even if it's published with build time data... Not sure if this will work properly and if we need different build items for OTel build and run times.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Build items are never produced at runtime. I mean, at runtime, the concept of build items and build steps doesn't even exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, the config values the build item holds might come from runtime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the Otel Provider be disabled?
See: https://github.com/smallrye/smallrye-fault-tolerance/blob/8da7dae36f195608291687793ca9f12fd9567d93/implementation/fault-tolerance/src/main/java/io/smallrye/faulttolerance/metrics/OpenTelemetryProvider.java#L27
The current code doesn't support that, but with some changes...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's the configuration I mentioned above. There might be a way, I just didn't have the time to explore it yet.
One TCK test remains excluded, because it has a bug [1].
[1] microprofile/microprofile-fault-tolerance#655