Remove service tag and replace with a similar concept on Scope. #131

tylerbenson · 2018-10-10T07:52:27Z

Background

A while ago, we added service as a tag (#119). This is overly simplistic and has high probability for user confusion.

Problem

The problem is we scoped it as a tag specifically for a narrow use case, but as implemented, it is trivial for users to use it in ways that can fundamentally break common tracer logic.

Proposal

I propose we roll back/remove service as a tag, and move the ability to override the service name to the Scope.

Perhaps something like this (in java parlance):

Span span = tracer.buildSpan("stuff").start();
try (Scope scope = tracer.activate(span)) {
  try (Scope newScope = scope.withServiceName("foo")) {
    // newScope is now active scope and service name is overridden for any new children spans.
  }
  // new child spans without service name overridden here.
}

This also opens the possibility of additional Scope scoped tags or other things.

Consideration

I don't believe any releases have been published with the new service tag included, so at a minimum, we should at least revert #119 and opentracing/opentracing-java#287 before they are released to prevent the need for a longer tail deprecation.

The text was updated successfully, but these errors were encountered:

codefromthecrypt · 2018-10-10T08:21:43Z

Thanks @tylerbenson. I agree the tag convention should be reverted.

I like that this won't accidentally result in a tag "service" which would be hard to grok whether it was an accident of legacy instrumentation or a new thing. It also addresses the first-class nature of service in distributed tracing. Finally, it deals with scoping properly.

One thing I would suggest is keeping this feature on the tracer as opposed to overloading the scope with more methods.

Ex

Span span = tracer.buildSpan("stuff").start();
try (Scope scope = tracer.activateForService(span, "who-I-proxy-for")) {
   // newScope is now active scope and service name is overridden for any new children spans.
}

note the span itself here should be only assigned to one service, so this is not only the children but also the activated one itself. There might be some nuance to how this plays out.

Note: the above allows you to switch services multiple times: nested scopes. If that's a feature it should end up in the tests.

An alternative would be to not allow nesting and do something similar to baggage which is write-only.
For example, adding a serviceName method to the span builder.

Ex

Span span = tracer.buildSpan("stuff").serviceName( "who-I-proxy-for").start();
try (Scope scope = tracer.activate(span)) {
   // newScope is now active scope and service name is overridden for any new children spans.
}

The pro of this is that it allocates the span from the beginning as that service. Also, it can't accidentally end up ambiguously as a "service" tag in old instrumentation. OTOH, this has the same implementation gotchas as baggage (Except the remote propagation part), even if it is only scoping a single property (service). That said, some tracers may already have a .serviceName method on their span which lets this drop in.

food for thought.. regardless, this should be tested in practice against something that proxies, and directly linked.

Test cases should include at least:

dispatch to multiple handlers based on incoming route (and those being different services)
multiple intermediate aka local spans
integration with at least one "real" implementation, to show service graph aggregation isn't hosed easily

yurishkuro · 2018-10-10T14:03:33Z

+1 to have this integrated with span creation rather than scope.

But the description could use better justification. The only one I heard is old instrumentation already using "service" tag for different purposes. I think it's a good enough reason (I wasn't a fan of the tag in the first place), but are there other considerations?

tylerbenson · 2018-10-10T21:37:15Z

@yurishkuro to be clear, you would prefer @adriancole's second suggestion, to see this associated with the span builder and have it propagated more like baggage?

yurishkuro · 2018-10-11T00:19:41Z

I was only referring to the builder part. Auto-propagating sounds like assuming too much, but I can see the rationale for that. Perhaps it also needs to be explicit.

tylerbenson · 2018-10-11T02:23:49Z

I don't think anyone is suggesting for it to propagate across processes, just referring to how baggage is shared between spans.

codefromthecrypt · 2018-10-12T01:15:00Z

In looking at prior art (like finagle) and also offline chat with @CodingFabian also thinking about this..

I think the simplest design is requiring the tracer to inherit parent-child any time it is set. This can't be assumed in tags, not safely IMHO as no other data in tags act like that. I think the span builder is the best way to achieve this in the OT api, possibly named "spanBuilder.serviceName"

For folks who are already using the "service" tag convention (ex @CodingFabian or @jeffbakerexpedia), we can treat it like any other data which is sortof "old school". For example, when tracers upgrade, if they see a tag named "service" they can try to re-forward it to the correct api. This would work at builder time for example, even if be a bit dodgy later.

Aside: to underpin this, a span.serviceName() or similar method would be needed in brave to properly inherit. We were to do this anyway, just waiting for rule of three anyway. For example, we knew camel use the forwarding stuff and currently hack by using different tracers. It needed to be a repeating pattern before escalating to an api. I can manage adding this there.. cc @oscerd

tedsuo · 2018-10-12T20:26:34Z

I like the idea of service being a more concrete concept in the spec.

One question on scope: the service tag was added as an optional way to implement a limited set of behavior - out of band reporting on multiple services. But the "service scope" comes up more broadly in tracing than just this usecase:

the service namespace is something users really care about. Every tracing system has a concept of "service name" and sometimes "service ID" which is configured in tracer startup.
service-level tags (tags that apply to every span emitted from a tracer instance) are useful, both internally as an index and externally as context available to other subsystems.

How do others feel about these broader usecases? If we're going to add serviceName as an explicit concept to a span, I feel like we should consider the implications beyond just out-of-band reporting.

LakerBaker · 2018-11-28T00:24:35Z

I am also +1 on doing this at span creation rather than scope. We have an internal use case at Expedia in which our build system is using Expedia's Haystack tracing system to visualize our CICD pipeline; we currently do this with the tag-based override but will change our code to support the new spec when it changes.

tylerbenson · 2018-11-28T01:30:09Z

@tedsuo What should be our path forward here?

mchandramouli · 2018-12-03T21:30:20Z

I agree with @adriancole to have serviceName as a first class citizen and not as a tag. Whether we call it as serviceName or applicationName to be generic, I prefer an API like what Adrian proposed above

Span span = tracer.buildSpan("stuff").serviceName( "who-I-proxy-for").start();

This forces serviceName to be a concrete member of the data model and not as a tag.

mchandramouli · 2018-12-03T21:32:57Z

On a side note, I would say the same for remoteServiceName instead of or in addition to peer.service tag. Though this need not be mandatory, having it helps with building better service graphs. There is a related discussion in Haystack ExpediaDotCom/haystack#537

yurishkuro · 2018-12-03T22:15:37Z

Service name is insufficient. Jaeger uses Process (instance, resource, workload are the other possible names), which contains a service name and a set of tags that contain additional metadata about the workload (host name, IP, port, zone, environment, code version, etc.).

It is different from remote service (peer.service) because the current workload may not know much metadata about the peer service, and the tag itself is generally not that critical.

yurishkuro · 2018-12-03T22:17:40Z

I think the simplest design is requiring the tracer to inherit parent-child any time it is set. This can't be assumed in tags, not safely IMHO as no other data in tags act like that. I think the span builder is the best way to achieve this in the OT api, possibly named "spanBuilder.serviceName"

+1 to inheritance & builder method (but replace "name" with a more complex object).

tylerbenson · 2018-12-06T00:25:28Z

I submitted a couple PRs (see above) to try to revert this mess I made. I suggest we get these merged and proceed forward with a separate PR to add service name as a first class API.

yurishkuro · 2018-12-06T00:30:46Z

@tylerbenson I would prefer to make changes backwards compatible, i.e. deprecate rather than remove.

tylerbenson · 2018-12-06T16:24:40Z

@yurishkuro I don't think it needs to be deprecated in the java repo since it was added since the last release, so technically not a breaking change there. I would consider changing it to deprecated in the spec though with the assumption that means it probably shouldn't be added to language implementations if not already there.

yurishkuro · 2018-12-06T16:29:29Z

ah, if it never made it to a public release, then sgtm

objectiser mentioned this issue Oct 15, 2018

Support "service" tag jaegertracing/jaeger#936

Closed

bulicekj mentioned this issue Nov 27, 2018

Display serviceName based on service tag, if available ExpediaDotCom/haystack-ui#435

Merged

LakerBaker mentioned this issue Nov 29, 2018

Emit metric for excessive operation names instead of rejecting the span ExpediaDotCom/haystack-collector#62

Closed

This was referenced Dec 5, 2018

Revert PR 287 which added service tag opentracing/opentracing-java#322

Merged

Revert "Add tag for service" #135

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove service tag and replace with a similar concept on Scope. #131

Remove service tag and replace with a similar concept on Scope. #131

tylerbenson commented Oct 10, 2018

codefromthecrypt commented Oct 10, 2018

yurishkuro commented Oct 10, 2018

tylerbenson commented Oct 10, 2018

yurishkuro commented Oct 11, 2018

tylerbenson commented Oct 11, 2018

codefromthecrypt commented Oct 12, 2018

tedsuo commented Oct 12, 2018 •

edited

Loading

LakerBaker commented Nov 28, 2018

tylerbenson commented Nov 28, 2018

mchandramouli commented Dec 3, 2018

mchandramouli commented Dec 3, 2018

yurishkuro commented Dec 3, 2018

yurishkuro commented Dec 3, 2018

tylerbenson commented Dec 6, 2018

yurishkuro commented Dec 6, 2018

tylerbenson commented Dec 6, 2018

yurishkuro commented Dec 6, 2018

Remove service tag and replace with a similar concept on Scope. #131

Remove service tag and replace with a similar concept on Scope. #131

Comments

tylerbenson commented Oct 10, 2018

Background

Problem

Proposal

Consideration

codefromthecrypt commented Oct 10, 2018

yurishkuro commented Oct 10, 2018

tylerbenson commented Oct 10, 2018

yurishkuro commented Oct 11, 2018

tylerbenson commented Oct 11, 2018

codefromthecrypt commented Oct 12, 2018

tedsuo commented Oct 12, 2018 • edited Loading

LakerBaker commented Nov 28, 2018

tylerbenson commented Nov 28, 2018

mchandramouli commented Dec 3, 2018

mchandramouli commented Dec 3, 2018

yurishkuro commented Dec 3, 2018

yurishkuro commented Dec 3, 2018

tylerbenson commented Dec 6, 2018

yurishkuro commented Dec 6, 2018

tylerbenson commented Dec 6, 2018

yurishkuro commented Dec 6, 2018

tedsuo commented Oct 12, 2018 •

edited

Loading