Skip to content

Concurrency: LangfuseTracer is not thread-safe, causing mixed traces in async environments #2140

@LastRemote

Description

@LastRemote

[P.S. This is auto-generated by Cursor but I think it is pretty well written]

Description

There is a concurrency issue in the haystack-integrations implementation of LangfuseTracer. The tracer uses a shared instance-level list (self._context) to manage the span stack. In an asynchronous environment, such as a FastAPI application handling multiple requests concurrently, this shared state leads to a race condition.

As a result, spans from different, independent pipeline runs can be incorrectly nested or intertwined within the same trace in Langfuse. This corrupts the tracing data, making it unreliable for debugging and monitoring.

Affected Component

haystack-integrations.tracing.langfuse.tracer.LangfuseTracer

Steps to Reproduce

  1. Initialize a Haystack pipeline with LangfuseTracer enabled.
  2. Expose this pipeline through an async web framework like FastAPI.
  3. Send multiple requests to the pipeline endpoint concurrently.
  4. Observe the traces in the Langfuse UI. You will notice that spans from different requests are mixed together under a single trace instead of being isolated in their own respective traces.

Root Cause

The LangfuseTracer instance maintains its span stack in an instance variable:

class LangfuseTracer(Tracer):
    def __init__(...):
        # ...
        self._context: List[LangfuseSpan] = []
        # ...

    def trace(...):
        # ...
        self._context.append(span)
        # ...
        try:
            yield span
        finally:
            # ...
            if self._context and self._context[-1] == span:
                self._context.pop()

When a single LangfuseTracer instance is used for the entire application (which is standard practice), this _context list is shared across all concurrent requests, leading to the race condition.

Proposed Solution (Recommended)

The standard Python solution for handling context-local state in asynchronous applications is to use contextvars. By replacing the instance-level list with a ContextVar, the span stack will be automatically isolated for each concurrent execution context (e.g., each incoming request).

The fix involves these changes:

  1. Introduce a ContextVar for the span stack:

    # Add this at the module level
    span_context_var: ContextVar[List["LangfuseSpan"]] = ContextVar("span_context", default=[])
  2. Modify LangfuseTracer to use the ContextVar:

    class LangfuseTracer(Tracer):
        def __init__(self, ...):
            # self._context is removed
            # ...
    
        @contextlib.contextmanager
        def trace(self, ...):
            # ...
            # Get the context-local stack and append the new span
            current_context = span_context_var.get()
            current_context.append(span)
            span_context_var.set(current_context)
    
            try:
                yield span
            finally:
                # ...
                # Pop from the context-local stack
                current_context = span_context_var.get()
                if current_context and current_context[-1] == span:
                    current_context.pop()
                    span_context_var.set(current_context)
                # ...
    
        def current_span(self) -> Optional[Span]:
            # Read from the context-local stack
            span_context = span_context_var.get()
            return span_context[-1] if span_context else None

Alternative Solution Considered (Not Recommended)

Another approach considered was to instantiate a new LangfuseTracer for each individual pipeline run (e.g., for each incoming web request).

Drawbacks of this approach:

  • Performance Overhead: The langfuse.Langfuse client is designed to be a long-lived object that batches and sends data efficiently in the background. Creating a new client and tracer for every request is inefficient and introduces unnecessary startup/teardown overhead.
  • Implementation Complexity: It shifts the burden of concurrency management to the application developer. It would require a complex and potentially error-prone mechanism to set and unset Haystack's global tracer for each request safely.
  • Design Concern: This approach is more of a workaround; it treats the symptom rather than fixing the root thread-safety issue within the tracer itself.

Conclusion and Recommendation

While instantiating a tracer per-request would solve the immediate problem, the contextvars solution is strongly recommended. It is more performant, aligns with idiomatic Python concurrency patterns, and correctly encapsulates the thread-safety logic within the LangfuseTracer class where it belongs.

This is a critical fix for anyone using Haystack with Langfuse in a production web server environment.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions