Skip to content

Conversation

vblagoje
Copy link
Member

@vblagoje vblagoje commented Sep 9, 2025

Why

This PR migrates the Langfuse integration from Langfuse v2 to v3. Langfuse v3 is built on OpenTelemetry (OTel), which differs from v2's trace creation model. The migration introduces an expected extra root span level due to OTel's requirement that trace attributes must be held by a span, unlike v2's direct trace creation capability.
TLDR - you'll see an extra nested root span.

Proposed Changes

  • Complete migration from Langfuse v2 to v3 API
  • User's will not notice any difference in traces (other than the extra root)

How did you test it?

  • Updated existing unit tests to work with v3 API
  • Adjusted some tests to work with extra root span
  • Manually tested in complex multi-agent itinerary setup. Here is the trace. Note the double top nested root span. Here is how the trace looks like in v2 . Hassieb from Lanfguse told me: "exactly, it is expected that there is one root span under the trace, as OTel does not have a concept of a trace that holds attributes (whereas Langfuse does). So the behavior you are seeing is expected"

Notes for the reviewer

The extra root span level is expected behavior in v3 due to OTel's architecture - it's not a bug but a fundamental difference from v2. Some tests needed to be updated to account for this.

@github-actions github-actions bot added integration:langfuse type:documentation Improvements or additions to documentation labels Sep 9, 2025
@vblagoje vblagoje marked this pull request as ready for review September 10, 2025 09:35
@vblagoje vblagoje requested a review from a team as a code owner September 10, 2025 09:35
@vblagoje vblagoje requested review from julian-risch and removed request for a team September 10, 2025 09:35
@vblagoje
Copy link
Member Author

vblagoje commented Sep 10, 2025

@LastRemote would you please, at your convenience, try out this PR on your async setup we used to previously test #2207 It would be amazing to nail this transition to v3 without regressions and additional cleanups. 🙏

@sjrl
Copy link
Contributor

sjrl commented Sep 11, 2025

@vblagoje just so you are aware I also opened a PR here #2257 that enabled linting on the examples and tests folder. Given that you reworked the langfuse integration in this PR feel free to close mine if it no longer makes sense.

@vblagoje
Copy link
Member Author

@LastRemote did you have a chance to run this branch on your async setup? 🙏

@LastRemote
Copy link
Contributor

@vblagoje Not yet. We had some trouble self-hosting Langfuse v3 so I haven’t started. I am planning to do this at some point this week after my experiment with critic agents.

Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR migrates the Langfuse integration from v2 to v3, which is built on OpenTelemetry (OTel). The migration introduces an expected extra root span level due to OTel's requirement that trace attributes must be held by a span, unlike v2's direct trace creation capability.

Key changes:

  • Updated Langfuse dependency from >=2.9.0, <3.0.0 to >=3.0.0, <4.0.0
  • Refactored span creation to use OTel-based context managers instead of direct trace/span creation
  • Updated tests to accommodate the new API and extra root span behavior

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
pyproject.toml Updates Langfuse dependency to v3
tracer.py Core migration from v2 to v3 API with context manager-based span creation
test_tracer.py Updates unit tests with new mock structure for v3 API
test_tracing.py Updates integration tests to handle extra root span and configurable host
Comments suppressed due to low confidence (1)

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@@ -456,4 +479,4 @@ def get_trace_id(self) -> str:
Return the trace ID.
:return: The trace ID.
"""
return self._tracer.get_trace_id()
return self._tracer.get_current_observation_id()
Copy link
Preview

Copilot AI Sep 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method get_trace_id() should return a trace ID, but it's calling get_current_observation_id() which returns an observation ID. This appears to be incorrect - it should likely call get_current_trace_id() instead.

Suggested change
return self._tracer.get_current_observation_id()
return self._tracer.get_current_trace_id()

Copilot uses AI. Check for mistakes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that there is a name mismatch here that we at least should explain or fix if the intention is to call get_current_trace_id

@@ -119,7 +122,6 @@ def test_tracing_integration(llm_class, env_var, expected_trace, basic_pipeline)
)
@pytest.mark.integration
def test_tracing_with_sub_pipelines():

@component
Copy link
Preview

Copilot AI Sep 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the extra empty line before the @component decorator to maintain consistent spacing.

Copilot uses AI. Check for mistakes.

Copy link
Member

@julian-risch julian-risch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks quite good to me already! Just some redundant code blocks in tests regarding mocking. Biggest question is about calling get_current_observation_id in get_trace_id.

Comment on lines +334 to +336
mock_client.start_as_current_span.return_value = MockContextManager()
mock_client.start_as_current_observation.return_value = MockContextManager()
mock_client.get_current_trace_id.return_value = "mock_trace_id_123"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These three lines seem redundant given that in the line above we call mock_get_client, which sets

    mock_client.start_as_current_span = Mock(return_value=MockContextManager())
    mock_client.start_as_current_observation = Mock(return_value=MockContextManager())
    mock_client.get_current_trace_id = Mock(return_value="mock_trace_id_123")

It occurs a couple of times in the file, for example test_trace_generation and test_trace_generation_invalid_start_time. Just check all the places where we call mock_get_client()

if trace_attrs:
# We need to get the actual span from the context manager
# For now, we'll skip this as the context manager needs to be entered
pass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code block doesn't do anything. Could be removed? What does "For now" mean? :)

@@ -456,4 +479,4 @@ def get_trace_id(self) -> str:
Return the trace ID.
:return: The trace ID.
"""
return self._tracer.get_trace_id()
return self._tracer.get_current_observation_id()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that there is a name mismatch here that we at least should explain or fix if the intention is to call get_current_trace_id

@julian-risch
Copy link
Member

My comments from our earlier conversation

Content pages to check if there are any updates needed:

Do you think we could leave a note somewhere in the docs about how to use the API SDK version 2 with langfuse-haystack<=2.3.0 or langfuse-haystack<3.0.0?

Any new features that users benefit from that we can highlight? Improved simplicity?
Key changes that I read about don't explain well enough how the user benefits from the upgrade. If there is no clear benefit, that's also okay but maybe you know of any?

  • Context Management: OpenTelemetry now handles context propagation automatically, reducing the need to manually pass IDs.
  • The name parameter is now required when creating spans and generations.
  • langfuse.create_trace_id()
  • LlamaIndex: There is no longer a Langfuse-specific LlamaIndex integration; use third-party OTEL-based instrumentations. However, we are convinced that a Langfuse-specific integration is the better approach

Shall we ask to get listed here? https://langfuse.com/changelog/2025-06-05-python-sdk-v3-generally-available

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
integration:langfuse type:documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update Langfuse integration for Python SDK v3
4 participants