The Feature
As LLM workflows become more agentic and UI-driven, it feels like backend traces alone sometimes miss important execution context.
Would there be value in correlating Playwright/browser sessions with Helicone traces?
Interesting possibilities:
- replay failed UI + LLM interactions together
- inspect browser state transitions alongside model/tool spans
- debug flaky multi-step workflows
- identify divergence points between frontend and backend behavior
Motivation, pitch
This could become especially useful for:
- autonomous workflows
- agent copilots
- async state mutations
- non-deterministic failures
Feels adjacent to observability/replay tooling rather than traditional testing.
Twitter / LinkedIn details
@rewantrex / https://www.linkedin.com/in/rewant-goenka-9351a81a0/
The Feature
As LLM workflows become more agentic and UI-driven, it feels like backend traces alone sometimes miss important execution context.
Would there be value in correlating Playwright/browser sessions with Helicone traces?
Interesting possibilities:
Motivation, pitch
This could become especially useful for:
Feels adjacent to observability/replay tooling rather than traditional testing.
Twitter / LinkedIn details
@rewantrex / https://www.linkedin.com/in/rewant-goenka-9351a81a0/