-
Notifications
You must be signed in to change notification settings - Fork 4.3k
Open
Description
Tracing for uncovered engine components
- Add trace-level, trace-module, and unify tracing/request-stage-metrics @sufeng-buaa SGLang Tracing: Add trace-level, trace-module, and unify tracing/request-stage-metrics #13152 Refactor: observability code cleanup #17862
- Implement tracing for Hierarchical Cache (selected by --trace-module hicache) @stmatengss
- Implement distributed request tracing for PP
- Implement distributed request tracing for Speculative Decode
Tracing for sgl-model-gateway
- Implement low-overhead Router request tracing with aggregation of trace data from the engine @sufeng-buaa [model-gateway][tracing]: implement request tracing using OpenTelemetry with trace context propagation (HTTP) #13897
Performance and Availability
- Optimize tracing overhead under large batch_size to ensure TTFT/TPOT is minimally impacted @sufeng-buaa
- adjust trace level Dynamically (add http API) @sufeng-buaa Refactor: observability code cleanup #17862
- Implement coverage of exception paths in request handling to prevent unclosed spans and avoid memory leaks. Refactor: observability code cleanup #17862
- Improve span attributes @zhanghaotong SGLang Tracing: Improve root span attributes #17008
- Optimize trace level @zhanghaotong
- Improve necessary events (such as retract) @zhanghaotong Refactor: observability code cleanup #17862
Data process
- Export OTLP data to database, filter and enhance processing. Currently, a simple script is provided to convert text data into Chrome JSON format (which can be parsed by Perfetto). @sufeng-buaa
Exploratory work (Draft)
- Refine output information (e.g., top-k tokens, etc.). May introduce performance issues? @zhanghaotong
- Fine-grained tracing for multimodal @zhanghaotong
- Further unify metrics and tracing; currently, metrics are relatively fragmented.
- Fine-grained tracing for SGLang diffusion
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels