Skip to content

Conversation

@AlonKellner-RedHat
Copy link
Collaborator

@AlonKellner-RedHat AlonKellner-RedHat commented Oct 29, 2025

Summary

This PR adds over-saturation stopping to the GuideLLM CLI.
It's based on the OSD (Over-Saturation Detection) algorithm we developed and evaluated at Jounce.
Use --stop-over-saturated or --stop-osd to enable.

Details

This PR adds:

  • Over-saturation stopping (--stop-over-saturated)
  • Comprehensive OSD unit tests

Test Plan


  • "I certify that all code in this PR is my own, except as noted below."

Use of AI

  • Includes AI-assisted code completion
  • Includes code generated by an AI application
  • Includes AI-generated tests (NOTE: AI written tests should have a docstring that includes ## WRITTEN BY AI ##)

@AlonKellner-RedHat AlonKellner-RedHat mentioned this pull request Oct 30, 2025
9 tasks
@AlonKellner-RedHat AlonKellner-RedHat force-pushed the feat/over-saturation-stopping branch 4 times, most recently from 85cf65e to f996254 Compare November 6, 2025 11:11
sjmonson added a commit that referenced this pull request Nov 6, 2025
## Summary

E2E tests which check basic GuideLLM functionality, using vLLM
simulator.

## Details

- [x] Max requests test
- [x] Max duration test
- [ ] Over-saturation stopping test - skipped for now, will be enabled
when #438 lands

## Test Plan

- [x] Local testing
- [x] GitHub action

---

- [x] "I certify that all code in this PR is my own, except as noted
below."

## Use of AI

- [x] Includes AI-assisted code completion
- [ ] Includes code generated by an AI application
- [ ] Includes AI-generated tests (NOTE: AI written tests should have a
docstring that includes `## WRITTEN BY AI ##`)
@AlonKellner-RedHat AlonKellner-RedHat force-pushed the feat/over-saturation-stopping branch 3 times, most recently from 7584c7f to 2c76c4e Compare November 17, 2025 09:04
Add over-saturation detection and stopping capability to GuideLLM CLI.

- Implement OverSaturationDetector with statistical slope detection
- Add OverSaturationConstraint for scheduler integration
- Add CLI flags --stop-over-saturated and --stop-osd
- Integrate with benchmark entrypoints and main CLI

Signed-off-by: Alon Kellner <[email protected]>
Add unit tests for over-saturation detection and constraint functionality.

Signed-off-by: Alon Kellner <[email protected]>
Add comprehensive test suite for over-saturation detection algorithm.

Signed-off-by: Alon Kellner <[email protected]>
Enable end-to-end test for over-saturation stopping functionality.

Signed-off-by: Alon Kellner <[email protected]>
Refactor the single constraints.py file into a package structure where each
constraint type has its own file:

- protocols.py: Protocol definitions (Constraint, ConstraintInitializer, SerializableConstraintInitializer)
- factory.py: ConstraintsInitializerFactory for creating and managing constraints
- base.py: Base classes (PydanticConstraintInitializer, UnserializableConstraintInitializer)
- standard.py: Standard constraints (MaxNumber, MaxDuration, MaxErrors, MaxErrorRate, MaxGlobalErrorRate, RequestsExhausted)
- over_saturation.py: Over-saturation detection constraint implementation

This improves code organization and maintainability while preserving backward
compatibility through the package's __init__.py exports.

Signed-off-by: Alon Kellner <[email protected]>
…Args

The field was referenced in __main__.py but missing from the schema definition,
causing ValueError when trying to get default values.

Signed-off-by: Alon Kellner <[email protected]>
- Fix first_iteration -> first_token_iteration attribute name
- Add type ignore for OverSaturationConstraint return type
- Fix validated_kwargs type handling for stop_over_saturated parameter

Signed-off-by: Alon Kellner <[email protected]>
When is_flag=True, Click automatically handles boolean values.
Specifying type=bool can cause Pydantic validation errors.

Signed-off-by: Alon Kellner <[email protected]>
- Fix import paths from advanced_constraints to constraints package
- Fix line length errors (E501) in test files
- Fix type error for request_start None check
- Update test coverage documentation

Signed-off-by: Alon Kellner <[email protected]>
@AlonKellner-RedHat AlonKellner-RedHat force-pushed the feat/over-saturation-stopping branch from ce85b1c to 46bc491 Compare November 18, 2025 08:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant