[Fix] data race in req_to_token pool #17850

cctry · 2026-01-28T03:30:23Z

Motivation

The chunked prefill requests will free its slot in req_to_token_pool and get allocated again when preparing for its next prefill batch.

As a result, if a prefill batch contains multiple requests and req_to_token_pool is at capacity. The write for matched kv indices for another request will overwrite the slot of the chunked requests which is being read in forward stream

Example

Prepare & Launch prefill batch N:     
    req A (first half) --> idx 1  
    
model runner reads idx 1
  
Prepare batch N+1: 
    req A (second half) --> idx 2
    req B --> idx 1

scheduler writes req B's matched indices to idx 1

Modifications

alloc(reqs: list[Req]) - Now takes request list, sets req.req_pool_idx directly, reuses slot if already set. cc @hnyls2002
Separate free() with free_mamba_cache(req, ...) in HybridReqToTokenPool - Only frees mamba state, not req slot cc @hanming-lu @yizhang2077
release_kv_cache() - Now calls free(req) at end; handles early mamba-only free case
Removed free() in process_prefill_chunk and cache_finished_req

Accuracy Tests

Benchmarking and Profiling

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.
Follow the SGLang code style guidance.
Work with maintainers to merge your PR. See the PR Merge Process

gemini-code-assist · 2026-01-28T03:30:27Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

cctry · 2026-01-28T03:31:22Z

/tag-run-ci-label

Henrry-CHEN · 2026-01-28T12:04:16Z

so if a prefill batch contain 2 or more request or request chunk, the accuracy of the mamba state for these req is not right?

cctry · 2026-01-28T20:25:16Z

Mamba state is correct but full attention can be wrong

cctry requested review from ByronHsu, ShangmingCai, Ying1123, hanming-lu, hnyls2002, merrymercy, xiezhq-hermann and yizhang2077 as code owners January 28, 2026 03:30

github-actions bot added the run-ci label Jan 28, 2026

init

6ce3974

fix

30b9b41

cctry force-pushed the csy/fix_req_to_pool branch from 60ab814 to 30b9b41 Compare January 28, 2026 21:31

cctry added 3 commits January 28, 2026 15:04

fix dllm

8587abd

fix mamba eagle

f37e98b

fix test

0b9995d

ShangmingCai assigned ByronHsu and hnyls2002 Feb 2, 2026

merrymercy added the high priority label Feb 2, 2026

Merge branch 'main' into csy/fix_req_to_pool

8e4924c

merrymercy merged commit 027f314 into main Feb 2, 2026
194 of 214 checks passed

merrymercy deleted the csy/fix_req_to_pool branch February 2, 2026 22:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Fix] data race in req_to_token pool #17850

[Fix] data race in req_to_token pool #17850

Uh oh!

cctry commented Jan 28, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Jan 28, 2026

Uh oh!

cctry commented Jan 28, 2026

Uh oh!

Henrry-CHEN commented Jan 28, 2026

Uh oh!

cctry commented Jan 28, 2026 via email •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

[Fix] data race in req_to_token pool #17850

[Fix] data race in req_to_token pool #17850

Uh oh!

Conversation

cctry commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Example

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Uh oh!

gemini-code-assist bot commented Jan 28, 2026

Uh oh!

cctry commented Jan 28, 2026

Uh oh!

Henrry-CHEN commented Jan 28, 2026

Uh oh!

cctry commented Jan 28, 2026 via email • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

cctry commented Jan 28, 2026 •

edited

Loading

cctry commented Jan 28, 2026 via email •

edited

Loading