-
Notifications
You must be signed in to change notification settings - Fork 12.7k
fix: llama_memory_seq_rm(mem, -1, ...) #15200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
because GGML_ASSERT(seq_id >= 0 && (size_t) seq_id < seq_to_stream.size()); in llama_kv_cache_unified::seq_rm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes incorrect sequence ID parameters in llama_memory_seq_rm
function calls, changing from -1
to 0
to comply with the function's assertion requirements that sequence IDs must be non-negative.
- Changes sequence ID parameter from
-1
to0
in twollama_memory_seq_rm
calls - Addresses assertion failures in
llama_kv_cache_unified::seq_rm
that requiresseq_id >= 0
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
File | Description |
---|---|
tools/main/main.cpp | Updates sequence ID parameter in memory sequence removal call |
examples/lookahead/lookahead.cpp | Updates sequence ID parameter in memory sequence removal call |
@@ -354,7 +354,7 @@ int main(int argc, char ** argv) { | |||
} | |||
|
|||
// remove any "future" tokens that we might have inherited from the previous session | |||
llama_memory_seq_rm(mem, -1, n_matching_session_tokens, -1); | |||
llama_memory_seq_rm(mem, 0, n_matching_session_tokens, -1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fix changes the sequence ID from -1 to 0, but this may alter the intended behavior. Please verify that sequence ID 0 is the correct target sequence for removal, as this could affect which cached tokens are cleared.
llama_memory_seq_rm(mem, 0, n_matching_session_tokens, -1); | |
llama_memory_seq_rm(mem, -1, n_matching_session_tokens, -1); |
Copilot uses AI. Check for mistakes.
@@ -429,7 +429,7 @@ int main(int argc, char ** argv) { | |||
|
|||
// KV cache management | |||
// if no verification token matched, we simply remove all cells from this batch -> no fragmentation | |||
llama_memory_seq_rm(mem, -1, n_past, -1); | |||
llama_memory_seq_rm(mem, 0, n_past, -1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fix changes the sequence ID from -1 to 0, but this may alter the intended behavior. Please verify that sequence ID 0 is the correct target sequence for removal, as this could affect which cached tokens are cleared in the lookahead algorithm.
llama_memory_seq_rm(mem, 0, n_past, -1); | |
llama_memory_seq_rm(mem, -1, n_past, -1); |
Copilot uses AI. Check for mistakes.
* llama : deprecate llama_kv_self_ API ggml-ci * llama : allow llama_memory_(nullptr) ggml-ci * memory : add flag for optional data clear in llama_memory_clear ggml-ci
I believe it must be llama_memory_seq_rm(mem, 0, ...)
because of
See:
745aa53#r163727848