-
Notifications
You must be signed in to change notification settings - Fork 10.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
common, examples, llama : optimize using reserve if possible #5535
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes may make the linter happy, but many of them are not appropriate in context.
examples/perplexity/perplexity.cpp
Outdated
@@ -1148,6 +1150,7 @@ static void winogrande_score(llama_context * ctx, const gpt_params & params) { | |||
} | |||
|
|||
eval_pairs.clear(); | |||
eval_pairs.reserve((i1 - i0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar issues to the above loop - best to just not reserve anything here.
common/train.cpp
Outdated
out_samples_begin.clear(); | ||
size_t end = (out_tokens.size() >= context_length) ? (out_tokens.size() - context_length) : 0; | ||
out_samples_begin.reserve(end); | ||
out_samples_begin.push_back(0); | ||
out_samples_size.reserve(end); | ||
out_samples_size.push_back(std::min((size_t) context_length, out_tokens.size())); | ||
size_t end = (out_tokens.size() >= context_length) ? (out_tokens.size() - context_length) : 0; | ||
for (size_t sample_begin = 1; sample_begin < end; ++sample_begin) { | ||
out_samples_begin.push_back(sample_begin); | ||
out_samples_size.push_back(context_length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xaedes It looks like we are clearing out_samples_begin but not out_samples_size, is that a bug?
examples/lookup/lookup.cpp
Outdated
@@ -181,6 +181,7 @@ int main(int argc, char ** argv){ | |||
const int startIdx = i + ngram_size; | |||
const int endIdx = startIdx + n_draft; | |||
if (endIdx < inp_size) { | |||
draft.reserve(endIdx - startIdx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think draft is necessarily empty here:
draft.reserve(endIdx - startIdx); | |
draft.reserve(draft.size() + endIdx - startIdx); |
examples/perplexity/perplexity.cpp
Outdated
eval_pairs.clear(); | ||
eval_pairs.reserve((i1 - i0) * 4); | ||
for (size_t i = i0; i < i1; ++i) { | ||
auto & hs_cur = hs_data[i]; | ||
size_t li = hs_cur.common_prefix; | ||
for (int s = 0; s < 4; ++s) { | ||
eval_pairs.reserve((hs_cur.seq_tokens[s].size() - 1) - hs_cur.common_prefix); | ||
for (size_t j = hs_cur.common_prefix; j < hs_cur.seq_tokens[s].size() - 1; j++) { | ||
eval_pairs.emplace_back(hs_cur.i_batch + li++, hs_cur.seq_tokens[s][j + 1]); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These reserves of eval_pairs aren't right - reserving in a loop without accounting for the existing capacity is wrong, and the initial reserve is trying to be precise but misses the innermost loop.
examples/perplexity/perplexity.cpp
Outdated
@@ -1519,10 +1524,13 @@ static void multiple_choice_score(llama_context * ctx, const gpt_params & params | |||
// Compute log-probs in parallel | |||
// First we collect all tasks | |||
eval_pairs.clear(); | |||
eval_pairs.reserve(i1 - i0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same issue.
llama.cpp
Outdated
@@ -8065,6 +8067,7 @@ struct llm_tokenizer_bpe { | |||
int index = 0; | |||
size_t offset = 0; | |||
|
|||
symbols.reserve(word.size()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same concern as SPM.
llama.cpp
Outdated
@@ -8138,6 +8141,7 @@ struct llm_tokenizer_bpe { | |||
const auto token = vocab.token_to_id.find(str); | |||
|
|||
if (token == vocab.token_to_id.end()) { | |||
output.reserve(str.end() - str.begin()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't quite right - output isn't necessarily empty, we're just appending to it. And reserving in a loop is not going to do the right thing - best to just remove this.
llama.cpp
Outdated
@@ -8309,6 +8313,7 @@ struct llm_tokenizer_bpe { | |||
} | |||
} | |||
|
|||
bpe_encoded_words.reserve(bpe_words.size()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is already a reserve() at the beginning of this function.
llama.cpp
Outdated
@@ -10194,6 +10199,7 @@ static void llama_convert_tensor_internal( | |||
size_t in_buff_offs = 0; | |||
size_t out_buff_offs = 0; | |||
|
|||
workers.reserve(nthread); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is already a reserve() of workers right when it is defined in the caller - reserving it here is pointless.
llama.cpp
Outdated
@@ -10697,6 +10703,7 @@ static void llama_model_quantize_internal(const std::string & fname_inp, const s | |||
first_row * n_per_row, this_nrow, n_per_row, local_hist.data(), imatrix); | |||
} | |||
}; | |||
workers.reserve(nthread_use - 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is already a reserve() of workers right when it is defined - reserving it here is pointless.
accidentally I use force push to an existing branch with PR that I forgot to finish. |
usual use reserve function is where we know count elements