Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

common, examples, llama : optimize using reserve if possible #5535

Closed
wants to merge 1 commit into from

Conversation

GermanAizek
Copy link
Contributor

@GermanAizek GermanAizek commented Feb 16, 2024

usual use reserve function is where we know count elements

Copy link
Collaborator

@cebtenzzre cebtenzzre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes may make the linter happy, but many of them are not appropriate in context.

@@ -1148,6 +1150,7 @@ static void winogrande_score(llama_context * ctx, const gpt_params & params) {
}

eval_pairs.clear();
eval_pairs.reserve((i1 - i0));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar issues to the above loop - best to just not reserve anything here.

common/train.cpp Outdated
Comment on lines 885 to 893
out_samples_begin.clear();
size_t end = (out_tokens.size() >= context_length) ? (out_tokens.size() - context_length) : 0;
out_samples_begin.reserve(end);
out_samples_begin.push_back(0);
out_samples_size.reserve(end);
out_samples_size.push_back(std::min((size_t) context_length, out_tokens.size()));
size_t end = (out_tokens.size() >= context_length) ? (out_tokens.size() - context_length) : 0;
for (size_t sample_begin = 1; sample_begin < end; ++sample_begin) {
out_samples_begin.push_back(sample_begin);
out_samples_size.push_back(context_length);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xaedes It looks like we are clearing out_samples_begin but not out_samples_size, is that a bug?

@@ -181,6 +181,7 @@ int main(int argc, char ** argv){
const int startIdx = i + ngram_size;
const int endIdx = startIdx + n_draft;
if (endIdx < inp_size) {
draft.reserve(endIdx - startIdx);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think draft is necessarily empty here:

Suggested change
draft.reserve(endIdx - startIdx);
draft.reserve(draft.size() + endIdx - startIdx);

Comment on lines 878 to 943
eval_pairs.clear();
eval_pairs.reserve((i1 - i0) * 4);
for (size_t i = i0; i < i1; ++i) {
auto & hs_cur = hs_data[i];
size_t li = hs_cur.common_prefix;
for (int s = 0; s < 4; ++s) {
eval_pairs.reserve((hs_cur.seq_tokens[s].size() - 1) - hs_cur.common_prefix);
for (size_t j = hs_cur.common_prefix; j < hs_cur.seq_tokens[s].size() - 1; j++) {
eval_pairs.emplace_back(hs_cur.i_batch + li++, hs_cur.seq_tokens[s][j + 1]);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These reserves of eval_pairs aren't right - reserving in a loop without accounting for the existing capacity is wrong, and the initial reserve is trying to be precise but misses the innermost loop.

@@ -1519,10 +1524,13 @@ static void multiple_choice_score(llama_context * ctx, const gpt_params & params
// Compute log-probs in parallel
// First we collect all tasks
eval_pairs.clear();
eval_pairs.reserve(i1 - i0);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue.

llama.cpp Outdated
@@ -8065,6 +8067,7 @@ struct llm_tokenizer_bpe {
int index = 0;
size_t offset = 0;

symbols.reserve(word.size());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same concern as SPM.

llama.cpp Outdated
@@ -8138,6 +8141,7 @@ struct llm_tokenizer_bpe {
const auto token = vocab.token_to_id.find(str);

if (token == vocab.token_to_id.end()) {
output.reserve(str.end() - str.begin());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't quite right - output isn't necessarily empty, we're just appending to it. And reserving in a loop is not going to do the right thing - best to just remove this.

llama.cpp Outdated
@@ -8309,6 +8313,7 @@ struct llm_tokenizer_bpe {
}
}

bpe_encoded_words.reserve(bpe_words.size());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is already a reserve() at the beginning of this function.

llama.cpp Outdated
@@ -10194,6 +10199,7 @@ static void llama_convert_tensor_internal(
size_t in_buff_offs = 0;
size_t out_buff_offs = 0;

workers.reserve(nthread);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is already a reserve() of workers right when it is defined in the caller - reserving it here is pointless.

llama.cpp Outdated
@@ -10697,6 +10703,7 @@ static void llama_model_quantize_internal(const std::string & fname_inp, const s
first_row * n_per_row, this_nrow, n_per_row, local_hist.data(), imatrix);
}
};
workers.reserve(nthread_use - 1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is already a reserve() of workers right when it is defined - reserving it here is pointless.

@github-actions github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Jan 18, 2025
@GermanAizek
Copy link
Contributor Author

accidentally I use force push to an existing branch with PR that I forgot to finish.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ggml changes relating to the ggml tensor library for machine learning
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants