common, examples, llama : optimize using reserve if possible #5535

GermanAizek · 2024-02-16T14:01:15Z

usual use reserve function is where we know count elements

cebtenzzre

These changes may make the linter happy, but many of them are not appropriate in context.

cebtenzzre · 2024-02-16T15:01:01Z

examples/perplexity/perplexity.cpp

@@ -1148,6 +1150,7 @@ static void winogrande_score(llama_context * ctx, const gpt_params & params) {
        }

        eval_pairs.clear();
+        eval_pairs.reserve((i1 - i0));


Similar issues to the above loop - best to just not reserve anything here.

cebtenzzre · 2024-02-16T15:07:45Z

common/train.cpp

        out_samples_begin.clear();
+        size_t end = (out_tokens.size() >= context_length) ? (out_tokens.size() - context_length) : 0;
+        out_samples_begin.reserve(end);
        out_samples_begin.push_back(0);
+        out_samples_size.reserve(end);
        out_samples_size.push_back(std::min((size_t) context_length, out_tokens.size()));
-        size_t end = (out_tokens.size() >= context_length) ? (out_tokens.size() - context_length) : 0;
        for (size_t sample_begin = 1; sample_begin < end; ++sample_begin) {
            out_samples_begin.push_back(sample_begin);
            out_samples_size.push_back(context_length);


@xaedes It looks like we are clearing out_samples_begin but not out_samples_size, is that a bug?

cebtenzzre · 2024-02-16T15:09:42Z

examples/lookup/lookup.cpp

@@ -181,6 +181,7 @@ int main(int argc, char ** argv){
                        const int startIdx = i + ngram_size;
                        const int endIdx = startIdx + n_draft;
                        if (endIdx < inp_size) {
+                            draft.reserve(endIdx - startIdx);


I don't think draft is necessarily empty here:

Suggested change

draft.reserve(endIdx - startIdx);

draft.reserve(draft.size() + endIdx - startIdx);

cebtenzzre · 2024-02-16T15:14:24Z

examples/perplexity/perplexity.cpp

        eval_pairs.clear();
+        eval_pairs.reserve((i1 - i0) * 4);
        for (size_t i = i0; i < i1; ++i) {
            auto & hs_cur = hs_data[i];
            size_t li = hs_cur.common_prefix;
            for (int s = 0; s < 4; ++s) {
+                eval_pairs.reserve((hs_cur.seq_tokens[s].size() - 1) - hs_cur.common_prefix);
                for (size_t j = hs_cur.common_prefix; j < hs_cur.seq_tokens[s].size() - 1; j++) {
                    eval_pairs.emplace_back(hs_cur.i_batch + li++, hs_cur.seq_tokens[s][j + 1]);
                }


These reserves of eval_pairs aren't right - reserving in a loop without accounting for the existing capacity is wrong, and the initial reserve is trying to be precise but misses the innermost loop.

cebtenzzre · 2024-02-16T15:15:00Z

examples/perplexity/perplexity.cpp

@@ -1519,10 +1524,13 @@ static void multiple_choice_score(llama_context * ctx, const gpt_params & params
        // Compute log-probs in parallel
        // First we collect all tasks
        eval_pairs.clear();
+        eval_pairs.reserve(i1 - i0);


Same issue.

cebtenzzre · 2024-02-16T15:19:03Z

llama.cpp

@@ -8065,6 +8067,7 @@ struct llm_tokenizer_bpe {
            int index = 0;
            size_t offset = 0;

+            symbols.reserve(word.size());


Same concern as SPM.

cebtenzzre · 2024-02-16T15:21:01Z

llama.cpp

@@ -8138,6 +8141,7 @@ struct llm_tokenizer_bpe {
                const auto token = vocab.token_to_id.find(str);

                if (token == vocab.token_to_id.end()) {
+                    output.reserve(str.end() - str.begin());


This isn't quite right - output isn't necessarily empty, we're just appending to it. And reserving in a loop is not going to do the right thing - best to just remove this.

cebtenzzre · 2024-02-16T15:22:12Z

llama.cpp

@@ -8309,6 +8313,7 @@ struct llm_tokenizer_bpe {
            }
        }

+        bpe_encoded_words.reserve(bpe_words.size());


There is already a reserve() at the beginning of this function.

cebtenzzre · 2024-02-16T15:23:09Z

llama.cpp

@@ -10194,6 +10199,7 @@ static void llama_convert_tensor_internal(
    size_t in_buff_offs = 0;
    size_t out_buff_offs = 0;

+    workers.reserve(nthread);


There is already a reserve() of workers right when it is defined in the caller - reserving it here is pointless.

cebtenzzre · 2024-02-16T15:25:07Z

llama.cpp

@@ -10697,6 +10703,7 @@ static void llama_model_quantize_internal(const std::string & fname_inp, const s
                                first_row * n_per_row, this_nrow, n_per_row, local_hist.data(), imatrix);
                    }
                };
+                workers.reserve(nthread_use - 1);


There is already a reserve() of workers right when it is defined - reserving it here is pointless.

GermanAizek · 2025-01-18T21:53:33Z

accidentally I use force push to an existing branch with PR that I forgot to finish.

cebtenzzre reviewed Feb 16, 2024

View reviewed changes

ggml: reserve in gguf_writer and added const pointers as params

3100a05

GermanAizek force-pushed the reserve branch from f104678 to 3100a05 Compare January 18, 2025 21:51

GermanAizek requested a review from JohannesGaessler as a code owner January 18, 2025 21:51

github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Jan 18, 2025

GermanAizek closed this Jan 18, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

common, examples, llama : optimize using reserve if possible #5535

common, examples, llama : optimize using reserve if possible #5535

GermanAizek commented Feb 16, 2024 •

edited

Loading

cebtenzzre left a comment

cebtenzzre Feb 16, 2024

cebtenzzre Feb 16, 2024

cebtenzzre Feb 16, 2024

cebtenzzre Feb 16, 2024

cebtenzzre Feb 16, 2024

cebtenzzre Feb 16, 2024

cebtenzzre Feb 16, 2024

cebtenzzre Feb 16, 2024

cebtenzzre Feb 16, 2024

cebtenzzre Feb 16, 2024

GermanAizek commented Jan 18, 2025

	draft.reserve(endIdx - startIdx);
	draft.reserve(draft.size() + endIdx - startIdx);

common, examples, llama : optimize using reserve if possible #5535

common, examples, llama : optimize using reserve if possible #5535

Conversation

GermanAizek commented Feb 16, 2024 • edited Loading

cebtenzzre left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

GermanAizek commented Jan 18, 2025

GermanAizek commented Feb 16, 2024 •

edited

Loading