Made Vocabulary's properties be initialized only ONCE on creation by Lyrcaxis · Pull Request #1110 · SciSharp/LLamaSharp

Lyrcaxis · 2025-02-26T09:59:15Z

Minor QOL improvement on the Vocabulary class with a slight performance increase.
Reason was I'm worried about llama.cpp internal deadlocks/mem. corruption based on an issue reported by @dpmm99.

I'm not 100% sure if this will solve the issue, but it's one less thing to worry about. Maybe we can narrow it down little by little.

LLama/Native/SafeLlamaModelHandle.cs

martindevans · 2025-02-26T14:55:19Z

@dpmm99 does this fix (or at least improve) the issue you were talking about in Discord?

martindevans

Looks good 👍

dpmm99 · 2025-02-27T02:01:50Z

@dpmm99 does this fix (or at least improve) the issue you were talking about in Discord?

Negative.

martindevans · 2025-03-15T23:22:32Z

Since this didn't fix the bug shall we close this PR?

Lyrcaxis · 2025-03-16T02:52:47Z

Since this didn't fix the bug shall we close this PR?

I think it’s still a decent improvement — potentially huge performance-wise depending on use case (e.g.: batching).

But I can see how this could be viewed as memory duplication, although quite minimal (~7MB assuming ~128k tokens mapping to an average of 5 char strings). The Dictionary<LLamaToken, string> actually isn't needed if it won't be public for users, so it could also be deleted to make this a <1KB footprint -- but I think something like TokenToString deserves to exist and be public.

Ultimately your call. Let me know.

martindevans · 2025-03-16T14:57:37Z

Oh if it's a large performance improvement as well let's merge it! I thought it was jsut a potential bugfix. Do you have any benchmarks for how much of a gain it is?

Lyrcaxis · 2025-03-16T16:46:56Z

These are some quick benchmarks from release (ctrl+F5):

Starting test with 1000000 iterations.

Testing EOS access
EOS access - PR's way: 1
EOS access - Current way: 10

Testing Vocab access (a bit unfair but QOL)
Single token to string - PR's way: 23
Single token to string - Current way: 175

Testing IsEOG
EOG Test - PR's way: 1
EOG Test - Current way: 17

The test has ended

The tests were done with this code:

// Placed in Vocabulary class
public void RunTest() {
    const float totalIters = 1_000_000;
    Console.WriteLine($"Starting test with {totalIters} iterations.");
    Console.WriteLine("\nTesting EOS access");
    var sw = Stopwatch.StartNew();
    for (int i = 0; i < totalIters; i++) { var x = EOS; }
    sw.Stop();
    Console.WriteLine($"EOS access - PR's way: {sw.ElapsedMilliseconds}");
    unsafe {
        var _vocabNative = llama_model_get_vocab(_model);
        sw.Restart();
        for (int i = 0; i < totalIters; i++) { var x = Normalize(LLamaVocabNative.llama_vocab_eos(_vocabNative)); }
        sw.Stop();
        Console.WriteLine($"EOS access - Current way: {sw.ElapsedMilliseconds}");
    }

    Console.WriteLine("\nTesting Vocab access");
    var decoder = new StreamingTokenDecoder(Encoding.UTF8, _model);
    var llamaToken = (LLamaToken) 42;
    sw.Restart();
    for (int i = 0; i < totalIters; i++) { var x = this.TokenToString[llamaToken]; }
    sw.Stop();
    Console.WriteLine($"Single token to string - PR's way: {sw.ElapsedMilliseconds}");
    sw.Restart();
    for (int i = 0; i < totalIters; i++) { decoder.Add(llamaToken); var x = decoder.Read(); }
    sw.Stop();
    Console.WriteLine($"Single token to string - Current way: {sw.ElapsedMilliseconds}");

    Console.WriteLine("\nTesting IsEOG");
    sw.Restart();
    for (int i = 0; i < totalIters; i++) { var x = EOGTokens.Contains((int) llamaToken); }
    sw.Stop();
    Console.WriteLine($"EOG Test - PR's way: {sw.ElapsedMilliseconds}");
    unsafe {
        sw.Restart();
        for (int i = 0; i < totalIters; i++) { var x = LLamaVocabNative.llama_vocab_is_eog(VocabNative, llamaToken); }
        sw.Stop();
    }
    Console.WriteLine($"EOG Test - Current way: {sw.ElapsedMilliseconds}");

    // Test accuracy
    unsafe {
        for (int i = 0; i < Count; i++) {
            decoder.Add(llamaToken);
            Debug.Assert(this.TokenToString[llamaToken] == decoder.Read());
            Debug.Assert(LLamaVocabNative.llama_vocab_is_eog(VocabNative, llamaToken) == EOGTokens.Contains((int)llamaToken));
            Debug.Assert(LLamaVocabNative.llama_vocab_is_control(VocabNative, llamaToken) == ControlTokens.Contains((int)llamaToken));
        }
    }
    Console.WriteLine($"\nThe test has ended");
}

TokenToString's performance can also be largely improved by using just a List<string> instead.
But the main thought behind exposing this specifically is QOL -- not performance (as it cannot replace StreamingTokenDecoder).

Lyrcaxis · 2025-03-16T17:08:44Z

Btw all tests passed on my PC even after updating from upstream/master.

github-actions · 2025-05-23T00:32:25Z

This pull request has been automatically marked as stale due to inactivity. If no further activity occurs, it will be closed in 7 days.

Lyrcaxis added 2 commits February 26, 2025 11:55

Made Vocabulary properties be initialized only ONCE on creation

204ba96

Added cache for EOG and Control tokens to Vocabulary

1df9568

martindevans reviewed Feb 26, 2025

View reviewed changes

LLama/Native/SafeLlamaModelHandle.cs Outdated Show resolved Hide resolved

martindevans reviewed Feb 26, 2025

View reviewed changes

LLama/Native/SafeLlamaModelHandle.cs Outdated Show resolved Hide resolved

Addressed change requests

8b2b7cc

Lyrcaxis force-pushed the vocabulary-minor-QOL-improvement branch from c982047 to 8b2b7cc Compare February 26, 2025 15:07

Lyrcaxis requested a review from martindevans February 26, 2025 15:08

martindevans approved these changes Feb 26, 2025

View reviewed changes

Lyrcaxis added 2 commits March 16, 2025 18:47

Tweaks on Vocabulary

8f21685

Made cached EOG/Control tokens be HashSet<int> for quicker lookup

513c1e3

github-actions bot added the stale Stale issue will be autoclosed soon label May 23, 2025

github-actions bot closed this May 31, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Made Vocabulary's properties be initialized only ONCE on creation#1110

Made Vocabulary's properties be initialized only ONCE on creation#1110
Lyrcaxis wants to merge 5 commits intoSciSharp:masterfrom
Lyrcaxis:vocabulary-minor-QOL-improvement

Lyrcaxis commented Feb 26, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

martindevans commented Feb 26, 2025

Uh oh!

martindevans left a comment

Uh oh!

dpmm99 commented Feb 27, 2025

Uh oh!

martindevans commented Mar 15, 2025

Uh oh!

Lyrcaxis commented Mar 16, 2025 •

edited

Loading

Uh oh!

martindevans commented Mar 16, 2025

Uh oh!

Lyrcaxis commented Mar 16, 2025 •

edited

Loading

Uh oh!

Lyrcaxis commented Mar 16, 2025 •

edited

Loading

Uh oh!

github-actions bot commented May 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

Conversation

Lyrcaxis commented Feb 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

martindevans commented Feb 26, 2025

Uh oh!

martindevans left a comment

Choose a reason for hiding this comment

Uh oh!

dpmm99 commented Feb 27, 2025

Uh oh!

martindevans commented Mar 15, 2025

Uh oh!

Lyrcaxis commented Mar 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

martindevans commented Mar 16, 2025

Uh oh!

Lyrcaxis commented Mar 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Lyrcaxis commented Mar 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented May 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Lyrcaxis commented Feb 26, 2025 •

edited

Loading

Lyrcaxis commented Mar 16, 2025 •

edited

Loading

Lyrcaxis commented Mar 16, 2025 •

edited

Loading

Lyrcaxis commented Mar 16, 2025 •

edited

Loading