feat: add Arinae algorithm by LoricAndre · Pull Request #990 · skim-rs/skim

LoricAndre · 2026-02-23T13:45:54Z

Description

Arinae is designed to become skim's default algorithm in the future.

Technically, it uses Smith-Waterman and a modified Levenshtein distance with affine gaps for scoring, as well as multiple optimizations (the main ones being a loose prefilter and checks for early dismissal of paths that cannot lead to the best match). It also forbids typos on the first char of the query.

In practice, it should feel close to FZY's scoring with typos disabled, but with a more natural behavior regarding typos as Frizbee or other algorithms.

These other algorithms usually work by allowing a set number of typos using 3D matrices for computations, the max-typos value being set based on the length of the query. In practice, that meant that tes will match exactly, but test will allow one typo, meaning that typing a single character will change the filtered items completely. This algorithm will instead penalize typos, not block them completely.

This algorithm does not aim to revolution anything, but it aims at making typo-resistant fuzzy matching feel more like an actual alternative to the current options (namely FZF and FZY), while maintaining per-item performance at least as good as the current algorithms.

Checklist

The title of my PR follows conventional commits
I have updated the documentation (README.md, comments, src/manpage.rs and/or src/options.rs if applicable)
I have added unit tests
I have added integration tests
I have linked all related issues or PRs

Note: codecov runs on the PR on this repo, but feel free to ignore it.

Benches (w/ `bench.sh`: less precise but shows rss usage & compatible with FZF)

Skim V2 (current default)

Frizbee (no typos)

Arinae (no typos)

FZF for comparison

Frizbee (typos)

Arinae (typos) - note the difference in the number of results compared to frizbee

Benches (w/ `criterion`: more precise, hooks directly into the code)

cargo bench --bench read_and_match
   Compiling skim v3.5.0 (/home/loric/src/skim)
    Finished `bench` profile [optimized] target(s) in 1m 21s
     Running benches/read_and_match.rs (target/release/deps/read_and_match-7be3557b13dbf6c1)
Gnuplot not found, using plotters backend
Benchmarking default: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 37.9s.
default                 time:   [3.6986 s 3.7315 s 3.7645 s]
                        change: [−2.0247% −0.7908% +0.4336%] (p = 0.25 > 0.05)
                        No change in performance detected.
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high mild

Benchmarking query: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 43.9s.
query                   time:   [4.5603 s 4.7987 s 5.0747 s]
                        change: [+1.0711% +7.2752% +14.625%] (p = 0.05 > 0.05)
                        No change in performance detected.
Found 3 outliers among 10 measurements (30.00%)
  1 (10.00%) low severe
  1 (10.00%) low mild
  1 (10.00%) high severe

Benchmarking query_frizbee: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 44.3s.
query_frizbee           time:   [4.6213 s 4.8387 s 5.1942 s]
                        change: [−10.587% +0.9013% +13.415%] (p = 0.90 > 0.05)
                        No change in performance detected.
Found 2 outliers among 10 measurements (20.00%)
  1 (10.00%) high mild
  1 (10.00%) high severe

Benchmarking query_ari: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 49.4s.
query_ari               time:   [4.0759 s 4.4534 s 5.0057 s]
                        change: [−13.815% +1.4402% +30.916%] (p = 0.91 > 0.05)
                        No change in performance detected.
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high severe

Benchmarking query_frizbee_typos: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 44.6s.
query_frizbee_typos     time:   [4.2940 s 4.3626 s 4.4337 s]

Benchmarking query_ari_typos: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 43.3s.
query_ari_typos         time:   [4.2142 s 4.4391 s 4.6898 s]
Found 2 outliers among 10 measurements (20.00%)
  1 (10.00%) low severe
  1 (10.00%) high severe

wip

codecov · 2026-02-23T14:15:30Z

Codecov Report

❌ Patch coverage is 76.17486% with 218 lines in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
src/fuzzy_matcher/arinae/algo.rs	82.15%	53 Missing and 5 partials ⚠️
src/item.rs	24.52%	39 Missing and 1 partial ⚠️
src/fuzzy_matcher/arinae/mod.rs	75.38%	31 Missing and 1 partial ⚠️
src/engine/fuzzy.rs	57.35%	29 Missing ⚠️
src/fuzzy_matcher/frizbee.rs	22.22%	14 Missing ⚠️
src/fuzzy_matcher/mod.rs	0.00%	7 Missing ⚠️
src/fuzzy_matcher/arinae/atom.rs	84.61%	6 Missing ⚠️
src/fuzzy_matcher/arinae/matrix.rs	80.64%	6 Missing ⚠️
src/matcher.rs	86.66%	6 Missing ⚠️
src/options.rs	0.00%	3 Missing and 2 partials ⚠️
... and 5 more

📢 Thoughts on this report? Let us know!

…rst char

…d accuracy

…ractice

LoricAndre · 2026-02-26T21:30:47Z

@saghen, @ibhagwan, sorry for the ping but we lack a better comm channel and I think you might be interested in this ! Feel free to ignore me if I'm wrong.

saghen · 2026-02-26T22:41:10Z

This looks really neat! I'll definitely read through your implementation soon. One thing though, I noticed that your frizbee usage might be hurting the performance a bit.

https://github.com/skim-rs/skim/pull/990/changes#diff-98065f4e9dfd493f60cd03421c7288370b7765c98685c1dd23270c0eb6195aacL34

You'll want to use the Matcher API rather than SmithWatermanMatcher so that we perform prefiltering, which should lead to quite the speed-up.

https://github.com/skim-rs/skim/pull/990/changes#diff-668cf417d0fb5e51ead9cf7eb691ab2246e71e1c4b6ccc86bfb1eed4ccc66834R34-R57

Your micro benches use the fuzzy_indices API which will be a lot slower than fuzzy_match for frizbee as well. In general, it might be worth trying to pass the whole list into the Matcher::match_list API to see how the performance compares, as that's the intended entrypoint.

Also, you might want to try using the parallel implementation in frizbee, as it scales much better than rayon in my testing

ibhagwan · 2026-02-26T22:48:04Z

Very interesting @LoricAndre, will try!

LoricAndre · 2026-02-26T23:20:47Z

This looks really neat! I'll definitely read through your implementation soon. One thing though, I noticed that your frizbee usage might be hurting the performance a bit.

https://github.com/skim-rs/skim/pull/990/changes#diff-98065f4e9dfd493f60cd03421c7288370b7765c98685c1dd23270c0eb6195aacL34

You'll want to use the Matcher API rather than SmithWatermanMatcher so that we perform prefiltering, which should lead to quite the speed-up.

https://github.com/skim-rs/skim/pull/990/changes#diff-668cf417d0fb5e51ead9cf7eb691ab2246e71e1c4b6ccc86bfb1eed4ccc66834R34-R57

Your micro benches use the fuzzy_indices API which will be a lot slower than fuzzy_match for frizbee as well. In general, it might be worth trying to pass the whole list into the Matcher::match_list API to see how the performance compares, as that's the intended entrypoint.

Also, you might want to try using the parallel implementation in frizbee, as it scales much better than rayon in my testing

And here I was, thinking I finally optimized it enough to catch up to frizbee's performance 😂
More seriously, I'll take a look at Matcher, and finding something better than rayon is next on the to-do list as it seems it's one of the biggest bottlenecks right now (and part of the reason I sit at 500-600% CPU usage and not close to 2400% as I would expect on my system). I'll take a look at what you did then.

LoricAndre · 2026-02-27T07:59:45Z

I took a look at Frizbee's Matcher API but it seems slower in my use (since I run it on a per-item base):

The average performance got up by 45+%, but frizbee's only got up by 35+%

saghen · 2026-02-27T18:06:19Z

I wish I had the time to investigate this more, but I'm slammed right now. I did a quick test by replacing the frizbee code with the code below, and I've included the results below as well. I realize now the main performance issue is instantiating the Matcher on every match_* call, as we have to allocate the score matrix every time. Ideally you'd instantiate it once and re-use it, but that'd be a non-trivial code change it looks like, as the FuzzyMatcher trait would need to have a set_choice(&mut self, choice: &str) function to instantiate the Matcher and then fuzzy_* would need to be &mut self.

c.bench_function("micro_frizbee", |b| {
    let mut matcher = frizbee::Matcher::new("test", &Default::default());
    b.iter(|| {
        let mut count = 0u64;
        for _ in matcher.match_iter(&lines) {
            count += 1;
        }
        count
    });
});
c.bench_function("micro_typos_frizbee", |b| {
    let config = frizbee::Config {
        max_typos: Some(1),
        ..Default::default()
    };
    let mut matcher = frizbee::Matcher::new("test", &config);
    b.iter(|| {
        let mut count = 0u64;
        for _ in matcher.match_iter(&lines) {
            count += 1;
        }
        count
    });
});
c.bench_function("micro_frizbee_parallel", |b| {
    b.iter(|| {
        let mut count = 0u64;
        for _ in frizbee::match_list_parallel("test", &lines, &Default::default(), 16) {
            count += 1;
        }
        count
    });
});

micro_skim_v2           time:   [352.28 ms 352.98 ms 353.74 ms]

micro_frizbee           time:   [73.265 ms 73.400 ms 73.554 ms]
micro_typos_frizbee     time:   [109.55 ms 109.87 ms 110.21 ms]
micro_frizbee_parallel  time:   [6.1193 ms 6.1390 ms 6.1607 ms]

micro_arinae            time:   [230.56 ms 230.88 ms 231.23 ms]
micro_arinae_range      time:   [124.95 ms 125.13 ms 125.33 ms]
micro_arinae_score      time:   [221.41 ms 221.68 ms 222.00 ms]
micro_typos_arinae      time:   [391.23 ms 391.63 ms 392.06 ms]

saghen · 2026-02-27T18:11:04Z

Btw you might want to explore incremental matching in your arinae implementation as well! saghen/frizbee#65

LoricAndre · 2026-02-27T18:26:12Z

Yeah recreating the matcher at each iteration is stupid of me. I'll spend some more time looking into this, thanks
And the incremental matching is also somewhere later in the to-do list

LoricAndre · 2026-02-27T18:36:23Z

Yeah recreating the matcher at each iteration is stupid of me. I'll spend some more time looking into this, thanks And the incremental matching is also somewhere later in the to-do list

I checked a bit more, and the current architecture does not allow me to reuse matchers, even using some ThreadLocal magic. I'll look into this further as I add ways to send entire chunks at a time to the FuzzyMatchers later.

LoricAndre

Initial review

LoricAndre and others added 13 commits February 22, 2026 00:15

feat: initial work on skim v3

ba10498

wip

wip: SW

7486189

chore: refactor SkimV3 to make it more maintainable

33ed2b4

chore: remove SIMD batch scores

4b0e3e7

fix: fix Skim V3 tests

f4eac16

feat: small optimizations

3785ed4

feat: bigger optimizations

6197b53

chore: generate completions & manpage

4d9043f

chore: remove unused wide dependency

d9e6fa9

Merge branch 'master' into skim-v3

41b797c

chore: update deps

b72df58

Merge branch 'skim-v3' of github.com:skim-rs/skim into skim-v3

6edc4a0

chore: generate completions & manpage

7600031

LoricAndre and others added 16 commits February 23, 2026 18:57

fix: make sure all subsequences pass in non-typos mode

363e8ed

chore: trade some performance against more precision with typos

e935622

feat: gain the performance back using unchecked accesses

abdef81

chore: remove failing tests

b55c61c

feat: use banding across whole upper triangle

80c12f2

chore: remove useless DEAD_COL checks

79cc41a

feat: make sure we match everything frizbee does while enforcing fi…

38ba9e8

…rst char

feat: minor optimizations

af75390

feat: more minor optimizations

026401e

chore: tweak parameters to find a good balance between performance an…

d54e18f

…d accuracy

chore: accept snap

3869905

Merge branch 'master' into skim-v3

4bfeb5b

chore: penalize consecutive typos

0775be4

chore: revert consecutive typos penalization as it seems useless in p…

61079ba

…ractice

wip: optimizations

7373188

feat: multiple optimizations

3e5051d

chore: rename & refactor into multiple files

880448a

LoricAndre changed the title ~~feat: add SkimV3 algorithm~~ feat: add Arinae algorithm Feb 26, 2026

LoricAndre and others added 5 commits February 26, 2026 20:10

Merge branch 'master' into skim-v3

9892c2b

chore: optimizations to the main flow

63e34cb

fix: correct banding in non-typo path

e1a3888

chore: generate completions & manpage

d2b3faa

docs: add algorithms section to the README [skip ci]

2a8d747

LoricAndre marked this pull request as ready for review February 26, 2026 21:30

fix(ari): correctly bound vband low

3f882ac

LoricAndre and others added 3 commits February 27, 2026 11:15

chore(ari): specific pre-separator bonuses

6dae55c

fix(ari): boost consec a bit more to beat start/sep

a2ffb98

chore: generate completions & manpage

654b912

LoricAndre added 5 commits February 28, 2026 00:04

feat: run matcher over chunks

c9c54e0

chore: adjust penalties to keep typos under subsequences

f8875ba

chore: accept snapshot

58f9424

fix: replace greedy ordered prefilter with looser unordered

9046330

chore: finish up rename

ea8cd28

LoricAndre commented Feb 28, 2026

View reviewed changes

chore: review

11ba019

LoricAndre merged commit c652744 into master Mar 1, 2026
14 checks passed

LoricAndre deleted the skim-v3 branch March 1, 2026 17:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add Arinae algorithm#990

feat: add Arinae algorithm#990
LoricAndre merged 80 commits intomasterfrom
skim-v3

LoricAndre commented Feb 23, 2026 •

edited

Loading

Uh oh!

codecov bot commented Feb 23, 2026 •

edited

Loading

Uh oh!

LoricAndre commented Feb 26, 2026

Uh oh!

saghen commented Feb 26, 2026 •

edited

Loading

Uh oh!

ibhagwan commented Feb 26, 2026

Uh oh!

LoricAndre commented Feb 26, 2026

Uh oh!

LoricAndre commented Feb 27, 2026

Uh oh!

saghen commented Feb 27, 2026 •

edited

Loading

Uh oh!

saghen commented Feb 27, 2026 •

edited

Loading

Uh oh!

LoricAndre commented Feb 27, 2026

Uh oh!

LoricAndre commented Feb 27, 2026

Uh oh!

LoricAndre left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

LoricAndre commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

Benches (w/ bench.sh: less precise but shows rss usage & compatible with FZF)

Skim V2 (current default)

Frizbee (no typos)

Arinae (no typos)

FZF for comparison

Frizbee (typos)

Arinae (typos) - note the difference in the number of results compared to frizbee

Benches (w/ criterion: more precise, hooks directly into the code)

Uh oh!

codecov bot commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

LoricAndre commented Feb 26, 2026

Uh oh!

saghen commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ibhagwan commented Feb 26, 2026

Uh oh!

LoricAndre commented Feb 26, 2026

Uh oh!

LoricAndre commented Feb 27, 2026

Uh oh!

saghen commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

saghen commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LoricAndre commented Feb 27, 2026

Uh oh!

LoricAndre commented Feb 27, 2026

Uh oh!

LoricAndre left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

LoricAndre commented Feb 23, 2026 •

edited

Loading

Benches (w/ `bench.sh`: less precise but shows rss usage & compatible with FZF)

Benches (w/ `criterion`: more precise, hooks directly into the code)

codecov bot commented Feb 23, 2026 •

edited

Loading

saghen commented Feb 26, 2026 •

edited

Loading

saghen commented Feb 27, 2026 •

edited

Loading

saghen commented Feb 27, 2026 •

edited

Loading