Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autovectorize everything #17

Merged
merged 3 commits into from
Feb 8, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 3 additions & 9 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -8,19 +8,13 @@ keywords = ["simd", "iterator"]
categories = ["algorithms", "concurrency"]
repository = "https://github.com/LaihoE/SIMD-itertools"


[dependencies]
itertools = "0.13.0"
multiversion = "0.7.4"
multiversion = "0.8.0"

[dev-dependencies]
criterion = "0.5.1"
rand = "0.8.5"


[[bench]]
name = "position"
harness = false
itertools = "0.13.0"

[profile.release]
lto = true
Expand All @@ -32,4 +26,4 @@ debug = false
rpath = false
lto = true
debug-assertions = false
codegen-units = 1
codegen-units = 1
46 changes: 22 additions & 24 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,45 +2,43 @@

[![crates.io](https://img.shields.io/crates/v/simd-itertools.svg)](https://crates.io/crates/simd-itertools)

Change:
```Rust
arr.iter().contains()
```
To:

### Unmatched flexibility 🤯

```Rust
arr.iter().contains_simd()
let needles = [42, 52, 94];
arr.iter().any_simd(|x| needles.contains(x) || x > 156);
```
- Works by letting LLVM do the vectorization (may change in the future).
- Functions are made easy to paste into Godbolt for inspection.


Currently the following are implemented:

```find```
```filter```
```position```
```contains```
```eq```
```min/max```
```is_sorted```
```all_equal```
```all```
```any```
```argmin/argmax```

And works for slice iterators of types: ```u8,u16,u32,u64,i8,i16,i32,i64,f32,f64,isize,usize```

### 🔥🚀 Performance gain compared to the standard library 🚀🔥
![Performance gain of compared to std implementation (u32)](benchmark.png)
You can expect similar performance across the functions.
### Tradeoffs
Every piece of software makes tradeoffs. The goal of this library it to provide the *majority* of performance gains gained from going scalar -> vectorized, while staying user-friendly. If you are looking to shave off the last few cycles this might not be what you are looking for.


### ⚠️ Warning ⚠️:
The library makes one extra assumption over the stdlib: The closure may be executed any number of times:

Requires nightly for now 😔:
```Rust
rustup toolchain install nightly
rustup default nightly
// revert back to stable: rustup default stable
arr.iter().simd_position(|x| {
println!("hello world");
*x == 42
})
```

May print a different number of times compared to the standard library. This shouldn't be an issue under normal use-cases but something to keep in mind.

To get the best performance make sure you are compiling with ```-C target-cpu=native```
For example:
```
RUSTFLAGS='-C target-cpu=native' cargo run
```

### Why is this not part of the standard library
It's tricky. Hopefully one day.
54 changes: 0 additions & 54 deletions benches/all_equal.rs

This file was deleted.

86 changes: 0 additions & 86 deletions benches/contains.rs

This file was deleted.

53 changes: 0 additions & 53 deletions benches/eq.rs

This file was deleted.

55 changes: 0 additions & 55 deletions benches/filter.rs

This file was deleted.

Loading