Skip to content

Commit 25217d3

Browse files
authored
Merge pull request #32 from orxfun/readme-revised
readme revised
2 parents 8e07f39 + 0827784 commit 25217d3

File tree

3 files changed

+36
-21
lines changed

3 files changed

+36
-21
lines changed

Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[package]
22
name = "orx-concurrent-vec"
3-
version = "3.0.1"
3+
version = "3.0.2"
44
edition = "2021"
55
authors = ["orxfun <[email protected]>"]
66
description = "A thread-safe, efficient and lock-free vector allowing concurrent grow, read and update operations."

README.md

Lines changed: 8 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -162,31 +162,19 @@ In the benchmark, we fix the number of updater threads to 4 and change the numbe
162162

163163
### Growth Performance
164164

165-
The following experiments focus only on the concurrent growth or collection of the elements.
165+
As mentioned, `ConcurrentVec` aims high performance concurrent growth. Therefore, certain design decisions are taken to enable safe `extend` method in order to overcome the **false sharing** problem.
166166

167-
#### Growth with ***push***
167+
> [Wikipedia](https://en.wikipedia.org/wiki/False_sharing): *When a system participant attempts to periodically access data that is not being altered by another party, but that data shares a cache block with data that is being altered, the caching protocol may force the first participant to reload the whole cache block despite a lack of logical necessity.*
168168
169-
In the first part, *rayon*'s parallel iterator, and push methods of *AppendOnlyVec*, *boxcar::Vec* and *ConcurrentVec* are used to collect results from multiple threads. Further, different underlying pinned vectors of the *ConcurrentVec* are evaluated. You may find the details of the benchmarks at [benches/collect_with_push.rs](https://github.com/orxfun/orx-concurrent-vec/blob/main/benches/collect_with_push.rs).
169+
Described problem can easily be experienced when multiple writers are concurrently pushing elements to the vector. We can avoid this problem by letting each writer ***extend the vector by multiple consecutive elements***; hence, making it unlikely that the cache blocks being accessed and altered by different threads will overlap. Furthermore, growth in batches has the advantage of requiring fewer atomic updates, and hence, reducing the overhead of concurrency.
170170

171-
<img src="https://raw.githubusercontent.com/orxfun/orx-concurrent-vec/main/docs/img/bench_collect_with_push.PNG" alt="https://raw.githubusercontent.com/orxfun/orx-concurrent-vec/main/docs/img/bench_collect_with_push.PNG" />
171+
The document [ConcurrentGrowthBenchmark.md](https://github.com/orxfun/orx-concurrent-vec/blob/main/docs/ConcurrentGrowthBenchmark.md) focuses specifically on the concurrent growth performance of `ConcurrentVec` and reports results of the benchmarks. In summary:
172172

173-
We observe that:
174-
* The default `Doubling` growth strategy leads to efficient concurrent collection of results. Note that this variant does not require any input to construct.
175-
* On the other hand, `Linear` growth strategy performs significantly better. Note that value of this argument means that each fragment of the underlying `SplitVec` will have a capacity of 2^12 (4096) elements. The underlying reason of improvement is potentially be due to less waste and could be preferred with minor knowledge of the data to be pushed.
176-
* Finally, `Fixed` growth strategy is the least flexible and requires perfect knowledge about the hard-constrained capacity (will panic if we exceed). Since it does not outperform `Linear`, we do not necessarily prefer `Fixed` even if we have the perfect knowledge.
173+
* `ConcurrentVec` is in general performant in concurrent growth.
174+
* Using `extend` rather than `push` provides further significant performance improvements.
175+
* There is not a significant difference between extending by batches of 64 elements or batches of 65536 elements. This is helpful since we do not need a well tuned number. A batch size large enough to avoid overlaps seems to be just fine.
177176

178-
The performance can further be improved by using `extend` method instead of `push`. You may see results in the next subsection and details in the [performance notes](https://docs.rs/orx-concurrent-bag/2.3.0/orx_concurrent_bag/#section-performance-notes) of `ConcurrentBag` which has similar characteristics.
179-
180-
#### Growth with ***extend***
181-
182-
The only difference in this follow up experiment is that we use `extend` rather than `push` with *ConcurrentVec*. You may find the details of the benchmarks at [benches/collect_with_extend.rs](https://github.com/orxfun/orx-concurrent-vec/blob/main/benches/collect_with_extend.rs).
183-
184-
The expectation is that this approach will solve the performance degradation due to false sharing, which turns out to be true:
185-
* Extending rather than pushing might double the growth performance.
186-
* There is not a significant difference between extending by batches of 64 elements or batches of 65536 elements. We do not need a well tuned number, a large enough batch size seems to be just fine.
187-
* Not all scenarios allow to extend in batches; however, the significant performance improvement makes it preferable whenever possible.
188-
189-
<img src="https://raw.githubusercontent.com/orxfun/orx-concurrent-vec/main/docs/img/bench_collect_with_extend.PNG" alt="https://raw.githubusercontent.com/orxfun/orx-concurrent-vec/main/docs/img/bench_collect_with_extend.PNG" />
177+
Of course, not all scenarios allow to extend in batches. However, whenever possible, it is preferable due to potential significant performance improvements.
190178

191179
## Contributing
192180

docs/ConcurrentGrowthBenchmark.md

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
# Concurrent Growth Benchmark
2+
3+
The following experiments focus only on the concurrent growth or collection of the elements.
4+
5+
## Growth with ***push***
6+
7+
In the first part, *rayon*'s parallel iterator, and push methods of *AppendOnlyVec*, *boxcar::Vec* and *ConcurrentVec* are used to collect results from multiple threads. Further, different underlying pinned vectors of the *ConcurrentVec* are evaluated. You may find the details of the benchmarks at [benches/collect_with_push.rs](https://github.com/orxfun/orx-concurrent-vec/blob/main/benches/collect_with_push.rs).
8+
9+
<img src="https://raw.githubusercontent.com/orxfun/orx-concurrent-vec/main/docs/img/bench_collect_with_push.PNG" alt="https://raw.githubusercontent.com/orxfun/orx-concurrent-vec/main/docs/img/bench_collect_with_push.PNG" />
10+
11+
We observe that:
12+
* The default `Doubling` growth strategy leads to efficient concurrent collection of results. Note that this variant does not require any input to construct.
13+
* On the other hand, `Linear` growth strategy performs significantly better. Note that value of this argument means that each fragment of the underlying `SplitVec` will have a capacity of 2^12 (4096) elements. The underlying reason of improvement is potentially be due to less waste and could be preferred with minor knowledge of the data to be pushed.
14+
* Finally, `Fixed` growth strategy is the least flexible and requires perfect knowledge about the hard-constrained capacity (will panic if we exceed). Since it does not outperform `Linear`, we do not necessarily prefer `Fixed` even if we have the perfect knowledge.
15+
16+
The performance can further be improved by using `extend` method instead of `push`. You may see results in the next subsection and details in the [performance notes](https://docs.rs/orx-concurrent-bag/2.3.0/orx_concurrent_bag/#section-performance-notes) of `ConcurrentBag` which has similar characteristics.
17+
18+
## Growth with ***extend***
19+
20+
The only difference in this follow up experiment is that we use `extend` rather than `push` with *ConcurrentVec*. You may find the details of the benchmarks at [benches/collect_with_extend.rs](https://github.com/orxfun/orx-concurrent-vec/blob/main/benches/collect_with_extend.rs).
21+
22+
The expectation is that this approach will solve the performance degradation due to false sharing, which turns out to be true:
23+
* Extending rather than pushing might double the growth performance.
24+
* There is not a significant difference between extending by batches of 64 elements or batches of 65536 elements. We do not need a well tuned number, a large enough batch size seems to be just fine.
25+
* Not all scenarios allow to extend in batches; however, the significant performance improvement makes it preferable whenever possible.
26+
27+
<img src="https://raw.githubusercontent.com/orxfun/orx-concurrent-vec/main/docs/img/bench_collect_with_extend.PNG" alt="https://raw.githubusercontent.com/orxfun/orx-concurrent-vec/main/docs/img/bench_collect_with_extend.PNG" />

0 commit comments

Comments
 (0)