Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: key_watermark() API #113

Open
rib opened this issue Mar 13, 2022 · 6 comments
Open

Feature Request: key_watermark() API #113

rib opened this issue Mar 13, 2022 · 6 comments

Comments

@rib
Copy link

rib commented Mar 13, 2022

I'm currently using Slabs as part of a Mesh data structure for vertices, edges, loops + faces (based on the design of Blender's BMesh API) and as part of that I'm looking to have secondary vectors of data that are indexed with keys from the slab.

I'm looking to store ancillary attributes (these may be per-vertex, or per-edge etc) within Vec vectors where I'd really like to avoid any memory overhead per value (meshes can be pretty huge), and the corresponding Slab (e.g. vertex slab for a per-vertex attribute) would be the authority on what keys are valid for indexing into those vectors.

To make this work I'd like to be able to maintain the attribute vector lengths according to the length of the Slab entries vector as a simple way of ensuring that all possible slab keys will be valid for indexing in to these attribute vectors.

To not be confused with querying the length of the slab as the number of current values, 'key_watermark' seems like it could get across what it is that I'd like to query here; like this...

    /// Return the current length of entry storage
    ///
    /// This is always larger than all currently valid keys and less than or
    /// equal to the current capacity.
    ///
    /// This may be useful for maintaining secondary data within vectors that
    /// use the same keys as a primary Slab.
    ///
    /// # Examples
    ///
    /// ```
    /// # use slab::*;
    /// let mut slab = Slab::with_capacity(10);
    ///
    /// for i in 0..4 {
    ///     slab.insert(i);
    /// }
    ///
    /// slab.remove(0);
    /// slab.remove(3);
    /// assert_eq!(slab.key_watermark(), 4);
    /// ```
    pub fn key_watermark(&self) -> usize {
        self.entries.len()
    }

I can look at creating a pull request if something like this sounds reasonable?

@Ralith
Copy link

Ralith commented Mar 13, 2022

Side tables are definitely an important use case. Is capacity insufficient for this?

@Darksonn
Copy link
Contributor

Note that by using the compact method, it is possible to have values with keys greater than that length.

@rib
Copy link
Author

rib commented Mar 13, 2022

Side tables are definitely an important use case. Is capacity insufficient for this?

I haven't double checked the memory growth strategy for Vec but I'd expect some variation of exponential growth which could lead to really large overestimates of the required storage using capacity.

Although capacity can be used for this, any overestimate of the required storage can then cause a significant number of secondary values to need to be initialised.

The entities length may also be an overestimate but to a much lesser extent on average, and since the value is just as readily available (except the current lack of api) then it seems like the ideal choice here.

Considering meshes with 100s of thousands of vertices, edges and faces then this level of redundant work wouldn't be ideal.

@rib
Copy link
Author

rib commented Mar 13, 2022

Note that by using the compact method, it is possible to have values with keys greater than that length.

Although I'd expect the possibility that the length may be an over estimate (which is ok, here and the hope is to just minimize any over estimate that will lead to the redundant initialisation of secondary values), surely even after (or during) a compact then no (valid) key can be greater than this length? Keys index into the entities vector so keys greater than (or equal to) this length would implicitly lead to an out of bounds panic no?

From looking at the implementation of compact it looks like it iteratively pops key/values and swaps them into a hole until the slab length matches the entities length, and I guess out of bound keys should only remain if the user doesn't uphold their side of the remapping?

@Darksonn
Copy link
Contributor

Ah, sorry, you are right, that doesn't happen. I had confused it with the DelayQueue.

rib added a commit to rib/slab that referenced this issue Mar 7, 2023
This can be used to more efficiently manage secondary storage with the
same keys as a primary Slab.

Knowing the largest valid key helps avoid any redundant work associated
with unused keys.

Alternatively, using the `capacity` instead of this watermark may result
in redundantly initializing large numbers of secondary entries -
considering that the backing vector is likely to grow exponentially each
time it fills.

Closes: tokio-rs#113
@LegionMammal978
Copy link

LegionMammal978 commented Apr 4, 2024

I've been looking for this feature for my own project. The context is that I have an unordered collection of elements that I want to name by their index, and I occasionally want to remove some of the elements without having to re-index any of the others. However, element removals are occur within batches of processing, where I scan over each element in the collection, mutate it a bit, and possibly remove it and transfer some of its data into another element.

To continue scanning over the collection while removing some elements, I would ordinarily use Slab::retain(), but that is not usable in this case, since whenever I remove an element, I need to access another element. So the remaining solution is to iterate over every key and do the removals and accesses within the loop, as Slab::retain() does internally. But iterating up to the full capacity() would add a lot of pointless work, trying to access elements that can never exist. So exposing entries.len() would make this loop more viable. (My current workaround is to just store the greatest returned key plus 1 next to the Slab.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants