-
Notifications
You must be signed in to change notification settings - Fork 254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement AWS key value store #2883
base: main
Are you sure you want to change the base?
Conversation
Kia ora @ogghead and thanks for this. We've got work going on in #2895 to implement some additional key-value interfaces, and I think it works better to land that one first, then have this PR include all the AWS stuff. This is partly down to what has the biggest compatibility implications combined with the present release timeline, but also will hopefully provide enough infrastructure to make extending this PR to the new interfaces easy! I hope that's okay with you. In the meantime, I'll try to have a look at your points for thought! |
I'd say this is fine for now. We could call this a DynamoDB KV store, which would leave us flexibility to later to add a S3 KV backend if users had large object use cases - nothing here would preclude that as long as we think of it as an "AWS product X" store rather than an "AWS" store.
I don't think we are bound to offer a "tokens in the runtime config" option if that doesn't make sense or is painful to implement. I'm not sure we can rely on the workload identity idea from that SKIP across all Spin runtime environments, but absolutely open to doing things differently as appropriate.
This is what we do in the SQS trigger. There's no credential configuration, we just load the SDK and let figure out the credentials, whether from ambient EVs or whatever. I'm told that's idiomatic enough, so I'd have no problem with doing the same thing here. I'm sure someone will shout out if they do - but we could presumably retrofit additional configuration methods if need be - the Cosmos one certainly went through a few sets of extensions... |
Sounds good to me! I'm excited to see that work land. It makes sense to hold off on this for now, then rework after that is merged and support all WASI KV interfaces for AWS. Thanks for taking a look! |
Great callout! The config specifies "Dynamo" as the KV store type, so should hopefully be flexible to add other backends. I will keep this in mind when implementing the full WASI KV interface
The runtime config token setup is (from local testing) working -- but the "use the default SDK config loading" is proving challenging with the current interface constraints. Mainly as the default AWS config loader is implemented as an async function.
Agreed, this is the pattern I followed in the Kinesis trigger as well. I would love to have this here too! The challenge is function coloring from AWS config default loader function -- using that async function here appeared to require a chain of refactoring across the general KV traits, but I must admit that my async Rust knowledge hit a wall when trying to reconcile the changes required for that. |
Ah, I misread your comment about using the default SDK config - sorry about that. It seems like you could call Tokio's |
Good callout -- I tried using
as well as
While these do compile, I see crashes immediately on Spin app startup with
Alternatively, when I went down the path of asyncifying everything required to await this function, I hit a wall at store_from_toml_fn where closures are returned. Async closure enhancements in Rust might be needed to make the closures returned there async, but this is where my knowledge of async Rust was lacking. If any Giant Async Brains (or anyone) have ideas on the best path forward on this, much appreciated! |
All right. I think I have a way around this for you. "But," in the words of Deep Thought, "you're not going to like it." So I asked the Giant Async Brains about blocking on Here is what I did, which seems to work (but you may find a less awful way, this was just the first stab that didn't make the compiler mad at me):
let client_fut: std::pin::Pin<Box<dyn std::future::Future<Output = Client> + Send>> = Box::pin(async move {
let config = match auth_options {
KeyValueAwsDynamoAuthOptions::RuntimeConfigValues(config) => /* as current */,
KeyValueAwsDynamoAuthOptions::Environmental => {
aws_config::load_defaults(BehaviorVersion::latest()).await // as before but uncommented
}
};
Client::new(&config)
});
let client_cell = async_once_cell::Lazy::from_future(client_fut);
Ok(Self { client: client_cell, table }) (Some of the naming here is poor, this was throwaway code.)
async fn get(&self, name: &str) -> Result<Arc<dyn Store>, Error> {
Ok(Arc::new(AwsDynamoStore {
_name: name.to_owned(),
client: self.client.get_unpin().await.clone(), // <-- this bit
table: self.table.clone(),
}))
} NOTE: this breaks Let me know if you need more info or want a proper diff. |
Excellent! This is exactly the Galaxy Async Brain thinking I was sorely lacking 😄 I will give this a go tonight, thanks for the tips! |
74e7705
to
0ecf501
Compare
Can confirm this worked like a charm! Pushed the changes to reflect and I will keep an eye on the full WASI KV implementation PR. I am in your debt for your help on this @itowlson :) One does not simply walk into |
I'm delighted to have helped! Thanks once again for your effort, your patience, and your good humour throughout this... ... ...because you will need them when I call that debt in. ominous music and cheesy lightning FX in which the viewer can vaguely make out the looming shape of (also, and at the risk of bathos, please ignore MQTT CI failures - it's a known flake) |
891cdc7
to
917f811
Compare
I have implemented atomic and batch operations -- the logic is tentatively all in place. I will test the operations with a component using WASI-KV and then (barring any uncovered issues) mark this ready for review |
d7b5ec1
to
2e71a72
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this! I'm not very qualified to review the Dynamo stuff but @endocrimes has kindly volunteered to look at it. In the meantime just a few comments and questions.
crates/key-value-aws/src/store.rs
Outdated
} | ||
|
||
struct AwsDynamoStore { | ||
_name: String, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the underscore because this is dead code? (I appreciate this has been through a lot of iteration and maybe it got lost in the process!) If this is needed but never used, it would be good to comment why it's needed; if it's not needed, maybe remove it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, this appears to be dead -- further investigation shows it may also be dead in the Azure implementation (likely where I utilized my brilliant ctrl+c, ctrl+v technique) -- I went ahead and removed this in both implementations
use spin_factor_key_value::{log_error, Cas, Error, Store, StoreManager, SwapError}; | ||
|
||
pub struct KeyValueAwsDynamo { | ||
table: Arc<String>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It wasn't clear to me why these were Arc given that client
isn't. Maybe merits a comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is fair -- the reasoning for client
not being Arc is that client already wraps an Arc:
pub struct Client {
handle: Arc<Handle>,
}
It should be low cost to clone the client without another Arc was my initial thought, but definitely open to thoughts! I will add a comment to this effect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it would have made sense to me if KeyValueAwsDynamo was Clone, but I didn't see a Clone implementation. And although Client
is clone, I don't think async_once_cell::Lazy<Client, std:something::Terrifying<...>>
is, which now makes me wonder if this might be a hangover from a previous iteration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are quite correct -- cloning these fields is only done in StoreManager::get
and StoreManager::summary
and the overall object is never cloned. But, I now realize that the get
function itself returns an Arc -- so this may be unnecessary overhead.
In the Store
implementation for AwsDynamoStore
though, creating CAS handles through the new_compare_and_swap
function made it seem ideal to use Arcs/low cost cloned objects for the client and table to allow creating many parallel CAS handles at low cost. That doesn't preclude containing owned data in the KeyValueAwsDynamo and creating an Arc for table in StoreManager::get
though, would that be preferable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that the considerations for AwsDynamoStore
are different - it was specifically this KeyValueAwsDynamo
struct that was puzzling me. But yeah if the expectation is that these fields will be repeatedly cloned then it makes sense to Arc
them once here. And with client
, as you say, it's already cheap to await (except for the first time) and clone the result. Thanks for the patient explanation - I'm happy now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is certainly possible that this is premature optimization and caching of the store is done in a table higher in the orchestration of key-value factors -- open to tweaking this! I appreciate the discussion :) It does look like we could get away without making region an Arc as it is never cloned (outside of formatting a string of course), so going to change that to a String in KeyValueAwsDynamo
crates/key-value-aws/src/store.rs
Outdated
} | ||
|
||
async fn exists(&self, key: &str) -> Result<bool, Error> { | ||
Ok(self.get_item(key).await?.is_some()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it fully downloads the value if present. Is that necessary just to check if the key exists? (It's fine if the answer is "yes" - this is me being ignorant about Dynamo. But if existence checks have time and egress cost implications that we might want to capture those in the docs. And now you have me wondering if the same applies to Cosmos...!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good callout -- while I do not know of a specific operation meant to check whether an item with a specific key exists in DynamoDB, we can definitely avoid downloading the entire value by returning only a specific key (in this case, just the PK). I will update with this optimization, but I cannot speak for whether CosmosDB supports something similar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, the Cosmos mention was more a note to self to see what it does and put a note in the docs if it downloaded potentially large data. Definitely not trying to put that on your plate.
crates/key-value-aws/src/store.rs
Outdated
impl Cas for CompareAndSwap { | ||
async fn current(&self) -> Result<Option<Vec<u8>>, Error> { | ||
// TransactGetItems fails if concurrent writes are in progress on an item | ||
let output = self |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking aloud -- this can be brought closer to ensuring a unique lock: TransactWrite can return the VAL key while setting a lock key -- only under the condition that the lock key doesn't already exist. Combined with deleting the lock key on swap, this could guarantee that only one process can acquire "the lock" assuming all processes call current
before swap
. However, this does not prevent another process calling swap
without first calling current
and ignoring the lock. But, it may still be worth making this change to get as close as possible to a transaction -- curious to hear thoughts on the best approach for this!
Scratch that, that is only possible for the UpdateItem operation and not the Update inside a TransactWrite. I suspect using UpdateItem will actually work fine here but checking to confirm behavior is idempotent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed it appears to be, so I will make this change which hopefully drastically simplify the calls in CAS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dynamo uses eventually consistent reads by default - which is fine for applications written against Dynamo, but I believe our Spin KV contract is that you at least read-your-writes. Dynamo charging 2x the cost for a Consistent Read makes this a compelling tradeoff regardless, but we may need to document that fact (and potentially make ConsistentReads an opt-in configuration?)
Otherwise this seems reasonably sound to me (modulo questions about CAS) - thanks for the PR!
crates/key-value-aws/src/store.rs
Outdated
.table_name(self.table.as_str()) | ||
.projection_expression(PK); | ||
|
||
if let Some(keys) = last_evaluated_key { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would potentially be slightly easier to read with the SDK's paginator (https://docs.rs/aws-sdk-dynamodb/latest/aws_sdk_dynamodb/operation/scan/builders/struct.ScanFluentBuilder.html#method.into_paginator) - but I'm extremely not a Rustacean 😅 (and this otherwise is doing ~the same thing afaict)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point -- I was unaware of this utility! I will look into replacing custom logic with the paginator.
} | ||
|
||
#[async_trait] | ||
impl Cas for CompareAndSwap { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little curious about the CAS impl here - With the lock
attribute not expiring (afaik?), it seems like there are a few crash/process-killing/bad-usage cases here that could result in an item being forever locked?
Could it not potentially be better served with a Consistent Read for current
, and then a Conditional write for the swap? something like the cli:
aws dynamodb update-item \
--table-name MyTable \
--key '...' \
--update-expression "SET VAL = :newval" \
--condition-expression "VAL = :currval"
That would also make it easier to understand what happens in race cases if the same key is both cas'd and set
/set-many
concurrently.
But if there's a common dynamo pattern I've missed please definitely let me know 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed -- there is definitely a situation where crashes could leave the lock sitting there -- my hope was that this is minimal as:
- All non-atomic writes use
Put
(erasing the lock) - Swapping the value deletes the lock regardless of whether it was acquired
But, if someone calls current
, then experienced hardware failure or a business logic crash, and then tries to rerun the whole component logic, they could find themselves in this bad state where the lock has not been released. I may have been overly hasty to refactor the CAS to hold a unique lock on data rather than using an optimistic lock with a version key. Your solution is elegant -- my only concern with using the VAL itself for conditional update was increased cost of sending it over the wire. I was previously using a version key for this and I think it's possible to use that in an update operation to do this comparison at lower cost: cache the incremented version key during current
and then assert it is the same during swap
.
I'll play around with a few options for optimistic locking on this today
Good point! I was mulling over whether to add a configuration to specify strongly consistent reads, and this comment makes it clear that would be good to have (and potentially default to strongly consistent to maintain consistency of behavior with other implementations?) It should be quick to add, will push that up either this morning or tonight |
Ok! I may have gotten a bit pedantic with the states for Cas but this is hopefully in line with what you were imagining @endocrimes and cleans up. That said, I uncovered some odd behavior while testing and published an "int test Spin app" repo here for reference
It is certainly possible that these issues are caused by my own environment but the AWS one appears to potentially be rooted in higher level orchestration logic around atomic retries here |
crates/key-value-aws/src/store.rs
Outdated
} | ||
|
||
async fn delete(&self, key: &str) -> Result<(), Error> { | ||
if self.exists(key).await? { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to check for key existence? Does client.delete_item
fail if the item doesn't exist? If so, we still have a race condition between the call to exists
and delete_item
where the item might be deleted and the error could occur.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! It appears that the delete operation only fails if a condition is set. I will remove this check as the API does not throw errors when running delete on nonexistent items
crates/key-value-aws/src/store.rs
Outdated
let mut results = Vec::with_capacity(keys.len()); | ||
|
||
if keys.is_empty() { | ||
return Ok(results); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: with_capacity
does allocation which we can avoid in the empty key case by moving the initialization down.
let mut results = Vec::with_capacity(keys.len()); | |
if keys.is_empty() { | |
return Ok(results); | |
} | |
if keys.is_empty() { | |
return Ok(Vec::new()); | |
} | |
let mut results = Vec::with_capacity(keys.len()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair, I may be missing some finer details of Vec memory allocation but the docs seem to imply that Vec::new and Vec::with_capacity(0) should behave the same:
However, the pointer might not actually point to allocated memory. In particular, if you construct a Vec with capacity 0 via Vec::new, vec![], Vec::with_capacity(0), or by calling shrink_to_fit on an empty Vec, it will not allocate memory.
Regardless, I will make this change to ensure no allocations occur in this case!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, this got me thinking -- would it be better to include these empty list checks here in the KV host implementation? That would ensure consistency across all implementations when handling empty lists for batch operations. I am imagining adding this check after fetching the store on 285 (so that permissions can be checked still) -- as well as on other batch operations there. Let me know if that is something that would be desirable, otherwise I can make this change here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a side note, I see potentially unnecessary vec allocations at that level
Signed-off-by: Darwin Boersma <[email protected]>
…ons easier Signed-off-by: Darwin Boersma <[email protected]>
Signed-off-by: Darwin Boersma <[email protected]>
Signed-off-by: Darwin Boersma <[email protected]>
Signed-off-by: Darwin Boersma <[email protected]>
Signed-off-by: Darwin Boersma <[email protected]>
Signed-off-by: Darwin Boersma <[email protected]>
Signed-off-by: Darwin Boersma <[email protected]>
…returned values for getItem calls Signed-off-by: Darwin Boersma <[email protected]>
…istency Signed-off-by: Darwin Boersma <[email protected]>
cc @devigned for the possible issue in the SQLite (or host?) implementation (#2883 (comment)) |
I'm at KubeCon right now, but I will try to give it a look when I have a break. At first glance, it seems like the CAS resource is not registered in the CAS resource table. The test code looked correct to me, so this is likely a bug in the CAS implementation. |
…needed exists check, higher level filtering of empty get_all queries, sqlite handle null value before swap Signed-off-by: Darwin Boersma <[email protected]>
3e9c8e5
to
1cc913c
Compare
I appreciate the quick responses! On lunch so pushed up a few fixes for latest comments. I did some additional testing and the issue on sqlite get_many does appear to be inconsistent and I can reproduce with AWS too -- I have a hunch that caching of deleted values could be involved as adding sleeps in between delete_many and get_many operations allows consistent passes of that test for sqlite. I did end up adding one fallback in the sqlite implementation in |
…d to client Signed-off-by: Darwin Boersma <[email protected]>
ca9edbb
to
8db1174
Compare
Signed-off-by: Darwin Boersma <[email protected]>
Ok, I have solved all the observed issues:
Pushed up fixes as well as validated that my test app passes all situations with AWS -- the sqlite implementation passes all situations barring the last one "Two handles, read nonexistent object, only one writes successfully" -- it might need a custom enum to capture the difference between an unknown CAS object and one that was fetched/is expected to be None -- but I could be misinterpreting the expected CAS behavior on an unknown object, let me know if we should actually treat unknown previous object as expected to not exist in database at all. I appreciate all the great feedback on this PR! |
Hi folks! I am creating this draft PR to solicit feedback on an initial AWS key value store implementation. I appreciate any and all discussions on this PR!
Some points for thought: