Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[indexer-alt] add coin jsonrpc impl #20931

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

[indexer-alt] add coin jsonrpc impl #20931

wants to merge 5 commits into from

Conversation

emmazzz
Copy link
Contributor

@emmazzz emmazzz commented Jan 21, 2025

Description

This PR added indexer alt jsonrpc reader implementation for getCoins and getAllCoins. I ported over raw_query from graphql crate for usage here.

Test plan

Added e2e tests.


Release notes

Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required.

For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates.

  • Protocol:
  • Nodes (Validators and Full nodes):
  • gRPC:
  • JSON-RPC:
  • GraphQL:
  • CLI:
  • Rust SDK:

Copy link

vercel bot commented Jan 21, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
sui-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Feb 12, 2025 0:52am
2 Skipped Deployments
Name Status Preview Comments Updated (UTC)
multisig-toolkit ⬜️ Ignored (Inspect) Visit Preview Feb 12, 2025 0:52am
sui-kiosk ⬜️ Ignored (Inspect) Visit Preview Feb 12, 2025 0:52am

@emmazzz emmazzz temporarily deployed to sui-typescript-aws-kms-test-env January 21, 2025 08:39 — with GitHub Actions Inactive
@emmazzz emmazzz temporarily deployed to sui-typescript-aws-kms-test-env January 21, 2025 18:27 — with GitHub Actions Inactive
@emmazzz emmazzz marked this pull request as ready for review January 21, 2025 18:28
@emmazzz emmazzz requested a review from lxfind January 21, 2025 18:28
@emmazzz emmazzz temporarily deployed to sui-typescript-aws-kms-test-env January 21, 2025 18:28 — with GitHub Actions Inactive
@emmazzz emmazzz requested review from amnn, gegaowp and wlmyng January 21, 2025 18:28
// let cursor_coin_query = cb::table.select(StoredCoinBalanceBucket::as_select()).filter(cb::object_id.eq(cursor.to_vec()));
let mut cursor_coins: Vec<StoredCoinBalanceBucket> =
conn.results_with_raw_query(cursor_coin_query).await?;
if cursor_coins.len() != 1 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you are trying to achieve here, but there can and will be multiple entries with the same object_id in this table, since it is an append table, we add entries for new mutations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to get the balance bucket of the cursor coin here so that we can use the balance in the filter, because results are sorted by balance. And you are right I should account for the append-only-ness here. I've updated the query in the latest commit to select the row with the latest cp sequence number.


// First load the coins' object ids.
// TODO: optimize this query according to the indexes in coin_balance_buckets table.
let mut query = query!("
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you would have to rewrite the entire function with a full raw string as the query.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in the latest commit.

crates/sui-indexer-alt-jsonrpc/src/reader.rs Outdated Show resolved Hide resolved
crates/sui-indexer-alt-jsonrpc/src/reader.rs Outdated Show resolved Hide resolved
crates/sui-indexer-alt-jsonrpc/src/reader.rs Outdated Show resolved Hide resolved
crates/sui-indexer-alt-jsonrpc/src/reader.rs Outdated Show resolved Hide resolved
crates/sui-indexer-alt-jsonrpc/src/reader.rs Outdated Show resolved Hide resolved
crates/sui-indexer-alt-jsonrpc/src/api/coin.rs Outdated Show resolved Hide resolved
crates/sui-indexer-alt-jsonrpc/src/api/coin.rs Outdated Show resolved Hide resolved
crates/sui-indexer-alt-jsonrpc/src/reader.rs Outdated Show resolved Hide resolved
crates/sui-indexer-alt-jsonrpc/src/api/coin.rs Outdated Show resolved Hide resolved
crates/sui-indexer-alt-jsonrpc/src/api/coin.rs Outdated Show resolved Hide resolved
@emmazzz
Copy link
Contributor Author

emmazzz commented Feb 3, 2025

Revamped and rebased this PR based on the latest changes in configs, pagination, errors and data loaders.
Another noticeable change is with the new cursor format consisting of (object_id, cp_sequence_number, balance_bucket) @gegaowp .

Copy link
Contributor

@amnn amnn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking in pretty good shape, thanks @emmazzz. Here's a summary of the comments I left, from most to least important:

  • Query performance -- I think the implementation of getAllCoins may have an unnecessary sort in it that we can avoid, and I'm a little bit worried about the DB's ability to efficiently plan the comparison on bucket, checkpoint, ID (so worth double checking that).
    • There's a tangent here about the size of the cursor and potentially using BCS encoded cursors which we can discuss.
  • Transactional test runner change -- this is a bit of a larger change and needs some more design -- I've proposed something in the inline comments. The issue with the current approach is that the feature being introduced (interpolating object IDs into cursors) bridges two places where we use similar syntax for different things, which is likely to cause confusion.
  • Errors -- I think are mostly there, but I made some suggestions to bring them fully in line (the biggest thing to address here is the conversion function not using unwraps, the second biggest thing is the behaviour when we fail to fetch contents which we need to align on).

crates/sui-indexer-alt-jsonrpc/src/api/coin.rs Outdated Show resolved Hide resolved
crates/sui-indexer-alt-jsonrpc/src/data/objects.rs Outdated Show resolved Hide resolved
crates/sui-indexer-alt-jsonrpc/src/lib.rs Outdated Show resolved Hide resolved
crates/sui-indexer-alt-schema/src/objects.rs Outdated Show resolved Hide resolved
crates/sui-indexer-alt-jsonrpc/src/api/coin.rs Outdated Show resolved Hide resolved
"result": {
"data": [
{
"coinType": "0x0000000000000000000000000000000000000000000000000000000000000002::sui::SUI",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This output makes me think that we are not using the coin balance buckets index as efficiently as we could, because we are merging together different coin types, but IIRC coin type is the second field in the index after owner (so we should be returning all the results for one coin type in one go).

This also implies that the cursor should include the coin type. The cursor is also already quite large, so this may indicate that we need to introduce a BCS-encoded cursor (similar to what we did for GraphQL), but we can do that as a follow-up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is for getAllCoins, which is different from getCoins in that it will return coins of all types in one go. getCoins on the other hand will only return coins of the same type each time so it shouldn't have any issues. getAllCoins is actually not used by Stashed. I only added this method because, well, it was convenient to. I'm happy to remove it or we can tackle this as a follow up later on after Stashed migration.

@emmazzz emmazzz temporarily deployed to sui-typescript-aws-kms-test-env February 5, 2025 07:09 — with GitHub Actions Inactive
Copy link
Contributor

@amnn amnn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recording our conversation from today!

crates/sui-indexer-alt-jsonrpc/src/api/coin.rs Outdated Show resolved Hide resolved
query = query
.filter(candidates!(cp_sequence_number).le(cp_sequence_number as i64))
.filter(
// Since the result is ordered by coin_balance_bucket followed by object_id,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Filter on the first Index Only Scan means that we are performing a scan through the results that we may be able to avoid by utilising the index better (I think the piece we are missing for that is bounding by cp_sequence_number in the right order w.r.t. coin_balance_bucket and object_id -- i.e. in between).

crates/sui-indexer-alt-jsonrpc/src/data/objects.rs Outdated Show resolved Hide resolved
crates/sui-indexer-alt-jsonrpc/src/data/reader.rs Outdated Show resolved Hide resolved
crates/sui-indexer-alt-jsonrpc/src/lib.rs Outdated Show resolved Hide resolved
crates/sui-indexer-alt-schema/src/objects.rs Outdated Show resolved Hide resolved
@@ -1352,7 +1350,28 @@ impl<'a> SuiTestAdapter {

variables.insert(format!("cursor_{idx}"), base64d);
} else {
variables.insert(format!("cursor_{idx}"), Base64::encode(s));
// Handle cursor strings containing @{obj_x_y} pattern
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recording our conversation from the pair programming session: It's okay (beneficial, even) to only interpolate other variables when dealing with interpolation for cursors (i.e. pass in an empty list of cursors), and I think it's a good idea to factor this change out into its own commit, at least, but that commit should come first (because the test adapter is being actively touched by multiple people at the moment, so I would rather not increase the confusion factor there).

crates/sui-indexer-alt-jsonrpc/src/lib.rs Outdated Show resolved Hide resolved
query = query
.filter(candidates!(cp_sequence_number).le(cp_sequence_number as i64))
.filter(
// Since the result is ordered by coin_balance_bucket followed by object_id,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We talked about how we would probably need to invert the direction of object IDs in the index to make this as efficient as possible.

@emmazzz emmazzz temporarily deployed to sui-typescript-aws-kms-test-env February 10, 2025 05:55 — with GitHub Actions Inactive
@emmazzz emmazzz changed the base branch from main to emma/txn-test-cursor February 10, 2025 05:56
})) = page.cursor
{
query = query.filter(sql::<Bool>(&format!(
r#"ROW(candidates.coin_balance_bucket, candidates.cp_sequence_number, candidates.object_id)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to using ROW here and ordering by coin_balance_bucket, cp_sequence_number, object_id all DESC a few lines later as we've discussed @amnn

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than using &format!(...), use diesel's bind construct, so that we don't have to worry about re-implementing escaping/encoding. See below from #21119 for an example:

if let Some(c) = page.cursor {
query = query.filter(
sql::<Bool>(r#"("candidates"."cp_sequence_number", "candidates"."object_id") < ("#)
.bind::<BigInt, _>(c.cp_sequence_number as i64)
.sql(", ")
.bind::<Bytea, _>(c.object_id.clone())
.sql(")"),
);
}

(I only learned about this recently, while working on this PR). It would look something like this for coin_balance_buckets:

if let Some(Cursor(c)) = page.cursor {
    query = query.filter(
        sql::<Bool>(r#"("candidates"."coin_balance_bucket", "candidates"."cp_sequence_number", "candidates"."object_id") < ("#)
            .bind::<BigInt, _>(c.coin_balance_bucket as i64)
            .sql(", ")
            .bind::<BigInt, _>(c.cp_sequence_number as i64)
            .sql(", ")
            .bind::<Bytea, _>(c.object_id.clone())
            .sql(")"),
    );
}

(the ROW keyword is optional).

@emmazzz
Copy link
Contributor Author

emmazzz commented Feb 10, 2025

Updated the PR to address the second round of comments, include the index change on coin_balance_bucket and also rebased on top of the transactional test runner cursor pr #21150

Copy link
Contributor

@amnn amnn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost there -- main comments:

  • I suspect getAllCoins is still not going to perform well because of its ordering. We need to order by coin type first, then balance bucket, etc
  • The index changes need to be split into their own PR to unblock landing this one (also some suggestions for making the migration idempotent and easy to run proactively).
  • Various patterns/suggestions based on rpc-alt: getOwnedObjects #21119. Main ones are to use SqlLiteral::bind, represent the ObjectID using its DB form in the cursor, and to use context(...) instead of .ok_or_else(|| anyhow!(...)).

crates/sui-indexer-alt-jsonrpc/src/api/coin.rs Outdated Show resolved Hide resolved
crates/sui-indexer-alt-schema/src/objects.rs Outdated Show resolved Hide resolved
}
}

impl Coins {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it would be nice if we could structure this endpoint like the rest of them (with a response module to hold these functions). The rest of the methods are structured that way to promote sharing (it's not possible to share response functions when they are tied to the type implementing the method), and it would be good to maintain consistency so that people can follow that pattern when they are navigating the codebase.

Copy link
Contributor Author

@emmazzz emmazzz Feb 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about it but decided not to do it that way because:

  • I don't think the code here is long or complex enough to need to be split up into different modules. In my opinion being in different files and modules can hurt readability (personally I don't like jumping between files), unless the code is long (transactions's case) or shareable (objects''s case), in which case the benefits outweigh the drawbacks, which bring me to the second point.
  • I don't see the response function in this case being shared. The only methods that return coin related responses are the methods in this file, getCoins, getAllCoins and getAllBalances. And we certainly don't plan to expand jsonrpc interface so most likely no future methods will be using it.

Or do you think this needs to be done for consistency reasons, i.e. any method that has filters and pagination should follow that structure?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recording our offline conversation on this:

  • Indeed, it's mostly to do with consistency (to make it easier for people to connect the dots going from the implementation of one RpcModule to the next).
  • But it's true that splitting things across multiple files can hurt readability too.

Where we ended up was that we would follow the same patterns, but not split it up across files unless the file got large. In this case, with the absence of getAllCoins, this translates into:

  • inlining get_coins_impl back into get_coins,
  • renaming get_coin_id_page as filter_coins and making it a free function (in the same module),
  • making coin_response a free function (but in the same module).

crates/sui-indexer-alt-jsonrpc/src/api/coin.rs Outdated Show resolved Hide resolved
crates/sui-indexer-alt-jsonrpc/src/api/coin.rs Outdated Show resolved Hide resolved
crates/sui-indexer-alt-jsonrpc/src/api/coin.rs Outdated Show resolved Hide resolved
crates/sui-indexer-alt-jsonrpc/Cargo.toml Outdated Show resolved Hide resolved
@emmazzz
Copy link
Contributor Author

emmazzz commented Feb 12, 2025

Hey @amnn thanks for the review! I have addressed most of the comments in the latest commit. Specifically,

  • getAllCoins has been removed. I agree with your assessment and I think we can tackle this one after Stashed milestone.
  • The index change has been split into its own PR [NO MERGE YET][indexer-alt] modify coin balance bucket index to sort object id in descending order #21187
  • I got assume_not_null working but I still need to have a separate struct for cursor for the time being because changing the object id type into Vec<u8> of the json cursor type will make it hard to test, given how we handle json cursor and object id replacement in test adapter. We are going to make it a bcs cursor anyway later when bcs cursor lands. I have added TODOs for it in the code.

I understand there are changes in other PRs that can further improve this one (bcs cursor and your improvements to load_latest) but I think these improvements are minor and don't need to block the landing of this one. What do you think?

@emmazzz emmazzz requested a review from amnn February 13, 2025 23:22
@emmazzz emmazzz force-pushed the emma/txn-test-cursor branch from 03b548d to b9e30f1 Compare February 14, 2025 08:43
Copy link
Contributor

@amnn amnn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small nits, but this is good to go! Agreed we can incorporate BCS cursors in a follow-up.

BadType(String, anyhow::Error),
}

// TODO: use bcs cursor once we have it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now available, after a rebase, I believe.

#[derive(Serialize, Deserialize, Debug, Clone)]
struct BalanceCursor {
#[serde(rename = "o")]
object_id: ObjectID,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
object_id: ObjectID,
object_id: Vec<u8>,

}
}

impl Coins {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recording our offline conversation on this:

  • Indeed, it's mostly to do with consistency (to make it easier for people to connect the dots going from the implementation of one RpcModule to the next).
  • But it's true that splitting things across multiple files can hurt readability too.

Where we ended up was that we would follow the same patterns, but not split it up across files unless the file got large. In this case, with the absence of getAllCoins, this translates into:

  • inlining get_coins_impl back into get_coins,
  • renaming get_coin_id_page as filter_coins and making it a free function (in the same module),
  • making coin_response a free function (but in the same module).

// We get all the qualified coin ids first.
let coin_id_page = self.get_coin_id_page(owner, coin_type_tag, page).await?;

let coin_futures = coin_id_page.data.iter().map(|id| self.coin_response(*id));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Is it possible to call into_iter here instead?

Comment on lines +186 to +188
candidates.field(cb::object_id),
candidates.field(cb::cp_sequence_number),
candidates.field(cb::coin_balance_bucket).assume_not_null(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would the following work?

Suggested change
candidates.field(cb::object_id),
candidates.field(cb::cp_sequence_number),
candidates.field(cb::coin_balance_bucket).assume_not_null(),
candidates!(object_id),
candidates!(cp_sequence_number),
candidates!(coin_balance_bucket).assume_not_null(),

})) = page.cursor
{
query = query.filter(
sql::<Bool>(r#"(candidates.coin_balance_bucket, candidates.cp_sequence_number, candidates.object_id) < ("#)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can drop the raw string if the string contents does not include quotes:

Suggested change
sql::<Bool>(r#"(candidates.coin_balance_bucket, candidates.cp_sequence_number, candidates.object_id) < ("#)
sql::<Bool>("(candidates.coin_balance_bucket, candidates.cp_sequence_number, candidates.object_id) < (")

Comment on lines +4 to +8
// This test is to verify that deleted coins are not included in the result of suix_getCoins.
// We create two coins, of balances 12 and 34, call the rpc method to see bot of them in the results.
// Then we merge the coins and call the rpc method again to see that only the merged coin with
// balance 46 is in the results.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: comment copypasta

Base automatically changed from emma/txn-test-cursor to main February 14, 2025 23:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants