Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add log_index field to events emitted by transactions #157

Open
zomglings opened this issue Nov 1, 2023 · 7 comments
Open

Add log_index field to events emitted by transactions #157

zomglings opened this issue Nov 1, 2023 · 7 comments

Comments

@zomglings
Copy link

zomglings commented Nov 1, 2023

Currently, there is no canonical way to order events in Starknet logs.

This is challenging when analyzing starknet data - any total ordering you impose as an analyst relies on the assumption that all nodes you could connect to emit events in the same order, and that this order is preserved by all RPC middleware.

It would be better if the log_index was canonically reported by all nodes as part of the starknet_getEvents response.

@Mirko-von-Leipzig
Copy link
Contributor

My only argument against this, is that I think this will be quite costly to implement on the node end. Event filtering and storage is already fairly complex.

@zomglings
Copy link
Author

There was some discussion in Telegram about why this is desirable, so I wanted to summarize the discussion and add some clarification here.

The fact that starknet_getEvents supports pagination (unlike eth_getEvents) means that each node is implicitly making a guarantee about the ordering of events. So one might think:

Is it not enough to just specify that events are always in order? That would have to be the case for pagination in any case

The problem right now is that the guarantee is being made on a per-client basis. The way that one Starknet client implementation implements ordering is not necessarily the way in which another client implements ordering. a log_index would be a guarantee of ordering at the protocol level, not at the level of each node implementation.

It is also important to understand why this is important. A single transaction may fire the same event (with the exact same parameters) multiple times. This means that the event type and parameters do not uniquely specify the event. Any event crawler must have a strategy to recover from failures mid-crawl. One scenario that crawlers must deal with is that they connect to a different RPC node post-crash than the one they were connected to pre-crash -- an RPC node implementing a different Starknet client. Unless there exists a protocol-level guarantee (like log_index) about the ordering of events, the only strategy for a crawler to recover from a crash is to delete the last block worth of events from crawl storage and start there, duplicating some of their work.

This certainly isn't the highest priority issue to solve, and most event crawlers build custom workarounds to this issue (e.g. Starkscan crawlers suffix _<event_index> to the transaction hash when displaying events), but I do believe it is important to address at scale.

To @Mirko-von-Leipzig 's point about how costly it is to implement - if it is very costly to implement, that doesn't mean that Starknet should not specify such information at protocol level. It should just mean that this is a lower priority relative to other development.

@zomglings
Copy link
Author

zomglings commented Nov 4, 2023

The real tradeoff I see here is that implementing log_index or something like it will increase one or both of the following:

  1. the amount of storage per event required to operate a node
  2. the starket_getEvents response time

It might be that such an increase is not acceptable.

@omerfirmak
Copy link

omerfirmak commented Nov 6, 2023

Adding a log_index to FILTERED_EVENT that corresponds to the index of the event within a block is essentially free in Juno.

But the situation might be different for other clients depending on their DB schemas.

@Mirko-von-Leipzig
Copy link
Contributor

Adding a log_index to FILTERED_EVENT that corresponds to the index of the event within a block is essentially free in Juno.

I don't think its enough to know the index within a block -- it would need to be the global index? Otherwise the same problem of ordering remains.

I feel like it should be enough to specify in the method description that the events will be returned in chronological order? i.e. block, transaction, event-in-transaction order?

@omerfirmak
Copy link

omerfirmak commented Nov 6, 2023

I don't think its enough to know the index within a block -- it would need to be the global index? Otherwise the same problem of ordering remains.

FILTERED_EVENT already has the block number IIRC, so block number and index within that block, when used together, should allow ordering events globally. Am I missing something?

@Mirko-von-Leipzig
Copy link
Contributor

I don't think its enough to know the index within a block -- it would need to be the global index? Otherwise the same problem of ordering remains.

FILTERED_EVENT already has the block number IIRC, so block number and index within that block, when used together, should allow ordering events globally. Am I missing something?

Ah no my bad; forgot it already has the block number.

A further consideration though: its pretty much impossible for events to be emitted out-of-order since this would break entirely for latest and pending pagination when updates to those come in during pagination.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants