Pooling Wallet - one possible root cause for syncing issues #12853

neurosis69 · 2022-08-09T12:12:03Z

The long_sync operation of a big wallet, when the wallet has many pool_state_transitions, is very inefficient.

I've tested sync operations with a testnet wallet with ~80k coins and assumable ~70k pool_state_transitions.

Process Flow

Here is a simplified process flow of how I understood the relevant parts of the sync.

When the long sync begins with the Pooling Wallet sync, this loop is executed:

chia-blockchain/chia/wallet/wallet_state_manager.py

Line 1087 in 244a9dc

while curr_coin_state.spent_height is not None:

The inefficiency comes from the call of apply_state_transition function.

chia-blockchain/chia/wallet/wallet_state_manager.py

Line 1091 in 244a9dc

success = await wallet.apply_state_transition(cs, curr_coin_state.spent_height)

Within this function we first call another function to add the spend to the db:

chia-blockchain/chia/pools/pool_wallet.py

Line 284 in 244a9dc

    
           await self.wallet_state_manager.pool_store.add_spend(self.wallet_id, new_state, block_height)

BEGIN `add_spend`

Before the spend is actually added to the db, the following two queries are executed:

chia-blockchain/chia/wallet/wallet_pool_store.py

Lines 47 to 52 in 244a9dc

    
           "SELECT transition_index, height, coin_spend " 
        
           "FROM pool_state_transitions " 
        
           "WHERE wallet_id=? " 
        
           "ORDER BY transition_index DESC " 
        
           "LIMIT 1", 
        
           (wallet_id,),

chia-blockchain/chia/wallet/wallet_pool_store.py

Lines 61 to 64 in 244a9dc

    
           "SELECT COUNT(*) " 
        
           "FROM pool_state_transitions " 
        
           "WHERE wallet_id=? AND height=? AND coin_spend=?", 
        
           (wallet_id, height, serialized_spend),

END `add_spend`

Then we enter this loop

chia-blockchain/chia/pools/pool_wallet.py

Lines 288 to 297 in 244a9dc

    
           # If we have reached the target state, resets it to None. Loops back to get current state 
        
           for _, added_spend in reversed( 
        
               await self.wallet_state_manager.pool_store.get_spends_for_wallet(self.wallet_id) 
        
           ): 
        
               latest_state: Optional[PoolState] = solution_to_pool_state(added_spend) 
        
               if latest_state is not None: 
        
                   if self.target_state == latest_state: 
        
                       self.target_state = None 
        
                       self.next_transaction_fee = uint64(0) 
        
                   break

Where we basically select all known spends from the pooling wallet:

chia-blockchain/chia/pools/pool_wallet.py

Line 290 in 244a9dc

    
           await self.wallet_state_manager.pool_store.get_spends_for_wallet(self.wallet_id)

To get all spends, this SQL query is executed:

chia-blockchain/chia/wallet/wallet_pool_store.py

Lines 97 to 98 in 244a9dc

    
           "SELECT height, coin_spend FROM pool_state_transitions WHERE wallet_id=? ORDER BY transition_index", 
        
           (wallet_id,),

Then we process the next coin, starting the whole loop again.

Possible Problems

For every coin, we select the whole pool_state_transitions table 3 times.
1. To get the transition_index from the wallet, starting here:
  
  chia-blockchain/chia/wallet/wallet_pool_store.py
  
  Line 47 in 244a9dc
  
  "SELECT transition_index, height, coin_spend "
  
  What is the reason for this query? If we only want to get the highest transition_index for the given_wallet, then we could probably optimize the query.
2. To get the amount of transitions for a given wallet, height and spend, starting here:
  
  chia-blockchain/chia/wallet/wallet_pool_store.py
  
  Line 61 in 244a9dc
  
  "SELECT COUNT(*) "
  
  I guess this query checks if the current spend should be added to the database.
  The check is done by selecting rows filtered by wallet_id and height and coin_spend.
  The PK of the table pool_state_transitions is transition_index, wallet_id and there are no other indexes.
3. To get all heights and coin_spends from a given wallet, starting here:
  
  chia-blockchain/chia/wallet/wallet_pool_store.py
  
  Line 97 in 244a9dc
  
  "SELECT height, coin_spend FROM pool_state_transitions WHERE wallet_id=? ORDER BY transition_index",
So, every time we loop through the function apply_state_transition, we most probably add a new spend and select all old spends plus the new spends three times (2 times the whole table, once the rowcount), basically reading the whole table, like explained above.
This adds up real quick and leads to a lot of redundant, unnecessary(?) work.

I don't really understand the design decissions, why the sync, in regards of the SQLite DB part, is implemented that way,
but I'm pretty sure this very problem could be mitigated by, at least not reading the growing whole table again and again.

The text was updated successfully, but these errors were encountered:

mariano54 · 2022-08-10T04:48:08Z

Thank you for your analysis. Before we continue the discussion, I would like to get on the same page on what is the use case you are describing. We did not optimize the pool wallet for 70k state transitions. We have optimized for many standard transactions, not pool wallet singletons.

Do you have a farmer that has won 70k blocks? Or have you changed your pool on chain 70k times? Why exactly do you have so many transactions for the pool wallet?

neurosis69 · 2022-08-10T08:59:36Z

The data is from the keys we used to test our pool on testnet, farming for a very long time and with a decent amount of testnet plots. So once joined the pool all other events are from block wins.

From a backup db we know, that the last known number of pool state transitions is 28370.
As we weren't able to sync the wallet any more (gave up after 2 days), we switched to new keys.

I actually had time for a few days to review the wallet syncing from a DB perspective, found also a few minor things,
which I will create PRs for, eventually.
If I find the time, I will try to get some data points, how long it takes and how it slowes down after every 500 transitions or so.

mariano54 · 2022-08-10T13:26:43Z

I see, so it's testnet keys.
The reason this was designed this way, was to reduce complexity and make it as simple as possible.
We did not consider the case of winning thousands of farming rewards, since it's a rare case.

If there are some relatively simple ways to optimize it, like not re-reading the whole table, we can make these changes.

neurosis69 · 2022-08-10T14:33:02Z

I had the feeling, that this is an extreme edge case, but I still think this could be avoided.
Is such a state transition only relevant when farming with pools or also when using nft plot format and claiming self farmed rewards?
At least real big farmers with > 10 PiB which are farming with pools may run into the same problem after some time.

This are the SQL statement executed for every loop:

15:44:45.401262 SELECT height, coin_spend FROM pool_state_transitions WHERE wallet_id=? ORDER BY transition_index
15:44:45.410667 SELECT transition_index, height, coin_spend FROM pool_state_transitions WHERE wallet_id=? ORDER BY transition_index DESC LIMIT 1
15:44:45.412169 SELECT COUNT(*) FROM pool_state_transitions WHERE wallet_id=? AND height=? AND coin_spend=?
15:44:45.412477 INSERT OR IGNORE INTO pool_state_transitions VALUES (?, ?, ?, ?)
15:44:45.412934 SELECT height, coin_spend FROM pool_state_transitions WHERE wallet_id=? ORDER BY transition_index
15:44:45.422234 SELECT height, coin_spend FROM pool_state_transitions WHERE wallet_id=? ORDER BY transition_index
15:44:45.453788 SELECT height, coin_spend FROM pool_state_transitions WHERE wallet_id=? ORDER BY transition_index
15:44:45.549240 SELECT * from coin_record WHERE coin_name=?
15:44:45.549638 SELECT * from coin_record WHERE coin_name=?
15:44:45.549939 INSERT OR FAIL INTO coin_record VALUES(?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
15:44:45.550090 SELECT last_insert_rowid()
15:44:45.550250 SELECT MIN(derivation_index) FROM derivation_paths WHERE used=0 AND hardened=0;
15:44:45.550484 SELECT MAX(derivation_index) FROM derivation_paths WHERE wallet_id=?
15:44:45.550930 UPDATE coin_record SET spent_height=?,spent=? WHERE coin_name=?
15:44:45.551016 SELECT last_insert_rowid()

The pool_state_transition table is even 4 times selected, not 3 times. I missed the get_tip function which also does this.

And another thing, the comment on apply_state_transition function mentions the following:

chia-blockchain/chia/pools/pool_wallet.py

Lines 263 to 265 in 7c9fb65

    
                   Updates the Pool state (including DB) with new singleton spends. 
        
                   The DB must be committed after calling this method. All validation should be done here. Returns True iff 
        
                   the spend is a valid transition spend for the singleton, False otherwise.

But the whole apply process runs in a single transaction, which also blows up the wal file and of course, when beeing stopped or crashed it has to start from 0 again.

mariano54 · 2022-08-10T15:45:46Z

This applies to pool farming, as well as self farming with plot NFTs.

Yes it is extremely inefficient. Could you provide timings on each of those queries and which ones took a long time?

We might be able to fix it by adding a simple cache somewhere. However, I do not want to make extensive and potentially risky changes, since it's not an issue that will affect too many people, and adding a bug would affect many people

neurosis69 · 2022-08-10T17:04:28Z

Yes it is extremely inefficient. Could you provide timings on each of those queries and which ones took a long time?

From what I see, the inefficiency comes mainly (or only) from querying the growing pool_state_transition table.

Short report

TIME: {amount of seconds to loop 200 times}, NEW_COIN_ITERATIONS: { overall counter }

The following metrics show the avg and sum amount of time spent, measured before and after this for loop:

chia-blockchain/chia/pools/pool_wallet.py

Line 289 in 7c9fb65

for _, added_spend in reversed(

AVG_OVER_LAST_200_ITERS: {average in sec for the last 200 times}
SUM_OVER_LAST_200_ITERS: {sum in sec for the last 200 times}

TIME: 15.338013887405396, NEW_COIN_ITERATIONS: 200
AVG_OVER_LAST_200_ITERS: 0.009473387002944946
SUM_OVER_LAST_200_ITERS: 1.8946774005889893
TIME: 18.56275773048401, NEW_COIN_ITERATIONS: 400
AVG_OVER_LAST_200_ITERS: 0.016466482877731323
SUM_OVER_LAST_200_ITERS: 3.2932965755462646
TIME: 21.860878229141235, NEW_COIN_ITERATIONS: 600
AVG_OVER_LAST_200_ITERS: 0.023045202493667604
SUM_OVER_LAST_200_ITERS: 4.6090404987335205
TIME: 27.917426347732544, NEW_COIN_ITERATIONS: 800
AVG_OVER_LAST_200_ITERS: 0.030841059684753418
SUM_OVER_LAST_200_ITERS: 6.168211936950684
TIME: 34.32459545135498, NEW_COIN_ITERATIONS: 1000
AVG_OVER_LAST_200_ITERS: 0.03513239741325378
SUM_OVER_LAST_200_ITERS: 7.026479482650757
TIME: 32.361149072647095, NEW_COIN_ITERATIONS: 1200
AVG_OVER_LAST_200_ITERS: 0.042580795288085935
SUM_OVER_LAST_200_ITERS: 8.516159057617188
TIME: 35.4493567943573, NEW_COIN_ITERATIONS: 1400
AVG_OVER_LAST_200_ITERS: 0.04850999116897583
SUM_OVER_LAST_200_ITERS: 9.701998233795166
TIME: 39.31749510765076, NEW_COIN_ITERATIONS: 1600
AVG_OVER_LAST_200_ITERS: 0.056016595363616944
SUM_OVER_LAST_200_ITERS: 11.203319072723389
TIME: 42.56333684921265, NEW_COIN_ITERATIONS: 1800
AVG_OVER_LAST_200_ITERS: 0.06123473525047302
SUM_OVER_LAST_200_ITERS: 12.246947050094604
TIME: 45.803821325302124, NEW_COIN_ITERATIONS: 2000
AVG_OVER_LAST_200_ITERS: 0.06745408654212952
SUM_OVER_LAST_200_ITERS: 13.490817308425903
TIME: 49.528172731399536, NEW_COIN_ITERATIONS: 2200
AVG_OVER_LAST_200_ITERS: 0.07036285161972046
SUM_OVER_LAST_200_ITERS: 14.072570323944092
TIME: 52.968520641326904, NEW_COIN_ITERATIONS: 2400
AVG_OVER_LAST_200_ITERS: 0.07603522300720215
SUM_OVER_LAST_200_ITERS: 15.20704460144043
TIME: 56.227105140686035, NEW_COIN_ITERATIONS: 2600
AVG_OVER_LAST_200_ITERS: 0.08831184983253479
SUM_OVER_LAST_200_ITERS: 17.662369966506958
TIME: 59.85309290885925, NEW_COIN_ITERATIONS: 2800
AVG_OVER_LAST_200_ITERS: 0.09517837405204772
SUM_OVER_LAST_200_ITERS: 19.035674810409546
TIME: 63.6175377368927, NEW_COIN_ITERATIONS: 3000
AVG_OVER_LAST_200_ITERS: 0.10020145177841186
SUM_OVER_LAST_200_ITERS: 20.040290355682373
TIME: 68.48028826713562, NEW_COIN_ITERATIONS: 3200
AVG_OVER_LAST_200_ITERS: 0.1073816955089569
SUM_OVER_LAST_200_ITERS: 21.476339101791382
TIME: 70.36668515205383, NEW_COIN_ITERATIONS: 3400
AVG_OVER_LAST_200_ITERS: 0.11387142539024353
SUM_OVER_LAST_200_ITERS: 22.774285078048706
TIME: 74.68989276885986, NEW_COIN_ITERATIONS: 3600
AVG_OVER_LAST_200_ITERS: 0.12173425197601319
SUM_OVER_LAST_200_ITERS: 24.346850395202637
TIME: 77.73890948295593, NEW_COIN_ITERATIONS: 3800

For instance, the first 200 iterations:

TIME: 15.338013887405396, NEW_COIN_ITERATIONS: 200
AVG_OVER_LAST_200_ITERS: 0.009473387002944946
SUM_OVER_LAST_200_ITERS: 1.8946774005889893

You could read as, it took about 15.3s overall.
The for loop took on avg. 0.009s and overall 1.894s, which is about 12.35% of overall.

Between iteration 3600 and 3800:

AVG_OVER_LAST_200_ITERS: 0.12173425197601319
SUM_OVER_LAST_200_ITERS: 24.346850395202637
TIME: 77.73890948295593, NEW_COIN_ITERATIONS: 3800

The processing of these 200 iterations took 77.7s.
The for loop took on avg. 0.121s and overall 24.34s, which is about 31,31% of overall.

emlowe · 2022-12-07T23:22:04Z

I have a test wallet on testnet with many claims and wins, and it is days to sync - I believe it is likely partially explained by this analysis. Assigning to Almog to take a look

MumfMeisterT self-assigned this Aug 9, 2022

MumfMeisterT added Pool Related to Pooling Wallet sync question Further information is requested labels Aug 9, 2022

wallentx assigned mariano54 and unassigned MumfMeisterT Aug 11, 2022

emlowe assigned almogdepaz and unassigned mariano54 Dec 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pooling Wallet - one possible root cause for syncing issues #12853

Pooling Wallet - one possible root cause for syncing issues #12853

neurosis69 commented Aug 9, 2022 •

edited

Loading

mariano54 commented Aug 10, 2022

neurosis69 commented Aug 10, 2022 •

edited

Loading

mariano54 commented Aug 10, 2022

neurosis69 commented Aug 10, 2022 •

edited

Loading

mariano54 commented Aug 10, 2022

neurosis69 commented Aug 10, 2022 •

edited

Loading

emlowe commented Dec 7, 2022

Pooling Wallet - one possible root cause for syncing issues #12853

Pooling Wallet - one possible root cause for syncing issues #12853

Comments

neurosis69 commented Aug 9, 2022 • edited Loading

Process Flow

BEGIN add_spend

END add_spend

Possible Problems

mariano54 commented Aug 10, 2022

neurosis69 commented Aug 10, 2022 • edited Loading

mariano54 commented Aug 10, 2022

neurosis69 commented Aug 10, 2022 • edited Loading

mariano54 commented Aug 10, 2022

neurosis69 commented Aug 10, 2022 • edited Loading

Short report

emlowe commented Dec 7, 2022

neurosis69 commented Aug 9, 2022 •

edited

Loading

BEGIN `add_spend`

END `add_spend`

neurosis69 commented Aug 10, 2022 •

edited

Loading

neurosis69 commented Aug 10, 2022 •

edited

Loading

neurosis69 commented Aug 10, 2022 •

edited

Loading