Skip to content

Conversation

@holiman
Copy link
Contributor

@holiman holiman commented May 8, 2020

This is part 2 of #20796. It's very much work in progress, and a bit hacky.

  • Deliver main tries
  • Deliver account tries
  • Testing

Feedback regarding the design is appreciated

@holiman
Copy link
Contributor Author

holiman commented May 8, 2020

@holiman
Copy link
Contributor Author

holiman commented May 9, 2020

I'm going to restart the benchmark from block 5M now, due to the problems with master.
I did get some nice charts for account_update, around the transition to byzantium (when we stop doing intermediate roots after every tx -- which is the point where I expect this PR to make a big dent).

Screenshot_2020-05-09 Dual Geth - Grafana

So far, this PR only exports the account trie, not yet storage tries.

@holiman
Copy link
Contributor Author

holiman commented May 9, 2020

@holiman
Copy link
Contributor Author

holiman commented May 9, 2020

On the new benchmark run, both finished generating the snapshot after about 2h 15m.

May 09 15:12:36 bench01.ethdevops.io geth INFO [05-09|13:12:35.949] Starting peer-to-peer node instance=Geth/v1.9.14-unstable-89a36171-20200508/linux-amd64/go1.14.2
May 09 15:12:42 bench02.ethdevops.io geth INFO [05-09|13:12:42.508] Starting peer-to-peer node instance=Geth/v1.9.14-unstable-82f9ed49-20200508/linux-amd64/go1.14.2 
...
May 09 17:31:56 bench02.ethdevops.io geth INFO [05-09|15:31:56.621] Generated state snapshot accounts=26380419 slots=41264679 storage=4.22GiB elapsed=2h18m54.750s
May 09 17:32:20 bench01.ethdevops.io geth INFO [05-09|15:32:20.731] Generated state snapshot accounts=26380345 slots=41263951 storage=4.22GiB elapsed=2h19m25.169s 

@holiman
Copy link
Contributor Author

holiman commented May 10, 2020

Block around 57622XX:

May 10 14:36:15 bench01.ethdevops.io geth INFO [05-10|12:36:15.239] Imported new chain segment blocks=74 txs=9891 mgas=487.228 elapsed=8.022s mgasps=60.733 number=5762219 hash="24988a…16b544" age=1y11mo1w dirty=1022.94MiB 
...
May 10 17:03:07 bench02.ethdevops.io geth INFO [05-10|15:03:07.256] Imported new chain segment blocks=45 txs=7366 mgas=308.569 elapsed=8.024s mgasps=38.454 number=5762249 hash="135387…4100a7" age=1y11mo1w dirty=1023.05MiB 

So after ~24h, bench01 got a 2.5 hours lead, or roughly 10% improvement in block import times

Here are account and storage update charts:
Screenshot_2020-05-10 Dual Geth - Grafana -- which looks like ~20ms savings per block, on average.

Start block after snapshot generation was around 5049092.
This gives,

  • bench01: 106ms/block on average,
  • bench02: 118ms/block on average

I have some new modifications which I will push soon, to deliver storage tries in object form too.

@holiman
Copy link
Contributor Author

holiman commented May 11, 2020

About to deploy a new update, here's the result chart for the first run (account tries delivered as objects, storage tries warmed up via prefetching but not delivered as objects)
Screenshot_2020-05-11 Dual Geth - Grafana

@holiman
Copy link
Contributor Author

holiman commented May 11, 2020

New benchmark started. Started at 5M, this is the chart from where both of them were done generating the snapshot: https://geth-bench.ethdevops.io/d/Jpk-Be5Wk/dual-geth?orgId=1&var-exp=bench01&var-master=bench02&var-percentile=50&from=1589212697550&to=now

@holiman
Copy link
Contributor Author

holiman commented May 12, 2020

After ~12 hours, here's a chart showing execution combined with storage_update and account_update.
Screenshot_2020-05-12 Dual Geth - Grafana

.

@holiman
Copy link
Contributor Author

holiman commented May 12, 2020

First run

The first run was based on the version which delivers account tries, but not storage tries.

This PR:

May 09 17:32:20 bench01.ethdevops.io geth INFO [05-09|15:32:20.731] Generated state snapshot accounts=26380345 slots=41263951 storage=4.22GiB elapsed=2h19m25.169s 
May 09 17:32:26 bench01.ethdevops.io geth INFO [05-09|15:32:26.386] Imported new chain segment blocks=70 txs=9546 mgas=433.862 elapsed=8.090s mgasps=53.629 number=5049092 hash="3ad8d0…2395aa" age=2y3mo1w dirty=1018.35MiB 
...
May 10 14:36:15 bench01.ethdevops.io geth INFO [05-10|12:36:15.239] Imported new chain segment blocks=74 txs=9891 mgas=487.228 elapsed=8.022s mgasps=60.733 number=5762219 hash="24988a…16b544" age=1y11mo1w dirty=1022.94MiB 

Start block after snapshot generation: 5049092, reached block 5762219 (713127 blocks later) after 21h4m: 106.3 ms/block

master

May 09 17:31:56 bench02.ethdevops.io geth INFO [05-09|15:31:56.621] Generated state snapshot accounts=26380419 slots=41264679 storage=4.22GiB elapsed=2h18m54.750s
May 09 17:32:00 bench02.ethdevops.io geth INFO [05-09|15:32:00.354] Imported new chain segment blocks=58 txs=7810 mgas=378.535 elapsed=8.010s mgasps=47.256 number=5048322 hash="153512…a7fc60" age=2y3mo1w dirty=1022.80MiB 
...
May 10 17:03:07 bench02.ethdevops.io geth INFO [05-10|15:03:07.256] Imported new chain segment blocks=45 txs=7366 mgas=308.569 elapsed=8.024s mgasps=38.454 number=5762249 hash="135387…4100a7" age=1y11mo1w dirty=1023.05MiB 

Start block after snapshot generation: 5048322, reached block 5762249 (713927 blocks later) after 23h31m: 118.6 ms/block

Second run

The second run is this PR with storage-trie-delivery added in

May 11 17:55:40 bench01.ethdevops.io geth INFO [05-11|15:55:40.118] Imported new chain segment blocks=67 txs=8194 mgas=456.993 elapsed=8.053s mgasps=56.743 number=5050584 hash="ab7b29…16feee" age=2y3mo1w dirty=1022.65MiB 
May 12 14:35:25 bench01.ethdevops.io geth INFO [05-12|12:35:25.353] Imported new chain segment blocks=77 txs=10947 mgas=512.181 elapsed=8.065s mgasps=63.507 number=5762236 hash="c660bd…ee51b7" age=1y11mo1w dirty=1023.21MiB 

Start block after snapshot generation: 5050584, reached block 5762236(711652 blocks later) after 20h40m: 104.5 ms/block

master

 May 11 17:52:50 bench02.ethdevops.io geth INFO [05-11|15:52:49.966] Imported new chain segment blocks=52 txs=6657 mgas=379.182 elapsed=8.020s mgasps=47.275 number=5048225 hash="36e92d…c09c8e" age=2y3mo1w dirty=1022.88MiB 
 May 12 17:22:40 bench02.ethdevops.io geth INFO [05-12|15:22:40.635] Imported new chain segment blocks=63 txs=9208 mgas=412.976 elapsed=8.044s mgasps=51.337 number=5762229 hash="f05b55…f7ec57" age=1y11mo1w dirty=1022.64MiB 

Start block after snapshot generation: 5048225, reached block 5762229(714004 blocks later) after 23h30m: 118.5 ms/block.

@holiman
Copy link
Contributor Author

holiman commented May 13, 2020

Interestingly, the removal of the regular prefetcher causes the loading-from-snapshot to take a hit.

So the gains that are won on account_update and storage_update are somewhat eaten into by speed-loss in snapshot_storage_read and snapshot_account_read
Screenshot_2020-05-13 Dual Geth - Grafana

@holiman
Copy link
Contributor Author

holiman commented May 14, 2020

The last couple of days:
Screenshot_2020-05-14 Dual Geth - Grafana

The block heights are 6.95M vs 7.22M (in favour of this PR), both started at 5.05M

@holiman
Copy link
Contributor Author

holiman commented May 15, 2020

Preparing a third run now, where I've re-enabled the old prefetcher, which executes the next block on current state. The hope being that this should address the regression on snapshot_account_read and snapshot_storage_read

@holiman
Copy link
Contributor Author

holiman commented May 15, 2020

Third run in progress

May 15 14:58:43 bench01.ethdevops.io geth INFO [05-15|12:58:43.545] Generated state snapshot accounts=26358822 slots=41225212 storage=4.22GiB elapsed=2h22m45.514s
May 15 14:58:49 bench01.ethdevops.io geth INFO [05-15|12:58:49.194] Imported new chain segment blocks=56 txs=7633 mgas=374.765 elapsed=8.045s mgasps=46.579 number=5048448 hash="272e82…614245" age=2y3mo2w dirty=1023.13MiB 
May 15 14:56:13 bench02.ethdevops.io geth INFO [05-15|12:56:13.641] Generated state snapshot accounts=26386458 slots=41276090 storage=4.22GiB elapsed=2h20m10.229s
May 15 14:56:13 bench02.ethdevops.io geth INFO [05-15|12:56:13.910] Imported new chain segment blocks=28 txs=4216 mgas=205.895 elapsed=8.056s mgasps=25.557 number=5048614 hash="81cd35…68d0e6" age=2y3mo2w dirty=1022.68MiB 

Charts: https://geth-bench.ethdevops.io/d/Jpk-Be5Wk/dual-geth?orgId=1&var-exp=bench01&var-master=bench02&var-percentile=50&from=1589547665049&to=now

@holiman
Copy link
Contributor Author

holiman commented May 16, 2020

 May 16 12:44:57 bench01.ethdevops.io geth INFO [05-16|10:44:57.160] Imported new chain segment blocks=62 txs=9586 mgas=418.097 elapsed=8.320s mgasps=50.248 number=5762239 hash="ae28f8…91122e" age=1y11mo2w dirty=1023.24MiB 
Version Run Start time Start block End time End block Speed
PR v1 1 15:32:26 5049092 12:36:15 5762219 106.3 ms/block
master 1 15:32:00 5048322 15:03:07 5762249 118.6 ms/block
PR v2 2 15:55:40 5050584 12:35:25 5762236 104.5 ms/block
master 2 15:52:49 5048225 15:22:40 5762229 118.5 ms/block
PR v3 3 12:58:49 5048448 10:44:57 5762239 109.8 ms/block
master 3 12:56:13 5048614

Looks like adding back the original prefetcher adds a regression. Although the snapshot reads are now in league with master:
Screenshot_2020-05-16 Dual Geth - Grafana

I can't really explain it -- my best guess would be that we're hitting some IO bottleneck, and although we improve in some parts, there's some overall higher sluggishness in all IO.

(there's no obvious extra egress from experimental on the last run either)

@holiman
Copy link
Contributor Author

holiman commented May 18, 2020

Comparing run 2 with run 3:

Run 2, blocks 5050253 - 6001565 : https://geth-bench.ethdevops.io/d/Jpk-Be5Wk/dual-geth?orgId=1&from=1589212470448&to=1589312346463

run-2

Run 3, blocks 5056095 - 6000581: https://geth-bench.ethdevops.io/d/Jpk-Be5Wk/dual-geth?orgId=1&from=1589548255204&to=1589652724888

run-3

op avg run 2 avg run 3 diff
execution 37.5 ms 39.7 ms +1.8
commit 15.9 ms 16.9 ms +1.0
snapshot storage read 9.3 ms 8.2 ms -1.1
account commit 8.0 ms 9.7 ms +1.7
snapshot account read 6.4 ms 5.7 ms -0.7
storage commit 6.3 ms 8.0 ms +1.7
storage hash 5.4 ms 5.8 ms +0.4
account hash 3.8 ms 4.0 ms +0.2
account update 3.4 ms 3.4 ms -
validation 3.2 ms 3.3 ms +0.1
snapshot commit 2.9 ms 2.8 ms -0.1
storage update 2.5 ms 2.5 ms -
account read 573 ns 0 ns -0.5
storage read 454 ns 0 ns -0.5
SUM 105.6ms 110.0ms

@holiman
Copy link
Contributor Author

holiman commented May 18, 2020

Looking at the efficiency of the trie prefetcher:
(second run):
Screenshot_2020-05-18 Dual Geth - Grafana
(third run):
Screenshot_2020-05-18 Dual Geth - Grafana(1)

We can see that the second run has 5.18K fetches on aveage, whereas the third run has 4.94K fetches on average. Which is the reason why the storage/account commit operation saw a regression on the third run, I guess.

@holiman holiman force-pushed the improve_updates_2 branch from d62220c to 6385d2d Compare May 19, 2020 09:35
@holiman holiman marked this pull request as ready for review May 19, 2020 09:36
@holiman
Copy link
Contributor Author

holiman commented May 19, 2020

ready for review. Should I disable to 'regular' prefetcher for good, or leave that as a separate thing later?

@holiman
Copy link
Contributor Author

holiman commented May 24, 2020

Here are the last 6 hours of them both being up to head.
Screenshot_2020-05-24 Dual Geth - Grafana

I'll swap the machines around for a sanity check now

@holiman
Copy link
Contributor Author

holiman commented May 24, 2020

Fourth run

bench02, now with this PR

May 24 19:27:43 bench02.ethdevops.io geth INFO [05-24|17:27:43.527] Generated state snapshot accounts=26356296 slots=41222789 storage=4.22GiB elapsed=2h19m22.077s
May 24 19:27:45 bench02.ethdevops.io geth INFO [05-24|17:27:45.797] Imported new chain segment blocks=17 txs=3209 mgas=134.557 elapsed=8.114s mgasps=16.583 number=5048551 hash="7b8e34…5c0703" age=2y3mo3w dirty=1023.79MiB 
...
May 25 17:01:45 bench02.ethdevops.io geth INFO [05-25|15:01:45.639] Imported new chain segment blocks=69 txs=10013 mgas=460.737 elapsed=8.033s mgasps=57.352 number=5762233 hash="434e0a…bcf21d" age=1y11mo3w dirty=1023.34MiB 

and bench01, with master:

May 24 19:22:35 bench01.ethdevops.io geth INFO [05-24|17:22:34.993] Generated state snapshot accounts=26350530 slots=41211811 storage=4.22GiB elapsed=2h14m18.436s
May 24 19:22:38 bench01.ethdevops.io geth INFO [05-24|17:22:38.724] Imported new chain segment blocks=43 txs=6352 mgas=300.193 elapsed=8.204s mgasps=36.587 number=5047272 hash="38ddff…2e7ed7" age=2y3mo3w dirty=1022.12MiB 
...
May 25 19:09:13 bench01.ethdevops.io geth INFO [05-25|17:09:13.180] Imported new chain segment blocks=58 txs=9057 mgas=398.132 elapsed=8.238s mgasps=48.324 number=5762238 hash="fa34b4…435bd0" age=1y11mo3w dirty=1023.08MiB 
Version Run Start time Start block End time End block Speed
PR v1 1 15:32:26 5049092 12:36:15 5762219 106.3 ms/block
master 1 15:32:00 5048322 15:03:07 5762249 118.6 ms/block
PR v2 2 15:55:40 5050584 12:35:25 5762236 104.5 ms/block
master 2 15:52:49 5048225 15:22:40 5762229 118.5 ms/block
PR v3 3 12:58:49 5048448 10:44:57 5762239 109.8 ms/block
master 3 12:56:13 5048614
PR v3 4 17:27:45 5048551 15:01:45 5762233 108.7 ms/block
master 4 17:22:38 5047272 17:09:13 5762238 119.9 ms/block

Q.E.D
:shipit:

@holiman holiman force-pushed the improve_updates_2 branch from 6385d2d to 96b7e15 Compare June 9, 2020 10:13
@holiman
Copy link
Contributor Author

holiman commented Jun 9, 2020

rebased

@holiman holiman force-pushed the improve_updates_2 branch 2 times, most recently from 3bd0b1a to e9edf05 Compare June 9, 2020 13:58
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we pass the prefetcher pointer to the statedb directly?

And also I think the triePrefetcher only make sense if we use snapshot. If snapshot is disabled, then we should disable this too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bc.triePrefetcher will be nil if it's disabled. The rest of the snapshot-stuff is handled via bc, so it's a bit cumbersome to make the connection directly to statedb.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we load the data from the trie directly(without hitting in the snapshot for some reasons), is it ok to reload the path again by triePrefetcher?

I think it's fine.
(a) Reload will be super cheap if it's in the memory.
(b) Usually we can hit all slots in the snapshot

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we would only not use the snapshot if we're in the process of creating it, which is a temporary headache, and it's ok to load it twice in that case, since it would also have been warmed up in the cache.

Another similar thing is if we have two identical storage tries, in two separate contracts. We can only "hand out' the trie to one of them, and the second one will have to load from disk/cache. Otherwise, the first one would modify the trie, and the second one would get the modified version.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prehaps we can cut down the number of slots to be resolved.

If the original value in storage is empty, then we don't need to resolve the path of slots.

But no idea how many this kind of slots we will have(create new slots in the contract).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we need to resolve the path is the original value is empty? if we're going to write to it, we'll still need to resolve the path.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason to move this storage initialization?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we might not need to do it at all, if there are no pending storage changes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is, if the value == originStorage

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we can do the similar thing like storage trie, whenever we retrieve the resolved trie from the prefetcher, mark it as nil.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean to just release the reference for GC, or do you mean for correctness?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cached tries should only be accessed for one time and then throw it away. E.g. storage trie.
So I mean for state trie it's same. If we fetch the state trie from prefetcher then mark it as nil.
(although the current code does the same logic)

@holiman
Copy link
Contributor Author

holiman commented Jul 6, 2020

I addressed some of the concerns, I need to think a bit more about the remaining ones + do a rebase.

Copy link
Contributor Author

@holiman holiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mainly LGTM, I might need some more time with it to fully understand everything new

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

acive -> active

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alreacy -> already

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's quite easy to misread fetches and fetchers, when skimming through the code

Comment on lines +282 to +286
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If, for some reason, we are missing a root, I think the effect will be that the subfetcher fails to start (and logs a warning), and later on, when we try to peek, nothing is listening on the copy chan, and we'll simply block forever.

I don't have a good idea on how to handle it differently, just making a note of it now...

Well, I guess one idea would be to, instead of returning here, have a loop which justs listens to the channels, doesn't actually do any work, and always deliver nil if data is requested.

Comment on lines +251 to +253
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My previous comment, about being blocked on the peek -- it might be possible to handle that by doing the same check here as done a few lines below:

		if sf.trie == nil {
			return nil
		}

Becuse if the root could not be opened, then the sf.trie will be nil.

Comment on lines +248 to +262
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So maybe this would be more 'safe'

Suggested change
func (sf *subfetcher) peek() Trie {
ch := make(chan Trie)
select {
case sf.copy <- ch:
// Subfetcher still alive, return copy from it
return <-ch
case <-sf.term:
// Subfetcher already terminated, return a copy directly
if sf.trie == nil {
return nil
}
return sf.db.CopyTrie(sf.trie)
}
}
func (sf *subfetcher) peek() Trie {
if sf.trie == nil {
return nil
}
ch := make(chan Trie)
select {
case sf.copy <- ch:
// Subfetcher still alive, return copy from it
return <-ch
case <-sf.term:
// Subfetcher already terminated, return a copy directly
return sf.db.CopyTrie(sf.trie)
}
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah wait, you always close the term channel when exiting the loop, so it won't actually block here. Nevermind the noise then!

@holiman
Copy link
Contributor Author

holiman commented Jan 20, 2021

LGTM!!

holiman and others added 2 commits January 21, 2021 01:46
Squashed from the following commits:

core/state: lazily init snapshot storage map
core/state: fix flawed meter on storage reads
core/state: make statedb/stateobjects reuse a hasher
core/blockchain, core/state: implement new trie prefetcher
core: make trie prefetcher deliver tries to statedb
core/state: refactor trie_prefetcher, export storage tries
blockchain: re-enable the next-block-prefetcher
state: remove panics in trie prefetcher
core/state/trie_prefetcher: address some review concerns

sq
@karalabe karalabe added this to the 1.10.0 milestone Jan 20, 2021
@karalabe karalabe merged commit ddadc3d into ethereum:master Jan 20, 2021
gzliudan added a commit to gzliudan/XDPoSChain that referenced this pull request Dec 9, 2024
gzliudan added a commit to gzliudan/XDPoSChain that referenced this pull request Jul 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants