-
Notifications
You must be signed in to change notification settings - Fork 21.5k
Closed
Labels
Description
Dear Geth Team,
I am currently looking into some optimisations to Geth related to parallelism, state fetching and snapshots as part of my work on the downstream celo-blockchain.
Although it is not a priority, I think there is a chance that some work done could end up being merged upstream. Even if not, I think it is still a good idea to start a conversation.
I have a number of questions:
- Is it correct to say as of now no actual optimisations exist based on the Berlin activated
AccessListin terms ofTrie(i.e. DB warming) or even snapshot object prefetching? What are the team’s plans for such things going forward? More generally, is there some sort of roadmap capturing planned features or are these only captured ephemerally in discussions/issues? - Somewhat relatedly, is it correct to say that the
core/state_prefetcher.gocode is outdated in the sense that one would not like to callTransitionDBin advance to warm the state, as it is costly, what’s more, in parallel? What is the (future) purpose of this code since it’s currently not invoked anywhere as far as I can tell? Is the future for this code to morph into DB prefetching based on AccessList? If so, what would be a good entry point to spawning the prefetching? - Why is it that the
TriePrefetcherthat lives in theStateDBuses one go worker per state root to walk the Trie? This seems inefficient to me, and in my mind, one should spawn as many goroutines as there aretrie.TryGets to be walked. In my mind, one ought to maintain a hashmap ofTriesand pop tasks off a global task list and spawn as many goroutines as there are tasks. The global prefetcher loop will then respond to stop commands.. This way, one can leverage concurrency forTrieprefetching even within a single contract’s state Trie. Sincedisk fetch >> goroutine context switch, I think this is a reasonable change. Since this is a relatively isolated change, I guess I could submit a PR here to familiarise with the contribution process on geth. I guess I’ll make an issue first. One question is whether it is preferable to maintain copies of Tries per thread (since this is rather lightweight - just a pointer to a global?Databaseand the root node). Further, if the intention is to warm the underlying DB, there is not much reason (apart, perhaps, for convenience) to continue to maintain theTries in memory, is there? In essence, one can request for theTries anew from theDatabasefrom a separate location after deleting the corresponding prefetcherTrieobject in memory, as long as one is maintaining a single globalDatabaseobject.
Cheers,
Jon