-
Notifications
You must be signed in to change notification settings - Fork 138
feat: blockstore: GetMany blockstore method #492
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…pstream master branch of go-datastore
|
Thank you for submitting this PR!
Getting other community members to do a review would be great help too on complex PRs (you can ask in the chats/forums). If you are unsure about something, just leave us a comment.
We currently aim to provide initial feedback/triaging within two business days. Please keep an eye on any labelling actions, as these will indicate priorities and status of your contribution. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note for future me:
We maybe want automatic forwarding using type assertions in the default constructor function too.
blockstore/blockstore.go
Outdated
| bs.rehash.Store(enabled) | ||
| } | ||
|
|
||
| func (bs *getManyBlockStore) Get(ctx context.Context, k cid.Cid) (blocks.Block, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am surprised this function needs to be duplicated, I think you could use embeding here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is possible but would require passing in a ds.Batching to the constructor function in addition to theds.TxnDatastore (or some composite interface of the two) as ds.TxnDatastore can't be used to construct a regular Blockstore for embedding because it doesn't satisfy the ds.Batching interface, even though the Put and Get methods we would like to fall through to don't use the portions of that interface that it doesn't support...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another option would be to break the normal Blockstore up into the components that need ds.Batching and the ones that just need ds.Datastore.ds.TxDatastore satisfies ds.Datastore and we only need ds.Datastore to fulfill the Get and Put methods on either blockstore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me know what you think the best/cleanest approach is here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would make it take the union of batching and txn datastores so it has access to all that is needed.
You could also implement a stub, a batch is inferior to a transaction while providing the same features:
Batching datastores support deferred, grouped updates to the database.
Batches do NOT have transactional semantics: updates to the underlying datastore are not guaranteed to occur in the same iota of time. Similarly, batched updates will not be flushed to the underlying datastore untilCommithas been called.Txns from aTxnDatastorehave all the capabilities of aBatch, but the reverse is NOT true.
https://pkg.go.dev/github.com/ipfs/go-datastore#Batching
So if you accept a TxnDatastore you can use a simple stub that implements batches using transactions, this would make thee code work in the only place it's used right now.
blockstore/blockstore.go
Outdated
| } | ||
|
|
||
| // TxnBlockstore is a blockstore interface that supports GetMany and PutMany methods using ds.TxnDatastore | ||
| type TxnBlockstore interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think TxnBlockstore is wrong because the blockstore does not provide transactions, GetManyBlockstore was fine mb.
Codecov ReportAttention: Patch coverage is
@@ Coverage Diff @@
## main #492 +/- ##
==========================================
- Coverage 65.78% 65.64% -0.15%
==========================================
Files 207 203 -4
Lines 25156 25385 +229
==========================================
+ Hits 16549 16663 +114
- Misses 7147 7235 +88
- Partials 1460 1487 +27
... and 42 files with indirect coverage changes 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx I'll try to look at this before the freeze on friday but I can't promise anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @i-norden 🙏. IIUC this is related to your work landing a usable alternative to filecoin-project/go-hamt-ipld#103 which is great to see.
Left some suggestions.
| type GetManyBlockstore interface { | ||
| Blockstore | ||
| PutMany(ctx context.Context, blocks []blocks.Block) error | ||
| GetMany(context.Context, []cid.Cid) ([]blocks.Block, []cid.Cid, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two recommendations:
- It seems like there's no need for
[]blocks.Blockand[]cid.CidsinceBlockcontains a.Cid()method https://github.com/ipfs/go-block-format/blob/v0.2.0/blocks.go#L19-L25 - You may want to consider a streaming interface so that you don't have to buffer all the blocks in memory
- The simplest option is to return either
(<-chan blocks.Block, error)or(<-chan BlockOption) depending on how you want errors returned - More complicated, but more featureful, would be if this had a more "session-like" quality to it where you could add/remove requests from the channel so that you don't have to spin up one goroutine per-block get (or per batch of gets). Could be something like this https://github.com/ipfs/go-bitswap/pull/593/files#diff-b751089c21e2437423d20fbee6c9a01a61c2422d5f7630ab8272d7fdeb63b654R17
- The simplest option is to return either
If returning an asynchronous object (e.g. channel or iterator) might be worth taking a look at ipfs/kubo#4592 to make sure you don't run into some common pitfalls. With Go generics now iterators may also make this easier than it used to be.
| if len(cs) == 1 { | ||
| // performance fast-path | ||
| block, err := bs.Get(ctx, cs[0]) | ||
| return []blocks.Block{block}, nil, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't make sense to not return the CID here given it's in the signature, but also it doesn't seem like []cid.Cid needs to be in the return signature
This PR introduces a
GetManyblockstore method to compliment the existingPutManymethod. This is for use in aGetManygo-ipld-cbor datastore implementation which, in turn, is for use in parallelForEachmethods for both go-amt-ipld and go-hamt-ipld. This PR uses theTxnDatastorekeytransform/namespace wrapper introduced in ipfs/go-datastore#210.GetManygo-ipld-cbor PR that uses this: ipfs/go-ipld-cbor#97TODO:
Replace replace directives if/when the dependency is merged and released