-
Notifications
You must be signed in to change notification settings - Fork 0
Add optional data mounting #149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests.
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 9 files with indirect coverage changes 🚀 New features to boost your workflow:
|
|
Seems like a nice simple way of specifying the data should be mounted. Also it seems quite handy that it only creates symlinks if the file exists in the mounted directory. I remember there was some discussion that in some cases we might want to raise an error if the mount doesn't exist? Maybe this behaviour could be specified as an optional bool in the DataMount constructor, if this behaviour is favourable in some cases? If you guys recommend this, it would be better for the DataMount object to do this, then having to do this check manually in every obi-one endpoint I think. Also do we need some custom logic for the staging functions? At the moment would the staging functions create symlinks when they call download functions internally, and then they'd try to edit the files at those symlinks? Rather we probably want that symlinks are first created through the download functions, but then in the editing step (i.e. for the circuit_config.json), the symlink is deleted and the files to be edited are copied to the specified download locations, and then edited? |
When a DataMount is created it checks if the prefix (/data) passed to it exists, otherwise it raises. Is that what you mean? Or do you mean it should raise if a file does not exist in the mount?
If there is such a case, the staging functions should not work directly on input data (links) but rather separate inputs/outputs. iirc for circuit_config.json specifically I think it is downloaded with client.download_content, in which case the file from the DataMount is used to read the bytes of the file but there is no symlink made. In any case, given that the data/ mount point will be read only, if the staging trying to modify these files, it will raise an error. |
Yes that's what I meant. Seems good if that's always behaviour then
Yes, I didn't mean that the data in the mount would be edited, but rather a file would be created at the path of the symlink (not at the path the symlink points to). But I see now the distinction that the files which are edited during staging use This seems like a good pattern, thanks for clarrifying! |
mgeplf
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation looks fine; my concern is that the word Mount doesn't express the intent of what is going on - or at least it confused me at first.
To me, mounting is the act of attaching a filesystem. What is happening here can happen regardless of where the data originates - it's more of a AssetStore or AssetDepot, or perhaps a cache or an indirection?
I hate to nitpick about this, as it's quite an invasive change. Perhaps better a better description would help to make it more clear?
I couldn't come up with a good enough name that describes the role of that object. I think DataCache feels closer to its true meaning. |
Ok. I'd still be tempted to name it something with I'm willing to do the change. |
|
After some discussion we converged to |
Allow specifying a local asset store when instantiating a Client. The mount location will be used to symlink the files from the assets instead of downloading, if they exist.
Example:
If the asset.full_path exists in
/data/{full_path}then the file is not downloaded via the api but it is instead symlinked. This works for both client.download_content, client.download_file and all methods that use them.