Repositories contained in this github organization are individual dandisets providing convenient access to TBs of neural data stored in the BRAIN Initiative DANDI Archive. All individual dandisets contained in this repository compose dandi/dandisets DataLad superdataset, which also provides code and tools to create and update individual dandisets.
DataLad is a free and open source distributed data management system that keeps track of your data, creates structure, ensures reproducibility, supports collaboration, and integrates with widely used data infrastructure.
If you do not have git-annex and DataLad, but using uv
, it might be enough to just uv tool install datalad --with git-annex
.
See Handbook: installation for alternative ways to install it.
With DataLad, it is trivial to quickly clone an entire dandiset regardless of its data size, and then get or drop files of interest, e.g.
❯ datalad clone https://github.com/dandisets/000027
install(ok): /tmp/000027 (dataset)
❯ cd 000027
dandiset.yaml sub-RAT123/
❯ datalad get sub-RAT123/sub-*.nwb
get(ok): sub-RAT123/sub-RAT123.nwb (file) [from web...]
❯ # do cool stuff on that data
❯ datalad drop sub-RAT123/sub-*.nwb
drop(ok): sub-RAT123/sub-RAT123.nwb (file)
or install super-dataset of all dandisets and install only selected datasets of interest (with or without data)
❯ datalad install https://github.com/dandi/dandisets
install(ok): /tmp/dandisets (dataset)
❯ cd dandisets
❯ datalad -l warning install 00002*
install(ok): /tmp/dandisets/000020 (dataset) [Installed subdataset in order to get /tmp/dandisets/000020]
install(ok): /tmp/dandisets/000021 (dataset) [Installed subdataset in order to get /tmp/dandisets/000021]
...
For more information about DataLad and opportunities it brings, please visit handbook.datalad.org. You can discover even more of DataLad datasets at https://registry.datalad.org and https://datasets.datalad.org.