-
Notifications
You must be signed in to change notification settings - Fork 749
Description
Hi,
I am interested in running hnswlib in the streaming track of the big-ann-benchmarks repository. The streaming task consists of executing a sequence of insertions, deletions, and queries under a fairly small (8 GB) memory limit and a one hour time limit, and it is judged based on average recall across all calls to query. I have a draft streaming entry of HNSW in this PR in big-ann-benchmarks (https://github.com/harsha-simhadri/big-ann-benchmarks/pull/328/files#diff-19fac8cd31d9f2cc23185999fa7a665e44aa22652669295b57590fb17c874dd8), and I am seeking feedback how best to implement deletion and replacement of deleted indices with newly inserted points. Currently, I am replacing deleted points whenever either (a) more than 20% of the vertices in the index are deleted, or (b) the maximum number of points would otherwise be increased.
I am looking to use this entry to benchmark HNSW in upcoming academic work, so I would greatly appreciate any feedback if I have not configured it optimally.
Thanks,
Magdalen