Skip to content

Requesting Feedback on Adding HNSWLib to Big-ANN-Benchmarks Streaming Track #612

@magdalendobson

Description

@magdalendobson

Hi,

I am interested in running hnswlib in the streaming track of the big-ann-benchmarks repository. The streaming task consists of executing a sequence of insertions, deletions, and queries under a fairly small (8 GB) memory limit and a one hour time limit, and it is judged based on average recall across all calls to query. I have a draft streaming entry of HNSW in this PR in big-ann-benchmarks (https://github.com/harsha-simhadri/big-ann-benchmarks/pull/328/files#diff-19fac8cd31d9f2cc23185999fa7a665e44aa22652669295b57590fb17c874dd8), and I am seeking feedback how best to implement deletion and replacement of deleted indices with newly inserted points. Currently, I am replacing deleted points whenever either (a) more than 20% of the vertices in the index are deleted, or (b) the maximum number of points would otherwise be increased.

I am looking to use this entry to benchmark HNSW in upcoming academic work, so I would greatly appreciate any feedback if I have not configured it optimally.

Thanks,
Magdalen

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions