|
9 | 9 |
|
10 | 10 | Portable Roaring bitmaps in C (and C++) with full support for your favorite compiler (GNU GCC, LLVM's clang, Visual Studio, Apple Xcode, Intel oneAPI). Included in the [Awesome C](https://github.com/kozross/awesome-c) list of open source C software.
|
11 | 11 |
|
| 12 | +# Table of Contents |
| 13 | + |
| 14 | +- [Introduction](#introduction) |
| 15 | +- [Objective](#objective) |
| 16 | +- [Requirements](#requirements) |
| 17 | +- [Quick Start](#quick-start) |
| 18 | +- [Packages](#packages) |
| 19 | +- [Using Roaring as a CPM dependency](#using-roaring-as-a-cpm-dependency) |
| 20 | +- [Using as a CMake dependency with FetchContent](#using-as-a-cmake-dependency-with-fetchcontent) |
| 21 | +- [Amalgamating](#amalgamating) |
| 22 | +- [API](#api) |
| 23 | + - [Main API functions](#main-api-functions) |
| 24 | + - [C++ API functions](#c-api-functions) |
| 25 | +- [Dealing with large volumes of data](#dealing-with-large-volumes-of-data) |
| 26 | +- [Running microbenchmarks](#running-microbenchmarks) |
| 27 | +- [Custom memory allocators](#custom-memory-allocators) |
| 28 | +- [Example (C)](#example-c) |
| 29 | +- [Compressed 64-bit Roaring bitmaps (C)](#compressed-64-bit-roaring-bitmaps-c) |
| 30 | +- [Conventional bitsets (C)](#conventional-bitsets-c) |
| 31 | +- [Example (C++)](#example-c-1) |
| 32 | +- [Building with cmake (Linux and macOS, Visual Studio users should see below)](#building-with-cmake-linux-and-macos-visual-studio-users-should-see-below) |
| 33 | +- [Building (Visual Studio under Windows)](#building-visual-studio-under-windows) |
| 34 | + - [Usage (Using conan)](#usage-using-conan) |
| 35 | + - [Usage (Using vcpkg on Windows, Linux and macOS)](#usage-using-vcpkg-on-windows-linux-and-macos) |
| 36 | +- [SIMD-related throttling](#simd-related-throttling) |
| 37 | +- [Thread safety](#thread-safety) |
| 38 | +- [How to best aggregate bitmaps?](#how-to-best-aggregate-bitmaps) |
| 39 | +- [Wrappers for Roaring Bitmaps](#wrappers-for-roaring-bitmaps) |
| 40 | +- [Mailing list/discussion group](#mailing-listdiscussion-group) |
| 41 | +- [Contributing](#contributing) |
| 42 | +- [References about Roaring](#references-about-roaring) |
| 43 | + |
12 | 44 | # Introduction
|
13 | 45 |
|
14 | 46 | Bitsets, also called bitmaps, are commonly used as fast data structures. Unfortunately, they can use too much memory.
|
@@ -253,7 +285,187 @@ We also have a C++ interface:
|
253 | 285 | - [roaring64map.hh](https://github.com/RoaringBitmap/CRoaring/blob/master/cpp/roaring64map.hh).
|
254 | 286 |
|
255 | 287 |
|
256 |
| -# Dealing with large volumes |
| 288 | +# Main API functions |
| 289 | + |
| 290 | +Below is an overview of the main functions provided by CRoaring in C, covering both 32-bit (`roaring.h`) and 64-bit (`roaring64.h`) bitmaps. For more details, see the header files in `include/roaring/` or the Doxygen documentation. |
| 291 | + |
| 292 | +## Creation and Destruction |
| 293 | +- `roaring_bitmap_t *roaring_bitmap_create(void);` |
| 294 | + Create a new empty 32-bit bitmap. |
| 295 | +- `roaring64_bitmap_t *roaring64_bitmap_create(void);` |
| 296 | + Create a new empty 64-bit bitmap. |
| 297 | +- `void roaring_bitmap_free(roaring_bitmap_t *r);` |
| 298 | + Free a 32-bit bitmap. |
| 299 | +- `void roaring64_bitmap_free(roaring64_bitmap_t *r);` |
| 300 | + Free a 64-bit bitmap. |
| 301 | + |
| 302 | +## Adding and Removing Values |
| 303 | +- `void roaring_bitmap_add(roaring_bitmap_t *r, uint32_t x);` |
| 304 | + Add value `x` to a 32-bit bitmap. |
| 305 | +- `void roaring64_bitmap_add(roaring64_bitmap_t *r, uint64_t x);` |
| 306 | + Add value `x` to a 64-bit bitmap. |
| 307 | +- `void roaring_bitmap_remove(roaring_bitmap_t *r, uint32_t x);` |
| 308 | + Remove value `x` from a 32-bit bitmap. |
| 309 | +- `void roaring64_bitmap_remove(roaring64_bitmap_t *r, uint64_t x);` |
| 310 | + Remove value `x` from a 64-bit bitmap. |
| 311 | + |
| 312 | +## Queries and Cardinality |
| 313 | +- `bool roaring_bitmap_contains(const roaring_bitmap_t *r, uint32_t x);` |
| 314 | + Check if `x` is present in a 32-bit bitmap. |
| 315 | +- `bool roaring64_bitmap_contains(const roaring64_bitmap_t *r, uint64_t x);` |
| 316 | + Check if `x` is present in a 64-bit bitmap. |
| 317 | +- `uint64_t roaring_bitmap_get_cardinality(const roaring_bitmap_t *r);` |
| 318 | + Get the number of elements in a 32-bit bitmap. |
| 319 | +- `uint64_t roaring64_bitmap_get_cardinality(const roaring64_bitmap_t *r);` |
| 320 | + Get the number of elements in a 64-bit bitmap. |
| 321 | + |
| 322 | +## Iteration |
| 323 | +- `bool roaring_iterate(const roaring_bitmap_t *r, roaring_iterator iterator, void *param);` |
| 324 | + Iterate over all values in a 32-bit bitmap, calling `iterator` for each value. |
| 325 | +- `bool roaring64_iterate(const roaring64_bitmap_t *r, roaring_iterator64 iterator, void *param);` |
| 326 | + Iterate over all values in a 64-bit bitmap. |
| 327 | + |
| 328 | +## Set Operations |
| 329 | +- `roaring_bitmap_t *roaring_bitmap_and(const roaring_bitmap_t *r1, const roaring_bitmap_t *r2);` |
| 330 | + Intersection (AND) of two 32-bit bitmaps. |
| 331 | +- `roaring64_bitmap_t *roaring64_bitmap_and(const roaring64_bitmap_t *r1, const roaring64_bitmap_t *r2);` |
| 332 | + Intersection (AND) of two 64-bit bitmaps. |
| 333 | +- `roaring_bitmap_t *roaring_bitmap_or(const roaring_bitmap_t *r1, const roaring_bitmap_t *r2);` |
| 334 | + Union (OR) of two 32-bit bitmaps. |
| 335 | +- `roaring64_bitmap_t *roaring64_bitmap_or(const roaring64_bitmap_t *r1, const roaring64_bitmap_t *r2);` |
| 336 | + Union (OR) of two 64-bit bitmaps. |
| 337 | +- `roaring_bitmap_t *roaring_bitmap_xor(const roaring_bitmap_t *r1, const roaring_bitmap_t *r2);` |
| 338 | + Symmetric difference (XOR) of two 32-bit bitmaps. |
| 339 | +- `roaring64_bitmap_t *roaring64_bitmap_xor(const roaring64_bitmap_t *r1, const roaring64_bitmap_t *r2);` |
| 340 | + Symmetric difference (XOR) of two 64-bit bitmaps. |
| 341 | +- `roaring_bitmap_t *roaring_bitmap_andnot(const roaring_bitmap_t *r1, const roaring_bitmap_t *r2);` |
| 342 | + Difference (r1 \ r2) for 32-bit bitmaps. |
| 343 | +- `roaring64_bitmap_t *roaring64_bitmap_andnot(const roaring64_bitmap_t *r1, const roaring64_bitmap_t *r2);` |
| 344 | + Difference (r1 \ r2) for 64-bit bitmaps. |
| 345 | + |
| 346 | +## Serialization and Deserialization |
| 347 | +- `size_t roaring_bitmap_portable_size_in_bytes(const roaring_bitmap_t *r);` |
| 348 | + Get the number of bytes required to serialize a 32-bit bitmap. |
| 349 | +- `size_t roaring64_bitmap_portable_size_in_bytes(const roaring64_bitmap_t *r);` |
| 350 | + Get the number of bytes required to serialize a 64-bit bitmap. |
| 351 | +- `size_t roaring_bitmap_portable_serialize(const roaring_bitmap_t *r, char *buf);` |
| 352 | + Serialize a 32-bit bitmap to a buffer (portable format). |
| 353 | +- `size_t roaring64_bitmap_portable_serialize(const roaring64_bitmap_t *r, char *buf);` |
| 354 | + Serialize a 64-bit bitmap to a buffer (portable format). |
| 355 | +- `roaring_bitmap_t *roaring_bitmap_portable_deserialize(const char *buf);` |
| 356 | + Deserialize a 32-bit bitmap from a buffer. |
| 357 | +- `roaring64_bitmap_t *roaring64_bitmap_portable_deserialize(const char *buf);` |
| 358 | + Deserialize a 64-bit bitmap from a buffer. |
| 359 | +- `roaring_bitmap_t *roaring_bitmap_portable_deserialize_safe(const char *buf, size_t maxbytes);` |
| 360 | + Safe deserialization of a 32-bit bitmap (will not read past `maxbytes`). |
| 361 | +- `roaring64_bitmap_t *roaring64_bitmap_portable_deserialize_safe(const char *buf, size_t maxbytes);` |
| 362 | + Safe deserialization of a 64-bit bitmap. |
| 363 | +- `size_t roaring_bitmap_portable_deserialize_size(const char *buf, size_t maxbytes);` |
| 364 | + Get the size of a serialized 32-bit bitmap (returns 0 if invalid). |
| 365 | +- `size_t roaring64_bitmap_portable_deserialize_size(const char *buf, size_t maxbytes);` |
| 366 | + Get the size of a serialized 64-bit bitmap (returns 0 if invalid). |
| 367 | + |
| 368 | +## Validation |
| 369 | +- `bool roaring_bitmap_internal_validate(const roaring_bitmap_t *r, const char **reason);` |
| 370 | + Validate the internal structure of a 32-bit bitmap. Returns `true` if valid, `false` otherwise. If invalid, `reason` points to a string describing the problem. |
| 371 | +- `bool roaring64_bitmap_internal_validate(const roaring64_bitmap_t *r, const char **reason);` |
| 372 | + Validate the internal structure of a 64-bit bitmap. |
| 373 | + |
| 374 | +## Notes |
| 375 | +- All memory allocated by the library must be freed using the corresponding `free` function. |
| 376 | +- The portable serialization format is cross-platform and can be shared between different languages and architectures. |
| 377 | +- Always validate bitmaps deserialized from untrusted sources before using them. |
| 378 | + |
| 379 | + |
| 380 | + |
| 381 | +# C++ API functions |
| 382 | + |
| 383 | +The C++ interface is provided via the `roaring.hh` (32-bit) and `roaring64map.hh` (64-bit) headers. These offer a modern, type-safe, and convenient API for manipulating Roaring bitmaps in C++. |
| 384 | + |
| 385 | +## Main Classes |
| 386 | +- `roaring::Roaring` — 32-bit Roaring bitmap |
| 387 | +- `roaring::Roaring64Map` — 64-bit Roaring bitmap |
| 388 | + |
| 389 | +## Common Methods (32-bit and 64-bit) |
| 390 | +- `Roaring()` / `Roaring64Map()` |
| 391 | + - Construct an empty bitmap. |
| 392 | +- `Roaring(std::initializer_list<uint32_t> values)` |
| 393 | + - Construct from a list of values. |
| 394 | +- `void add(uint32_t x)` / `void add(uint64_t x)` |
| 395 | + - Add a value to the bitmap. |
| 396 | +- `void remove(uint32_t x)` / `void remove(uint64_t x)` |
| 397 | + - Remove a value from the bitmap. |
| 398 | +- `bool contains(uint32_t x) const` / `bool contains(uint64_t x) const` |
| 399 | + - Check if a value is present. |
| 400 | +- `uint64_t cardinality() const` |
| 401 | + - Get the number of elements in the bitmap. |
| 402 | +- `bool isEmpty() const` |
| 403 | + - Check if the bitmap is empty. |
| 404 | +- `void clear()` |
| 405 | + - Remove all elements. |
| 406 | +- `void runOptimize()` |
| 407 | + - Convert internal containers to run containers for better compression. |
| 408 | +- `void setCopyOnWrite(bool enable)` |
| 409 | + - Enable or disable copy-on-write mode for fast/shallow copies. |
| 410 | +- `bool operator==(const Roaring&) const` / `bool operator==(const Roaring64Map&) const` |
| 411 | + - Equality comparison. |
| 412 | +- `void swap(Roaring&)` / `void swap(Roaring64Map&)` |
| 413 | + - Swap contents with another bitmap. |
| 414 | + |
| 415 | +## Set Operations |
| 416 | +- `Roaring operator|(const Roaring&) const` / `Roaring64Map operator|(const Roaring64Map&) const` |
| 417 | + - Union (OR) |
| 418 | +- `Roaring operator&(const Roaring&) const` / `Roaring64Map operator&(const Roaring64Map&) const` |
| 419 | + - Intersection (AND) |
| 420 | +- `Roaring operator^(const Roaring&) const` / `Roaring64Map operator^(const Roaring64Map&) const` |
| 421 | + - Symmetric difference (XOR) |
| 422 | +- `Roaring operator-(const Roaring&) const` / `Roaring64Map operator-(const Roaring64Map&) const` |
| 423 | + - Difference |
| 424 | +- In-place versions: `operator|=`, `operator&=`, `operator^=`, `operator-=` |
| 425 | + |
| 426 | +## Iteration |
| 427 | +- `Roaring::const_iterator` / `Roaring64Map::const_iterator` |
| 428 | + - Standard C++ iterator support: `begin()`, `end()` |
| 429 | +- `void iterate(function, void* param)` |
| 430 | + - Call a function for each value (C-style callback). |
| 431 | + |
| 432 | +## Serialization and Deserialization |
| 433 | +- `size_t getSizeInBytes() const` |
| 434 | + - Get the size in bytes for serialization. |
| 435 | +- `void write(char* buf) const` |
| 436 | + - Serialize the bitmap to a buffer. |
| 437 | +- `static Roaring read(const char* buf, bool portable = true)` |
| 438 | + - Deserialize a bitmap from a buffer. |
| 439 | +- `static Roaring readSafe(const char* buf, size_t maxbytes, bool portable = true)` |
| 440 | + - Safe deserialization (will not read past `maxbytes`). |
| 441 | + |
| 442 | +## Bulk Operations |
| 443 | +- `void addMany(size_t n, const uint32_t* values)` / `void addMany(size_t n, const uint64_t* values)` |
| 444 | + - Add many values at once. |
| 445 | +- `void toUint32Array(uint32_t* out) const` / `void toUint64Array(uint64_t* out) const` |
| 446 | + - Export all values to an array. |
| 447 | + |
| 448 | +## Example Usage |
| 449 | +```cpp |
| 450 | +#include "roaring.hh" |
| 451 | +using namespace roaring; |
| 452 | + |
| 453 | +Roaring r1; |
| 454 | +r1.add(42); |
| 455 | +if (r1.contains(42)) { |
| 456 | + // ... |
| 457 | +} |
| 458 | +Roaring r2 = Roaring::bitmapOf(3, 1, 2, 3); |
| 459 | +Roaring r3 = r1 | r2; |
| 460 | +for (auto v : r3) { |
| 461 | + // iterate over values |
| 462 | +} |
| 463 | +``` |
| 464 | + |
| 465 | +For 64-bit values, use `#include "roaring64map.hh"` and the `Roaring64Map` class, which has a similar API. |
| 466 | + |
| 467 | + |
| 468 | +# Dealing with large volumes of data |
257 | 469 |
|
258 | 470 | Some users have to deal with large volumes of data. It may be important for these users to be aware of the `addMany` (C++) `roaring_bitmap_or_many` (C) functions as it is much faster and economical to add values in batches when possible. Furthermore, calling periodically the `runOptimize` (C++) or `roaring_bitmap_run_optimize` (C) functions may help.
|
259 | 471 |
|
|
0 commit comments