Skip to content

Commit fdf6c64

Browse files
authored
better doc (#705)
* better doc * format * guarding tests on big endian
1 parent f0a88b1 commit fdf6c64

File tree

3 files changed

+217
-3
lines changed

3 files changed

+217
-3
lines changed

README.md

Lines changed: 213 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,38 @@
99

1010
Portable Roaring bitmaps in C (and C++) with full support for your favorite compiler (GNU GCC, LLVM's clang, Visual Studio, Apple Xcode, Intel oneAPI). Included in the [Awesome C](https://github.com/kozross/awesome-c) list of open source C software.
1111

12+
# Table of Contents
13+
14+
- [Introduction](#introduction)
15+
- [Objective](#objective)
16+
- [Requirements](#requirements)
17+
- [Quick Start](#quick-start)
18+
- [Packages](#packages)
19+
- [Using Roaring as a CPM dependency](#using-roaring-as-a-cpm-dependency)
20+
- [Using as a CMake dependency with FetchContent](#using-as-a-cmake-dependency-with-fetchcontent)
21+
- [Amalgamating](#amalgamating)
22+
- [API](#api)
23+
- [Main API functions](#main-api-functions)
24+
- [C++ API functions](#c-api-functions)
25+
- [Dealing with large volumes of data](#dealing-with-large-volumes-of-data)
26+
- [Running microbenchmarks](#running-microbenchmarks)
27+
- [Custom memory allocators](#custom-memory-allocators)
28+
- [Example (C)](#example-c)
29+
- [Compressed 64-bit Roaring bitmaps (C)](#compressed-64-bit-roaring-bitmaps-c)
30+
- [Conventional bitsets (C)](#conventional-bitsets-c)
31+
- [Example (C++)](#example-c-1)
32+
- [Building with cmake (Linux and macOS, Visual Studio users should see below)](#building-with-cmake-linux-and-macos-visual-studio-users-should-see-below)
33+
- [Building (Visual Studio under Windows)](#building-visual-studio-under-windows)
34+
- [Usage (Using conan)](#usage-using-conan)
35+
- [Usage (Using vcpkg on Windows, Linux and macOS)](#usage-using-vcpkg-on-windows-linux-and-macos)
36+
- [SIMD-related throttling](#simd-related-throttling)
37+
- [Thread safety](#thread-safety)
38+
- [How to best aggregate bitmaps?](#how-to-best-aggregate-bitmaps)
39+
- [Wrappers for Roaring Bitmaps](#wrappers-for-roaring-bitmaps)
40+
- [Mailing list/discussion group](#mailing-listdiscussion-group)
41+
- [Contributing](#contributing)
42+
- [References about Roaring](#references-about-roaring)
43+
1244
# Introduction
1345

1446
Bitsets, also called bitmaps, are commonly used as fast data structures. Unfortunately, they can use too much memory.
@@ -253,7 +285,187 @@ We also have a C++ interface:
253285
- [roaring64map.hh](https://github.com/RoaringBitmap/CRoaring/blob/master/cpp/roaring64map.hh).
254286

255287

256-
# Dealing with large volumes
288+
# Main API functions
289+
290+
Below is an overview of the main functions provided by CRoaring in C, covering both 32-bit (`roaring.h`) and 64-bit (`roaring64.h`) bitmaps. For more details, see the header files in `include/roaring/` or the Doxygen documentation.
291+
292+
## Creation and Destruction
293+
- `roaring_bitmap_t *roaring_bitmap_create(void);`
294+
Create a new empty 32-bit bitmap.
295+
- `roaring64_bitmap_t *roaring64_bitmap_create(void);`
296+
Create a new empty 64-bit bitmap.
297+
- `void roaring_bitmap_free(roaring_bitmap_t *r);`
298+
Free a 32-bit bitmap.
299+
- `void roaring64_bitmap_free(roaring64_bitmap_t *r);`
300+
Free a 64-bit bitmap.
301+
302+
## Adding and Removing Values
303+
- `void roaring_bitmap_add(roaring_bitmap_t *r, uint32_t x);`
304+
Add value `x` to a 32-bit bitmap.
305+
- `void roaring64_bitmap_add(roaring64_bitmap_t *r, uint64_t x);`
306+
Add value `x` to a 64-bit bitmap.
307+
- `void roaring_bitmap_remove(roaring_bitmap_t *r, uint32_t x);`
308+
Remove value `x` from a 32-bit bitmap.
309+
- `void roaring64_bitmap_remove(roaring64_bitmap_t *r, uint64_t x);`
310+
Remove value `x` from a 64-bit bitmap.
311+
312+
## Queries and Cardinality
313+
- `bool roaring_bitmap_contains(const roaring_bitmap_t *r, uint32_t x);`
314+
Check if `x` is present in a 32-bit bitmap.
315+
- `bool roaring64_bitmap_contains(const roaring64_bitmap_t *r, uint64_t x);`
316+
Check if `x` is present in a 64-bit bitmap.
317+
- `uint64_t roaring_bitmap_get_cardinality(const roaring_bitmap_t *r);`
318+
Get the number of elements in a 32-bit bitmap.
319+
- `uint64_t roaring64_bitmap_get_cardinality(const roaring64_bitmap_t *r);`
320+
Get the number of elements in a 64-bit bitmap.
321+
322+
## Iteration
323+
- `bool roaring_iterate(const roaring_bitmap_t *r, roaring_iterator iterator, void *param);`
324+
Iterate over all values in a 32-bit bitmap, calling `iterator` for each value.
325+
- `bool roaring64_iterate(const roaring64_bitmap_t *r, roaring_iterator64 iterator, void *param);`
326+
Iterate over all values in a 64-bit bitmap.
327+
328+
## Set Operations
329+
- `roaring_bitmap_t *roaring_bitmap_and(const roaring_bitmap_t *r1, const roaring_bitmap_t *r2);`
330+
Intersection (AND) of two 32-bit bitmaps.
331+
- `roaring64_bitmap_t *roaring64_bitmap_and(const roaring64_bitmap_t *r1, const roaring64_bitmap_t *r2);`
332+
Intersection (AND) of two 64-bit bitmaps.
333+
- `roaring_bitmap_t *roaring_bitmap_or(const roaring_bitmap_t *r1, const roaring_bitmap_t *r2);`
334+
Union (OR) of two 32-bit bitmaps.
335+
- `roaring64_bitmap_t *roaring64_bitmap_or(const roaring64_bitmap_t *r1, const roaring64_bitmap_t *r2);`
336+
Union (OR) of two 64-bit bitmaps.
337+
- `roaring_bitmap_t *roaring_bitmap_xor(const roaring_bitmap_t *r1, const roaring_bitmap_t *r2);`
338+
Symmetric difference (XOR) of two 32-bit bitmaps.
339+
- `roaring64_bitmap_t *roaring64_bitmap_xor(const roaring64_bitmap_t *r1, const roaring64_bitmap_t *r2);`
340+
Symmetric difference (XOR) of two 64-bit bitmaps.
341+
- `roaring_bitmap_t *roaring_bitmap_andnot(const roaring_bitmap_t *r1, const roaring_bitmap_t *r2);`
342+
Difference (r1 \ r2) for 32-bit bitmaps.
343+
- `roaring64_bitmap_t *roaring64_bitmap_andnot(const roaring64_bitmap_t *r1, const roaring64_bitmap_t *r2);`
344+
Difference (r1 \ r2) for 64-bit bitmaps.
345+
346+
## Serialization and Deserialization
347+
- `size_t roaring_bitmap_portable_size_in_bytes(const roaring_bitmap_t *r);`
348+
Get the number of bytes required to serialize a 32-bit bitmap.
349+
- `size_t roaring64_bitmap_portable_size_in_bytes(const roaring64_bitmap_t *r);`
350+
Get the number of bytes required to serialize a 64-bit bitmap.
351+
- `size_t roaring_bitmap_portable_serialize(const roaring_bitmap_t *r, char *buf);`
352+
Serialize a 32-bit bitmap to a buffer (portable format).
353+
- `size_t roaring64_bitmap_portable_serialize(const roaring64_bitmap_t *r, char *buf);`
354+
Serialize a 64-bit bitmap to a buffer (portable format).
355+
- `roaring_bitmap_t *roaring_bitmap_portable_deserialize(const char *buf);`
356+
Deserialize a 32-bit bitmap from a buffer.
357+
- `roaring64_bitmap_t *roaring64_bitmap_portable_deserialize(const char *buf);`
358+
Deserialize a 64-bit bitmap from a buffer.
359+
- `roaring_bitmap_t *roaring_bitmap_portable_deserialize_safe(const char *buf, size_t maxbytes);`
360+
Safe deserialization of a 32-bit bitmap (will not read past `maxbytes`).
361+
- `roaring64_bitmap_t *roaring64_bitmap_portable_deserialize_safe(const char *buf, size_t maxbytes);`
362+
Safe deserialization of a 64-bit bitmap.
363+
- `size_t roaring_bitmap_portable_deserialize_size(const char *buf, size_t maxbytes);`
364+
Get the size of a serialized 32-bit bitmap (returns 0 if invalid).
365+
- `size_t roaring64_bitmap_portable_deserialize_size(const char *buf, size_t maxbytes);`
366+
Get the size of a serialized 64-bit bitmap (returns 0 if invalid).
367+
368+
## Validation
369+
- `bool roaring_bitmap_internal_validate(const roaring_bitmap_t *r, const char **reason);`
370+
Validate the internal structure of a 32-bit bitmap. Returns `true` if valid, `false` otherwise. If invalid, `reason` points to a string describing the problem.
371+
- `bool roaring64_bitmap_internal_validate(const roaring64_bitmap_t *r, const char **reason);`
372+
Validate the internal structure of a 64-bit bitmap.
373+
374+
## Notes
375+
- All memory allocated by the library must be freed using the corresponding `free` function.
376+
- The portable serialization format is cross-platform and can be shared between different languages and architectures.
377+
- Always validate bitmaps deserialized from untrusted sources before using them.
378+
379+
380+
381+
# C++ API functions
382+
383+
The C++ interface is provided via the `roaring.hh` (32-bit) and `roaring64map.hh` (64-bit) headers. These offer a modern, type-safe, and convenient API for manipulating Roaring bitmaps in C++.
384+
385+
## Main Classes
386+
- `roaring::Roaring` — 32-bit Roaring bitmap
387+
- `roaring::Roaring64Map` — 64-bit Roaring bitmap
388+
389+
## Common Methods (32-bit and 64-bit)
390+
- `Roaring()` / `Roaring64Map()`
391+
- Construct an empty bitmap.
392+
- `Roaring(std::initializer_list<uint32_t> values)`
393+
- Construct from a list of values.
394+
- `void add(uint32_t x)` / `void add(uint64_t x)`
395+
- Add a value to the bitmap.
396+
- `void remove(uint32_t x)` / `void remove(uint64_t x)`
397+
- Remove a value from the bitmap.
398+
- `bool contains(uint32_t x) const` / `bool contains(uint64_t x) const`
399+
- Check if a value is present.
400+
- `uint64_t cardinality() const`
401+
- Get the number of elements in the bitmap.
402+
- `bool isEmpty() const`
403+
- Check if the bitmap is empty.
404+
- `void clear()`
405+
- Remove all elements.
406+
- `void runOptimize()`
407+
- Convert internal containers to run containers for better compression.
408+
- `void setCopyOnWrite(bool enable)`
409+
- Enable or disable copy-on-write mode for fast/shallow copies.
410+
- `bool operator==(const Roaring&) const` / `bool operator==(const Roaring64Map&) const`
411+
- Equality comparison.
412+
- `void swap(Roaring&)` / `void swap(Roaring64Map&)`
413+
- Swap contents with another bitmap.
414+
415+
## Set Operations
416+
- `Roaring operator|(const Roaring&) const` / `Roaring64Map operator|(const Roaring64Map&) const`
417+
- Union (OR)
418+
- `Roaring operator&(const Roaring&) const` / `Roaring64Map operator&(const Roaring64Map&) const`
419+
- Intersection (AND)
420+
- `Roaring operator^(const Roaring&) const` / `Roaring64Map operator^(const Roaring64Map&) const`
421+
- Symmetric difference (XOR)
422+
- `Roaring operator-(const Roaring&) const` / `Roaring64Map operator-(const Roaring64Map&) const`
423+
- Difference
424+
- In-place versions: `operator|=`, `operator&=`, `operator^=`, `operator-=`
425+
426+
## Iteration
427+
- `Roaring::const_iterator` / `Roaring64Map::const_iterator`
428+
- Standard C++ iterator support: `begin()`, `end()`
429+
- `void iterate(function, void* param)`
430+
- Call a function for each value (C-style callback).
431+
432+
## Serialization and Deserialization
433+
- `size_t getSizeInBytes() const`
434+
- Get the size in bytes for serialization.
435+
- `void write(char* buf) const`
436+
- Serialize the bitmap to a buffer.
437+
- `static Roaring read(const char* buf, bool portable = true)`
438+
- Deserialize a bitmap from a buffer.
439+
- `static Roaring readSafe(const char* buf, size_t maxbytes, bool portable = true)`
440+
- Safe deserialization (will not read past `maxbytes`).
441+
442+
## Bulk Operations
443+
- `void addMany(size_t n, const uint32_t* values)` / `void addMany(size_t n, const uint64_t* values)`
444+
- Add many values at once.
445+
- `void toUint32Array(uint32_t* out) const` / `void toUint64Array(uint64_t* out) const`
446+
- Export all values to an array.
447+
448+
## Example Usage
449+
```cpp
450+
#include "roaring.hh"
451+
using namespace roaring;
452+
453+
Roaring r1;
454+
r1.add(42);
455+
if (r1.contains(42)) {
456+
// ...
457+
}
458+
Roaring r2 = Roaring::bitmapOf(3, 1, 2, 3);
459+
Roaring r3 = r1 | r2;
460+
for (auto v : r3) {
461+
// iterate over values
462+
}
463+
```
464+
465+
For 64-bit values, use `#include "roaring64map.hh"` and the `Roaring64Map` class, which has a similar API.
466+
467+
468+
# Dealing with large volumes of data
257469

258470
Some users have to deal with large volumes of data. It may be important for these users to be aware of the `addMany` (C++) `roaring_bitmap_or_many` (C) functions as it is much faster and economical to add values in batches when possible. Furthermore, calling periodically the `runOptimize` (C++) or `roaring_bitmap_run_optimize` (C) functions may help.
259471

src/containers/run.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -142,7 +142,7 @@ void run_container_offset(const run_container_t *c, container_t **loc,
142142
pivot = run_container_index_equalorlarger(c, top);
143143
// pivot is the index of the first run that is >= top or -1 if no such run
144144

145-
if(pivot >= 0) {
145+
if (pivot >= 0) {
146146
split = c->runs[pivot].value < top;
147147
lo_cap = pivot + (split ? 1 : 0);
148148
hi_cap = c->n_runs - pivot;

tests/toplevel_unit.c

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -218,6 +218,7 @@ DEFINE_TEST(is_really_empty) {
218218
roaring_bitmap_free(bm);
219219
}
220220

221+
#if !CROARING_IS_BIG_ENDIAN
221222
// https://github.com/Ezibenroc/PyRoaringBitMap/issues/124
222223
DEFINE_TEST(PyRoaringBitMap124) {
223224
// adversarial test case
@@ -239,6 +240,7 @@ DEFINE_TEST(PyRoaringBitMap124) {
239240
const roaring_bitmap_t *r2 = roaring_bitmap_frozen_view(data, length);
240241
assert_true(r2 == NULL);
241242
}
243+
#endif
242244

243245
DEFINE_TEST(inplaceorwide) {
244246
uint64_t end = 4294901761;
@@ -4827,7 +4829,6 @@ int main() {
48274829
tellmeall();
48284830

48294831
const struct CMUnitTest tests[] = {
4830-
cmocka_unit_test(PyRoaringBitMap124),
48314832
cmocka_unit_test(fuzz_deserializer),
48324833
cmocka_unit_test(issue660),
48334834
cmocka_unit_test(issue538b),
@@ -4845,6 +4846,7 @@ int main() {
48454846
cmocka_unit_test(issue316),
48464847
cmocka_unit_test(issue288),
48474848
#if !CROARING_IS_BIG_ENDIAN
4849+
cmocka_unit_test(PyRoaringBitMap124),
48484850
cmocka_unit_test(issue245),
48494851
#endif
48504852
cmocka_unit_test(issue208),

0 commit comments

Comments
 (0)