package octez-libs
Install
Dune Dependency
Authors
Maintainers
Sources
sha256=aa2f5bc99cc4ca2217c52a1af2a2cdfd3b383208cb859ca2e79ca0903396ca1d
sha512=d68bb3eb615e3dcccc845fddfc9901c95b3c6dc8e105e39522ce97637b1308a7fa7aa1d271351d5933febd7476b2819e1694f31198f1f0919681f1f9cc97cb3a
doc/octez-libs.stdlib/Tezos_stdlib/Bloomer/index.html
Module Tezos_stdlib.Bloomer
Source
A probabilistic set implementation in a fixed memory buffer, with an optional best-effort cleanup mechanism. The bigger the memory buffer, the less false positive outcomes of mem
.
In a standard bloom filter, element membership is encoded as bits being equal to 1 at indices obtained by hashing said element. In this implementation, elements are associated not to bits but to counters.
The countdown
function decrements the counter associated to an element. Hence, each counter corresponds to the number of calls to the countdown
function before they are removed from the filter, assuming no collision occurs.
To the best of our knowledge, the variant of bloom filters implemented in this module is new. In particular, this implementation does not correspond to counting bloom filters as described eg here: https://en.wikipedia.org/wiki/Counting_Bloom_filter
In order to emphasize the use of counters as a time-based garbage collection mechanism, we call this implementation a generational bloom filter.
create ~hash ~hashes ~index_bits ~countdown_bits
creates an initially empty generational bloom filter. The number of generations is 2^countdown_bits
. The filter is an array of 2^index_bits
countdown cells, each of size countdown_bits
. The resulting filter takes 2^index_bits * countdown_bits
bits in memory. The hash function must return enough bytes to represent hashes
indexes of index_bits
bits.
When a value is add
ed, its hash
is split into hashes
chunks of index_bits
, that are used as indexes in the filter's countdown array. These countdown cells are then set to their maximum value, that is 2^countdown_bits-1
.
The value will remain a mem
ber for as long as all these cells are above zero, which in the most optimistic case (where no collision occur) is until countdown
has been called 2^countdown_bits-1
times. An exception is if clear
is called, in which case it is certain to disappear, as all other values.
Arguments to create
are subject to the following constraints:
0 < index_bits <= 24
0 < countdown_bits <= 24
Force removing an element, which may remove others in case of collisions. Use with care.
Percentage (in the 0;1
interval) of cells which are nonzero.
Histogram of life expectancies (measured in number of countdowns to 0).