package grenier
Install
Dune Dependency
Authors
Maintainers
Sources
sha256=dec7f84b9e93d5825f10c7dea84d5a74d7365ede45664ae63c26b5e8045c1c44
sha512=b8aa1569c2e24b89674d1b34de34cd1798896bb6a53aa5a1287f68cee880125e6b687f66ad73da9069a01cc3ece1f0684f48328b099d43529bff736b772c8fd8
doc/grenier.hll/Hll/index.html
Module Hll
Source
An implementation of HyperLogLog probabilistic cardinality estimator.
Type of HyperLogLog counters
Create a new counter with error
error rate. error
should verify 0.0 < error && error < 1.0
. 0.05
is a reasonable default.
Use estimate_memory
to measure memory consumption and runtime of this function.
add t k
counts item k
in t
.
k
should be "random": it should be the output of some cryptographic hashing algorithm like SHA. It is not treated as an integer. This is key to getting proper results. No patterns should appear in the bits of the different items added.
Runtime is O(1).
Estimate the memory consumed in bytes by a counter with the specified error rate.
This ignores the constant overhead of the OCaml representation, around two words. It is a bytes
of estimate_memory ~error + 1
length.
merge ~into:t0 t'
has the same effect as adding all items added to t'
to t0
.
t0
and t'
must have been constructed with the same error rate!
The following algorithm provide a reasonable hashing function for integers, if you want to feed the HLL with "normal" integers.