Legend:
Page
Library
Module
Module type
Parameter
Class
Class type
Source
Source file proof_intf.ml
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357(*
* Copyright (c) 2013-2022 Thomas Gazagnaire <thomas@gazagnaire.org>
*
* Permission to use, copy, modify, and distribute this software for any
* purpose with or without fee is hereby granted, provided that the above
* copyright notice and this permission notice appear in all copies.
*
* THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
* WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
* MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
* ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
* WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
* ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
* OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
*)moduletypeS=sig(** Proofs are compact representations of trees which can be shared between
peers.
This is expected to be used as follows:
- A first peer runs a function [f] over a tree [t]. While performing this
computation, it records: the hash of [t] (called [before] below), the
hash of [f t] (called [after] below) and a subset of [t] which is needed
to replay [f] without any access to the first peer's storage. Once done,
all these informations are packed into a proof of type [t] that is sent
to the second peer.
- The second peer generates an initial tree [t'] from [p] and computes
[f t']. Once done, it compares [t']'s hash and [f t']'s hash to [before]
and [after]. If they match, they know that the result state [f t'] is a
valid context state, without having to have access to the full storage
of the first peer. *)typecontentstypehashtypesteptypemetadatatypekinded_hash=[`Contentsofhash*metadata|`Nodeofhash][@@derivingirmin]type'ainode={length:int;proofs:(int*'a)list}[@@derivingirmin](** The type for (internal) inode proofs.
These proofs encode large directories into a tree-like structure.
Invariants are dependent on the backend.
[length] is the total number of entries in the children of the inode. It's
the size of the "flattened" version of that inode. [length] can be used to
prove the correctness of operations such as [Tree.length] and
[Tree.list ~offset ~length] in an efficient way.
[proofs] contains the children proofs. It is a sparse list of ['a] values.
These values are associated to their index in the list, and the list is
kept sorted in increasing order of indices. ['a] can be a concrete proof
or a hash of that proof.
{e For [irmin-pack]}: [proofs] have a length of at most [Conf.entries]
entries. For binary trees, this boolean index is a step of the left-right
sequence / decision proof corresponding to the path in that binary tree. *)type'ainode_extender={length:int;segments:intlist;proof:'a}[@@derivingirmin](** The type for inode extenders.
An extender is a compact representation of a sequence of [inode] which
contain only one child. As for inodes, the ['a] parameter can be a
concrete proof or a hash of that proof.
If an inode proof contains singleton children [i_0, ..., i_n] such as:
[{length=l; proofs = \[ (i_0, {proofs = ... { proofs = \[ (i_n, p) \] }})\]}],
then it is compressed into the inode extender
[{length=l; segment = \[i_0;..;i_n\]; proof=p}] sharing the same length
[l] and final proof [p]. *)(** The type for compressed and partial Merkle tree proofs.
Tree proofs do not provide any guarantee with the ordering of
computations. For instance, if two effects commute, they won't be
distinguishable by this kind of proof.
[Value v] proves that a value [v] exists in the store.
[Blinded_value h] proves a value with hash [h] exists in the store.
[Node ls] proves that a a "flat" node containing the list of files [ls]
exists in the store. {e For [irmin-pack]}: the length of [ls] is at most
[Conf.stable_hash];
[Blinded_node h] proves that a node with hash [h] exists in the store.
[Inode i] proves that an inode [i] exists in the store.
[Extender e] proves that an inode extender [e] exist in the store. *)typetree=|Contentsofcontents*metadata|Blinded_contentsofhash*metadata|Nodeof(step*tree)list|Blinded_nodeofhash|Inodeofinode_treeinode|Extenderofinode_treeinode_extender[@@derivingirmin](** The type for inode trees. It is a subset of [tree], limited to nodes.
[Blinded_inode h] proves that an inode with hash [h] exists in the store.
[Inode_values ls] is simliar to trees' [Node].
[Inode_tree i] is similar to tree's [Inode].
[Inode_extender e] is similar to trees' [Extender]. *)andinode_tree=|Blinded_inodeofhash|Inode_valuesof(step*tree)list|Inode_treeofinode_treeinode|Inode_extenderofinode_treeinode_extender[@@derivingirmin](** Stream proofs represent an explicit traversal of a Merle tree proof. Every
element (a node, a value, or a shallow pointer) met is first "compressed"
by shallowing its children and then recorded in the proof.
As stream proofs directly encode the recursive construction of the Merkle
root hash is slightly simpler to implement: the verifier simply needs to
hash the compressed elements lazily, without any memory or choice.
Moreover, the minimality of stream proofs is trivial to check. Once the
computation has consumed the compressed elements required, it is
sufficient to check that no more compressed elements remain in the proof.
However, as the compressed elements contain all the hashes of their
shallow children, the size of stream proofs is larger (at least double in
size in practice) than tree proofs, which only contains the hash for
intermediate shallow pointers. *)(** The type for elements of stream proofs.
[Value v] is a proof that the next element read in the store is the value
[v].
[Node n] is a proof that the next element read in the store is the node
[n].
[Inode i] is a proof that the next element read in the store is the inode
[i].
[Inode_extender e] is a proof that the next element read in the store is
the node extender [e]. *)typeelt=|Contentsofcontents|Nodeof(step*kinded_hash)list|Inodeofhashinode|Inode_extenderofhashinode_extender[@@derivingirmin]typestream=eltSeq.t[@@derivingirmin](** The type for stream proofs.
The sequance [e_1 ... e_n] proves that the [e_1], ..., [e_n] are read in
the store in sequence. *)type'at[@@derivingirmin](** The type for proofs of kind ['a] (i.e. [stream] or [proof]).
A proof [p] proves that the state advanced from [before p] to [after p].
[state p]'s hash is [before p], and [state p] contains the minimal
information for the computation to reach [after p]. *)valv:before:kinded_hash->after:kinded_hash->'a->'at(** [v ~before ~after p] proves that the state advanced from [before] to
[after]. [p]'s hash is [before], and [p] contains the minimal information
for the computation to reach [after]. *)valbefore:'at->kinded_hash(** [before t] it the state's hash at the beginning of the computation. *)valafter:'at->kinded_hash(** [after t] is the state's hash at the end of the computation. *)valstate:'at->'a(** [proof t] is a subset of the initial state needed to prove that the proven
computation could run without performing any I/O. *)end(** Environment that tracks side effects during the production/consumption of
proofs.
{1 The Merkle Proof Construction Algorithm}
This description stands for [Set] proofs and assumes that the large nodes
are represented by the backend as a tree structure (i.e. inodes).
There are 4 distinct phases when working with Irmin's merkle proofs:
[Produce | Serialise | Deserialise | Consume].
{2 [Produce]}
This phase runs the [f] function provided by the Irmin user. It builds an
[after] tree from a [before] tree that has been setup with an [Env] that
records every backend reads into two hash tables.
During the next phase (i.e. [Serialise]) the cleared [before] tree will be
traversed from root to stems only following the paths that are referenced in
[Env].
In practice [Env] doesn't exactly record the reads, it keeps track of all
the [hash -> backend node] and [hash -> backend contents] mappings that are
directly output of the backend stores through [P.Node.find] and
[P.Contents.find]. This is obviously enough to remember the contents, the
nodes and the inodes tips, but the inner inodes are not directly referenced
in the hash tables.
The inner inodes are in fact referenced in their inode tip which is itself
referenced in [Env]'s hash tables. Since an inode shares its lazy pointers
with the inodes derived from it, even the inner inodes that are loaded from
the derived tips will be available from the original inode tip.
{2 [Serialise]}
In this phase, the [Env] contains everything necessary for the computation
of a Merkle proof from a cleared [before]. The [Env] now affects
[Node.cached_value] and [Contents.cached_value] allowing for the discovery
of the cached closure.
{2 [Deserialise]}
In this phase the [Env] is filled by recursively destructing the proof and
filling it before the [Consume] phase.
{2 [Consume]}
In this last phase the [Env] is again made accessible through
[Node.cached_pvalue] and [Contents.cached_pvalue], making it possible for
the user to reference by [hash] everything that was contained in the proof.
{1 Nodes and Portable Nodes}
While the [Produce] phase must be connected to the backend to records reads,
the [Consume] phase must be disconnected from the backend.
[Produce] manipulates backend nodes of type [Backend.Node.Val.t] (the ones
enriched with backend keys)
[Consume] is restricted to manipulating nodes of type
[Backend.Node_portable.t].
{1 Hashing of Backend Nodes with Streamed Proofs}
Hashing a backend node or calling [head] on it may trigger IOs in order to
load inner inodes (this is the case in irmin-pack).
In various places, [Env] requires calling [head] or [hash_exn] on nodes.
[Env] must be very careful that these two facts do not lead to chaos during
the recording of IOs' order.
Two tricks are in place to prevent problems:
- The [Node.of_proof] functions return nodes that don't require IOs to
produce their hash (i.e. they use caching if necessary).
- The [Node.head] function that is called on a node during
[dehydrate_stream_node] is also called just after [rehydrate_stream_node]. *)moduletypeEnv=sigtypekind=Set|Streamtypemode=Produce|Serialise|Deserialise|Consumetypet[@@derivingirmin]typehashtypenodetypepnodetypecontentstypestreamvalis_empty:t->boolvalempty:unit->tvalcopy:into:t->t->unit(** {2 Modes} *)valset_mode:t->kind->mode->unitvalwith_set_produce:(t->start_serialise:(unit->unit)->'aLwt.t)->'aLwt.tvalwith_set_consume:(t->stop_deserialise:(unit->unit)->'aLwt.t)->'aLwt.tvalwith_stream_produce:(t->to_stream:(unit->stream)->'aLwt.t)->'aLwt.tvalwith_stream_consume:stream->(t->is_empty:(unit->bool)->'aLwt.t)->'aLwt.t(** {2 Interactions With [Tree]} *)valadd_contents_from_store:t->hash->contents->unitvaladd_node_from_store:t->hash->node->node(** [add_node_from_store] returns a [node] and not [unit] because [Env] may
take the opportunity to wrap the input node in [Node.Val.with_handler]. *)valadd_contents_from_proof:t->hash->contents->unitvaladd_pnode_from_proof:t->hash->pnode->unitvalfind_contents:t->hash->contentsoptionvalfind_node:t->hash->nodeoptionvalfind_pnode:t->hash->pnodeoptionendmoduletypeProof=sigmoduletypeS=SmoduletypeEnv=EnvexceptionBad_proofof{context:string}typebad_stream_exn=|Stream_too_longof{context:string;reason:string}|Stream_too_shortof{context:string;reason:string}|Proof_mismatchof{context:string;reason:string}exceptionBad_streamofbad_stream_exnvalbad_proof_exn:string->'avalbad_stream_exn:string->string->'avalbad_stream_too_long:string->string->'avalbad_stream_too_short:string->string->'amoduleMake(C:Type.S)(H:Hash.S)(P:sigtypestep[@@derivingirmin]end)(M:Type.S):sigincludeSwithtypecontents:=C.tandtypehash:=H.tandtypestep:=P.stepandtypemetadata:=M.tendmoduleEnv(B:Backend.S)(P:Swithtypecontents:=B.Contents.Val.tandtypehash:=B.Hash.tandtypestep:=B.Node.Val.stepandtypemetadata:=B.Node.Val.metadata):Envwithtypehash:=B.Hash.tandtypecontents:=B.Contents.Val.tandtypenode:=B.Node.Val.tandtypepnode:=B.Node_portable.tandtypestream:=P.streamend