package repr

  1. Overview
  2. Docs

Module ReprSource

Yet-another type combinator library

Repr provides type combinators to define runtime representation for OCaml types and generic operations to manipulate values with a runtime type representation.

The type combinators supports all the usual type primitives but also compact definitions of records and variants. It also allows the definition of run-time representations of recursive types.

Type Combinators

Sourcetype 'a t

The type for runtime representation of values of type 'a.

Sourcetype len = [
  1. | `Int
  2. | `Int8
  3. | `Int16
  4. | `Int32
  5. | `Int64
  6. | `Fixed of int
]

The type of integer used to store buffers, list or array lengths.

Int use a (compressed) variable encoding to encode integers in a binary format, while IntX always use X bytes. Overflows are not detected.

Primitives

Sourceval unit : unit t

unit is a representation of the unit type.

Sourceval bool : bool t

bool is a representation of the boolean type.

Sourceval char : char t

char is a representation of the character type.

Sourceval int : int t

int is a representation of integers. Binary serialization uses a varying-width representation.

Sourceval int32 : int32 t

int32 is a representation of the 32-bit integer type.

int63 is a representation of the 63-bit integer type supplied by the Optint library.

Sourceval int64 : int64 t

int64 is a representation of the 64-bit integer type.

Sourceval float : float t

float is a representation of the float type.

Sourceval string : string t

string is a representation of the string type.

Sourceval bytes : bytes t

bytes is a representation of the bytes type.

Sourceval string_of : len -> string t

Like string but with a given kind of size.

Sourceval bytes_of : len -> bytes t

Like bytes but with a given kind of size.

Sourceval boxed : 'a t -> 'a t

boxed t is the same as t but with a binary representation which is always boxed (e.g. top-level values won't be unboxed). This forces Unboxed functions to be exactly the same as boxed ones.

Sourceval list : ?len:len -> 'a t -> 'a list t

list t is a representation of lists of values of type t.

Sourceval array : ?len:len -> 'a t -> 'a array t

array t is a representation of arrays of values of type t.

Sourceval option : 'a t -> 'a option t

option t is a representation of values of type t option.

Sourceval pair : 'a t -> 'b t -> ('a * 'b) t

pair x y is a representation of values of type x * y.

Sourceval triple : 'a t -> 'b t -> 'c t -> ('a * 'b * 'c) t

triple x y z is a representation of values of type x * y * z.

Sourceval quad : 'a t -> 'b t -> 'c t -> 'd t -> ('a * 'b * 'c * 'd) t

quad w x y z is a representation of values of type w * x * y * z.

Sourceval result : 'a t -> 'b t -> ('a, 'b) result t

result a b is a representation of values of type (a, b) result.

Sourceval either : 'a t -> 'b t -> ('a, 'b) Either.t t

either a b is a representation of values of type (a, b) Either.t.

Sourceval seq : 'a t -> 'a Seq.t t

seq t is a representation of sequences of values of type t.

Sourceval ref : 'a t -> 'a ref t

ref t is a representation of references to values of type t.

Note: derived deserialisation functions will not preserve reference sharing.

Sourceval lazy_t : 'a t -> 'a Lazy.t t

lazy_t t is a representation of lazy values of type t.

Note: derived deserialisation functions on the resulting type will not be lazy.

Sourceval queue : 'a t -> 'a Queue.t t

queue t is a representation of queues of values of type t.

Sourceval stack : 'a t -> 'a Stack.t t

stack t is a representation of stacks of values of type t.

Sourceval hashtbl : 'k t -> 'v t -> ('k, 'v) Hashtbl.t t

hashtbl k v is a representation of hashtables with keys of type k and values of type v.

Sourceval set : (module Set.S with type elt = 'elt and type t = 'set) -> 'elt t -> 'set t

set (module Set) elt is a representation of sets with elements of type elt. See Of_set for a functorised equivalent of this function.

Sourcemodule Of_set (Set : sig ... end) : sig ... end

Functor for building representatives of sets from the standard library.

Sourcemodule Of_map (Map : sig ... end) : sig ... end

Functor for building representatives of maps from the standard library.

Sourcetype empty = |

An uninhabited type, defined as a variant with no constructors.

Sourceval empty : empty t

empty is a representation of the empty type.

Records

Sourcetype ('a, 'b, 'c) open_record

The type for representing open records of type 'a with a constructor of type 'b. 'c represents the remaining fields to be described using the (|+) operator. An open record initially satisfies 'c = 'b and can be sealed once 'c = 'a.

Sourceval record : string -> 'b -> ('a, 'b, 'b) open_record

record n f is an incomplete representation of the record called n of type 'a with constructor f. To complete the representation, add fields with (|+) and then seal the record with sealr.

The name n is used for non-binary encoding/decoding and for pretty printing.

Sourcetype ('a, 'b) field

The type for fields holding values of type 'b and belonging to a record of type 'a.

Sourceval field : string -> 'a t -> ('b -> 'a) -> ('b, 'a) field

field n t g is the representation of the field called n of type t with getter g. Raises. Invalid_argument if n is not valid UTF-8.

The name n is used for non-binary encoding/decoding and for pretty printing. It must not be used by any other field in the record.

For instance:

  type manuscript = { title : string option }

  let manuscript = field "title" (option string) (fun t -> t.title)
Sourceval (|+) : ('a, 'b, 'c -> 'd) open_record -> ('a, 'c) field -> ('a, 'b, 'd) open_record

r |+ f is the open record r augmented with the field f.

Sourceval sealr : ('a, 'b, 'a) open_record -> 'a t

sealr r seals the open record r. Raises. Invalid_argument if two or more fields share the same name.

Putting all together:

  type menu = { restaurant : string; items : (string * int32) list }

  let t =
    record "t" (fun restaurant items -> { restaurant; items })
    |+ field "restaurant" string (fun t -> t.restaurant)
    |+ field "items" (list (pair string int32)) (fun t -> t.items)
    |> sealr

Variants

Sourcetype ('a, 'b, 'c) open_variant

The type for representing open variants of type 'a with pattern matching of type 'b. 'c represents the remaining constructors to be described using the (|~) operator. An open variant initially satisfies c' = 'b and can be sealed once 'c = 'a.

Sourceval variant : string -> 'b -> ('a, 'b, 'b) open_variant

variant n p is an incomplete representation of the variant type called n of type 'a using p to deconstruct values. To complete the representation, add cases with (|~) and then seal the variant with sealv.

The name n is used for non-binary encoding/decoding and for pretty printing.

Sourcetype ('a, 'b) case

The type for representing variant cases of type 'a with patterns of type 'b.

Sourcetype 'a case_p

The type for representing patterns for a variant of type 'a.

Sourceval case0 : string -> 'a -> ('a, 'a case_p) case

case0 n v is a representation of a variant constructor v with no arguments and name n. Raises. Invalid_argument if n is not valid UTF-8.

The name n is used for non-binary encoding/decoding and for pretty printing. It must not by used by any other case0 in the record.

For instance:

  type t = Foo

  let foo = case0 "Foo" Foo
Sourceval case1 : string -> 'b t -> ('b -> 'a) -> ('a, 'b -> 'a case_p) case

case1 n t c is a representation of a variant constructor c with an argument of type t and name n. Raises. Invalid_argument if n is not valid UTF-8.

The name n is used for non-binary encoding/decoding and for pretty printing. It must not by used by any other case1 in the record.

For instance:

  type t = Foo of string

  let foo = case1 "Foo" string (fun s -> Foo s)
Sourceval (|~) : ('a, 'b, 'c -> 'd) open_variant -> ('a, 'c) case -> ('a, 'b, 'd) open_variant

v |~ c is the open variant v augmented with the case c.

Sourceval sealv : ('a, 'b, 'a -> 'a case_p) open_variant -> 'a t

sealv v seals the open variant v. Raises. Invalid_argument if two or more cases of same arity share the same name.

Putting all together:

  type t = Foo | Bar of string

  let t =
    variant "t" (fun foo bar -> function Foo -> foo | Bar s -> bar s)
    |~ case0 "Foo" Foo
    |~ case1 "Bar" string (fun x -> Bar x)
    |> sealv
Sourceval enum : string -> (string * 'a) list -> 'a t

enum n cs is a representation of the variant type called n with singleton cases cs. e.g.

  type t = Foo | Bar | Toto

  let t = enum "t" [ ("Foo", Foo); ("Bar", Bar); ("Toto", Toto) ]

The name n and the case names are used for non-binary encoding/decoding and for pretty printing. Raises. Invalid_argument if two or more cases share the same name.

Recursive definitions

Repr allows a limited description of recursive records and variants.

TODO: describe the limitations, e.g. only regular recursion and no use of the generics inside the mu* functions and the usual caveats with recursive values (such as infinite loops on most of the generics which don't check sharing).

Sourceval mu : ('a t -> 'a t) -> 'a t

mu f is the representation r such that r = mu r.

For instance:

  type x = { x : x option }

  let x =
    mu (fun x ->
        record "x" (fun x -> { x })
        |+ field "x" (option x) (fun x -> x.x)
        |> sealr)
Sourceval mu2 : ('a t -> 'b t -> 'a t * 'b t) -> 'a t * 'b t

mu2 f is the representations r and s such that r, s = mu2 r s.

For instance:

  type r = { foo : int; bar : string list; z : z option }
  and z = { x : int; r : r list }

  (* Build the representation of [r] knowing [z]'s. *)
  let mkr z =
    record "r" (fun foo bar z -> { foo; bar; z })
    |+ field "foo" int (fun t -> t.foo)
    |+ field "bar" (list string) (fun t -> t.bar)
    |+ field "z" (option z) (fun t -> t.z)
    |> sealr

  (* And the representation of [z] knowing [r]'s. *)
  let mkz r =
    record "z" (fun x r -> { x; r })
    |+ field "x" int (fun t -> t.x)
    |+ field "r" (list r) (fun t -> t.r)
    |> sealr

  (* Tie the loop. *)
  let r, z = mu2 (fun r z -> (mkr z, mkz y))

Staging

Sourcetype +'a staged

The type for staged operations.

Sourceval stage : 'a -> 'a staged

stage x stages x, where x would typically be a function that is expensive to construct.

Sourceval unstage : 'a staged -> 'a

unstage x unstages x.

Both stage and unstage are implemented with the identity function.

As the generic operations tend to be used repeatedly with the same left-most parameters, this type trick encourages the user to specialise them only once for performance reasons.

For instance:

  let t = Repr.(pair int bool)
  let compare = Repr.(unstage (compare t))

  let sorted_list =
    List.init 42_000 (fun _ -> (Random.int 100_000, Random.bool ()))
    |> List.sort compare

Generic Operations

Given a value 'a t, it is possible to define generic operations on value of type 'a such as pretty-printing, parsing and unparsing.

Sourcetype 'a equal = 'a -> 'a -> bool
Sourceval equal : 'a t -> 'a equal staged

equal t is the equality function between values of type t.

Sourcetype 'a compare = 'a -> 'a -> int
Sourceval compare : 'a t -> 'a compare staged

compare t compares values of type t.

Sourcetype 'a pp = Format.formatter -> 'a -> unit

The type for pretty-printers.

Sourcetype 'a of_string = string -> ('a, [ `Msg of string ]) result

The type for parsers.

Sourceval pp : 'a t -> 'a pp

pp t is the pretty-printer for values of type t.

Sourceval pp_dump : 'a t -> 'a pp

pp_dump t is the dump pretty-printer for values of type t.

This pretty-printer outputs an encoding which is as close as possible to native OCaml syntax, so that the result can easily be copy-pasted into an OCaml REPL to inspect the value further.

Sourceval pp_ty : 'a t pp

The pretty printer for generics of type t.

Sourceval to_string : 'a t -> 'a -> string

to_string t is Fmt.to_to_string (pp t).

Sourceval of_string : 'a t -> 'a of_string

of_string t parses values of type t.

Sourceval random : 'a t -> (unit -> 'a) staged

random t is a random value generator for values of type t. For bounded types, values are sampled uniformly; for unbounded ones (lists, strings etc.), the length is first chosen according to a geometric distribution.

Derived generators use the global PRNG state provided by Stdlib.Random.get_state.

NOTE: this generator may fail to terminate when sampling a recursive type.

Sourceval random_state : 'a t -> (Random.State.t -> 'a) staged

random_state is a variant of random that takes an explicit PRNG state to use for random generation.

Sourcetype 'a ty = 'a t
Sourcemodule Attribute : sig ... end

Attributes provide a mechanism for attaching metadata to type representations.

JSON converters

Sourcemodule Json : sig ... end

Overlay on top of Jsonm to work with rewindable streams.

Sourcetype 'a encode_json = Jsonm.encoder -> 'a -> unit

The type for JSON encoders.

Sourcetype 'a decode_json = Json.decoder -> ('a, [ `Msg of string ]) result

The type for JSON decoders.

Sourceval pp_json : ?minify:bool -> 'a t -> 'a Fmt.t

Similar to pp_dump but pretty-prints the JSON representation instead of the OCaml one. See encode_json for details about the encoding.

For instance:

  type t = { foo : int option; bar : string list }

  let t =
    record "r" (fun foo bar -> { foo; bar })
    |+ field "foo" (option int) (fun t -> t.foo)
    |+ field "bar" (list string) (fun t -> t.bar)
    |> sealr

  let s = Fmt.str "%a\n" (pp t) { foo = None; bar = [ "foo" ] }

  (* s is "{ foo = None; bar = [\"foo\"]; }" *)

  let j = Fmt.str "%a\n" (pp_json t) { foo = None; bar = [ "foo" ] }

  (* j is "{ \"bar\":[\"foo\"] }" *)

NOTE: this will automatically convert JSON fragments to valid JSON objects by adding an enclosing array if necessary.

Sourceval encode_json : 'a t -> Jsonm.encoder -> 'a -> unit

encode_json t e encodes t into the jsonm encoder e. The encoding is a relatively straightforward translation of the OCaml structure into JSON. The main highlights are:

  • The unit value () is translated into the empty object {}.
  • OCaml ints are translated into JSON floats.
  • OCaml strings are translated into JSON strings. You must then ensure that the OCaml strings contains only valid UTF-8 characters.
  • OCaml options are translated differently depending on context: record fields with a value of None are removed from the JSON object; record fields with a value of Some x are automatically unboxed into x; and outside of records, None is translated into null and Some x into {"some": x'} with x' the JSON encoding of x.
  • Variant cases built using case0 are represented as strings.
  • Variant cases built using case1 are represented as a record with one field; the field name is the name of the variant.

NOTE: this can be used to encode JSON fragments. It's the responsibility of the caller to ensure that the encoded JSON fragment fits properly into a well-formed JSON object.

Sourceval decode_json : 'a t -> Jsonm.decoder -> ('a, [ `Msg of string ]) result

decode_json t e decodes values of type t from the jsonm decoder e.

Sourceval decode_json_lexemes : 'a t -> Jsonm.lexeme list -> ('a, [ `Msg of string ]) result

decode_json_lexemes is similar to decode_json but uses an already decoded list of JSON lexemes instead of a decoder.

Sourceval to_json_string : ?minify:bool -> 'a t -> 'a -> string

to_json_string is encode_json with a string encoder.

Sourceval of_json_string : 'a t -> string -> ('a, [ `Msg of string ]) result

of_json_string is decode_json with a string decoder .

Binary Converters

Sourcetype 'a encode_bin = 'a -> (string -> unit) -> unit

The type for binary encoders.

Sourcetype 'a decode_bin = string -> int ref -> 'a

The type for binary decoders.

Sourcetype -'a size_of

The type for size function related to binary encoder/decoders.

Sourcetype 'a short_hash := ?seed:int -> 'a -> int
Sourceval short_hash : 'a t -> 'a short_hash staged

hash t x is a short hash of x of type t.

Sourceval pre_hash : 'a t -> 'a encode_bin staged

pre_hash t x is the string representation of x, of type t, which will be used to compute the digest of the value. By default it's to_bin_string t x but it can be overriden by like and map operators.

Sourceval encode_bin : 'a t -> 'a encode_bin staged

encode_bin t is the binary encoder for values of type t.

Sourceval decode_bin : 'a t -> 'a decode_bin staged

decode_bin t is the binary decoder for values of type t.

Sourceval to_bin_string : 'a t -> ('a -> string) staged

to_bin_string t x use encode_bin to convert x, of type t, to a string.

NOTE: When t is string or bytes, the original buffer x is not prefixed by its size as encode_bin would do. If t is string, the result is x (without copy).

Sourceval of_bin_string : 'a t -> (string -> ('a, [ `Msg of string ]) result) staged

of_bin_string t s is v such that s = to_bin_string t v.

NOTE: When t is string, the result is s (without copy).

Sourceval size_of : 'a t -> ('a -> int option) staged

size_of t x is either the size of encode_bin t x or the binary encoding of x, if the backend is not able to pre-compute serialisation lengths.

Sourcemodule Size : sig ... end
Sourcemodule Unboxed : sig ... end

Unboxed operations assumes that value being serialized is fully filling the underlying buffer. When that's the case, it is not necessary to prefix the value's binary representation by its size, as it is exactly the buffer's size.

Abstract types

Sourceval abstract : pp:'a pp -> of_string:'a of_string -> json:('a encode_json * 'a decode_json) -> bin:('a encode_bin * 'a decode_bin * 'a size_of) -> ?unboxed_bin:('a encode_bin * 'a decode_bin * 'a size_of) -> equal:'a equal -> compare:'a compare -> short_hash:'a short_hash -> pre_hash:'a encode_bin -> unit -> 'a t

The representation of an abstract type, with an internal structure that is opaque to Repr, that supports the generic operations above.

Overriding specific operations

For a given type representation, each generic operation can be implemented in one of the following ways:

Sourcetype 'a impl =
  1. | Structural
    (*

    The automatic implementation derived from the type structure.

    *)
  2. | Custom of 'a
    (*

    A hand-written implementation.

    *)
  3. | Undefined
    (*

    An unimplemented operation that raises Unsupported_operation when invoked.

    *)
Sourceexception Unsupported_operation of string
Sourceval partially_abstract : pp:'a pp impl -> of_string:'a of_string impl -> json:('a encode_json * 'a decode_json) impl -> bin:('a encode_bin * 'a decode_bin * 'a size_of) impl -> unboxed_bin:('a encode_bin * 'a decode_bin * 'a size_of) impl -> equal:'a equal impl -> compare:'a compare impl -> short_hash:'a short_hash impl -> pre_hash:'a encode_bin impl -> 'a t -> 'a t

partially_abstract t is a partially-abstract type with internal representation t. The named arguments specify the implementation of each of the generic operations on this type.

Sourceval like : ?pp:'a pp -> ?of_string:'a of_string -> ?json:('a encode_json * 'a decode_json) -> ?bin:('a encode_bin * 'a decode_bin * 'a size_of) -> ?unboxed_bin:('a encode_bin * 'a decode_bin * 'a size_of) -> ?equal:'a equal -> ?compare:'a compare -> ?short_hash:'a short_hash -> ?pre_hash:'a encode_bin -> 'a t -> 'a t

A wrapper around partially_abstract with each operation defaulting to `Structural and admitting a `Custom override.

Note: if ~compare is passed and ~equal is not then the default equality function (fun x y -> compare x y = 0) will be used.

Sourceval map : ?pp:'a pp -> ?of_string:'a of_string -> ?json:('a encode_json * 'a decode_json) -> ?bin:('a encode_bin * 'a decode_bin * 'a size_of) -> ?unboxed_bin:('a encode_bin * 'a decode_bin * 'a size_of) -> ?equal:'a equal -> ?compare:'a compare -> ?short_hash:'a short_hash -> ?pre_hash:'a encode_bin -> 'b t -> ('b -> 'a) -> ('a -> 'b) -> 'a t

This combinator allows defining a representative of one type in terms of another by supplying coercions between them. For a representative of Stdlib.Map, see Of_map.

Sourcemodule type S = sig ... end
Sourcemodule type DSL = sig ... end

Miscellaneous modules

Sourcemodule Binary : sig ... end

This module provides functions for interacting with Repr's binary serialisation format directly (without first constructing a representation of the type being encoded). These can be useful for performance-critical applications, where the runtime overhead of the dynamic specialisation is too large, or when the actual codec being used is too complex to be expressed via a type representation.

Sourcemodule Staging : sig ... end

This module is intended to be globally opened.

Sourcemodule Witness : sig ... end
OCaml

Innovation. Community. Security.