package multipart_form

  1. Overview
  2. Docs

Multipart-form.

The MIME type multipart/form-data is used to express values submitted through a <form>. This module helps the user to extract these values from an input.

module Field_name : sig ... end
module Content_type : sig ... end
module Content_encoding : sig ... end
module Content_disposition : sig ... end
module Field : sig ... end
module Header : sig ... end

Decoder.

type 'id emitters = Header.t -> (string option -> unit) * 'id

Type of emitters.

An emitters is able to produce from the given header a pusher which is able to save contents and a unique ID to be able to get the content furthermore.

type 'a elt = {
  1. header : Header.t;
  2. body : 'a;
}

Type of a simple element.

An element is a part in sense of the multipart/form-data format. A part can contains multiple parts. It has systematically a Header.t.

type 'a t =
  1. | Leaf of 'a elt
  2. | Multipart of 'a t option list elt

Type of multipart/form-data contents.

  • a Leaf is a content with a simple header.
  • a Multipart is a list of possibly empty (option) sub-elements - indeed, we can have a multipart inside a multipart.
val map : ('a -> 'b) -> 'a t -> 'b t
val flatten : 'a t -> 'a elt list

Streaming API.

val parse : emitters:'id emitters -> Content_type.t -> [ `String of string | `Eof ] -> [ `Continue | `Done of 'id t | `Fail of string ]

parse ~emitters content_type returns a function that can be called repeatedly to feed it successive chunks of a multipart/form-data input stream. It then allows streaming the output (the contents of the parts) through the emitters callback.

For each part, the parser calls emitters to be able to save contents and get a reference of it. Each part then corresponds to a Leaf in the multipart document returned in the `Done case, using the corresponding reference.

As a simple example, one can use parse to generate an unique ID for each part and associate it to a Buffer.t. The table tbl maintains the mapping between the part IDs that can be found in the return value and the contents of the parts.

let gen = let v = ref (-1) in fun () -> incr v ; !v in
let tbl = Hashtbl.create 0x10 in
let emitters () =
  let idx = gen () in
  let buf = Buffer.create 0x100 in
  (function None -> ()
          | Some str -> Buffer.add_string buf str), idx in
let step = parse ~emitters content_type in
let get_next_input_chunk () = ... in
let rec loop () =
  match step (get_next_input_chunk ()) with
  | `Continue -> loop ()
  | `Done tree -> Ok tree
  | `Fail msg -> Error msg
in
loop ()

As illustrated by the example above, the use of parse is somewhat intricate. This is because parse handles the general case of parsing streamed input and producing streamed output, and does not depend on any concurrenty library. Simpler functions of_{stream,string}_to_{list,tree} can be found below for when streaming is not needed. When using Lwt, the Multipart_form_lwt module provides a more convenient API, both in the streaming and non-streaming case.

val parser : emitters:'id emitters -> Content_type.t -> 'id t Angstrom.t

parse ~emitters content_type gives access to the underlying angstrom parser used internally by the parse function. This is useful when one needs control over the parsing buffer used by Angstrom.

Non-streaming API.

The functions below offer a simpler API for the case where streaming the output is not needed. This means that the entire contents of the multipart data will be stored in memory: they should not be used when dealing with possibly large data.

type 'a stream = unit -> 'a option
val of_stream_to_list : string stream -> Content_type.t -> (int t * (int * string) list, [> `Msg of string ]) Stdlib.result

of_stream_to_list stream content_type returns, if it succeeds, a pair of a value t and an associative list of contents. The multipart document t references parts using unique IDs (integers) and associates these IDs to the respective contents of each part, stored as a string.

val of_string_to_list : string -> Content_type.t -> (int t * (int * string) list, [> `Msg of string ]) Stdlib.result

Similar to of_stream_to_list, but takes the input as a string.

val of_stream_to_tree : string stream -> Content_type.t -> (string t, [> `Msg of string ]) Stdlib.result

of_stream_to_tree stream content_type returns, if it succeeds, a value t representing the multipart document, where the contents of the parts are stored as strings. It is equivalent to of_stream_to_list where references have been replaced with their associated contents.

val of_string_to_tree : string -> Content_type.t -> (string t, [> `Msg of string ]) Stdlib.result

Similar to of_string_to_tree, but takes the input as a string.

Encoder.

type part
val part : ?header:Header.t -> ?disposition:Content_disposition.t -> ?encoding:Content_encoding.t -> (string * int * int) stream -> part

part ?header ?disposition ?encoding stream makes a new part from a body stream stream and fields. stream while be mapped according to encoding.

type multipart
val multipart : rng:(?g:'g -> int -> string) -> ?g:'g -> ?header:Header.t -> ?boundary:string -> part list -> multipart

multipart ~rng ?g ?header ?boundary parts makes a new multipart from a bunch of parts, fields and a specified boundary. If boundary is not specified, we use rng to make a random boundary (we did not check that it does not appear inside parts).

val to_stream : multipart -> Header.t * (string * int * int) stream

to_stream ms generates an HTTP header and a stream.

OCaml

Innovation. Community. Security.