package base

  1. Overview
  2. Docs
Legend:
Page
Library
Module
Module type
Parameter
Class
Class type
Source

Module String.Utf32beSource

UTF-32 big-endian encoding. See Utf interface.

Sourcetype t = private string
Sourceval t_sexp_grammar : t Sexplib0.Sexp_grammar.t

t_of_sexp and of_string will raise if the input is invalid in this encoding. See sanitize below to construct a valid t from arbitrary input.

include Identifiable.S with type t := t
Sourceval hash_fold_t : Hash.state -> t -> Hash.state
Sourceval hash : t -> Hash.hash_value
include Sexplib0.Sexpable.S with type t := t
Sourceval t_of_sexp : Sexplib0.Sexp.t -> t
Sourceval sexp_of_t : t -> Sexplib0.Sexp.t
include Stringable.S with type t := t
Sourceval of_string : string -> t
Sourceval to_string : t -> string
include Comparable.S with type t := t
include Comparisons.S with type t := t
include Comparisons.Infix with type t := t
Sourceval (>=) : t -> t -> bool
Sourceval (<=) : t -> t -> bool
Sourceval (=) : t -> t -> bool
Sourceval (>) : t -> t -> bool
Sourceval (<) : t -> t -> bool
Sourceval (<>) : t -> t -> bool
Sourceval equal : t -> t -> bool
Sourceval compare : t -> t -> int

compare t1 t2 returns 0 if t1 is equal to t2, a negative integer if t1 is less than t2, and a positive integer if t1 is greater than t2.

Sourceval min : t -> t -> t
Sourceval max : t -> t -> t
Sourceval ascending : t -> t -> int

ascending is identical to compare. descending x y = ascending y x. These are intended to be mnemonic when used like List.sort ~compare:ascending and List.sort ~cmp:descending, since they cause the list to be sorted in ascending or descending order, respectively.

Sourceval descending : t -> t -> int
Sourceval between : t -> low:t -> high:t -> bool

between t ~low ~high means low <= t <= high

Sourceval clamp_exn : t -> min:t -> max:t -> t

clamp_exn t ~min ~max returns t', the closest value to t such that between t' ~low:min ~high:max is true.

Raises if not (min <= max).

Sourceval clamp : t -> min:t -> max:t -> t Or_error.t
include Comparator.S with type t := t
Sourcetype comparator_witness
include Pretty_printer.S with type t := t
Sourceval pp : Formatter.t -> t -> unit
Sourceval hashable : t Hashable.t

Interpret t as a container of Unicode scalar values, rather than of ASCII characters. Indexes, length, etc. are with respect to Uchar.t.

include Indexed_container.S0_with_creators with type t := t and type elt = Uchar.t
include Container.S0_with_creators with type t := t with type elt = Uchar.t
Sourcetype elt = Uchar.t
Sourceval of_list : elt list -> t
Sourceval of_array : elt array -> t
Sourceval append : t -> t -> t

E.g., append (of_list [a; b]) (of_list [c; d; e]) is of_list [a; b; c; d; e]

Sourceval concat : t list -> t

Concatenates a nested container. The elements of the inner containers are concatenated together in order to give the result.

Sourceval map : t -> f:(elt -> elt) -> t

map f (of_list [a1; ...; an]) applies f to a1, a2, ..., an, in order, and builds a result equivalent to of_list [f a1; ...; f an].

Sourceval filter : t -> f:(elt -> bool) -> t

filter t ~f returns all the elements of t that satisfy the predicate f.

Sourceval filter_map : t -> f:(elt -> elt option) -> t

filter_map t ~f applies f to every x in t. The result contains every y for which f x returns Some y.

Sourceval concat_map : t -> f:(elt -> t) -> t

concat_map t ~f is equivalent to concat (map t ~f).

Sourceval partition_tf : t -> f:(elt -> bool) -> t * t

partition_tf t ~f returns a pair t1, t2, where t1 is all elements of t that satisfy f, and t2 is all elements of t that do not satisfy f. The "tf" suffix is mnemonic to remind readers that the result is (trues, falses).

Sourceval partition_map : t -> f:(elt -> (elt, elt) Base__.Either0.t) -> t * t

partition_map t ~f partitions t according to f.

include Container.S0 with type t := t with type elt := elt
Sourceval mem : t -> elt -> bool

Checks whether the provided element is there, using equality on elts.

Sourceval is_empty : t -> bool
Sourceval iter : t -> f:(elt -> unit) -> unit

iter must allow exceptions raised in f to escape, terminating the iteration cleanly. The same holds for all functions below taking an f.

Sourceval fold : t -> init:'acc -> f:('acc -> elt -> 'acc) -> 'acc

fold t ~init ~f returns f (... f (f (f init e1) e2) e3 ...) en, where e1..en are the elements of t.

Sourceval fold_result : t -> init:'acc -> f:('acc -> elt -> ('acc, 'e) Result.t) -> ('acc, 'e) Result.t

fold_result t ~init ~f is a short-circuiting version of fold that runs in the Result monad. If f returns an Error _, that value is returned without any additional invocations of f.

Sourceval fold_until : t -> init:'acc -> f:('acc -> elt -> ('acc, 'final) Container.Continue_or_stop.t) -> finish:('acc -> 'final) -> 'final

fold_until t ~init ~f ~finish is a short-circuiting version of fold. If f returns Stop _ the computation ceases and results in that value. If f returns Continue _, the fold will proceed. If f never returns Stop _, the final result is computed by finish.

Example:

  type maybe_negative =
    | Found_negative of int
    | All_nonnegative of { sum : int }

  (** [first_neg_or_sum list] returns the first negative number in [list], if any,
      otherwise returns the sum of the list. *)
  let first_neg_or_sum =
    List.fold_until ~init:0
      ~f:(fun sum x ->
        if x < 0
        then Stop (Found_negative x)
        else Continue (sum + x))
      ~finish:(fun sum -> All_nonnegative { sum })
  ;;

  let x = first_neg_or_sum [1; 2; 3; 4; 5]
  val x : maybe_negative = All_nonnegative {sum = 15}

  let y = first_neg_or_sum [1; 2; -3; 4; 5]
  val y : maybe_negative = Found_negative -3
Sourceval exists : t -> f:(elt -> bool) -> bool

Returns true if and only if there exists an element for which the provided function evaluates to true. This is a short-circuiting operation.

Sourceval for_all : t -> f:(elt -> bool) -> bool

Returns true if and only if the provided function evaluates to true for all elements. This is a short-circuiting operation.

Sourceval count : t -> f:(elt -> bool) -> int

Returns the number of elements for which the provided function evaluates to true.

Sourceval sum : (module Container.Summable with type t = 'sum) -> t -> f:(elt -> 'sum) -> 'sum

Returns the sum of f i for all i in the container.

Sourceval find : t -> f:(elt -> bool) -> elt option

Returns as an option the first element for which f evaluates to true.

Sourceval find_map : t -> f:(elt -> 'a option) -> 'a option

Returns the first evaluation of f that returns Some, and returns None if there is no such element.

Sourceval to_list : t -> elt list
Sourceval to_array : t -> elt array
Sourceval min_elt : t -> compare:(elt -> elt -> int) -> elt option

Returns a min (resp. max) element from the collection using the provided compare function. In case of a tie, the first element encountered while traversing the collection is returned. The implementation uses fold so it has the same complexity as fold. Returns None iff the collection is empty.

Sourceval max_elt : t -> compare:(elt -> elt -> int) -> elt option

These are all like their equivalents in Container except that an index starting at 0 is added as the first argument to f.

Sourceval foldi : t -> init:_ -> f:(int -> _ -> elt -> _) -> _
Sourceval iteri : t -> f:(int -> elt -> unit) -> unit
Sourceval existsi : t -> f:(int -> elt -> bool) -> bool
Sourceval for_alli : t -> f:(int -> elt -> bool) -> bool
Sourceval counti : t -> f:(int -> elt -> bool) -> int
Sourceval findi : t -> f:(int -> elt -> bool) -> (int * elt) option
Sourceval find_mapi : t -> f:(int -> elt -> 'a option) -> 'a option
Sourceval init : int -> f:(int -> elt) -> t

init n ~f is equivalent to of_list [f 0; f 1; ...; f (n-1)]. It raises an exception if n < 0.

Sourceval mapi : t -> f:(int -> elt -> elt) -> t

mapi is like map. Additionally, it passes in the index of each element as the first argument to the mapped function.

Sourceval filteri : t -> f:(int -> elt -> bool) -> t
Sourceval filter_mapi : t -> f:(int -> elt -> elt option) -> t

filter_mapi is like filter_map. Additionally, it passes in the index of each element as the first argument to the mapped function.

Sourceval concat_mapi : t -> f:(int -> elt -> t) -> t

concat_mapi t ~f is like concat_map. Additionally, it passes the index as an argument.

Sourceval to_sequence : t -> Uchar.t Sequence.t

Produce a sequence of unicode characters.

Sourceval is_valid : string -> bool

Reports whether a string is valid in this encoding.

Sourceval sanitize : string -> t

Create a t from a string by replacing any byte sequences that are invalid in this encoding with Uchar.replacement_char. This can be used to decode strings that may be encoded incorrectly.

Sourceval get : t -> byte_pos:int -> Uchar.t

Decodes the Unicode scalar value at the given byte index in this encoding. Raises if byte_pos does not refer to the start of a Unicode scalar value.

Sourceval of_string_unchecked : string -> t

Creates a t without sanitizing or validating the string. Other functions in this interface may raise or produce unpredictable results if the string is invalid in this encoding.

Sourceval split : t -> on:Uchar.t -> t list

Similar to String.split, but splits on a Uchar.t in t. If you want to split on a char, first convert it with Uchar.of_char, but note that the actual byte(s) on which t is split may not be the same as the char byte depending on both char and the encoding of t. For example, splitting on 'α' in UTF-8 or on '\n' in UTF-16 is actually splitting on a 2-byte sequence.

Sourceval codec_name : string

The name of this encoding scheme; e.g., "UTF-8".

Sourceval length_in_uchars : t -> int

Counts the number of unicode scalar values in t.

This function is not a good proxy for display width, as some scalar values have display widths > 1. Many native applications such as terminal emulators use wcwidth (see man 3 wcwidth) to compute the display width of a scalar value. See the uucp library's Uucp.Break.tty_width_hint for an implementation of wcwidth's logic. However, this is merely best-effort, as display widths will vary based on the font and underlying text shaping engine (see docs on tty_width_hint for details).

For applications that support Grapheme clusters (many terminal emulators do not), t should first be split into Grapheme clusters and then the display width of each of those Grapheme clusters needs to be computed (which is the max display width of the scalars that are in the cluster).

There are some active efforts to improve the current state of affairs:

  • https://github.com/wez/wezterm/issues/4320
  • https://www.unicode.org/L2/L2023/23194-text-terminal-wg-report.pdf
Sourceval length : t -> int

length could be misinterpreted as counting bytes. We direct users to other, clearer options.

  • alert length_in_uchars Use [length_in_uchars] to count unicode scalar values or [String.length] to count bytes
OCaml

Innovation. Community. Security.