package base

  1. Overview
  2. Docs
Legend:
Page
Library
Module
Module type
Parameter
Class
Class type
Source

Module String.EscapingSource

Operations for escaping and unescaping strings, with parameterized escape and escapeworthy characters. Escaping/unescaping using this module is more efficient than using Pcre. Benchmark code can be found in core/benchmarks/string_escaping.ml.

Sourceval escape_gen_exn : escapeworthy_map:(char * char) list -> escape_char:char -> (string -> string) Staged.t

escape_gen_exn escapeworthy_map escape_char returns a function that will escape a string s as follows: if (c1,c2) is in escapeworthy_map, then all occurrences of c1 are replaced by escape_char concatenated to c2.

Raises an exception if escapeworthy_map is not one-to-one. If escape_char is not in escapeworthy_map, then it will be escaped to itself.

Sourceval escape_gen : escapeworthy_map:(char * char) list -> escape_char:char -> (string -> string) Or_error.t
Sourceval escape : escapeworthy:char list -> escape_char:char -> (string -> string) Staged.t

escape ~escapeworthy ~escape_char s is

  escape_gen_exn ~escapeworthy_map:(List.zip_exn escapeworthy escapeworthy)
    ~escape_char

Duplicates and escape_char will be removed from escapeworthy. So, no exception will be raised

Sourceval unescape_gen_exn : escapeworthy_map:(char * char) list -> escape_char:char -> (string -> string) Staged.t

unescape_gen_exn is the inverse operation of escape_gen_exn. That is,

  let escape = Staged.unstage (escape_gen_exn ~escapeworthy_map ~escape_char) in
  let unescape = Staged.unstage (unescape_gen_exn ~escapeworthy_map ~escape_char) in
  assert (s = unescape (escape s))

always succeed when ~escapeworthy_map is not causing exceptions.

Sourceval unescape_gen : escapeworthy_map:(char * char) list -> escape_char:char -> (string -> string) Or_error.t
Sourceval unescape : escape_char:char -> (string -> string) Staged.t

unescape ~escape_char is defined as unescape_gen_exn ~map:[] ~escape_char

Sourceval is_char_escaping : string -> escape_char:char -> int -> bool

Any char in an escaped string is either escaping, escaped, or literal. For example, for escaped string "0_a0__0" with escape_char as '_', pos 1 and 4 are escaping, 2 and 5 are escaped, and the rest are literal.

is_char_escaping s ~escape_char pos returns true if the char at pos is escaping, false otherwise.

Sourceval is_char_escaped : string -> escape_char:char -> int -> bool

is_char_escaped s ~escape_char pos returns true if the char at pos is escaped, false otherwise.

Sourceval is_char_literal : string -> escape_char:char -> int -> bool

is_char_literal s ~escape_char pos returns true if the char at pos is not escaped or escaping.

Sourceval index : string -> escape_char:char -> char -> int option

index s ~escape_char char finds the first literal (not escaped) instance of char in s starting from 0.

Sourceval index_exn : string -> escape_char:char -> char -> int
Sourceval rindex : string -> escape_char:char -> char -> int option

rindex s ~escape_char char finds the first literal (not escaped) instance of char in s starting from the end of s and proceeding towards 0.

Sourceval rindex_exn : string -> escape_char:char -> char -> int
Sourceval index_from : string -> escape_char:char -> int -> char -> int option

index_from s ~escape_char pos char finds the first literal (not escaped) instance of char in s starting from pos and proceeding towards the end of s.

Sourceval index_from_exn : string -> escape_char:char -> int -> char -> int
Sourceval rindex_from : string -> escape_char:char -> int -> char -> int option

rindex_from s ~escape_char pos char finds the first literal (not escaped) instance of char in s starting from pos and towards 0.

Sourceval rindex_from_exn : string -> escape_char:char -> int -> char -> int
Sourceval split : string -> on:char -> escape_char:char -> string list

split s ~escape_char ~on returns a list of substrings of s that are separated by literal versions of on. Consecutive on characters will cause multiple empty strings in the result. Splitting the empty string returns a list of the empty string, not the empty list.

E.g., split ~escape_char:'_' ~on:',' "foo,bar_,baz" = ["foo"; "bar_,baz"].

Sourceval split_on_chars : string -> on:char list -> escape_char:char -> string list

split_on_chars s ~on returns a list of all substrings of s that are separated by one of the literal chars from on. on are not grouped. So a grouping of on in the source string will produce multiple empty string splits in the result.

E.g., split_on_chars ~escape_char:'_' ~on:[',';'|'] "foo_|bar,baz|0" -> ["foo_|bar"; "baz"; "0"].

Sourceval lsplit2 : string -> on:char -> escape_char:char -> (string * string) option

lsplit2 s ~on ~escape_char splits s into a pair on the first literal instance of on (meaning the first unescaped instance) starting from the left.

Sourceval lsplit2_exn : string -> on:char -> escape_char:char -> string * string
Sourceval rsplit2 : string -> on:char -> escape_char:char -> (string * string) option

rsplit2 s ~on ~escape_char splits s into a pair on the first literal instance of on (meaning the first unescaped instance) starting from the right.

Sourceval rsplit2_exn : string -> on:char -> escape_char:char -> string * string
Sourceval lstrip_literal : ?drop:(char -> bool) -> t -> escape_char:char -> t

These are the same as lstrip, rstrip, and strip for generic strings, except that they only drop literal characters -- they do not drop characters that are escaping or escaped. This makes sense if you're trying to get rid of junk whitespace (for example), because escaped whitespace seems more likely to be deliberate and not junk.

Sourceval rstrip_literal : ?drop:(char -> bool) -> t -> escape_char:char -> t
Sourceval strip_literal : ?drop:(char -> bool) -> t -> escape_char:char -> t
OCaml

Innovation. Community. Security.