# Basic Data Types And Pattern Matching

## Introduction

This tutorial introduces basic data types in OCaml. Its goal is to teach skills on how to handle data from predefined, variant, and record types. It also includes the important concept of pattern matching on those types.

In OCaml, there are no type checks at runtime, and values don't change type unless explicitly converted. This is what being statically- and strongly-typed means. This allows convenient and safe processing of structured data. In this tutorial, data in OCaml is represented using variants and products, which correspond to algebraic data types. At this level, a nominal type-checking algorithm is used. Historically, this is OCaml's first type system, as it comes from the ML programming language, OCaml's ancestor. Although OCaml has other type systems, this document focuses on data typed using this algorithm.

This tutorial will cover atomic types, such as integers and Booleans, before moving on to predefined compound types, like strings and lists. It ends with a section about user-defined types, namely variants and records.

Before proceeding, it's necessary to have completed the Get Started series of tutorials as well as Functions and Values. As in previous tutorials, expressions after `#` and ending with `;;` are for the toplevel, like UTop.

## Predefined Types

### Integers, Floats, Booleans, and Characters

#### Integers

The `int` type is the default and basic integer type in OCaml. When you enter a whole number, OCaml recognises it as an integer, as shown in this example:

``````# 42;;
- : int = 42
``````

The `int` type represents platform-dependent signed integers. This means `int` does not always have same the number of bits. It depends on underlying platform characteristics, such as processor architecture or operating system. Operations on `int` values are provided by the `Stdlib` and the `Int` modules.

Usually, `int` has 31 bits in 32-bit architectures and 63 in 64-bit architectures, because one bit is reserved for OCaml's runtime operation. The standard library also provides `Int32` and `Int64` modules, which support platform-independent operations on 32- and 64-bit signed integers. These modules are not detailed in this tutorial.

There are no dedicated types for unsigned integers in OCaml. Bitwise operations on `int` treat the sign bit the same as other bits. Binary operators use standard symbols. The signed remainder operator is written `mod`. Integers in OCaml have no predefined power operator.

#### Floats and Type Conversions

Float numbers have type `float`.

OCaml does not perform any implicit type conversion between values. Therefore, arithmetic expressions can't mix integers and floats. Parameters are either all `int` or all `float`. Arithmetic operators on floats are not the same, and they are written with a dot suffix: `+.`, `-.`, `*.`, `/.`.

``````# let pi = 3.14159;;
val pi : float = 3.14159

# let tau = 2.0 *. pi;;
val tau : float = 6.28318

# let tau = 2 *. pi;;
Error: This expression has type int but an expression was expected of type
float

# let tau = 2 * pi;;
Error: This expression has type float but an expression was expected of type
int
``````

Operations on `float` are provided by the `Stdlib` and the `Float` modules.

#### Booleans

Boolean values are represented by the type `bool`.

``````# true;;
- : bool = true

# false;;
- : bool = false

# false < true;;
- : bool = true
``````

Operations on `bool` are provided by the `Stdlib` and the `Bool` modules. The conjunction “and” is written `&&` and disjunction “or” is written `||`. Both are short-circuited, meaning that they don't evaluate the argument on the right if the left one's value is sufficient to decide the whole expression's value.

#### Characters

Values of type `char` correspond to the 256 symbols of the Latin-1 set. Character literals are surrounded by single quotes, as shown below:

``````# 'a';;
- : char = 'a'
``````

Operations on `char` values are provided by the `Stdlib` and the `Char` modules.

The module `Uchar` provides support for Unicode characters.

### Strings & Byte Sequences

#### Strings

Strings are finite and fixed-sized sequences of `char` values. Strings are immutable, meaning it is impossible to change a character's value inside a string. The string concatenation operator symbol is `^`.

``````# "hello" ^ " " ^ "world!";;
- : string = "hello world!"
``````

``````# "buenos dias".;;
- : char : 'o'
``````

Operations on `string` values are provided by the `Stdlib` and the `String` modules.

#### Byte Sequences

Byte sequences are finite and fixed-sized. Each individual byte is represented by a `char` value. Byte sequences are mutable, meaning they can't be extended or shortened, but each component byte may be updated. Essentially, a byte sequence (type `bytes`) is a mutable string that can't be printed. There is no way to write `bytes` literally, so they must be produced by a function.

``````# String.to_bytes "hello";;
- : bytes = Bytes.of_string "hello"
``````

Operations on `bytes` values are provided by the `Stdlib` and the `Bytes` modules. Only the function `Bytes.get` allows direct access to the characters contained in a byte sequence. There is no direct access operator on byte sequences.

### Arrays & Lists

#### Arrays

Arrays are finite and fixed-sized sequences of values of the same type. Here are a couple of examples:

``````# [| 0; 1; 2; 3; 4; 5 |];;
- : int array = [|0; 1; 2; 3; 4; 5|]

# [| 'x'; 'y'; 'z' |];;
- : char array = [|'x'; 'y'; 'z'|]

# [| "foo"; "bar"; "baz" |];;
- : string array = [|"foo"; "bar"; "baz"|]
``````

Arrays may contain values of any type. Here arrays are `int array`, `char array`, and `string array`, but any type of data can be used in an array. Usually, `array` is said to be a polymorphic type. Strictly speaking, it is a type operator, and it accepts a type as a parameter (here `int`, `char`, and `string`) to form another type (those inferred here). This is the empty array.

``````# [||];;
- : 'a array = [||]
``````

Remember, `'a` ("alpha") is a type parameter that will be replaced by another type.

Like `string` and `bytes`, arrays support direct access, but the syntax is not the same.

``````# [| 'x'; 'y'; 'z' |].(2);;
- : char = 'z'
``````

Arrays are mutable, meaning they can't be extended or shortened, but each element may be updated.

``````# let letter = [| 'v'; 'x'; 'y'; 'z' |];;
val letter : char array = [|'v'; 'x'; 'y'; 'z'|]

# letter.(2) <- 'A';;
- : unit = ()

# letter;;
- : char array = [|'v'; 'x'; 'A'; 'z'|]
``````

The left-arrow `<-` is the array update operator. Above, it means the cell at index 2 is set to value `'A'`. It is the same as writing `Array.set letter 2 'A'`. Array update is a side effect, and the unit value is returned.

Operations on arrays are provided by the `Array` module. There is a dedicated tutorial on Arrays.

#### Lists

As literals, lists are very much like arrays. Here are the same previous examples turned into lists.

``````# [ 0; 1; 2; 3; 4; 5 ];;
- : int list = [0; 1; 2; 3; 4; 5]

# [ 'x'; 'y'; 'z' ];;
- : char list = ['x'; 'y'; 'z']

# [ "foo"; "bar"; "baz" ];;
- : string list = ["foo"; "bar"; "baz"]
``````

Like arrays, lists are finite sequences of values of the same type. They are polymorphic, too. However, lists are extensible, immutable, and don't support direct access to all the values they contain. Lists play a central role in functional programming, so they have a dedicated tutorial.

Operations on lists are provided by the `List` module. The `List.append` function concatenates two lists. It can be used as an operator with the symbol `@`.

There are symbols of special importance with respect to lists:

• The empty list is written `[]`, has type `'a list'`, and is pronounced “nil.”
• The list constructor operator, written `::` and pronounced “cons,” is used to add a value at the head of a list.

Together, they are the basic means to build a list and access the data it stores. For instance, here is how lists are built by successively applying the cons (`::`) operator:

``````# 3 :: [];;
- : int list = 

# 2 :: 3 :: [];;
- : int list = [2; 3]

# 1 :: 2 :: 3 :: [];;
- : int list = [1; 2; 3]
``````

Pattern matching provides the basic means to access data stored inside a list.

``````# match [1; 2; 3] with
| x :: u -> x
| [] -> raise Exit;;
- : int = 1

# match [1; 2; 3] with
| x :: y :: u -> y
| x :: u -> x
| [] -> raise Exit;;
- : int = 2
``````

In the above expressions, `[1; 2; 3]` is the value that is matched over. Each expression between the `|` and `->` symbols is a pattern. They are expressions of type `list`, only formed using `[]`, `::`, and binding names that represent various shapes a list may have. The pattern `[]` means “if the list is empty.” The pattern `x :: u` means “if the list contains data, let `x` be the first element of the list and `u` be the rest of the list.” Expressions at the right of the `->` symbol are the results returned in each corresponding case.

### Options & Results

#### Options

The `option` type is also a polymorphic type. Option values can store any kind of data or represent the absence of any such data. Option values can only be constructed in two different ways: either `None`, when no data is available, or `Some`, otherwise.

``````# None;;
- : 'a option = None

# Some 42;;
- : int option = Some 42

# Some "hello";;
- : string option = Some "hello"
``````

Here is an example of pattern matching on an option value:

``````# match Some 42 with None -> raise Exit | Some x -> x;;
- : int = 42
``````

Operations on options are provided by the `Option` module. Options are discussed in the Error Handling guide.

#### Results

The `result` type can be used to express that a function's outcome can be either success or failure. There are only two ways to build a result value: either using `Ok` or `Error` with the intended meaning. Both constructors can hold any kind of data. The `result` type is polymorphic, but it has two type parameters: one for `Ok` values and another for `Error` values.

``````# Ok 42;;
- : (int, 'a) result = Ok 42

# Error "Sorry";;
- : ('a, string) result = Error "Sorry"
``````

Operations on results are provided by the `Result` module. Results are discussed in the Error Handling guide.

### Tuples

Here is a tuple containing two values, also known as a pair.

``````# (3, 'K');;
- : int * char = (3, 'K')
``````

That pair contains the integer `3` and the character `'K'`; its type is `int * char`. The `*` symbol stands for product type.

This generalises to tuples with 3 or more elements. For instance, `(6.28, true, "hello")` has type `float * bool * string`. The types `int * char` and `float * bool * string` are called product types. The `*` symbol is used to denote types bundled together in products.

The predefined function `fst` returns the first element of a pair, while `snd` returns the second element of a pair.

``````# fst (3, 'a');;
- : int = 3

# snd (3, 'a');;
- : char = 'a'
``````

In the standard library, both are defined using pattern matching. Here is how a function extracts the third element of the product of four types:

``````# let f x = match x with (a, b, c, d) -> c;;
val f : 'a * 'b * 'c * 'd -> 'c = <fun>
``````

Note that types `int * char * bool`, `int * (char * bool)`, and `(int * char) * bool` are not the same. The values `(42, 'a', true)`, `(42, ('a', true))`, and `((42, 'a'), true)` are not equal. In mathematical language, the product type operator `*` is not associative.

### Functions

The type of functions from type `a` to type `b` is written `a -> b`. Here are a few examples:

``````# fun x -> x * x;;
- : int -> int = <fun>

# (fun x -> x * x) 9;;
- : int = 81
``````

The first expression is an anonymous function of type `int -> int`. The type is inferred from the expression `x * x`, which must be of type `int` since `*` is an operator that returns an `int`. The `<fun>` printed in place of the value is a token, meaning functions don't have a value to be displayed. This is because, if they have been compiled, their code is no longer available.

The second expression is function application. The parameter `9` is applied, and the result `81` is returned.

``````# fun x -> x;;
- : 'a -> 'a = <fun>

# (fun x -> x) 42;;
- : int = 42

# (fun x -> x) "This is really disco!";;
- : string = "This is really disco!"
``````

The first expression is another anonymous function. It is the identity function, it can be applied to anything, and it returns its argument unchanged. This means that its parameter can be of any type, and its result has the same type. The same code can be applied to data of different types. This is called polymorphism.

Remember, the `'a` is a type parameter, so values of any type can be passed to the function and their type replaces the type parameter. The identity function has the same input and output type, whatever it may be.

The two following expressions show that the identity function can apply to different type parameters:

``````# let f = fun x -> x * x;;
val f : int -> int = <fun>

# f 9;;
- : int = 81
``````

Defining a function is the same as naming a value, as illustrated in the first expression:

``````# let g x = x * x;;
val g : int -> int = <fun>

# g 9;;
- : int = 81
``````

Executable OCaml code consists primarily of functions, so it's beneficial to make them as concise and clear as possible. The function `g` is defined here using a shorter, more common, and maybe more intuitive syntax.

In OCaml, functions may terminate without returning the expected type value by throwing an exception (of type `exn`), which does not appear in its type. There is no way to know if a function may raise an exception without inspecting its code.

``````# raise;;
- : exn -> 'a' = <fun>
``````

Exceptions are discussed in the Error Handling guide.

Functions may have several parameters.

``````# fun a b -> a ^ " " ^ b;;
- : string -> string -> string = <fun>

# let mean a b = (a + b) / 2;;
val mean : int -> int -> int = <fun>
``````

Like the product type symbol `*`, the function type symbol `->` is not associative. The following two types are not the same:

• `(int -> int) -> int` : This function takes a function of type `int -> int` as a parameter and returns an `int` as the result.
• `int -> (int -> int)` : This function takes an `int` as a parameter and returns a function of type `int -> int` as the result.

### Unit

Uniquely, the type `unit` has only one value. It is written `()` and pronounced “unit.”

The `unit` type has several uses. Mainly, it serves as a token when a function does not need to be passed data or doesn't have any data to return once it has completed its computation. This happens when functions have side effects such as OS-level I/O. Functions need to be applied to something for their computation to be triggered, and they also must return something. When nothing meaningful can be passed or returned, `()` should be used.

``````# read_line;;
- : unit -> string = <fun>

# print_endline;;
- : string -> unit = <fun>
``````

The function `read_line` reads an end-of-line terminated sequence of characters from standard input and returns it as a string. Reading input begins when `()` is passed.

`````` # read_line ();;
foo bar
- : string = "foo bar"

# print_endline;;
- : string -> unit = <fun>
``````

Note: Replace `foo bar` with your own text and press `Return`.

The function `print_endline` prints the string followed by a line ending on standard output. Return of the unit value means the output request has been queued by the operating system.

## User-Defined Types

User-defined types are always introduced using the `type … = …` statement. The keyword `type` must written in lowercase. The first ellipsis stands for type name and must not begin with a capital. The second ellipsis stands for the type definition. Three cases are possible:

1. Variant
2. Record
3. Aliases

These three kinds of type definitions are covered in the next three sections.

### Variants

Variants are also called tagged unions. They relate to the concept of disjoint union.

#### Enumerated Data Types

The simplest form of a variant type corresponds to an enumerated type. It is defined by an explicit list of named values. Defined values are called constructors and must be capitalised.

For example, here is how variant data types could be defined to represent Dungeons & Dragons character classes and alignment.

``````# type character_class =
| Barbarian
| Bard
| Cleric
| Druid
| Fighter
| Monk
| Ranger
| Rogue
| Sorcerer
| Wizard;;
type character_class =
Barbarian
| Bard
| Cleric
| Druid
| Fighter
| Monk
| Ranger
| Rogue
| Sorcerer
| Wizard

# type rectitude = Evil | R_Neutral | Good;;
type rectitude = Evil | R_Neutral | Good

# type firmness = Chaotic | F_Neutral | Lawful;;
type firmness = Chaotic | F_Neutral | Lawful
``````

These kinds of variant types can also be used to represent weekdays, cardinal directions, or any other fixed-sized set of values that can be given names. An ordering is defined on values following the definition order (e.g., `Druid < Ranger`).

Pattern matching can be performed on the types defined above:

``````# let rectitude_to_french = function
| Evil -> "Mauvais"
| R_Neutral -> "Neutre"
| Good -> "Bon";;
val rectitude_to_french : rectitude -> string = <fun>
``````

Note that:

• `unit` is a variant with a unique constructor, which does not carry data: `()`.
• `bool` is also a variant with two constructors that doesn't carry data: `true` and `false`.

#### Constructors With Data

It is possible to wrap data in constructors. The following type has several constructors with data (e.g., `Hash of string`) and some without (e.g., `Head`). It represents the different means to refer to a Git revision.

``````# type commit =
| Hash of string
| Tag of string
| Branch of string
type commit =
Hash of string
| Tag of string
| Branch of string

``````

Here is how to convert a `commit` to a `string` using pattern matching:

``````# let commit_to_string = function
| Hash sha -> sha
| Tag name -> name
| Branch name -> name
val commit_to_string : commit -> string = <fun>
``````

Above, the `function …` construct is used instead of the `match … with …` construct used previously:

``````let commit_to_string' x = match x with
| Hash sha -> sha
| Tag name -> name
| Branch name -> name
val commit_to_string' : commit -> string = <fun>
``````

We need to pass an inspected expression to the `match … with …` construct. The `function …` is a special form of an anonymous function that takes a parameter and forwards it to a `match … with …` construct, as shown above.

Warning: Wrapping product types with parentheses turns them into a single parameter.

``````# type t =
| C1 of int * bool
| C2 of (int * bool);;

# let p = (4, false);;
val p : int * bool = (4, false)

# C1 p;;
Error: The constructor C1 expects 2 argument(s),
but is applied here to 1 argument(s)

# C2 p;;
- : C2 = C2 (4, false)
``````

The constructor `C1` has two parameters of type `int` and `bool`, whilst the constructor `C2` has a single parameter of type `int * bool`.

#### Recursive Variants

A variant definition referring to itself is recursive. A constructor may wrap data from the type being defined.

This is the case for the following definition, which can be used to store JSON values.

``````# type json =
| Null
| Bool of bool
| Int of int
| Float of float
| String of string
| Array of json list
| Object of (string * json) list;;
type json =
Null
| Bool of bool
| Int of int
| Float of float
| String of string
| Array of json list
| Object of (string * json) list
``````

Both constructors `Array` and `Object` contain values of type `json`.

Functions defined using pattern matching on recursive variants are often recursive too. This function checks if a name is present in a whole JSON tree:

``````# let rec has_field name = function
| Array u ->
List.fold_left (fun b obj -> b || has_field name obj) false u
| Object u ->
List.fold_left
(fun b (key, obj) -> b || key = name || has_field name obj) false u
| _ -> false;;
val has_field : string -> json -> bool = <fun>
``````

Here, the last pattern uses the symbol `_`, which catches everything. It returns `false` on all data that is neither `Array` nor `Object`.

### Polymorphic Data Types

#### Revisiting Predefined Types

The predefined type `option` is a variant type with two constructors: `Some` and `None`. It can contain values of any type, such as `Some 42` or `Some "hola"`. In that sense, `option` is polymorphic. Here is how it is defined in the standard library:

``````# #show option;;
type 'a option = None | Some of 'a
``````

The predefined type `list` is polymorphic in the same sense. It is a variant with two constructors and can hold data of any type. Here is how it is defined in the standard library:

``````# #show list;;
type 'a list = [] | (::) of 'a * 'a list
``````

The only magic here is turning constructors into symbols, which we don't cover in this tutorial. The types `bool` and `unit` also are regular variants, with the same magic:

``````# #show unit;;
type unit = ()

# #show bool;;
type bool = false | true
``````

Implicitly, product types also behave as variant types. For instance, pairs can be seen as inhabitants of this type:

``````# type ('a, 'b) pair = Pair of 'a * 'b;;
type ('a, 'b) pair = Pair of 'a * 'b
``````

`(int, bool) pair` would be written `int * bool`, and `Pair (42, true)` would be written `(42, true)`. From the developer's perspective, everything happens as if such a type were declared for every possible product shape. This is what allows pattern matching on products.

Even integers and floats can be seen as enumerated-like variant types, with many constructors and funky syntactic sugar, which allows pattern matching on those types.

In the end, the only type construction that does not reduce to a variant is the function arrow type. Pattern matching allows the inspection of values of any type, except functions.

#### User-Defined Polymorphic

Here is an example of a variant type that combines constructors with data, constructors without data, polymorphism, and recursion:

``````# type 'a tree =
| Leaf
| Node of 'a * 'a tree * 'a tree;;
type 'a tree = Leaf | Node of 'a * 'a tree * 'a tree
``````

It can be used to represent arbitrarily labelled binary trees. Assuming such a tree would be labelled with integers, here is a possible way to compute the sum of its integers, using recursion and pattern matching.

``````# let rec sum = function
| Leaf -> 0
| Node (x, lft, rht) -> x + sum lft + sum rht;;
val sum : int tree -> int = <fun>
``````

Here is how the map function can be defined in this type:

``````# let rec map f = function
| Leaf -> Leaf
| Node (x, lft, rht) -> Node (f x, map f lft, map f rht);;
val map : ('a -> 'b) -> 'a tree -> 'b tree = <fun>
``````

In the OCaml community, as well as in the larger functional programming community, the word polymorphism is used loosely. It is applied to things working in a similar fashion with various types. In this broad sense, several features of OCaml are polymorphic. Each uses a particular form of polymorphism and has a name. In summary, OCaml has several forms of polymorphism. In most cases, the distinction between those concepts is blurred, but it is sometimes necessary to distinguish them.

Here are the terms applicable to data types:

1. `'a list`, `'a option`, and `'a tree` are very often said to be polymorphic types. Formally, `bool list` or `int option` are the types, whilst `list` and `option` are type operators that take a type parameter and result in a type. This is a form of parametric polymorphism. `'a list` and `'a option` denote type families, which are all the types created by applying type parameters to the operators.
2. OCaml has something called Polymorphic Variants. Although the types `option`, `list`, and `tree` are variants and polymorphic, they aren't polymorphic variants. They are type-parametrised variants. We stick to this usage and say the variants in this section are polymorphic. OCaml polymorphic variants are covered in another tutorial.

### Records

Records are like tuples in that several values are bundled together. In a tuple, elements are identified by their position in the corresponding product type. They are either first, second, third, or at some other position. In a record, each element has a name and a value. This name-value pair is known as a field. That's why record types must be declared before being used.

For instance, here is the definition of a record type meant to partially represent a Dungeons & Dragons character class. Please note that the following code is dependent upon the definitions earlier in this tutorial. Ensure you have entered the definitions in the Enumerated Data Types section.

``````# type character = {
name : string;
level : int;
race : string;
class_type : character_class;
alignment : firmness * rectitude;
armor_class : int;
};;
type character = {
name : string;
level : int;
race : string;
class_type : character_class;
alignment : firmness * rectitude;
armor_class : int;
}
``````

Values of type `character` carry the same data as inhabitants of this product: `string * int * string * character_class * character_alignment * int`.

Access the fields by using the dot notation, as shown:

``````# let ghorghor_bey = {
name = "Ghôrghôr Bey";
level = 17;
race = "half-ogre";
class_type = Fighter;
alignment = (Chaotic, R_Neutral);
armor_class = -8;
};;
val ghorghor_bey : character =
{name = "Ghôrghôr Bey"; level = 17; race = "half-ogre";
class_type = Fighter; alignment = (Chaotic, R_Neutral); armor_class = -8}

# ghorghor_bey.alignment;;
- : firmness * rectitude = (Chaotic, R_Neutral)

# ghorghor_bey.class_type;;
- : character_class = Fighter

# ghorghor_bey.level;;
- : int = 17
``````

Note that records behave like single constructor variants. That allows pattern matching on them.

``````# match ghorghor_bey with { level; _ } -> level;;
- : int = 17
``````

### Type Aliases

Just like values, any type can be given a name.

``````# type latitude_longitude = float * float;;
type latitude_longitude = float * float
``````

This is mostly useful as a means of documentation or to shorten long type expressions.

## A Complete Example: Mathematical Expressions

This example shows how to represent simple mathematical expressions like `n * (x + y)` and multiply them out symbolically to get `n * x + n * y`:

``````# type expr =
| Plus of expr * expr        (* a + b *)
| Minus of expr * expr       (* a - b *)
| Times of expr * expr       (* a * b *)
| Divide of expr * expr      (* a / b *)
| Var of string              (* "x", "y", etc. *);;
type expr =
Plus of expr * expr
| Minus of expr * expr
| Times of expr * expr
| Divide of expr * expr
| Var of string
``````

The expression `n * (x + y)` would be written:

``````# let e = Times (Var "n", Plus (Var "x", Var "y"));;
val e : expr = Times (Var "n", Plus (Var "x", Var "y"))
``````

Here is a function that prints out `Times (Var "n", Plus (Var "x", Var "y"))` as something more like `n * (x + y)`:

``````# let rec to_string = function
| Plus (e1, e2) -> "(" ^ to_string e1 ^ " + " ^ to_string e2 ^ ")"
| Minus (e1, e2) -> "(" ^ to_string e1 ^ " - " ^ to_string e2 ^ ")"
| Times (e1, e2) -> "(" ^ to_string e1 ^ " * " ^ to_string e2 ^ ")"
| Divide (e1, e2) -> "(" ^ to_string e1 ^ " / " ^ to_string e2 ^ ")"
| Var v -> v;;
val to_string : expr -> string = <fun>
``````

We can write a function to multiply out expressions of the form `n * (x + y)` or `(x + y) * n`, and for this we will use a nested pattern:

``````# let rec distrib = function
| Times (e1, Plus (e2, e3)) ->
Plus (Times (distrib e1, distrib e2),
Times (distrib e1, distrib e3))
| Times (Plus (e1, e2), e3) ->
Plus (Times (distrib e1, distrib e3),
Times (distrib e2, distrib e3))
| Plus (e1, e2) -> Plus (distrib e1, distrib e2)
| Minus (e1, e2) -> Minus (distrib e1, distrib e2)
| Times (e1, e2) -> Times (distrib e1, distrib e2)
| Divide (e1, e2) -> Divide (distrib e1, distrib e2)
| Var v -> Var v;;
val distrib : expr -> expr = <fun>
``````

This is how it can be used:

``````# e |> distrib |> to_string |> print_endline;;
((n * x) + (n * y))
- : unit = ()
``````

The first two patterns hold the key to how the `distrib` function works. The first pattern is `Times (e1, Plus (e2, e3))`, which matches expressions of the form `e1 * (e2 + e3)`. The right-hand side of this first pattern is equivalent to `(e1 * e2) + (e1 * e3)`. The second pattern does the same thing, except for expressions of the form `(e1 + e2) * e3`.

The remaining patterns don't change the expressions' form, but they call the `distrib` function recursively on their subexpressions. This ensures that all its subexpressions get multiplied out too. (If you only wanted to multiply out the very top level of an expression, you could replace all the remaining patterns with a simple `e -> e` rule.)

The reverse operation, i.e., factorising out common subexpressions, can be implemented in a similar fashion. The following version only works for the top-level expression.

``````# let top_factorise = function
| Plus (Times (e1, e2), Times (e3, e4)) when e1 = e3 ->
Times (e1, Plus (e2, e4))
| Plus (Times (e1, e2), Times (e3, e4)) when e2 = e4 ->
Times (Plus (e1, e3), e4)
| e -> e;;
val top_factorise : expr -> expr = <fun>
# top_factorise (Plus (Times (Var "n", Var "x"),
Times (Var "n", Var "y")));;
- : expr = Times (Var "n", Plus (Var "x", Var "y"))
``````

The factorise function above introduces another feature: guards to each pattern. The conditional follows the `when`, and it means that the return code is executed only if the pattern matches and the condition in the `when` clause is satisfied.

## Conclusion

This tutorial has provided a comprehensive overview of OCaml's basic data types and their usage. We have explored the built-in types, such as integers, floats, characters, lists, tuples, and strings, and the user-defined types: records and variants. Records and tuples are mechanisms for grouping heterogeneous data into cohesive units. Variants are a mechanism for exposing heterogeneous data as coherent alternatives.

Going further, you can explore several advanced topics related to OCaml's data types to deepen your understanding and enhance your programming skills.

As of writing this, these topics will be covered in forthcoming tutorials. Documentation on them is available in the OCaml Language Manual and in the books on OCaml.

## Help Improve Our Documentation

All OCaml docs are open source. See something that's wrong or unclear? Submit a pull request.

Contribute