Programming languages matter. They affect the reliability, security, and efficiency of the code you write, as well as how easy it is to read, refactor, and extend. The languages you know can also change how you think, influencing the way you design software even when you’re not using them.
OCaml mixes power and pragmatism in a way that makes it ideal for building complex software systems. What makes OCaml special is that it occupies a sweet spot in programming language design. It provides a combination of efficiency, expressiveness, and practicality that is matched by no other language. That is in large part because OCaml is an elegant combination of language features that have been developed over the last 40 years. These include:
- Generational garbage collection for automatic memory management.
- Static type-checking to increase performance and reduce the number of runtime errors, as found in Java and C#.
- Parametric polymorphism, which enables the construction of abstractions that work across different data types, similar to generics in Java and C# and templates in C++.
- Good support for immutable programming, i.e., programming without making destructive updates to data structures. This is present in traditional functional languages like Scheme, and it's also found in distributed, big-data frameworks like Hadoop.
- Type inference, so you don’t need to annotate every function parameter, return type, and variable. Instead, types are inferred based on how a value is used. Available in a limited form in C# with implicitly-typed local variables and in C++11 with its auto keyword.
- Algebraic data types and pattern matching to define and manipulate complex data structures, also available in Scala and F#.
There is something transformative about having all these features together and able to interact in a single language. Despite their importance, these ideas have made only limited inroads into mainstream languages, and when they do arrive there, like first-class functions in C# or parametric polymorphism in Java, it’s typically in a limited and awkward form. The only languages that completely embody these ideas are statically-typed, functional programming languages like OCaml, F#, Haskell, Scala, Rust, and Standard ML.
Among this worthy set of languages, OCaml stands apart because it manages to provide a great deal of power while remaining highly pragmatic. The compiler has a straightforward compilation strategy that produces performant code without requiring heavy optimisation and without the complexities of dynamic just-in-time (JIT) compilation. This, along with OCaml’s strict evaluation model, makes runtime behavior easy to predict. The garbage collector is incremental, (letting you avoid large GC-related pauses) and precise, meaning it will collect all unreferenced data (unlike many reference-counting collectors). Plus, the runtime is simple and highly portable.
All of this makes OCaml a great choice for programmers who want to step up to a better programming language, and at the same time get practical work done.
A Brief History
OCaml was written in 1996 by Xavier Leroy, Jérôme Vouillon, Damien Doligez, and Didier Rémy at INRIA in France. It was inspired by a long line of research into ML, starting in the 1960s, and continues to have deep links to the academic community.
ML was originally the meta language of the LCF (Logic for Computable Functions) proof assistant, released by Robin Milner in 1972 (at Stanford, and later at Cambridge). ML was turned into a compiler in order to make it easier to use LCF on different machines, and it was gradually turned into a full-fledged system of its own by the 1980s.
The first implementation of Caml appeared in 1987. Ascánder Suárez created it as part of the Formel project at INRIA, headed by Gérard Huet. Later, Pierre Weis and Michel Mauny continued developing it. In 1990, Xavier Leroy and Damien Doligez built a new implementation called Caml Light that they based on a bytecode interpreter with a fast, sequential garbage collector. Over the next few years, useful libraries appeared, such as Michel Mauny’s syntax manipulation tools, and this helped promote the use of Caml in education and research teams.
Xavier Leroy continued extending Caml Light with new features, which resulted in the 1995 release of Caml Special Light. This improved the executable efficiency significantly by adding a fast native code compiler that made Caml’s performance competitive with mainstream languages such as C++. A module system inspired by Standard ML also provided powerful facilities for abstraction and made larger-scale programs easier to construct.
The modern OCaml emerged in 1996, when Didier Rémy and Jérôme Vouillon implemented a powerful and elegant object system. This object system was notable for supporting many common object-oriented idioms in a statically type-safe way, whereas the same idioms required runtime checks in languages such as C++ or Java. In 2000, Jacques Garrigue extended OCaml with several new features such as polymorphic methods and variants, as well as labeled and optional arguments.
The last two decades have seen OCaml attract a significant user base, and language improvements have been steadily added to support the growing commercial and academic codebases. By 2012, the OCaml 4.0 release had added Generalised Algebraic Data Types (GADTs) and first-class modules to increase the flexibility of the language. Since then, OCaml has had a steady, yearly release cadence, and OCaml 5.0 with Multicore support did release in December 2022. There is also fast, native-code support for the latest CPU architectures, such as x86_64, ARM, RISC-V, and PowerPC, making OCaml a good choice for systems where resource usage, predictability, and performance all matter.
- Separate compilation of standalone applications: Portable bytecode compilers make it possible to create stand-alone applications out of OCaml programs. For example, OCaml code can interact with C code via a foreign function interface when necessary, which has allowed it to be used commercially to prove the absence of runtime errors in safety-critical software for the Airbus A340 family.
- A sophisticated module system: You can think of OCaml as being divided into two parts: a core language, concerned with values and types, and a module language that's concerned with modules and module signatures. OCaml contains an incredibly powerful module system. For instance, it allows you to parameterise one module over another. This makes it possible to concisely and safely build up layers of abstraction in large pieces of software.
- Object-oriented programming: OCaml allows for writing programs in an object-oriented style. In keeping with the language’s philosophy, the object-oriented layer obeys the "strong typing" paradigm, which makes it impossible to send a message to an object that cannot answer it. Such safety does not come at the cost of expressiveness, and thanks to features such as multiple inheritance and parametric classes, some of the most complex design patterns can be expressed in a natural manner.
There are several different methods available for debugging OCaml programs. The interactive REPL offers an
elementary, yet fast and simple method to test functions. For more complex cases, the interactive system
provides a cheap way of "tracing" computations. Finally, OCaml has an extremely powerful tool for
following a program's execution called
ocamldebug, it is a symbolic (i.e. source level) replay debugger. It enables the user to stop a program at any time to scrutinise the value of variables and the stack of calling functions. It even allows going back into the past to resume execution at a particular point of interest.
- Efficient compiler, efficient compiled code: OCaml offers two batch compilers: a bytecode compiler and a native code compiler. Bytecode compilers quickly generate small, portable executables. The native code compiler is slower, but it produces more efficient machine code, whose performance meets the highest standards of modern compilers.
(This page is adapted from the "Why OCaml" section of Real World OCaml)