OCaml Outreachy Internships
This is a record of all past OCaml Community Outreachy Internship Projects.
Winter 2023
Develop a Geometric Creative Coding Library for OCaml
OCaml is an industrial-strength functional programming language that's been around for nearly three decades.
While functional programming itself is not new, it has not dominated mainstram programming languages. Recently, more mainstram languages have been adopting concepts from functional programming. Now more than ever is a good time to have various types of learning material pertaining to functional programming.
Creative coding is a type of computer programming that focuses on generating artistic, expressive, and creative outputs using software and digital tools. It has its applications in places such as game development. Above all, it is a great pedagogical tool that gives visual outputs to its readers.
Joy is a tiny creative coding library in Python. Joy builds heavily on functional programming concepts with very little reference to Python syntax.
This project aims to implement a geometric creative coding library in OCaml. It is heavily inspired by Joy. When done, it will serve as a means to do geometric creative coding in OCaml.
Implement a Dark Mode for OCaml.org
OCaml is a powerful, statically-typed programming language known for its efficiency and expressiveness. OCaml.org serves as the central hub for the OCaml community, providing resources, documentation, and news. In today's digital age, users expect a more personalised and comfortable web experience. One such expectation is the availability of a dark mode, which has become a popular feature on websites and applications. This project outlines the plan to implement a dark mode for OCaml.org, enhancing user experience and modernising the platform. As OCaml continues gaining traction in various industries, it is essential to modernise its online presence to meet users' expectations worldwide.
The current styles and colors for light mode already exist so implementing a dark mode will involve adding contrasting colors and styles according to the Figma design. It will also consider accessibility standards and create a button that toggles between the light and dark mode.
Improve the GUI Experience for OCaml Users
Inspired by Rust's "Are we GUI yet?", we want the same work done on the OCaml GUI libraries. A similar work has been done for the OCaml web libraries: "Is OCaml web yet?" (see the pull request). This work would allow to tackle "Are we game yet?" in the future.
The survey must take into account the targeted platforms of these libraries, dependencies, (in)compatibilities, features, last updates, etc. A list is available in OCamlVerse but is not complete or detailed enough. Interns having previous knowledge of GUI libraries available in other languages can also compare them to the equivalent OCaml libraries.
This work must result in a guide on OCaml.org, similar to the "Is OCaml web yet?" page.
Summer 2023
Persistent Storage in MirageOS Unikernels
Every operating system, even unikernels, need a way to persist data accross reboots. Having persistent storage capabilities in MirageOS is definitely a feature to consider including. Developing this includes building libraries for partitioning disks, filesystems for these partitions, and a simple, intuitive, and programmatic way to interact with these storage devices from a user's view point. This project pushes this vision one step further by building a library for GPT partitioning.
Improving Error Reporting in Existing PPXLIB-Based PPXs
In the past, when 'ppxlib' encountered an exception in a transformation, it stopped the rewriting process, causing the rewriters after to not be processed. Also multiple errors could not be reported at the same time there were multiple failing rewriters as just the first raising rewriter will raise, and the compilation process stops there. But now, a raising rewriter does not stop the preceding rewriter from running, allowing multiple to be raised both in the context-free phase and all the other phases.
MIDI Over Ethernet With MirageOS
MIDI, which stands for Musical Instrument Digital Interface, is a widely used protocol in the world of music and audio technology. MirageOS is a library operating system that specialises in creating lightweight, secure, and efficient unikernels. Unikernels are highly-specialised, single-purpose virtual machine images designed for specific applications, and it is written in OCaml. The project focussed on implementing the rtpMIDI protocol for serialising-deserialising of MIDI messages over Ethernet and implementing use cases like a publisher-subcriber based server-client model for MIDI messages.
Winter 2022
Implement a Non-Blocking, Streaming Codec for TopoJSON
TopoJSON is an extension to GeoJSON to encode topology. This allows for redundant data to be removed and file sizes to be greatly reduced. This is often very desirable especially when working with data in the browser. In a previous Outreachy internship, a new OCaml library was implemented to provide an OCaml library for TopoJSON. This project will build on this adding more functionality to the library and providing a non-blocking, streaming codec version similar to the geojsone library.
Summer 2022
Expand OCaml 5.0 Parallel Benchmark Suite
OCaml 5.0 will be live soon! It ships with support for shared-memory parallelism and concurrency OCaml has missed all these years. This will be accompanied by a robust set of Multicore libraries useful for parallel programming. The Multicore compiler and libraries are under active development and will continue to evolve as the OCaml ecosystem moves towards Multicore. For assessing the impact of new features in the OCaml compiler and Multicore libraries, we have a set of sequential and parallel benchmarks present in our benchmark suite. While the sequential benchmarks contain many real-world applications, a wider set of parallel benchmarks would be useful. This project entails gathering the parallel benchmarks available at various places like https://github.com/ckoparkar/ocaml-benchmarks and making them available in the benchmark suite.
Extend OCaml's GeoJSON Library to Support TopoJSON
TopoJSON is an extension to GeoJSON to encode topology. This allows for redundant data to be removed and file sizes to be greatly reduced. This is often very desirable especially when working with data in the browser. This project looks to extend ocaml-geojson
to support TopoJSON.
Winter 2021
Build a Monitoring Dashboard for OCaml.org
We currently have no visibility on the performance of the server serving v3.ocaml.org, which pages are most visited, if errors happen, etc. To offer some visibility, we can implement a basic monitoring dashboard that would provide Metrics such as: Memory, CPU, Open file descriptors, Statistics such as (check if GDPR compliant first!) Requested URIs, User agents, Language, Logs. This project consists of mostly two parts: a frontend and a backend. The backend consists of building a high-level library to collect data and get statistics on them. The frontend will use this library to display graphs of the metrics, statistics, and other data we want to collect.
Improve the OCaml Meta-Programming Ecosystem
It's common for programming languages to provide some way to meta-program in order to preprocess code before reaching the last compilation step, for example, in the form of macros or templates. The OCaml compiler doesn't provide a full built-in macro system, but the OCaml parser does provide syntax for preprocessing purposes: attributes and extension points. We -the OCaml community- also have an official framework, called ppxlib
, to write preprocessors -called PPXs- based on that syntax and integrate them into the compilation process.
However, it's on the OCaml community to write and provide important PPXs to the OCaml developers. We've noticed that having the most important PPXs under the official PPX GitHub organisation -next to ppxlib
- is helpful. Developers can easily find them; developers can trust them; and they're well-written and hygienic, so developers can use them as how-to guides for writing other PPXs. In this project, you'll write one or some of those official standard PPXs.
Support `.eml` Files in OCaml's VSCode Extension
Support .eml
files in OCaml's VSCode extension Dream, the OCaml web framework, uses .eml
files to embed HTML in OCaml files. At the moment, opening these files in VSCode, with the official OCaml VSCode extension, will not provide any syntax highlighting or diagnostics for the .eml
files, because they are not supported. The goal of the project is to add support for the syntax in the extension itself as a first step, and eventually, add support for the language in the OCaml Language Server (LSP) as a second step.
Summer 2021
Create opam Package Search
Opam is the source-based package manager for OCaml code. This project comprises of writing a new web client for rendering output from the opam package database. There is a JSON endpoint on opam.ocaml.org, which provides information about packages that would provide metadata about the packages. We can extend this JSON metadata to include all the opam packages (not just the top 10) and use that to power a search frontend for the website. This may include presenting the data as a GraphQL endpoint with the frontend querying that endpoint using GraphQL.
Improve the OCaml.org Website
OCaml.org is the main website for OCaml, a functional, typed, high-level programming language. This project revolves around improving the website on multiple different fronts including: layout, accessibility, and content.
Improve the OCaml.org Website
OCaml.org is the main website for OCaml, a functional, typed, high-level programming language. This project revolves around improving the website on multiple different fronts including: layout, accessibility, and content.
Summer 2020
Reducing Global Mutable State in the OCaml Compiler Codebase
Structured Output Format for the OCaml Compiler Messages
Usually, the output messages from the compiler are a bit more difficult to read for a machine, hence it's more time consuming to find the warnings, errors, etc., and their origin. By producing a structured output for compiler messages, other tools can more easily interoperate with them and provide tooling on top of the messages.
Summer 2019
Test the OCaml Compiler With Code Coverage Tools
Improving the compiler testing process using code coverage tools. The core OCaml system has a large test suite, and it would be very useful to see which parts of the system are tested more actively and which are not so. Developers will be helped to see where it is needed to add new tests, and in the process of improving coverage, it is possible to find unexplored bugs and fix them. It might help to make OCaml and its libraries more reliable.
Test the OCaml Compiler With Random Tests and a Reference Interpreter
The aim of this project is to extend an existing testcase-generator for the OCaml compiler, using a reference interpreter (existing or newly developed) to find a lot of bugs in the compiler and fix as much of them as possible.
Summer 2016
Winter 2015
Summer 2014
MirageOS Contributions and Improvements
MirageOS Cloud API Support
MirageOS (see http://xenproject.org/developers/teams/mirage-os.html, http://www.openmirage.org/) is a type-safe unikernel written in OCaml which generates highly specialised "appliance" VMs that run directly on Xen without requiring an intervening kernel. A MirageOS application typically runs via several communicating kernel instances on the cloud. Today these instances are difficult to manage; we would like to explore strategies for managing these distributed computations using common public cloud APIs such as those exposed by Amazon EC2 and Rackspace. First we need to create pure OCaml API bindings for (e.g.) EC2 and Rackspace (purity is needed to ensure portability). These API bindings can then be used to provide operating-system-level abstractions to the unikernels. For example, a traditional VM might hotplug a vCPU; while a MirageOS application would request a "VM create" using the cloud API and "connect" the new instance to the existing network. We should be able to spin up 1000s of "CPUs" by using such APIs in a cluster environment. As well as helping Xen/Mirage, the public cloud API bindings will be very useful to other people in other contexts -- a nice side-effect.