package cachet-solo5
Install
Dune Dependency
Authors
Maintainers
Sources
sha256=7cf3d609523592516ee5570c106756168d9dca264412a0ef4085d9864c53cbad
sha512=6f05a5fb19324df71ff9c3067a7c17a7a248d431417551169e4ca5aa8b177b6f902bfec73f0ee907443fe01dd9153c6b3ec97fbf0f325d1bcfcb28f7a2501adf
Description
A small library that provides a simple cache system for page-by-page read access on a block device.
Published: 13 Jan 2025
README
Cachet, a simple cache system for mmap
Cachet is a small library that provides a simple cache system for page-by-page read access on a block device. The cache system requires a map function, which can correspond to Unix.map_file.
Here's a simple example using Unix.map_file
:
let shared = true
let empty = Bigarray.Array1.create Bigarray.char Bigarray.c_layout 0
let map fd ~pos len =
let stat = Unix.fstat fd in
let len = Int.min len (stat.Unix.st_size - pos) in
if pos < stat.Unix.st_size
then let barr = Unix.map_file fd ~pos:(Int64.of_int pos)
Bigarray.char Bigarray.c_layout shared [| len |] in
Bigarray.array1_of_genarray barr
else empty
external getpagesize : unit -> int = "unix_getpagesize" [@@noalloc]
let () =
let fd = Unix.openfile "disk.img" Unix.[ O_RDONLY ] 0o644 in
let finally () = Unix.close fd in
Fun.protect ~finally @@ fun () ->
let cache = Cachet.make ~pagesize:(getpagesize ()) ~map fd in
let seq = Cachet.get_seq cache 0 in
...
Cachet and schedulers
Cachet is designed to treat the map
function as atomic. In other words: a unit of work that is indivisible and guaranteed to be executed as a single, coherent, and uninterrupted operation. Therefore, the load
function (used to load a page) cannot be more cooperative (and give other tasks the opportunity to run) than it already is.
Using Cachet with a scheduler requires addressing two issues:
- enabling cooperation after a page has been loaded
- the possibility of parallel loading of the page to ensure that other tasks can be executed
For the first point, with regard to Lwt or Async, it's essentially a question of potentially adding Lwt.pause
or Async.yield
after using Cachet.load
(or the user-friendly functions):
let () = Lwt_main.run begin
let page = Cachet.load cache 0xdead in
let* () = Lwt.pause () in
... end
For the second point, only OCaml 5 and effects can answer this issue by using an effect which will notify the scheduler to read the page in parallel.
(* see [man 3 pread] *)
let map fd ~pos len = Effect.perform (Scheduler.Pread (fd, pos, len))
let () = Scheduler.run begin fun () ->
let fd = Unix.openfile "disk.img" Unix.[ O_RDONLY ] 0o644 in
let finally () = Unix.close fd in
Fun.protect ~finally @@ fun () ->
let cache = Cachet.make ~pagesize:(getpagesize ()) ~map fd in
let page = Cachet.load cache 0xdead in
...
end
Note that this is only effective if the page is read in parallel. If this is not the case, adding a cooperation point as you could do with Lwt/Async is enough. Reading a page remains atomic and allowing other tasks to run at the same time as this reading implies that the latter must necessarily be done in parallel (via a Thread
or a Domain
).
Finally, the Cachet documentation specifies how many pages we would need to read to obtain the requested value. As a result, it's up to the user to know where the cooperation point should be placed and whether it makes sense to use, for example, get_string
or just use load
interspersed with cooperation points.
Dependencies (4)
-
mirage-solo5
>= "0.7.0"
-
cachet
= version
-
dune
>= "3.5.0"
-
ocaml
>= "4.14.0"
Dev Dependencies (1)
-
alcotest
with-test & >= "1.8.0"
Used by
None
Conflicts
None