package SZXX
Library
Module
Module type
Parameter
Class
Class type
val sexp_of_location : location -> Sexplib0.Sexp.t
type 'a cell_parser = {
string : location -> Base.string -> 'a;
formula : location -> formula:Base.string -> Base.string -> 'a;
error : location -> formula:Base.string -> Base.string -> 'a;
boolean : location -> Base.string -> 'a;
(*"1" for
*)true
number : location -> Base.string -> 'a;
(*May contain a decimal part
*)date : location -> Base.string -> 'a;
(*ISO-8601 format
*)null : 'a;
}
A cell parser converts from XLSX types to your own data type (usually a variant). Use SZXX.Xlsx.string_cell_parser
or SZXX.Xlsx.yojson_cell_parser
to get started quickly, then make your own.
val sexp_of_row : ('a -> Sexplib0.Sexp.t) -> 'a row -> Sexplib0.Sexp.t
val string_cell_parser : Base.string cell_parser
Convenience cell_parser to convert from XLSX types to String
val yojson_cell_parser :
[> `Bool of Base.bool
| `Float of Base.float
| `String of Base.string
| `Null ]
cell_parser
Convenience cell_parser to convert from XLSX types to JSON
XLSX dates are stored as floats. Convert from a float
to a Ptime.date
XLSX datetimes are stored as floats. Convert from a float
to a Ptime.t
Convert from a column reference such as "D7" or "AA2" to a 0-based column index
val stream_rows_double_pass :
?filter_sheets:(sheet_id:Base.int -> raw_size:Base.int64 -> Base.bool) ->
sw:Eio.Std.Switch.t ->
_ Eio.File.ro ->
'a cell_parser ->
'a row Base.Sequence.t
Stream parsed rows from an XLSX file. This functions is GUARANTEED to run in constant memory, without buffering.
SZXX.Xlsx.stream_rows_double_pass ?only_sheet ~sw file cell_parser
filter_sheets
: Default: all sheets. Sheet IDs start at 1. Note: it does not necessarily match the order of the sheets in Excel.
sw
: A regular Eio.Switch.t
file
: A file opened with Eio.Path.open_in
or Eio.Path.with_open_in
. If your XLSX document is not a file (e.g. an HTTP transfer), then use SZXX.Xlsx.stream_rows_single_pass
cell_parser
: A cell parser converts from XLSX types to your own data type (usually a variant). Use SZXX.Xlsx.string_cell_parser
or SZXX.Xlsx.yojson_cell_parser
to get started quickly, then make your own.
SZXX will wait for you to consume rows from the Sequence before extracting more.
val stream_rows_single_pass :
?max_buffering:Base.int ->
?filter:(Xml.DOM.element row -> Base.bool) ->
?filter_sheets:(sheet_id:Base.int -> raw_size:Base.int64 -> Base.bool) ->
sw:Eio.Std.Switch.t ->
feed:Feed.t ->
'a cell_parser ->
'a row Base.Sequence.t
Stream parsed rows from an XLSX document. This function will only buffer rows encountered before the SST (see README.md
). Consider using SZXX.Xlsx.stream_rows_double_pass
if your XLSX is stored as a file.
SZXX.Xlsx.stream_rows_single_pass ?max_buffering ?filter ?only_sheet ~sw ~feed cell_parser
max_buffering
: Default: unlimited. Sets a limit to the number of rows that may be buffered. Raises an exception if it runs out of buffer space before reaching the SST.
filter
: Use this filter to drop uninteresting rows and reduce the number of rows that must be buffered. If necessary, use SZXX.Xlsx.Expert.parse_row_without_sst
to access cell-level data. This function is called on every row of every sheet (unless ?only_sheet
limits extraction to a single sheet).
filter_sheets
: Default: all sheets. Sheet IDs start at 1. Note: it does not necessarily match the order of the sheets in Excel.
sw
: A regular Eio.Switch.t
feed
: A producer of raw input data. Create a feed
by using the SZXX.Feed
module.
cell_parser
: A cell parser converts from XLSX types to your own data type (usually a variant). Use SZXX.Xlsx.string_cell_parser
or SZXX.Xlsx.yojson_cell_parser
to get started quickly, then make your own.
As much as possible, SZXX will wait for you to consume rows from the Sequence before extracting more.
module Expert : sig ... end