package owl-base
Install
Dune Dependency
Authors
Maintainers
Sources
sha256=38d210ce6c1c2f09631fd59951430e4f364b5ae036c71ed1b32ce559b2a29263
sha512=c468100556445384b9c6adad9c37b5a9b8c27db8be35f61979e65fafa88c60221b8bda0a9c06cfbbc8d4e216a1ed08a315dfefb45bb4f5f15aa82d4358f57567
doc/owl-base/Owl_dataframe/index.html
Module Owl_dataframe
Type definition
Pakcking & unpacking element
val pack_bool : bool -> elt
Pack the boolean value to ``elt`` type.
val pack_int : int -> elt
Pack the int value to ``elt`` type.
val pack_float : float -> elt
Pack the float value to ``elt`` type.
val pack_string : string -> elt
Pack the string value to ``elt`` type.
val unpack_bool : elt -> bool
Unpack ``elt`` type to boolean value.
val unpack_int : elt -> int
Unpack ``elt`` type to int value.
val unpack_float : elt -> float
Unpack ``elt`` type to float value.
val unpack_string : elt -> string
Unpack ``elt`` type to string value.
Pakcking & unpacking series
val pack_bool_series : bool array -> series
Pack boolean array to ``series`` type.
val pack_int_series : int array -> series
Pack int array to ``series`` type.
val pack_float_series : float array -> series
Pack float array to ``series`` type.
val pack_string_series : string array -> series
Pack string array to ``series`` type.
val unpack_bool_series : series -> bool array
Unpack ``series`` type to boolean array.
val unpack_int_series : series -> int array
Unpack ``series`` type to int array.
val unpack_float_series : series -> float array
Unpack ``series`` type to float array.
val unpack_string_series : series -> string array
Unpack ``series`` type to string array.
Obtain properties
val row_num : t -> int
``row_num x`` returns the number of rows in ``x``.
val col_num : t -> int
``col_num x`` returns the number of columns in ``x``.
val shape : t -> int * int
``shape x`` returns the shape of ``x``, i.e. ``(row numnber, column number)``.
val numel : t -> int
``numel x`` returns the number of elements in ``x``.
val types : t -> string array
``types x`` returns the string representation of column types.
val get_heads : t -> string array
``get_heads x`` returns the column names of ``x``.
val set_heads : t -> string array -> unit
``set_heads x head_names`` sets ``head_names`` as the column names of ``x``.
val id_to_head : t -> int -> string
``id_to_head head_name`` converts head name to its corresponding column index.
val head_to_id : t -> string -> int
``head_to_id i`` converts column index ``i`` to its corresponding head name.
Basic get and set functions
``set x i j v`` sets the value of element at ``(i,j)`` to ``v``.
``get_by_name x i head_name`` is similar to ``get`` but uses column name.
``set_by_name x i head_name`` is similar to ``set`` but uses column name.
``get_rows x a`` returns the rows of ``x`` specified in ``a``.
``get_cols x a`` returns the columns of ``x`` specified in ``a``.
``get_col_by_name`` is similar to ``get_col`` but uses column name.
``get_cols_by_name`` is similar to ``get_cols`` but uses column names.
``get_slice s x`` returns a slice of ``x`` defined by ``s``. For more details, please refer to :doc:`owl_dense_ndarray_generic`.
``get_slice_by_name`` is similar to ``get_slice`` but uses column name.
Core operations
``make ~data head_names`` creates a dataframe with an array of series data and corresponding column names. If data is not passed in, the function will return an empty dataframe.
val reset : t -> unit
``reset x`` resets the dataframe ``x`` by setting all the time series to empty.
``unique x`` removes the duplicates from the dataset and only returns the unique ones.
``sort ~inc x head`` sorts the entries in the dataframe ``x`` according to the specified column by head name ``head``. By default, ``inc`` equals ``true``, indicating increasing order.
val min_i : t -> string -> int
``min_i x head`` returns the row index of the minimum value in the column specified by the ``head`` name.
val max_i : t -> string -> int
``max_i x head`` returns the row index of the maximum value in the column specified by the ``head`` name.
``append_col x col`` appends a column to the dataframe ``x``.
``insert_row x i row`` inserts one ``row`` with at position ``i`` into dataframe ``x``.
``insert_col x j col_head s`` inserts series ``s`` with column head ``col_head`` at position ``j`` into dataframe ``x``.
val remove_row : t -> int -> unit
``remove_row x i`` removes the ``ith`` row of ``x``. Negative index is accepted.
val remove_col : t -> int -> unit
``remove_col x i`` removes the ``ith`` column of ``x``. Negative index is accepted.
``concat_horizontal x y`` merges two dataframes ``x`` and ``y``. Note that ``x`` and ``y`` must have the same number of rows, and each column name should be unique.
``concat_vertical x y`` concatenates two dataframes by appending ``y`` to ``x``. The two dataframes ``x`` and ``y`` must have the same number of columns and the same column names.
Iteration functions
``iteri_row f x`` iterates the rows of ``x`` and applies ``f``.
``iter_row`` is similar to ``iteri_row`` without passing in row indices.
``mapi_row f x`` transforms current dataframe ``x`` to a new dataframe by applying function ``f``. Note that the returned value of ``f`` must be consistent with ``x`` w.r.t to its length and type, otherwise runtime error will occur.
``map_row`` is similar to ``mapi_row`` but without passing in row indices.
``filteri_row`` creates a new dataframe from ``x`` by filtering out those rows which satisfy the condition ``f``.
``filter_row`` is similar to ``filteri_row`` without passing in row indices.
``filter_map_row f x`` creates a new dataframe from ``x`` by applying ``f`` to each row. If ``f`` returns ``None`` then the row is excluded in the returned dataframe; if ``f`` returns ``Some row`` then the row is included.
``filter_map_row`` is similar to ``filter_mapi_row`` without passing in row indices.
Extended indexing operators
Extended indexing operator associated with ``get_by_name`` function.
Extended indexing operator associated with ``set_by_name`` function.
Extended indexing operator associated with ``filter_row`` function.
Extended indexing operator associated with ``filter_map_row`` function. Given a dataframe ``x``, ``f`` is used for filtering and ``g`` is used for transforming. In other words, ``x.?(f) <- g`` means that if ``f row`` is ``true`` then ``g row`` is included in the returned dataframe.
Extended indexing operator associated with ``get_slice_by_name`` function.
IO & helper functions
val of_csv :
?sep:char ->
?head:string array ->
?types:string array ->
string ->
t
``of_csv ~sep ~head ~types fname`` creates a dataframe by reading the data in a CSV file with the name ``fname``. Currently, the function supports four data types: ``b`` for boolean; ``i`` for int; ``f`` for float; ``s`` for string.
Note if ``types`` parameter is ignored, then all the elements will be parsed as string element by default.
Parameters: * ``sep``: delimiter, the default one is tab. * ``head``: column names, if not passed in, the first line of CSV file will be used. * ``types``: data type of each column, must be consistent with head.
val to_csv : ?sep:char -> t -> string -> unit
``to_csv ~sep x fname`` converts a dataframe to CSV file of name ``fname``. The delimiter is specified by ``sep``.
``print x`` pretty prints a dataframe on the terminal.
val elt_to_str : elt -> string
``elt_to_str x`` converts element ``x`` to its string representation.