package biocaml

  1. Overview
  2. Docs
Legend:
Page
Library
Module
Module type
Parameter
Class
Class type
Source

Module Biocaml_unixSource

Sourcemodule Accu : sig ... end

A datastructure (based on Hashtbl) to accumulate values.

Sourcemodule Bam : sig ... end

Read and write BAM format.

Sourcemodule Bamstats : sig ... end
Sourcemodule Bar : sig ... end

Affymetrix's BAR files. Their Tiling Analysis Software (TAS) produces BAR files in binary format but this module supports only the text format generated by selecting the "Export probe analysis as TXT" option.

Sourcemodule Bed : sig ... end

BED data files.

Sourcemodule Bgzf : sig ... end

I/O on Blocked GNU Zip format (BGZF) files

Sourcemodule Bin_pred : sig ... end

Performance measurement of binary classifiers.

Sourcemodule Biocaml_result : sig ... end

Extension of Core's Result. Internal use only.

Sourcemodule Bpmap : sig ... end

Affymetrix's BPMAP files. Only text format supported. Binary BPMAP files must first be converted to text using Affymetrix's probe exporter tool.

Sourcemodule Cel : sig ... end

Affymetrix's CEL files. Only text format supported. Binary file must be converted using Affymetrix's conversion tool. This tool does not change file extension, so be sure your file really is in text format.

Sourcemodule Chr : sig ... end

Chromosome names. A chromosome name, as defined by this module, consists of two parts. An optional prefix "chr" (case-insensitive), followed by a suffix identifying the chromosome. The possible suffixes (case-insensitive) are:

Sourcemodule Entrez : sig ... end

Entrez Utilities API

Sourcemodule Fasta : sig ... end

FASTA files. The FASTA family of file formats has different incompatible descriptions (1, 2, 3, 4, etc.). Roughly FASTA files are in the format:

Sourcemodule Fastq : sig ... end

FASTQ files. The FASTQ file format is repeated sequence of 4 lines:

Sourcemodule File_mapper : sig ... end
Sourcemodule Future : sig ... end
Sourcemodule Future_unix : sig ... end
Sourcemodule GenomeMap : sig ... end

Data structures to represent sets of (possibly annotated) genomic regions

Sourcemodule Gff : sig ... end

GFF files.

Sourcemodule Histogram : sig ... end

Histograms with polymorphic bin types.

Sourcemodule Interval_tree : sig ... end

Interval tree (data structure)

Sourcemodule Iset : sig ... end

DIET : Discrete Interval Encoding Trees

Sourcemodule Jaspar : sig ... end

Access to Jaspar database

Sourcemodule Line : sig ... end
Sourcemodule Lines : sig ... end

Manipulate the lines of a file.

Sourcemodule Math : sig ... end

Numeric mathematics.

Sourcemodule Msg : sig ... end

Consistent printing of errors, warnings, and bugs. An error is a user mistake that prevents continuing program execution, a warning is a milder problem that the program continues to execute through, and a bug is a mistake in the software.

Sourcemodule MzData : sig ... end
Sourcemodule Phred_score : sig ... end

PHRED quality scores.

Sourcemodule Pos : sig ... end

File positions. A position within a file is defined by:

Sourcemodule Psl : sig ... end
Sourcemodule Pwm : sig ... end

Position-weight matrix

Sourcemodule RSet : sig ... end

Efficient integer sets when many elements expected to be large contiguous sequences of integers.

Sourcemodule Range : sig ... end

Ranges of contiguous integers (integer intervals). A range is a contiguous sequence of integers from a lower bound to an upper bound. For example, [2, 10] is the set of integers from 2 through 10, inclusive of 2 and 10.

Sourcemodule Roman_num : sig ... end

Roman numerals. Values greater than or equal to 1 are valid roman numerals.

Sourcemodule Sam : sig ... end

SAM files. Documentation here assumes familiarity with the SAM specification.

Sourcemodule Sbml : sig ... end

SBML file parser. Currently only level 2 version 4 is supported.

Sourcemodule Seq : sig ... end

Nucleic acid sequences. A nucleic acid code is any of A, C, G, T, U, R, Y, K, M, S, W, B, D, H, V, N, or X. See IUB/IUPAC standards for further information. Gaps are not supported. Internal representation uses uppercase, but constructors are case-insensitive. By convention the first nucleic acid in a sequence is numbered 1.

Sourcemodule Seq_range : sig ... end

Range on a sequence, where the sequence is represented by an identifier.

Sourcemodule Sgr : sig ... end

Sequence Graph (SGR) files.

Sourcemodule Solexa_score : sig ... end

Solexa quality scores.

Sourcemodule Strand : sig ... end

Strand names. There are various conventions for referring to the two strands of DNA. This module provides an of_string function that parses the various conventions into a canonical representation, which we define to be '-' or '+'.

Sourcemodule Table : sig ... end

Generic “tables” (like CSV, TSV, Bed …).

Sourcemodule Tfxm : sig ... end

Buffered transforms. A buffered transform represents a method for converting a stream of inputs to a stream of outputs. However, inputs can also be buffered, i.e. you can feed inputs to the transform and pull out outputs later. There is no requirement that 1 input produces exactly 1 output. It is common that multiple input values are needed to construct a single output, and vice versa.

Sourcemodule Track : sig ... end

Track files in UCSC Genome Browser format. The following documentation assumes knowledge of concepts explained on the UCSC Genome Browser's website. Basically, a track file is one of several types of data (WIG, GFF, etc.), possibly preceded by comments, browser lines, and a track line. This module allows only a single data track within a file, although the UCSC specifies that multiple tracks may be provided together.

Sourcemodule Transcripts : sig ... end

Transcripts are integer intervals containing a list of exons. Exons are themselves defined as a list of integer intervals.

Sourcemodule Vcf : sig ... end

Parsing of VCF files.

Sourcemodule Wig : sig ... end

WIG data.

Sourcemodule Zip : sig ... end

Streaming interface to the Zlib library.

OCaml

Innovation. Community. Security.