package biocaml

  1. Overview
  2. Docs
The OCaml Bioinformatics Library

Install

Dune Dependency

Authors

Maintainers

Sources

biocaml-0.11.2.tbz
sha256=fae219e66db06f81f3fd7d9e44717ccf2d6d85701adb12004ab4ae6d3359dd2d
sha512=f6abd60dac2e02777be81ce3b5acdc0db23b3fa06731f5b2d0b32e6ecc9305fe64f407bbd95a3a9488b14d0a7ac7c41c73a7e18c329a8f18febfc8fd50eccbc6

doc/biocaml.unix/Biocaml_unix/Pwm/index.html

Module Biocaml_unix.PwmSource

Position-weight matrix

This module can be used to create position-weight matrices (PWM) to describe a DNA motif. Such matrices can then be searched on a DNA sequence, given a threshold for alignment score.

Sourcetype count_matrix = int array array

Type to represent gap-free alignments. First dimension is the sequence position, second dimension is for the alphabet. Only DNA alphabet (A, C, G, T) is supported to rows should be of length exactly four.

Sourcetype background = private float array

Probability distribution over an alphabet

Sourceval flat_background : unit -> background

Uniform distribution over A, C, G, T

Sourceval background_of_sequence : string -> float -> background

background_of_sequence seq pc estimates the base frequency in seq using pc as pseudo-counts. Typical value for pc is 0.1.

Sourcetype t = private float array array

Representation of a PWM

Builds a PWM from a count_matrix and a background

Sourceval tandem : ?orientation:[ `direct | `inverted | `everted ] -> spacer:int -> count_matrix -> count_matrix -> background -> t

tandem orientation spacer cm1 cm2 bg builds a PWM by constructing a composite motif: it builds mat1 the PWM from cm1 under background bg (resp. mat2 from cm2 under bg), then concatenates mat1 and mat2 with spacer non scoring columns in between

Sourceval reverse_complement : t -> t

Reverse complement of a PWM

Sourceval scan : t -> string -> float -> (int * float) list

scan mat seq tol returns the list of positions (with corresponding scores) such that the alignment score of mat is superior to tol

Sourceval fast_scan : t -> string -> float -> (int * float) list

Identical to scan but directly implemented in C

Sourceval best_hit : t -> string -> int * float

best_hit mat seq returns the position and score of the best alignment found in seq for the motif mat. Raise Invalid_arg if seq is shorter than mat

OCaml

Innovation. Community. Security.