package biocaml
Install
Dune Dependency
Authors
Maintainers
Sources
sha256=fae219e66db06f81f3fd7d9e44717ccf2d6d85701adb12004ab4ae6d3359dd2d
sha512=f6abd60dac2e02777be81ce3b5acdc0db23b3fa06731f5b2d0b32e6ecc9305fe64f407bbd95a3a9488b14d0a7ac7c41c73a7e18c329a8f18febfc8fd50eccbc6
doc/biocaml.unix/Biocaml_unix/Pwm/index.html
Module Biocaml_unix.Pwm
Source
Position-weight matrix
This module can be used to create position-weight matrices (PWM) to describe a DNA motif. Such matrices can then be searched on a DNA sequence, given a threshold for alignment score.
Type to represent gap-free alignments. First dimension is the sequence position, second dimension is for the alphabet. Only DNA alphabet (A, C, G, T) is supported to rows should be of length exactly four.
Probability distribution over an alphabet
Uniform distribution over A, C, G, T
background_of_sequence seq pc
estimates the base frequency in seq
using pc
as pseudo-counts. Typical value for pc
is 0.1
.
Representation of a PWM
Builds a PWM from a count_matrix and a background
val tandem :
?orientation:[ `direct | `inverted | `everted ] ->
spacer:int ->
count_matrix ->
count_matrix ->
background ->
t
tandem orientation spacer cm1 cm2 bg
builds a PWM by constructing a composite motif: it builds mat1
the PWM from cm1
under background bg
(resp. mat2
from cm2
under bg
), then concatenates mat1
and mat2
with spacer
non scoring columns in between
scan mat seq tol
returns the list of positions (with corresponding scores) such that the alignment score of mat
is superior to tol
Identical to scan
but directly implemented in C