Research interests | Publications | Software | Short CV | Personal | Documents
MUSA
Contents:
These implementations of MUSA are now available:
Version | Comments | Download |
---|---|---|
0.5.6 | [linux bin] |
MUSA does not require the specification of any parameters in order to search for motifs. There are, however, certain parameters that can be specified in order to focus the search:
Parameter | Description |
---|---|
λ | Defines the length of λ-mers that are used to build the motifs. No motif will be found whose length is shorter than λ. By default λ = 4. By specifying a larger value you will focus the search on larger motifs. |
ε | Defines the tolerance for the distance between λ-mers. In the case of complex motifs, it allows for some variation in the distance between components in occurrences of the same motif. By default ε = 0. |
sieve | Defines the proportion of sequences in which the motif must occur in order to be identified. By default, sieve = 30%. |
mindist | Defines the the minimum distance between two λ-mers in complex motifs. You can use this option the force the search of complex motifs. By default mindist = 1, i.e., no restriction is used. |
maxdist | Defines the maximum distance between two λ-mers in complex motifs. You can use this option to focus the search on limited portions of the sequence. By default maxdist = 500. |
MUSA is available as a command-line program and is quite easy to use. Specify any of the options you might be interested in (which are described below) or simply run the program using the default options. You can also either specify a filename or receive the data directly from the standard input. The input sequences should be given in FASTA format.
Usage: musa [OPTION...] FASTA|- MUSA -- Inference of Complex Motifs -b, --bothstrands Search both strands -e, --epsilon=EPSILON Epsilon parameter (distance tolerance) -l, --lambda=LAMBDA Lambda parameter (size of lambda-mers) -m, --mindist=mindist Minimum distance between lambda-mers -M, --maxdist=maxdist Maximum distance between lambda-mers -o, --output=FILE Output to FILE instead of standard output -q, --quiet Behave quietly -s, --sieve=SIEVE Percent of minimum number of sequences -?, --help Give this help list --usage Give a short usage message -V, --version Print program version Mandatory or optional arguments to long options are also mandatory or optional for any corresponding short options.Report bugs to <ndm+musa_bugs@algos.inesc-id.pt>. |
Version 0.5 |
# MUSA/0.5 Output # Sequences: 3 Motifs: 6 # Motif Quorum P-value ATGCGT <2> CATAT 2 of 3 6.304220e-07 ATGC <4> CATAT 3 of 3 7.824886e-07 ATGC <3> TCATAT 2 of 3 1.633340e-05 ATGCCTGTCATAT 1 of 3 1.723221e-05 ATGCGTGGCATAT 1 of 3 8.292759e-05 ATGCGTATCATAT 1 of 3 1.929857e-04 |
Above we can see an example of the output given by version 0.5 of MUSA. The first column corresponds to the motif found. If it is a complex motif, the distance between each component is given between <>. The second column corresponds to the quorum of the motif, i.e., the number of sequences of the dataset where the motif can be found. The last column corresponds to the statistical significance score computed for each motif.
If the user specifies a value of ε greater than zero, then the distance between each component will be shown as an pair of distances (mininum,maximum).
If the user performs the search for motifs in both strands, the output will generally show each motif and its reverse-complemented version. This may not happen, however, since the motif reconstruction procedure is not guaranteed to produce coherent reconstructions of biclusters composed of reverse-complemented pairs of configurations.
Mendes ND, Casimiro AC, Santos PM, Sá-Correia I, Oliveira AL, Freitas AT. MUSA: a parameter free algorithm for the identification of biologically significant motifs. Bioinformatics. 2006 Dec 15; 22(24): 2996-3002 Or simply copy
@Article{Mendes:2006:Bioinformatics:17068086, author = "Mendes, N D and Casimiro, A C and Santos, P M and S{\'a}-Correia, I and Oliveira, A L and Freitas, A T", title = {MUSA: a parameter free algorithm for the identification of biologically significant motifs}, journal = "Bioinformatics", year = "2006", volume = "22", number = "24", pages = "2996-3002", month = "Dec", pmid = "17068086", url = "http://www.hubmed.org/display.cgi?uids=17068086", doi = "10.1093/bioinformatics/btl537" }
Research interests | Publications | Software | Short CV | Personal | Documents