AlignACE and CompareACE
AlignACE (Aligns Nucleic Acid Conserved Elements) is a program which finds sequence elements conserved in a set of DNA sequences. It uses a Gibbs sampling strategy which is similar to that described by A. F. Neuwald, J.Liu and C.E. Lawrence in Gibbs motif sampling: Detection of bacterial outer membrane protein repeats. An iterative masking procedure is used to allow multiple distinct motifs to be found within a single data set.
Developed by genomics researchers at Harvard Medical School, AlignAce employs an algorithm that scans non-coding nucleic acid sequences at high resolution for motifs that occur with non-random frequency. This algorithm is built into a multi-level sequence analysis program that highlights gene-specific regulatory elements for further analysis.
AlignAce offers both efficiency and convenience. Its high signal-to-noise ratio preferentially reduces false positives in the program output, while iterative masking uncovers multiple, distinct sequence motifs within a single data set. In gene expression studies, AlignACE easily highlights potential gene regulatory elements and sets of co-regulated genes.
CompareACE performs a pairwise comparision between two motifs and returns a value between -1.0 and 1.0 for the best possible alignment, with a perfect match scoring 1.0. This value corresponds to the Pearson correlation coefficient between the base frequencies of the positions in the aligned portions of the motifs. To prevent spurious matches, it is required that the aligned portion include at least the six more informative positions in each motif.
Installed on the opteron cluster.
Usage, version 4.0
- The input file must be in FASTA format.
-
Executables
AlignACE and CompareACE are in
/biomed/src/alignace/alignace2004/.
- Running AlignACE with no options returns a list of all possible options.
- CompareACE can be used in these ways:
- Usage 1a:
CompareACE ace_file1 mot1 ace_file2 mot2
This compares motif mot1 of ace_file1 with motif number mot2 of ace_file2.
- Usage 1b:
CompareACE ace_file1 ace_file2
This performs all pairwise comparisons between the motifs in the specified files (assumes all filenames have some non-digit characters).
- Usage 2:
CompareACE -all file ace_col mot_col (-c [cutoff])
This usage performs all pairwise comparisons between all motifs in the specified file and returns those scoring better than the given cutoff (default all). The file should be tab-delimited with the name of the AlignACE file in column ace_col and the motif number in column mot_col, specifying one motif per line.
- Usage 3:
CompareACE -form file1 file2
These files should contain a set of aligned sites, optionally with asterisks underneath for selection of active columns, and any comment lines (starting with #).
- Usage 1a:
- Examples
Reference
Roth, F.P., Hughes, J.D., Estep, P.W. and G.M. Church, 1998, "Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitation", Nat. Biotechnol.16(10): 939-945. (See also, "Comment", pp. 907-908.)
See also:
- AlignACE web site
- Other sequence analysis software installed at PSC.