EMBOSS

EMBOSS (European Molecular Biology Open Software Suite) package contains over 200 programs for sequence analysis. EMBOSS is designed for classical sequence analysis where the amount of sequences is less than 100 000. Because of that, most of the tools are not effective for raw NGS datasets where you have millions of sequences (reads). Examples of application areas of EMBOSS tools are given below.

Sequence alignment
Phylogeny
Hidden Markov models
Rapid database searching with sequence patterns
Protein motif identification, including domain analysis
EST analysis
Nucleotide sequence pattern analysis, for example to identify CpG islands.
Simple and species-specific repeat identification
Codon usage analysis for small genomes
Rapid identification of sequence patterns in large scale sequence sets.
Presentation tools for publication
RNA secondary structure prediction

License

Free to use and open source under GNU GPLv2.

Available

Puhti: 6.5.7
Chipster provides a graphical interface to many EMBOSS tools.

Usage

EMBOSS programs are available on Puhti as part of the collection in biokit module. To use it, load the biokit module by running the command:

module load biokit

The biokit module sets up a set of commonly used bioinformatics tools, including EMBOSS. Note however that there are other bioinformatics tools on Puhti that have separate setup commands.

After loading biokit, you can start any of the EMBOSS programs by typing its name. For example:

wossname

The wossname command is a help tool that you can use to see what EMBOSS commands are available. You can also use it to search EMBOSS tools using keywords.

EMBOSS

License

Available

Usage

More information