Skip to content



EMBOSS (European Molecular Biology Open Software Suite) package contains over 200 programs for sequence analysis. EMBOSS is designed for classical sequence analysis were the amount of seqences are less than 100 000. Because of that, most of the tools are not effective for raw NGS datasets where you have milloins sequences (reads). Examples of application areas of EMBOSS tools are given below.

  • Sequence alignment
  • Phylogeny
  • Hidden Markow models
  • Rapid database searching with sequence patterns
  • Protein motif identification, including domain analysis
  • EST analysis
  • Nucleotide sequence pattern analysis, for example to identify CpG islands.
  • Simple and species-specific repeat identification
  • Codon usage analysis for small genomes
  • Rapid identification of sequence patterns in large scale sequence sets.
  • Presentation tools for publication
  • RNA secondary structure prediction


Free to use and open source under GNU GPLv2.


Version on CSC's Servers

  • Puhti: 6.5.7
  • Chipster provides a graphical interface to many EMBOSSS tools.


To make EMBOSS programs available in Puhti super-cluster, give command:

module load biokit

The biokit module sets up a set of commonly used bioinformatics tools, including EMBOSS. Note however that there are bioinformatics tools in Puhti, that have a separate setup commands.

After loading biokit, you can start any of the EMBOSS programs by typing its name. For example:

The wossname command is a help tool that you can use to see what EMBOSS commands are available. You can also use it to search EMBOSS tools using key words.


Last update: October 10, 2022