MetaPhlAn is a computational tool for profiling the composition of microbial communities from metagenomic sequencing data. MetaPhlAn relies on unique clade-specific marker genes identified from ~17,000 reference genomes.


Free to use and open source under MIT License.


  • MetaPhlAn 3.0 is available in Puhti


In Puhti, MetaPhlAn is installed as part of gcc 9.1.0 compatible biopythontools module. To activate it, run commands:

module load biokit
module load biopythontools

MetaPhlAn can automatically retrieve the MetaPhlAn database and create the Bowtie2 indexes it needs on-the-fly when it the command is executed. By default MetaPhlAn saves these index files to the MetaPhlAn installation directory, but in Puhti, this is not possible. Because of that, the users should use option --bowtie2db to define a directory that will be used to store the database and index files.

For example in the case of project_2001234 the user could first create a directory for the databases:

cd /scratch/project_2001234
mkdir metaphlan_databases

A test input dataset for testing MataPhlAn can be downloaded from the metaphlan gothub site:


In the MetaPhlAn command --bowtie2db is used to define the database directory. In this example the job is executed as an interactive batch job.

sinteractive -m 4G -c 4
module load biokit
module load biopythontools
metaphlan --bowtie2db metaphlan_databases  SRS014476-Supragingival_plaque.fasta.gz --input_type fasta > SRS014476-Supragingival_plaque_profile.txt

