QIIME
QIIME (Quantitative Insights Into Microbial Ecology) is a package for comparison and analysis of microbial communities, primarily based on high-throughput amplicon sequencing data (such as SSU rRNA) generated on a variety of platforms, but also supporting analysis of other types of data (such as shotgun metagenomic data). QIIME takes users from their raw sequencing output through initial analyses such as OTU picking, taxonomic assignment, and construction of phylogenetic trees from representative sequences of OTUs, and through downstream statistical analysis, visualization, and production of publication-quality graphics.
In 2017 a totally rewritten version QIIME2 was released. The development of the original QIIME version has stopped. QIIME2 is strongly suggested for most uses.
License
Free to use and open source under BSD 3-Clause License.
Available
- QIIME1: Puhti: 1.9.1
- QIIME2: Puhti: 2022.8, 2023.2, 2023.5, 2023.9-amplicon, 2023.9-shotgun, 2024.2-amplicon, 2024.2-shotgun, 2024.10-amplicon, 2024.10-metagenome, 2024.10-pathogenome
Usage
To load QIIME1 module on Puhti:
To use QIIME2, check available versions with:
Load desired version with e.g.:
After that you can start QIIME2 with command:
Distributions
Latest versions of QIIME2 come in different distributions: amplicon/metagenome/pathogenome/tiny. These distributions vary on which plugins come with them. You can compare the distributions on QIIME2 home pages.
CSC provides installations for the amplicon, metagenome and pathogenome distributions.
Additional plugins
CSC only maintains the basic distributions of QIIME2. If you need plugins not included in the basic distributions, you will need to install your own QIIME2 using the Tykky tool.
First select the distribution (amplicon/metagenome/pathogenome/tiny) that best meets your needs.
Download the corresponding environment file.
For example for 2024.10 amplicon distribution:
Check the installation instructions for the plugins you want to use.
If the additional plugins can be installed with Conda, you can simply add them to the end of the environment file.
If the plugins need additional installation steps, you can copy them to text file and use
conda-containerize update
command as described in the Tykky documentation.
Installation:
module purge
module load tykky
mkdir qiime
conda-containerize new --mamba --prefix qiime qiime2-amplicon-2024.10-py310-linux-conda.yml
If necessary, run:
Running
Note that many QIIME tasks involve heavy computing. Thus, these tasks should be executed as batch jobs.
QIIME jobs can be very disk intensive, especially its handling of temporary files, so it is best to reserve fast local disk for them.
For interactive batch jobs, see sinteractive documentation.
In case of normal batch jobs, you must reserve NVMe disk area that will be used as $TMPDIR area.
For example, to reserve 100 GB of local disk space:
For example, the batch job script below runs the denoising step of the QIIME moving pictures tutorial as a batch job using eight cores.
#!/bin/bash
#SBATCH --job-name=qiime_denoise
#SBATCH --account=<project>
#SBATCH --time=01:00:00
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8
#SBATCH --mem=16G
#SBATCH --partition=small
#SBATCH --gres=nvme:100
#set up qiime
module load qiime2/2023.9-amplicon
# run task. Don't use srun in submission as it resets TMPDIR
qiime dada2 denoise-single \
--i-demultiplexed-seqs demux.qza \
--p-trim-left 0 \
--p-trunc-len 120 \
--o-representative-sequences rep-seqs-dada2.qza \
--o-table table-dada2.qza \
--o-denoising-stats stats-dada2.qza \
--p-n-threads $SLURM_CPUS_PER_TASK
Maximum running time is set to 1 hour (--time=01:00:00
). As QIIME2 uses thread-based
parallelization, the job is requested to use one task (--ntasks=1
) where all cores need to be in
the same node (--nodes=1
). This one task will use eight cores as parallel threads
--cpus-per-task=8
that can use in total up to 16 GB of memory (--mem=16G
). Note that the
number of cores to be used needs to be defined in actual qiime
command, too. That is done with
Qiime option --p-n-threads
. In this case we use $SLURM_CPUS_PER_TASK
variable that contains the
--cpus-pre-task
value. We could as well use --p-n-threads 8
, but then we have to remember
to change the value if the number of reserved CPUs is changed.
The job is submitted to the batch job system with sbatch
command. For example, if the batch job file is named qiime_job.sh
, then the submission command is:
Note
The use of tab-qiime
to enable command completion for QIIME is known to cause problems on Puhti, and should be avoided.