Trinity is used for de novo reconstruction of transcriptomes from RNA-seq data. Trinity combines three independent software modules: Inchworm, Chrysalis, and Butterfly, applied sequentially to process large volumes of RNA-seq reads. Trinity partitions the sequence data into many individual de Bruijn graphs, each representing the transcriptional complexity at at a given gene or locus, and then processes each graph independently to extract full-length splicing isoforms and to tease apart transcripts derived.
The Trinity module at CSC also includes TransDecoder and Trinotate tools to anlyze the results of a Trinity run.
Free to use and open source under [Broad Institute License]https://github.com/genome-vendor/trinity/blob/master/LICENSE).
Version on CSC's Servers
Puhti: 2.13.2, 2.11.0, 2.8.5
In Puhti, Trinity is set up with command:
module load biokit
module load trinty/2.13.2
Trinity should be used used interactively in a compute node or preferably through the batch job system. Below is an example batch job file for Trinity.
#!/bin/bash #SBATCH --job-name=trinity #SBATCH --output=output_%j.txt #SBATCH --error=errors_%j.txt #SBATCH --time=48:00:00 #SBATCH --ntasks=1 #SBATCH --nodes=1 #SBATCH --cpus-per-task=6 #SBATCH --mem=24000 #SBATCH --account=project_1234567 # # module load trinity/2.13.2 Trinity --seqType fq --max_memory 22G --left reads.left.fq --right \ reads.right.fq --SS_lib_type RF --CPU $SLURM_CPUS_PER_TASK \ --output trinity_run_out --grid_exec sbatch_commandlist
6 * 4 GB = 24 GB. In Puhti, you must use batch job option
--account=to define the project to be used. You should replace project_1234567 used in the example, with your own project. You can check your projects with command:
In the actual Trinity command the number, of computing cores to be used (--CPU) is set using environment variable:
This variable contains the value set the
--cpus-per-task SLURM option.
In Puhti you can also use distributed computing to speed up the trinity job. When definition:
When the batch job file is ready, it can be submitted to the batch queue system with command:
Please check the Trinity site to get hints for estimating the required resources,
You can analyse the results of your Trinity job with
autoTrininotate. You need two files, resulting from a successful Trinity assembly.
1. Fasta formatted nucleotide sequence file containing the final contigs created by Trinity (Trinity.fasta)
2. gene-to-trans map for the input fasta file (Trinity.fasta.gene_to_trans_map)
You can launch autoTrinotate with command:
autoTrinotate analysis can require much resources so you should execute the command in with sinteractive or as a batch job.
AutoTrinotate produces an SQLite database file that can be further analyzed with command:
Last edited Wed Dec 1 2021