TBtools
TBtools
[For version 0.665]
Click to Join in TBtools User Group on Telegram
CJ (
ccj0410@gmail.com)
South China Agricultural University
Overview
Rapid development of high-throughput sequencing (HTS) techniques has led biology into the “big-data” era. Data analysis using various bioinformatics softwares or pipelines relying on programming and command-line environment is challenging and time-consuming for most wet-lab biologists. Bioinformatics tools with a user-friendly interface are preferred to save time.
Thus, we present TBtools (a Toolkit for Biologists integrating various biological data handling tools), a stand-alone software with a user-friendly interface. It has powerful data handling engines for both bulk sequence processing and interactive data visualization. It includes a large collection of functions, which may facilitate much simple, routine but elaborate work on biological data, such as bulk sequence extraction, gene set enrichment analysis, Venn diagram preparation, heatmap illustration, comparative sequence visualization, etc.
A Glance of TBtools’ Functions
Download and Installation
TBtools is a platform-independent software that can be run under all operating systems with Java Runtime Environment 1.6 or newer. It is freely available to non-commercial users at
https://github.com/CJ-Chen/TBtools/releases
For users under all operating systems (Windows, Mac, Linux….):
- Download TBtools-crossplatform_XXX.rar.
- Unpack the rar file and obtain an executable jar file.
- Optional. If users want to use the BLAST wrapper functions, then users is required to install BLAST package and add its bin directory to environment variables.
For users under windows, a better choice is: - Download TBtools_windows-32-bits-XXX.rar or TBtools_windows-64-bits-XXX.rar file.
- Unpack the rar file and obtain an exe file.
- Double click the exe file, click next and wait for the installation procedure.
Getting started with TBtools
For users start TBtools from a jar file, there are two ways:
- Double click the jar file; if it don’t work, try next way.
- Open the terminal (CMD or Powershell under Windows, Shell/Bash under Mac or Linux); type
java -Xmx2G -jar PathtoTBtools.jar
A command example under Windows
For windows users that have installed TBtools from an exe file, double click the TBtools icon and the main panel of TBtools will pop up.
In the main panel of TBtools, there is a main menubar and several buttons:
- Click “Version” to check whether the current TBtools is the latest version.
- Click “Citation” to get the citation method of TBtools.
Usage of Key Functions
Obtain the Demo Data ftp://118.24.17.128/TBtools%20-%20Demo%20Data/
, using “uftp” as acount number and “12345678” as password.
Bulk Sequence Extraction
Go to it:
Main menubar -> Sequence Toolkits -> Fasta Tools -> Amazing Fasta Extractor
Input:
- A target sequence file in Fasta format (ref https://en.wikipedia.org/wiki/FASTA), e.g.
>Unigene1 high expressing gene
ACGATCAGCTCAGCGACGATCGACTAGCTACGATCAGCTAGCTACGATCGACTAGCTAGCTACGA
ACGATCAGCTCAGCGACGATCGACTAGCTACGATCAGCTA
>Unigene2 low expressing gene
ACTCAGCTCAGCGACGATCAGCTCAGCGACGATCGACTAGCTACGACGACTAGCTACGA……
….
- A set of gene identifiers or regions, e.g.
##### Lines prefixed with # will be ignored
##### Examples for One Gene ###########
Unigene_1
Unigene_2
### ChrID StartPos EndPos
Chr_1 100000 102000
### GeneID ChrID StartPos EndPos #########
FinalGeneID Chr_1 100000 120000
Output:
Complete sequences or regions of sequences specified by users
Detailed Usage:
- Drag a target sequence file in the text-field or set it by click the “…” button
- Click “initialize” button to build a FA-index (if the index has already been built, TBtools will skip it)
- Set a path of an output file
- Set a set of IDs or sequence regions
- Click “Start”
- Optional. If users select “Just Show Dialog”, users can obtain the extracted sequences directly from a dialog. In this case, setting of an output file is not required.
Sequence Extraction from Genome According to Gene Structure Annotation File (.gff3/gtf)
Go to it:
Main menubar -> Sequence Toolkits -> Gff3/GTF Manipulator -> Gtf/Gff3 Sequences Extractor
Input:
- A target sequence set of genome in Fasta format
- A corresponding gene structure annotation file in gff3/gtf format (ref
https://en.wikipedia.org/wiki/General_feature_format)
Output:
A file storing sequences of a user-specific feature (CDS, exon, mRNA, gene, UTR, promoter, etc.)
Detailed Usage: - Set a gff3/gtf file
- Click the “initialize” button and TBtools will provide available features for users to select
- Select a target feature, e.g. “CDS”
- Select an ID to group sequence segments of specific features, e.g. “Parent”
- Set a genome sequence file
- Set an output file
*. Optional. If users want to extract sequences upstream or downstream from the specific feature, e.g 2000 bp upstream from CDS (often referring as “Promoter regions”), users need to enter “2000” in the corresponding text-field.
BLAST Wrapper and visualization
Go to it:
Main menubar -> Blast -> Blast Waper -> Several Sequences To a Big File [Commonly Used]
Input:
- A set of query sequences. Fasta format is only required for multiple sequences.
- A target sequence file, e.g. transcriptome, genome.
Output:
BLAST result in user specific format (xml, table, pairwise)
Detailed Usage: - Paste a set of query sequences or drag a sequence file and drop into the text-area
- Set a target sequence file in fasta format
- Set a path of an output file to store the BLAST result. Click “Temp” button will generate an intermediate file, which will be automatically deleted by TBtools when exited.
- Click “Start”
- When BLAST process is finished, user can click the “Visualize” button to invoke TBtools BLAST result visualization functions (Only valid for XML outfmt).
*. Optional. Detailed parameters could be changed on the top-right panel and the “Other Options”. Most of the time, keep it as default and it work well.
Example of visualization of BLAST results
Gene Ontology Enrichment Analysis
Go to it:
Main menubar -> GO and KEGG -> GO Enrichment
Input:
- A go-basic.obo file downloaded from http://purl.obolibrary.org/obo/go/go-basic.obo
- A GO annotation background, formatted as bellow. The first column are gene identifiers. If multiple gene annotated to a same GO term, comma could be used to separate them, e.g. “Unigene1,Unigene2 GO:0005509”; The second column are GO numbers. If one gene is annotated to several GO term, comma could be used to separate them, e.g. “Unigene1 GO:0008483,GO0030170”.
- A gene set of interest for enrichment analysis.
Output:
Eight files will be generated. An example is showed as bellow. GO annotation background was parsed according to Gene Ontology information stored in go-basic.obo file. *Fresh-hand of enrichment analysis is recommended to use the .final.xls result directly. GO enrichment analysis is conducted to three categories, Biological Process, Cellular Component, and Molecular Function. File suffixed with “*.sorted.padjust” can be used for further analyses.
Detailed Usage: - Set the latest go-basic.obo file
- Set a GO annotation background
- Set a gene set of interest
- Set an output directory. User could also set a prefix for naming of output files.
- Optional. Click “Download go-basis.obo …” will invoke TBtools to download the latest go-basic.obo file.
- Optional. GO-slim could be used in enrichment analysis.
KEGG Pathway Enrichment Analysis
Go to it:
Main menubar -> GO and KEGG -> KEGG Enrichment
Input:
- A file storing KEGG pathway ontology information. Users could prepare it using the “Make One Backend File From Web” Button.
- A KEGG annotation background, formatted as bellow. The first column are gene identifiers. If multiple gene annotated to a same KO number, comma could be used to…