Transit - Annotation Files to use for Pathway Enrichment
- associations file: ORF id -> pathway id
- pathways file: pathway id -> name/description
- FET,GSEA,ONT: which enrichment methods can each pathway set be used for? (e.g. "-M FET" flag in transit pathway_enrichment command)
- FET=Fisher's Exact Test, GSEA=Gene Set Enrichment Analysis (Subramaniam 2004, PMID 16199517); ONT=Ontologizer (Bauer 2008, PMID 18511468)
- Sanger roles: from (Cole 1997, PMID 9634230) (unique to Mtb)
- COG gene and pathway counts exclude "General Function Only" and "Function Unknown"
- the Genbank accession number should be used to obtain the genome seq for mapping reads using TPP
- the corresponding prot_table (see genomes page?) should be used for runnning resampling so the genes have the conventional ORF ids for pathway analysis
- For counts of genes with COG pathway assignments in the tables below, we exclude the following large but meaningless pathways:
- R - General function prediction only
- S - Function unknown
- For counts of genes with Sanger pathways assignments in the tables below, we exclude the following large but meaningless pathways:
- V - Conserved hypotheticals
- VI - Unknowns
Mycobacterium tuberculosis H37Rv (NC_000962.3) (total ORFs=4019)
Mycobacterium smegmatis mc2 155 (NC_008596.1) (total ORFs=6719)
Mycobacterium abscessus ATCC 19977 (NC_010397.1) (total ORFs = 4920)
Mycobacterium avium 104 (NC_008595.1) (total ORFs = 5120)
(there are no COG annotations for avium)
Note: if you would like to request pathway files for other
organisms, please email me at ioerger@cs.tamu.edu.