CoReMicrob: Computational Resources for Micorbiome

Major Databases for Microbiome Data

This page provides information regarding major databases providing raw data of microbiome studies along with links to their websites.

Major Databases for Metagenome Data Retrieval

Database Name Description PubMed ID
NCBI Sequence Read Archive (SRA) Stores raw sequencing data and alignment information 34850094
EMBL-EBI European Nucleotide Archive (ENA) Includes raw sequence data, sequence information 39558171
DDBJ Sequence Read Archive (DRA) Stores raw sequencing data and alignment information 36420889
China National GeneBank DataBase Built for biological big data sharing and application services 32952115

Major Curated Databases for Metagenome Studies

Database Name Description PubMed ID
MicrobiomeDB Discovery tool empowering researchers to leverage metadata to interrogate microbiome datasets. 29106667
GMrepo Curated database of consistently annotated human gut metagenomes 34788838
QIITA Provides database and compute resources to the global community 30275573
Indian Microbiome Database Comprehensive database documenting microbiome research in India. NA
MGnify Facilitates assembly, analysis, and archiving of microbiome-derived nucleic acid sequences. 36477304
MicrobiomeHD Standardized database of human gut microbiome studies in health and disease 29209090
IBDMDB An integrated resource for analyzing the gut microbial ecosystem in the context of IBD 31142855
Human Gut Microbiome Atlas Global shotgun metagenomics of normal human gut microbiome from 20 countries are presented with species abundance, gene richness and enterotypes NA
curatedMetagenomicData It includes gene families, marker abundance, marker presence, pathway abundance, pathway coverage, and relative abundance for samples collected from different body sites. 29088129
Human Microbiome Compendium This dataset includes over 1.6L samples of publicly available 16S rRNA amplicon sequencing data, all processed using the same pipeline and reference database 37873416
NIH Human Microbiome Project Contains studies of dynamic changes in the microbiome and host under three conditions: pregnancy and preterm birth; inflammatory bowel diseases; and prediabetes. 31142853
H3Africa Expanding the human gut microbiome atlas of Africa. 39880958
curatedMetagenomicData The curatedMetagenomicData package provides standardized, curated human microbiome data for novel analyses. It includes gene families, marker abundance, marker presence, pathway abundance, pathway coverage, and relative abundance for samples collected from different body sites. 29088129
microbiomap 168,000 public 16S gut microbiome samples processed and integrated at microbiomap.org. 39848248

Major Reference Databases used for Metagenome Studies

Database Name Description PubMed ID
NCBI RefSeq Comprehensive, curated collection of reference sequences, including genomes for taxonomy. 39526381
MARMICRODB Database for taxonomic classification of (marine) metagenomes. zenodo
MGnify Database Database for microbiome-derived sequences with taxonomic annotations and analysis tools. 36477304
IMG Integrated genome and metagenome database. 22194640
IMG/PR A database of plasmids from genomes and metagenomes with rich annotations and metadata. 37930866
IMG/VR An expanded database of uncultivated virus genomes. 36399502
Kraken2 Database Database tailored for use with Kraken2, optimized for fast and accurate metagenomic assignments. 31779668
NMPFamsDB A database of novel protein families from microbial metagenomes and metatranscriptomes. 37811892
SILVA Ribosomal RNA database for taxonomy and phylogenetics, widely used for microbial studies. 23193283
RDP (Ribosomal Database Project) High-quality ribosomal RNA sequences for taxonomy and functional annotations. 24288368
Greengenes 16S rRNA database for microbial taxonomy and phylogeny (limited updates). 16820507
GTDB A standardized bacterial and archaeal taxonomy based on phylogenomics. 34520557
UniProtKB Protein sequence and functional information database, supports taxonomy-based analysis. 36408920
EggNOG Functional and phylogenomic annotation of genes, including taxonomic classifications. 36399505
KEGG Comprehensive resource for biological pathways and genomes, with taxonomy support. 36300620
PROKKA Database Specialized annotation database for bacterial genomes, used in taxonomy assignments. 24642063
MarFERReT An open-source, version-controlled reference library of marine microbial eukaryote functional genes. 38129449
MetaCyc Database of metabolic pathways and enzymes. 31586394
CAZy The carbohydrate-active enzyme database. 34850161
MEROPS A database for peptidases and their inhibitors, providing a comprehensive resource for the study of proteolytic enzymes. 29145643
SEED A framework for genome annotation, providing curated subsystems and facilitating comparative genomics and metabolic reconstruction. 24293654
TIGRFAMs A collection of protein families featuring curated multiple sequence alignments, Hidden Markov Models (HMMs), and associated annotation, useful for protein function prediction. 23197656
COG (Clusters of Orthologous Groups) A database of orthologous gene families, facilitating functional annotation of proteins and genome evolution studies. 39494517
CARD (Comprehensive Antibiotic Resistance Database) Curated database for AMR genes and associated resistance mechanisms. 36263822
MEGARes Provides resistance gene annotations tailored for metagenomic data. 31722416
VFDB (Virulence Factor Database) Comprehensive database of virulence factors in bacterial pathogens. 15608208
MvirDB Broad repository for virulence and AMR genes, including toxins and pathogenicity islands. 17090593
PATRIC Integrative resource for AMR, virulence factors, and pathogen genomes. 21896772
T3DB (Toxin and Toxin Target Database) Contains detailed information about toxins, toxicants, and their effects on genes and proteins. 19897546
ISfinder A database of bacterial insertion sequences, often linked to resistance gene mobilization. 16381877
CyanoBase The genome database for cyanobacteria. 27899668

Metagenome Assembled Genomes and Microbiome Colletions from other Organisms

Database Name Description PubMed ID
ELGG A compendium of 32,277 metagenome-assembled genomes and over 80 million genes from the early-life human gut microbiome 36050292
UHGG A unified catalog of 204,938 reference genomes from the human gut microbiome 32690973
VMGC A multi-kingdom collection of 33,804 reference genomes for the human vaginal microbiome 38907008
IMGG A high-quality genome compendium of the human gut microbiome of Inner Mongolians 36604505
GMMC The multi-kingdom microbiome of the goat gastrointestinal tract 37779211
GKGMC Ecological niches and assembly dynamics of diverse microbial consortia in the gastrointestine of goat kids 38365259
RUG Compendium of 4,941 rumen metagenome-assembled genomes for rumen microbiome biology and enzyme discovery 31375809
African Boran MAGs 1200 high-quality metagenome-assembled genomes from the rumen of African cattle and their relevance in the context of sub-optimal feeding 32883364
Japan RUGs Identification of 146 Metagenome-assembled Genomes from the Rumen Microbiome of Cattle in Japan 36273894
iMGMC An Integrated Metagenome Catalog Reveals New Insights into the Murine Gut Microbiome 32130896
Murine MAGs Metagenome-Assembled Genomes from Murine Fecal Microbiomes Dominated by Uncharacterized Bacteria 36779794
CMMG Comprehensive mouse microbiota genome catalog reveals major difference to its human counterpart 35259160
Chicken Gut MAGs Metagenome-assembled genomes and gene catalog from the chicken gut microbiome aid in deciphering antibiotic resistomes 34795385
GIT Duck gut metagenome reveals the microbiome signatures linked to intestinal regional, temporal development, and rearing condition 39135685
Horse MAGs Expanded catalogue of metagenome-assembled genomes reveals resistome characteristics and athletic performance-associated microbes in horse 36631912
PIG MAGs A reference gene catalogue of the pig gut microbiome 27643971
HRGM2 A human gut metagenome-assembled genome catalogue spanning 41 countries supports genome-scale metabolic models 41345261
MAGIC A reference gene catalogue of the pig gut microbiome 39591974
PIGC Expanded catalog of microbial genes and metagenome-assembled genomes from the pig gut microbiome 33597514