Detailed description page of SalivaDB

This page displays user query in tabular form.

SAL_21853 details

Primary information
SALID	SAL_21853
Biomarker name	Streptococcus mutans
Biomarker Type	NA
Sampling Method	NA
Collection Method	Samples of human saliva collected
Analysis Method	Pyrosequencing
Collection Site	Whole Saliva
Disease Category	Healthy
Disease/Condition	Healthy
Disease Subtype	NA
Fold Change/ Concentration	NA
Up/Downregulated	NA
Exosomal	NA
Organism	Homo sapiens
PMID	22962346
Year of Publication	2012
Biomarker ID	1309
Biomarker Category	Microbe
Sequence	NZ_JAFEVV000000000.1
Title of study	Comparing clustering and pre-processing in taxonomy analysis
Abstract of study	MOTIVATION: Massively parallel sequencing allows for rapid sequencing of large numbers of sequences in just a single run. Thus, 16S ribosomal RNA (rRNA) amplicon sequencing of complex microbial communities has become possible. The sequenced 16S rRNA fragments (reads) are clustered into operational taxonomic units and taxonomic categories are assigned. Recent reports suggest that data pre-processing should be performed before clustering. We assessed combinations of data pre-processing steps and clustering algorithms on cluster accuracy for oral microbial sequence data.RESULTS: The number of clusters varied up to two orders of magnitude depending on pre-processing. Pre-processing using both denoising and chimera checking resulted in a number of clusters that was closest to the number of species in the mock dataset (25 versus 15). Based on run time, purity and normalized mutual information, we could not identify a single best clustering algorithm. The differences in clustering accuracy among the algorithms after the same pre-processing were minor compared with the differences in accuracy among different pre-processing steps.