GlycoPP is a webserver for predicting potential N-and O-glycosites in prokaryotic protein sequence(s), where N-glycosite is an Asn residue and O-glycosite could be a serine or threonine residue having a glycan attached covalently and enzymatically at amide or hydroxyl group respectively.
GlycoPP Version 1.0 is the first open-access, web based, and highly accurate glycosylation prediction software made available for the analysis of prokaryotic protein sequences. GlycoPP prediction programmes are trained on the largest available and an extensive dataset of 107 N-glycosites and 116 O-glycosites extracted from 59 experimentally characterized glycoproteins of prokaryotes as obtained from ProGlycProt first release (June 2011). This dataset includes validated N-glycosites from phyla Crenarchaeota, Euryarchaeota (domain Archaea), Proteobacteria (domain Bacteria) and validated O-glycosites from phyla Actinobacteria, Bacteroidetes, Firmicutes and Proteobacteria (domain Bacteria).
The webserver provides prediction results for N-or O-glycosites using any one of the following four, user defined SVM (Support Vector Machine) based prediction approaches namely:
1. Binary Profile of Pattern (BPP)
2. Composition Profile of Patterns (CPP)
3. PSSM Profile of Patterns (PPP)
4. Hybrid approaches:
BPP+ASA for N-glycosites prediction
PPP+ASA for O-glycosites prediction.
In view of the current understanding that glycosylation occurs on folded proteins in bacteria, the last two approaches namely, BPP+ASA and PPP+ASA employ predicted surface accessibility information in addition to the BPP or PPP features as mentioned above for prediction of glycosite(s) in input protein sequence(s). These programmes can accept one or more amino acid sequence(s) in FASTA format as input and results are provided in tabular format where potential glycosites are shown in green color along with the corresponding SVM score.
1. Users are encouraged to supplement our prediction results with other complementary evidences like presence of signal peptides, transmemebrane domains, sub-cellular localization of the proteins, presence of certain OSTs or GTs in the genome of the organism to indicate likely type and mode of glycosylation, known glycosylation in a close homologue and available experimental data on type of linkages, attached sugars etc., for best interpretation of the results obtained and also to decipher the biological significance of the same
2. Currents Models are trained on a limited set of all available prokaryotic glycosites and may not be useful for proteins/organisms with large differences than from the ones used in training the tool.
Please cite following paper, if you are using this server
Chauhan JS, Bhat AH, Raghava, G. P. S. and Rao A (2012) GlycoPP: A Webserver for Prediction of N- and O-Glycosites in Prokaryotic Protein Sequences Plos One