Contact us Home Design Peptide Scan Protein Submit Peptides in Batch Mode Algorithm Used Help Contact us







TumorHPD Algorithm


Data set used
  1. We have taken data set of 651 peptides from the database (TumorHoPe) and equal number of peptides were taken from Swiss-Prot database (for negative data set). This data set was used for calculating amino acid composition and dipeptide composition.
  2. For N5 terminal , data set contains first five residues and C5 terminal ,dataset contains last five residues of peptides in main dataset .
  3. For N10 terminal ,data set contains first ten residues and C10 terminal data set contains last ten residues of peptide in main dataset .
  4. Data set for peptides (length in between 4 and 10 residue) consist of 469 peptides from main dataset


Prediction approach:

We have used following approaches for the development of SVM models:

Amino acid composition: Composition profile of patterns is the percentage frequencies of each amino acid in a fixed length sequence patterns. The fraction of all 20 natural amino acids of fixed length sequence patterns are taken as input vector for SVM.

Dipeptide composition : In this approach, fixed pattern length of 400 (20 x 20). It encapsulates the global as well as local information of the sequence.

Binary profile: In this approach, fixed length of 21-window sequence patterns was converted into binary form. Each residue of patterns was represented by a vector of dimension 21 (e.g. Ala by 1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0; Cys by 0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0), which contains 20 amino acids and one dummy amino acid "X".

Models Used on Webserver: We have developed models on the svm based prediction method on different input features. Following results were obtained from amino acid composition, binary profile which we have used in model generation.


MethodThresTPFPTNFNSensitivitySpecificityAccuracyMCCROC
Amino Acid Composition053110854512081.5783.4682.520.650.90
Binary(NTCT5)0.14868356816374.8887.2581.080.630.88
Binary (NTCT10)-0.1204312224980.6387.7584.190.690.91
(Binary NTCT5 upto 10 residue long)0.13434442512673.1390.6281.880.650.88






Department of Computational Biology Indraprastha Institute of Information Technology