IL4pred: A platform to design IL4 inducing peptide

IL4pred

In Silico Platform for Designing and Disovering of Interleukin-4 inducing peptides


Algorithm of IL4Pred






The prediction server for IL4 inducing epitopes has been designed in a very user-friendly manner. Here, on this page, user can get the details of all the algorithms and procedures exploited in the different modules.

Dataset

Our data can be divided as:

1. Main-dataset: This includes 985 sequences as positive training data, 744 negative sequences from IEDB.

2. Alternative dataset: This comprises same 985 positive sequences as Main dataset, 985 negative sequences from swissprot.

3. Length-restricted dataset: As most of peptides in the main dataset were in range of 8-15 amino acid in length, which is the most accepted length for MHC class II binding. Therefore we removed the peptide with the length smaller than 8 or greater than 15 and a third dataset was created with 904 positive sequences and 740 negative sequences.

Input features from MERCI

Pic-1


Support Vector Machine based methods

In the present study, SVM classifier was used from freely available SVM_light packagg e . This package is powerful as well as user-friendly where we can adjust the parameters and kernel functions like Linear, Polynomial, RBF and Sigmoid.

Evaluation or Performance

Five-fold cross validation technique has been used. Four sets are used for training and remaining one in used for testing, in this way the process repeats five times. Evaluation of performance of different SVM modules has been done by calculating accuracy and Matthew's correlation coefficient (MCC).

Input features for SVM

In this study we have been used various features as SVM input for the prediction of CPPs.

1. Amino Acid Composition: Amino Acid Composition is the fraction of each amino acid present in a peptide. There are 20 vectors generated in which one corresponds to one amino acid and these vectors used for as SVM input.

2. Dipeptide Composition: Dipeptide Composition is the fraction of each dipeptide like AA, AC, AD and so on. It provides compositional as well as local order each residue present in the peptide. It contains 20x20 (400) vectors.

3. Amino Acid Propensity: Amino Acid Propensity can be defined as the dipeptide composition multiplied by its frequency of occurrence in positive nd negative datasets. Here vector size remains 400.

4. Physico-chemical Properties: Physico-chemical Properties of each amino acid like hydrophobicity, hydrohpilicity, charge, pI etc. has been used as input feature for the prediction. We obtained physico-chemical properties values of each amino acid form the webserver AAindex and used them to calculate physico-chemical properties of peptide by Perl programes.