FADPred : A webserver for the prediction of FAD interacting residues

Algorithm of FADPred

Data Sets
FAD interacting data used here were taken from the PDB database. FAD interacting data contains 198 protein chain. For interacting dataset we find the list of FAD interacting residue using Ligand Protein Contact (LPC) and take 8 residue from both side of this residue. For negative dataset same length and same number of residue which don't interact with FAD randomly taken.

Binary patterns and amino acid composition were used for the input.

Binary Patterns
Amino acids were represented as binary string of length 21 where 20 "0" and a unique position set to "1" for each amino acid. For example an amino acid(A) can be represented as follows

A = 100000000000000000000

Evolutionary information (PSSM)
Evolutionary information obtaineb from position specific scoring matrix (PSSM) generated during PSI-BLAST search against non-redundant (nr) database of protein sequence. The evolutionary information for each amino acid is encapsulated in a vector of 21 dimensions where the size of PSSM matrix of a protein with N residue is 21 * N. Where 20 dimension are standard amino acid and 1 for dummy amino acid. We normalized each value within the range of 0 - 1.

The SVM was implemented using freely downloadable software package SVM_light written by Joachims (Joachims 1999). The software enables the user to define a number of parameters as well as to select from a choice of inbuilt kernal functions, including a radial basis function (RBF) and a polynomial kernal.

Evaluation module
The performance modules constructed in this study were evaluated using a 5-fold cross-validation technique. In the 5-fold cross-validation, the relevant dataset was partoned randomly into five equally sized sets. The training and testing was carried out five times, each time using one distinct set for testing and the remaining four sets for training.The performance of the methods was computed using the following formulas

Sensitivity = TP/TP+FNX100

Specificity = TN/TN+FPX100

Accuracy = TP+TN/TP+FP+TN+FN

Where TP and TN are correctly predicted antibacterial peptides and non antibacterial peptides respectively. FP and FN are wrongly predicted antibacterial peptides and non antibacterial peptides respectively.

Department of Computational Biology: Indraprastha Institute of Information Technology (IMTECH), New Delhi, India