AntiBP2 Algorithm

Data Sets
Antibacterial peptides data used here were taken from the APD database. APD database contains 999 antibacterial peptides. Antibacterial peptides having less than 15 amino acid residues and duplicate peptides were removed for N terminal, C terminal and N+C terminal method. For amino acid composition method seqence lenght ranges upto 99 amino acid. For negative dataset same length and same number of non antibacterial peptides randomly taken.

Binary patterns and amino acid composition were used for the input.

Binary Patterns
Amino acids were represented as binary string of length 20 where 19 "0" and a unique position set to "1" for each amino acid.For example an amino acid(A) can be represented as follows

A = 10000000000000000000

Amino Acid Composition
Amino acid composition is the fraction of each amino acid in a protein.

The SVM was implemented using freely downloadable software package SVM_light written by Joachims (Joachims 1999). The software enables the user to define a number of parameters as well as to select from a choice of inbuilt kernal functions, including a radial basis function (RBF) and a polynomial kernal.

Evaluation module
The performance modules constructed in this study were evaluated using a 5-fold cross-validation technique. In the 5-fold cross-validation, the relevant dataset was partoned randomly into five equally sized sets. The training and testing was carried out five times, each time using one distinct set for testing and the remaining four sets for training.The performance of the methods was computed using the following formulas

Sensitivity = TP/TP+FNX100

Specificity = TN/TN+FPX100

Accuracy = TP+TN/TP+FP+TN+FN

Where TP and TN are correctly predicted antibacterial peptides and non antibacterial peptides respectively. FP and FN are wrongly predicted antibacterial peptides and non antibacterial peptides respectively.