Method used for prediction
The AlphaPred server uses two feed forward back propagation networks with a single hidden layer. Both the networks have window eleven residues wide and have 10 units in a single hidden layer. The target output consists of 1 or 0 (turn or non-turn).
In AlphaPred server, following two networks have been used:
Neural Network SNNS: First level-Sequence-to-structure net
The input to the first network is PSI-BLAST obtained position specific matrices(Altschul et al., 1997). PSIPRED uses PSI-BLAST to detect distant homologues of a query sequence and generate position specific matrix as part of the prediction process, and here we have used these intermediate PSI-BLAST generated position specific scoring matrices, as a direct input to the first network. The matrix has 21 X M real elements, where M is the length of the target sequence.
Neural Network SNNS: Second level-Structure-to-structure net
An important feature of the predictions generated by the first network is that they are uncorrelated, that is the network made prediction for each residue in isolation without reference to neighboring prediction. The correlation can be taken into account by using a second structure-to-structure network. The input to second filtering network is prediction obtained from the first net and the secondary structure predicted by PSIPRED (Jones, 1999).
Prediction from a multiple alignment of protein sequences rather than a single sequence has long been recognized as a way to improve prediction accuracy (Cuff and Barton, 1999). During evolution, residues with similar physico-chemical properties are conserved if they are important to the fold or function of the protein. The availability of large families of homologous sequences revolutionised secondary structure prediction. Traditional methods, when applied to a family of proteins rather than a single sequence proved much more accurate at identifying core secondary structure elements.
The same approach is used here for the prediction of alpha turns. It is a combination of neural network and multiple alignment information. Net is trained on the PSI-BLAST(part of PSIPRED) generated position specific scoring matrices.
In PSI-BLAST(Position Specific Iterative Blast)(Altschul et al., 1997), the sequences extracted from a Blast search are aligned and a statistical profile is derived from the multiple alignment. The profile is then used as a query for the next search, and this loop is iterated a number of times that is controled by the user. For more information, Click here.
The PSIPRED method has been used for secondary structure prediction. It uses PSI-BLAST to detect distant homologues of a query sequence and generate position specific scoring matrix as part of the prediction process (Jones, 1999), and training is done on these intermediate PSI-BLAST generated position specific scoring matrices as a direct input to the neural network. The matrix has 21 X M elements, where M is the length of the target sequence and each element represents the likelihood of that particular residue substitution at that position in the template. It is a sensistive scoring system, whcih involves the probabilities with which amino acids occur at various positions.
The following figure shows the network architecture used in AlphaPred: