|
|
|
|
|
About SARpred |
The SARpred is a neural netwok based method, uses two feed forward back propagation networks with a single hidden layer, to predict the real value of surface accessibility. Both the networks have window seventeen residues wide and have 10 units in a single hidden layer. Surface accessibility has been encoded in real value normalize in the range between 0-1. The target output consists of a single output value between 0-1 corresponding to the normalize value of the central residue in the input pattern.
In SARpred server, following two networks have been used:
|
First level:Sequence-to-structure network
The input to the first network is PSI-BLAST obtained position specific scoring matrices generated by PSI-BLAST. PSIPRED uses PSI-BLAST to detect distant homologues of a query sequence and generate position specific matrix as part of the prediction process, and here we have used these intermediate PSI-BLAST generated position specific scoring matrices, as a direct input to the first network. The matrix has 21 X M real elements, where M is the length of the target sequence and each element represents the frequency of occurance of each of the 21 amino acids at one position in the alignment. |
|
Second level:Structure-to-structure network
The input to second filtering network is prediction obtained from the first sequence-to-structure network and the secondary structure predicted by PSIPRED (Jones, 1999). Four units encode each residues, in which one of the unit codes for prediction output (in the range between 0-1) from first level network and the remaining three units code for three secondary structure states (helix, strand, and coil). |
Multiple Alignment
It is well established that prediction from a multiple alignment of protein sequences rather than a single sequence is one of a way to improve prediction accuracy (Cuff and Barton, 1999). During evolution, residues with similar physico-chemical properties are conserved if they are important to the fold or function of the protein. The availability of large families of homologous sequences revolutionised secondary structure prediction. Traditional methods, when applied to a family of proteins rather than a single sequence proved much more accurate at identifying core secondary structure elements. (Kaur and Raghava,2002) have also developed neural network based method for the prediction of tight turns by using evolutionary information of multiple alignment profiles and have achieved outstanding performance. Till now, there is no method known that has used multiple sequence information for real value prediction of surface accessibility. There is only one method of (Ahmad et al.,2003) RVPnet that predicts the real value of surface accessibility by using single sequence as input.
|
Therefore, in the present study an attempt has been made to incorporate evolutionary information of multiple sequence alignment profiles and PSIPRED predicted secondary structure information for the prediction of real value of surface accessibility.(Table2)
|
PSI-BLAST
In PSI-BLAST (Position Specific Iterative Blast) (Altschul et al., 1997), the sequences extracted from a Blast search are aligned and a statistical profile is derived from the multiple alignment. The profile is then used as a query for the next search, and this loop is iterated a number of times that is controled by the user. For more information, Click here.
|
|
|
PSIPRED
The PSIPRED method has been used for secondary structure prediction. It uses PSI-BLAST to detect distant homologues of a query sequence and generate position specific scoring matrix as part of the prediction process (Jones, 1999), and training is done on these intermediate PSI-BLAST generated position specific scoring matrices as a direct input to the neural network. The matrix has 21 X M elements, where M is the length of the target sequence and each element represents the likelihood of that particular residue substitution at that position in the template. It is a sensistive scoring system, whcih involves the probabilities with which amino acids occur at various positions. |