Welcome To The Dataset Page of PPRInt 2.0



PPRInt2 provides the standard dataset of RNA interacting proteins obtained from HybridNAP and ProNA2020. Standard techniques were used for generating the dataset, which contains 545 non-redundatnt (30%, CD-HIT) RNA-interacting proteins for training, 161 non-redundatnt RNA-interacting proteins sequences for validation. The Trainig dataset consist of 18559 RNA-interacting and 171879 RNA non-interacting patterns, validated on the validation dataset consisting of 6966 RNA-interacting and 44349 non-interacting residues, with the pattern size of 17. In order to facilitate the users for using our dataset effectively, we provide RNA interacting protein sequences. User can download the dataset by clicking on the provided link.


DatasetDescriptionFiles

Training

Dataset contains 545 RNA-interacting protein sequences. Interacting residues have been shown in the form of '+' sign whereas non-interacting is denoted by '-' sign.

Validation

Dataset contains 161 RNA-interacting protein sequences. Interacting residues have been shown in the form of '+' sign whereas non-interacting is denoted by '-' sign.