We collected and compilied following datasets from literature. In past, these datasets have been used for developing prediction methods. This list is not complete, collection is in progress, please help us by submitting/suggesting new dataset to this collection
Types of Datasets | Sub Types | PDB chains in dataset | References (Pubmed ID) | Original site | Download Data |
Regular Secondary structure |
helix, strand, coil | 513,406 1155 24761 60,21 987,988 83 | 25883141 29501620 25380779 28968641 23298369 22095872 | Jpred4 PATSIM POLYPROLINE MemBrain SSCon K2D3 | CB513 , CB406 PATSIM PolyprOnline MemBrain SSCon K2D3(Table S1) |
Strand | 1452 | 24064422 | bcov | BetaSheet1452 |
Irregular Secondary structure |
Beta-Turns | 1296 547 426 426,6376,20142 | 20673368 15822097 12592033 25728793 | DEBT COUDES BTEVAL BetaTPred3.0 | PDB1296 FA547 GR426 BT426,BT6376,BT20142 |
Gamma-Turns | 490 749 320 | 17936305 12662358 10878855 | sciencedirect wiley.com JBIOSCI | PA490 KG749 KG320 |
Alpha-Turns | 490 193 | 16894602 8652792 | Alpha Turn V1.0 wiley.com | JX490 VP193 |
B-Hairpin | 534 | 12177429 | BhairPred | BhairPred |
Beta barrel | 1881712 | 22843985 | TMBB-DB | TMBB-DB |
PDB derived Datasets with non-redundant sequence and structural quality criteria |
PDB_SELECT | 5130 | 19783827 | PDB_SELECT | PDB_SELECT_25 |
PDBFINDER | 168357 | 9021272 | PDBFINDER | PDBFINDER |
PISCES | 3023 | 12912846 | PISCES | PISCES |
AbDb | 1476 complete non-redundant, 57 non-redundant light chains, 177 non-redundant heavy chains | 29718130 | AbDb | AbDb |
DNA/RNA interacting residues |
DNA | 206 500 62 25 782 488,82 | 21069866 19767616 16845003 19594868 28381244 28132027 | Proteins 3dfootprint BindN BindN DisBind DRNAPred | DBP206 3D-footprint PDNA-62 PRINR25 DisBind DRNAPred |
RNA | 147 205 109 782 488,82 | 17483510 20483814 16790841 28381244 28132027 | PRIDB PRNA RNABindR DisBind DRNAPred | RB147 PRNA MT109 DisBind DRNAPred |
DNA/RNA interacting proteins |
DNA | 146 1153 92 146 | 18042272 | DNAbinder | Main Dataset Alternate Dataset Realistic Dataset Independent Dataset |
RNA | 377
766,326 26306,24228,1678,2662 2241,369 1678 | 20677174
26607710 29495575 22192482 22192482 | RNApred
PRIdictor RPiRLS RPISeq RPIntDB | RNA Binding Data non-RNA binding data PRIdictor data RPiRLS data RPISeq data RPIntDB data |
Nucleotide interacting residues |
ATP | 168 429 168,227 227,17,1372 | 20021687 29361215 23288787 22130595 | ATPint ATPbind TargetATPsite NsitePred | ATPint168 ATPbind TargetATPsite NsitePred_ATP |
ADP | 321,25,1372 | 22130595 | NsitePred | NsitePred_ADP |
AMP | 140,18,1372 | 22130595 | NsitePred | NsitePred_AMP |
GTP | 44 56,6,1372 | 20525281 22130595 | GTPBinder NsitePred | GTPbinder44 NsitePred_GTP |
GDP | 105,9,1372 | 22130595 | NsitePred | NsitePred_GDP |
NAD | 195 | 20353553 | NADbinder | NADbinder195 |
FAD | 198 | 20122222 | FADPred | FADPred198 |
Metals and Ions Interacting Residue |
Ioncom dataset | 142(Zn), 110(Cu), 227 (Fe2+), 103 (Fe3+), 379(Mn), 179 (Ca), 103(Mg), 53(K), 78(Na), 62(CO3), 22(NO2), 303(So4), 339(Po4) | 27378301 | IonCom | IonCom_dataset |
Bacterial protein interaction |
Functional interaction | 1941 229 79 | 19798435 22102573 22053087 | Bacteriome DBETH MimoDB | Functional interactions dataset DBETH MimoDB |
TAP interaction | 918 | 19798435 | Bacteriome | Hu et al. TAP interaction dataset |
Functional & TAP interaction | 2283 | 19798435 | Bacteriome | Combined interactions dataset |
Experimental interaction | 2291 | 19798435 | Bacteriome | Extended interaction dataset |
Protein crystalization |
Propensity Dataset1 | 3958 | 18285371 | SSPF Crystallisation Propensity Predictors (Main server non-functional) | ParCrys Datasets |
Propensity Dataset2 | 144, 500 | 19755114 | MetaPPCP (Main server non-functional) | MetaPPCP Datasets |
Protein crystallization, purification and production propensity dataset 1 | 3587, 3585 | 21685077 | PPCpred | PPCpred Dataset |
Protein crystallization, purification and production propensity dataset 2 | 5383, 23348 | 25148528 | PredPPCrys (Main server non-functional) | PredPPCrys Dataset |
Protein crystallization, purification and production propensity dataset 3 | 5383, 23348, 11946 | 26906024 | Crysalis | Crysalis Dataset |
Protein crystallization and propensity dataset | 1197, 2378 | 24019868 | SCMCRYS (Main server non-functional) | SCMCRYS Dataset |
Helix packing |
Helix packing | 610 | wiki.c2b2 | Helix packing patterns | Helix packing dataset |
Membrane proteins |
Homologous Membrane Proteins | 36 | 16648166 | HOMEP | homep datasets |
Transmembrane Proteins | 247 | 15111065 | Phobius | Phobius |
Membrane Proteins | 3249 trainset,4333 testset,7695 non-membrane proteins | 22386149 | ProClusEnsem | ProClusEnsem |
Dihedral angles |
Dihedral angles dataset | 513,80,175,179,212,1989,1988 | 20025785 | DISSPred | disspred datasets |
Protein backbone dihederal angles | 1267,1267,85,40,5046 | 29745828 | RaptorX-Angle | RaptorX-Angle datasets |
Dihedral angles from chemical shifts and/or homology | 141,31,15 | 16845087 | PREDITOR | PREDITOR datasets |
Protein Backbone Torsion Angle | 500,460,1029 | 18923703 | ANGLOR | ANGLOR datasets |
Protein Torsion Angle | 1552,11 | 28923002 | DNTor | DNTor datasets |
Surface accessibility |
Surface accessibility dataset | 215 | 11170200
| Protein surface accessibility | Manesh-215 |
AcconPred | 5729,945 | 26339631 | Solvent Accessibility and Contact Number Simultaneously | AcconPred |
Rotamer Libraries |
Dunbrack Rotamer Libraries | 850 | 12163064 | Dunbrack Rotamer Libraries | Dunbrack Rotamer Libraries |
Tuffery et al's rotamer libraries | 2926 | 12557186 | Backbone independent
Backbone Dependent | Tuffery's Backbone independent
Tuffery's Backbone dependent |
Penultimate Rotamer Library | 240 | 10861930 | Penultimate Rotamer Library | Penultimate Rotamer Library |
Kirys et al's Rotamer Library | 233 | 22544766 | Kirys Rotamer Library | Kirys Rotamer Library (Only Table S2&S3) |