| Download Datasets |
SAMbinder provides the gold standard dataset of SAM ligand interacting protein chains generated from PDB. Standard techniques were used for generating the dataset, which contains 145 non-redundatnt (40%, CD-HIT) SAM binding protein chains. Dataset is divided into two parts; i) Trainign dataset consists of 118 non-redundant protein chains (1798 SAM interacting residues and 33314 non-interacting residues), ii) Independent/Validation dataset comprises of 27 non-redundant PDB chains (390 SAM interacting residues and 6715 non-interacting residues).
In order to facilitate users in using our dataset effectively, we not only provide SAM binding chains but also binary/pssm patterns. User can download three types of datasets; i)Type1 contain protein chains, ii) Type 2 contain pattern of length 17 amino acids for each protein, and iii) Type 3 contain PSSM profile for each protein.
Dataset Type1: Protein chains along with the interaction information.
Dataset_Type1 | Description | Files |
Main | Dataset contains 118 SAM interacting protein chains. Interacting residues have been shown in the form of '+' sign whereas non-interacting is denoted by '-' sign. | |
Validation | Dataset contains 27 SAM interacting protein chains. Interacting residues have been shown in the form of '+' sign whereas non-interacting is denoted by '-' sign. | |
Dataset Type2: This dataset type consists of patterns generated from PDB chains.
Dataset_Type2 | Description | Files |
Main | Dataset contains patterns of window length 17 generated from 118 SAM interacting protein chains. Individual positive and negative patterns are present of each PDB chain. | |
Validation | Dataset contains patterns of window length 17 generated from 27 SAM interacting protein chains. Individual positive and negative patterns are present of each PDB chain. | |
Dataset Type2: This dataset type consists of PSSM profile of each patterns generated from PDB chains.
Dataset_Type3 | Description | Files |
Main | Dataset contains PSSM profiles of each patterns of window length 17 generated from 118 SAM interacting protein chains. Individual positive and negative profiles are present of each PDB chain. | |
Validation | Dataset contains PSSM profiles of each patterns of window length 17 generated from 27 SAM interacting protein chains. Individual positive and negative profiles are present of each PDB chain. | |