Download Datasets

SAMbinder provides the gold standard dataset of SAM ligand interacting protein chains generated from PDB. Standard techniques were used for generating the dataset, which contains 145 non-redundatnt (40%, CD-HIT) SAM binding protein chains. Dataset is divided into two parts; i) Trainign dataset consists of 118 non-redundant protein chains (1798 SAM interacting residues and 33314 non-interacting residues), ii) Independent/Validation dataset comprises of 27 non-redundant PDB chains (390 SAM interacting residues and 6715 non-interacting residues).

In order to facilitate users in using our dataset effectively, we not only provide SAM binding chains but also binary/pssm patterns. User can download three types of datasets; i)Type1 contain protein chains, ii) Type 2 contain pattern of length 17 amino acids for each protein, and iii) Type 3 contain PSSM profile for each protein.

Dataset Type1: Protein chains along with the interaction information.

Dataset_Type1

Description

Files

Main

Dataset contains 118 SAM interacting protein chains. Interacting residues have been shown in the form of '+' sign whereas non-interacting is denoted by '-' sign.

Validation

Dataset contains 27 SAM interacting protein chains. Interacting residues have been shown in the form of '+' sign whereas non-interacting is denoted by '-' sign.

Dataset Type2: This dataset type consists of patterns generated from PDB chains.

Dataset_Type2

Description

Files

Main

Dataset contains patterns of window length 17 generated from 118 SAM interacting protein chains. Individual positive and negative patterns are present of each PDB chain.

Validation

Dataset contains patterns of window length 17 generated from 27 SAM interacting protein chains. Individual positive and negative patterns are present of each PDB chain.

Dataset Type2: This dataset type consists of PSSM profile of each patterns generated from PDB chains.

Dataset_Type3

Description

Files

Main

Dataset contains PSSM profiles of each patterns of window length 17 generated from 118 SAM interacting protein chains. Individual positive and negative profiles are present of each PDB chain.

Validation

Dataset contains PSSM profiles of each patterns of window length 17 generated from 27 SAM interacting protein chains. Individual positive and negative profiles are present of each PDB chain.