We obtained neurotoxins proteins from SWISS-PROT database (http://us.expasy.org/srs.sbin/cgi-bin/wgetx/) using the searchword “Neurotoxins” and non-toxin protein sequence using BUTNOT option in coment field (function BUTNOT toxin). We examine the source and functions of these proteins and obtained 932 proteins for further processing (eubacteria (13), Cnidaria(31), Mollusca (111) arthropoda(spider(165) and scorpion (314) ) chordata (295)) See table 1. We used PROSET software to create a dataset of non-redundant proteins where no two proteins have more than 90% sequence identity. Final dataset consist of 605 non-redudant neurotoxins. For source classification the datasets consists of 323 arthropoda(scorpion and spider) , 13 eurobacteria(Clostridium), 152 chordata(Snake), 23 cnidaria(Sea anemone), 94 mollusca(cone). For function classification the dataset consists of 332 blocks ion channels , 89 blocks acetylcholine receptors , 8 inhibits acetylchloine release via matalloproteolytic activity, 21 inhibits acetylchloine release with phospholipase A2 activity (21) and 10 facilitates acetylcholine release. For ion channel inhibitors classification the dataset consists of 81 calcium ion channel inhibitors, 8 Chloride ion channel inhibitors, 91 Potasssium ion channel inhibitor and 150 Sodium ion channel inhibitor.
Table 1. Distribution of neurotoxins showing source and functions obtained from SWISS-PROT
IAR1 =Inhibits Acetylcholine release by metalloproteolytic activity; IAR2= Inhibits Acetylcholine release by phospholipase A2 activity; FAR= Facilitates acetylcholine release; BIC= Blocks ion channels; BAR= Blocks acetylcholine receptors; L=long; S=Short,K=Kappa; W=Weak; OTH=Others, include myotoxic, anticoagulant, hemorraghic, hypotensive, bactericidal activity and excitatory symptoms.
The dataset for classificaton of neurotoxins and non-toxin
The dataset for source classification
The dataset for function classification
The dataset for sub-classification of ion channels inhibitors