ProPrInt: Help page
Following are the detailed results of Support Vector Machine optimization
The result will help you to select optimum "threshold" for the prediction
[A] Escherichia coli
(i)Amino Acid Composition
| Threshold | Sensitivity | Specificity | Accuracy | PPV | NPV | MCC |
| 1.00 | 0.565 | 1.000 | 0.968 | 1.000 | 0.967 | 0.739 |
| 0.800 | 0.709 | 1.000 | 0.979 | 1.000 | 0.978 | 0.833 |
| 0.600 | 0.811 | 1.000 | 0.986 | 1.000 | 0.985 | 0.894 |
| 0.400 | 0.887 | 1.000 | 0.992 | 1.000 | 0.991 | 0.938 |
| 0.200 | 0.930 | 1.000 | 0.995 | 1.000 | 0.995 | 0.962 |
| 0.000 | 0.957 | 1.000 | 0.997 | 0.999 | 0.997 | 0.976 |
| -0.200 | 0.979 | 1.000 | 0.998 | 0.999 | 0.998 | 0.988 |
| -0.400 | 0.990 | 1.000 | 0.999 | 0.997 | 0.999 | 0.993 |
| -0.600 | 0.994 | 0.999 | 0.999 | 0.994 | 1.000 | 0.994 |
| -0.800 | 0.996 | 0.993 | 0.994 | 0.921 | 1.000 | 0.955 |
| -1.000 | 1.000 | 0.803 | 0.817 | 0.284 | 1.000 | 0.477 |
(ii)Dipeptide Composition
| Threshold | Sensitivity | Specificity | Accuracy | PPV | NPV | MCC |
| 1.000 | 0.790 | 1.000 | 0.985 | 1.000 | 0.984 | 0.882 |
| 0.800 | 0.884 | 1.000 | 0.992 | 1.000 | 0.991 | 0.936 |
| 0.600 | 0.932 | 1.000 | 0.995 | 1.000 | 0.995 | 0.963 |
| 0.400 | 0.962 | 1.000 | 0.997 | 1.000 | 0.997 | 0.979 |
| 0.200 | 0.981 | 1.000 | 0.999 | 1.000 | 0.998 | 0.989 |
| 0.000 | 0.992 | 1.000 | 0.999 | 1.000 | 0.999 | 0.996 |
| -0.200 | 0.994 | 1.000 | 1.000 | 1.000 | 1.000 | 0.997 |
| -0.400 | 0.996 | 1.000 | 1.000 | 1.000 | 1.000 | 0.998 |
| -0.600 | 0.997 | 1.000 | 1.000 | 1.000 | 1.000 | 0.999 |
| -0.800 | 0.997 | 1.000 | 1.000 | 1.000 | 1.000 | 0.999 |
| -1.000 | 0.997 | 0.864 | 0.874 | 0.365 | 1.000 | 0.561 |
(iii)Biochemical classes tripeptide Composition
| Threshold | Sensitivity | Specificity | Accuracy | PPV | NPV | MCC |
| 1.000 | 0.564 | 1.000 | 0.968 | 1.000 | 0.967 | 0.738 |
| 0.800 | 0.721 | 1.000 | 0.980 | 1.000 | 0.979 | 0.840 |
| 0.600 | 0.839 | 1.000 | 0.988 | 1.000 | 0.988 | 0.910 |
| 0.400 | 0.914 | 1.000 | 0.994 | 1.000 | 0.993 | 0.953 |
| 0.200 | 0.946 | 1.000 | 0.996 | 1.000 | 0.996 | 0.971 |
| 0.000 | 0.972 | 1.000 | 0.998 | 1.000 | 0.998 | 0.985 |
| -0.200 | 0.985 | 1.000 | 0.999 | 1.000 | 0.999 | 0.992 |
| -0.400 | 0.993 | 1.000 | 0.999 | 1.000 | 0.999 | 0.996 |
| -0.600 | 0.996 | 1.000 | 1.000 | 1.000 | 1.000 | 0.998 |
| -0.800 | 0.999 | 0.999 | 0.999 | 0.990 | 1.000 | 0.994 |
| -1.000 | 1.000 | 0.680 | 0.703 | 0.196 | 1.000 | 0.365 |
[B] Saccharomyces cerevisiae (i)Amino Acid Composition
| Threshold | Sensitivity | Specificity | Accuracy | PPV | NPV | MCC |
| 1.000 | 0.263 | 0.971 | 0.617 | 0.900 | 0.568 | 0.331 |
| 0.800 | 0.351 | 0.950 | 0.650 | 0.875 | 0.594 | 0.375 |
| 0.600 | 0.442 | 0.912 | 0.677 | 0.833 | 0.620 | 0.400 |
| 0.400 | 0.527 | 0.855 | 0.691 | 0.784 | 0.644 | 0.405 |
| 0.200 | 0.609 | 0.786 | 0.698 | 0.740 | 0.668 | 0.402 |
| 0.000 | 0.691 | 0.702 | 0.697 | 0.699 | 0.694 | 0.393 |
| -0.200 | 0.762 | 0.600 | 0.681 | 0.656 | 0.716 | 0.367 |
| -0.400 | 0.824 | 0.486 | 0.655 | 0.616 | 0.734 | 0.329 |
| -0.600 | 0.878 | 0.371 | 0.624 | 0.582 | 0.752 | 0.288 |
| -0.800 | 0.920 | 0.264 | 0.592 | 0.556 | 0.768 | 0.245 |
| -1.000 | 0.952 | 0.171 | 0.561 | 0.534 | 0.780 | 0.196 |
(ii)Dipeptide Composition
| Threshold | Sensitivity | Specificity | Accuracy | PPV | NPV | MCC |
| 1.000 | 0.282 | 0.985 | 0.633 | 0.949 | 0.578 | 0.375 |
| 0.800 | 0.388 | 0.972 | 0.680 | 0.932 | 0.614 | 0.444 |
| 0.600 | 0.489 | 0.948 | 0.718 | 0.903 | 0.650 | 0.491 |
| 0.400 | 0.578 | 0.916 | 0.747 | 0.874 | 0.685 | 0.525 |
| 0.200 | 0.660 | 0.860 | 0.760 | 0.825 | 0.717 | 0.531 |
| 0.000 | 0.728 | 0.786 | 0.757 | 0.773 | 0.743 | 0.515 |
| -0.200 | 0.791 | 0.682 | 0.736 | 0.713 | 0.765 | 0.476 |
| -0.400 | 0.850 | 0.550 | 0.700 | 0.654 | 0.785 | 0.419 |
| -0.600 | 0.900 | 0.404 | 0.652 | 0.602 | 0.802 | 0.351 |
| -0.800 | 0.941 | 0.256 | 0.599 | 0.559 | 0.813 | 0.271 |
| -1.000 | 0.969 | 0.135 | 0.552 | 0.528 | 0.813 | 0.188 |
(iii) Biochemical classes tripeptide Composition
| Threshold | Sensitivity | Specificity | Accuracy | PPV | NPV | MCC |
| 1.000 | 0.287 | 0.983 | 0.635 | 0.943 | 0.579 | 0.375 |
| 0.800 | 0.385 | 0.968 | 0.676 | 0.923 | 0.611 | 0.434 |
| 0.600 | 0.485 | 0.941 | 0.713 | 0.892 | 0.646 | 0.479 |
| 0.400 | 0.575 | 0.908 | 0.742 | 0.862 | 0.681 | 0.512 |
| 0.200 | 0.656 | 0.852 | 0.754 | 0.816 | 0.713 | 0.519 |
| 0.000 | 0.728 | 0.781 | 0.755 | 0.769 | 0.742 | 0.510 |
| -0.200 | 0.789 | 0.676 | 0.733 | 0.709 | 0.763 | 0.469 |
| -0.400 | 0.846 | 0.550 | 0.698 | 0.653 | 0.782 | 0.416 |
| -0.600 | 0.897 | 0.406 | 0.652 | 0.602 | 0.798 | 0.348 |
| -0.800 | 0.936 | 0.262 | 0.599 | 0.559 | 0.803 | 0.268 |
| -1.000 | 0.962 | 0.144 | 0.553 | 0.529 | 0.792 | 0.185 |
[B] Helicobacter pylori (i)Amino Acid Composition
| Threshold | Sensitivity | Specificity | Accuracy | PPV | NPV | MCC |
| 1.000 | 0.324 | 0.986 | 0.655 | 0.957 | 0.593 | 0.413 |
| 0.800 | 0.452 | 0.976 | 0.714 | 0.950 | 0.640 | 0.502 |
| 0.600 | 0.559 | 0.954 | 0.757 | 0.924 | 0.684 | 0.558 |
| 0.400 | 0.660 | 0.910 | 0.785 | 0.880 | 0.728 | 0.589 |
| 0.200 | 0.753 | 0.868 | 0.811 | 0.851 | 0.779 | 0.626 |
| 0.000 | 0.825 | 0.800 | 0.813 | 0.805 | 0.821 | 0.626 |
| -0.200 | 0.884 | 0.711 | 0.798 | 0.754 | 0.860 | 0.604 |
| -0.400 | 0.932 | 0.600 | 0.766 | 0.700 | 0.898 | 0.564 |
| -0.600 | 0.958 | 0.476 | 0.717 | 0.646 | 0.919 | 0.496 |
| -0.800 | 0.979 | 0.307 | 0.643 | 0.585 | 0.935 | 0.385 |
| -1.000 | 0.991 | 0.178 | 0.584 | 0.547 | 0.952 | 0.290 |
(ii)Dipeptide Composition
| Threshold | Sensitivity | Specificity | Accuracy | PPV | NPV | MCC |
| 1.000 | 0.300 | 0.995 | 0.647 | 0.982 | 0.587 | 0.409 |
| 0.800 | 0.486 | 0.988 | 0.737 | 0.975 | 0.658 | 0.547 |
| 0.600 | 0.630 | 0.971 | 0.800 | 0.956 | 0.724 | 0.639 |
| 0.400 | 0.747 | 0.942 | 0.844 | 0.928 | 0.788 | 0.702 |
| 0.200 | 0.824 | 0.910 | 0.867 | 0.902 | 0.838 | 0.737 |
| 0.000 | 0.885 | 0.852 | 0.868 | 0.857 | 0.881 | 0.737 |
| -0.200 | 0.929 | 0.771 | 0.850 | 0.802 | 0.915 | 0.708 |
| -0.400 | 0.965 | 0.661 | 0.813 | 0.740 | 0.950 | 0.657 |
| -0.600 | 0.983 | 0.508 | 0.745 | 0.666 | 0.967 | 0.557 |
| -0.800 | 0.993 | 0.291 | 0.642 | 0.583 | 0.977 | 0.399 |
| -1.000 | 0.997 | 0.121 | 0.559 | 0.531 | 0.973 | 0.244 |
(iii) Biochemical classes tripeptide Composition
| Threshold | Sensitivity | Specificity | Accuracy | PPV | NPV | MCC |
| 1.000 | 0.285 | 0.995 | 0.640 | 0.983 | 0.582 | 0.398 |
| 0.800 | 0.461 | 0.986 | 0.724 | 0.971 | 0.647 | 0.526 |
| 0.600 | 0.630 | 0.970 | 0.800 | 0.954 | 0.724 | 0.637 |
| 0.400 | 0.743 | 0.944 | 0.843 | 0.930 | 0.786 | 0.701 |
| 0.200 | 0.818 | 0.904 | 0.861 | 0.895 | 0.833 | 0.725 |
| 0.000 | 0.877 | 0.846 | 0.862 | 0.851 | 0.873 | 0.724 |
| -0.200 | 0.922 | 0.765 | 0.843 | 0.797 | 0.907 | 0.695 |
| -0.400 | 0.955 | 0.645 | 0.800 | 0.729 | 0.934 | 0.631 |
| -0.600 | 0.980 | 0.490 | 0.735 | 0.658 | 0.961 | 0.540 |
| -0.800 | 0.988 | 0.291 | 0.639 | 0.582 | 0.959 | 0.388 |
| -1.000 | 0.997 | 0.123 | 0.560 | 0.532 | 0.978 | 0.248 |
Following are the methods of feature extraction from amino acid sequence
(i) Amino Acid Composition
This is simply the percentage frequency of all the 20 natural amino acids in the protein sequence. For a single protein sequence the vector represents 20 features.
Ci = fi / L
where Ci is the composition of ith amino acid; fi is frequency of ith amino acid in the sequence; and L is the total number of amino acid residue in the sequence.
(ii) Dipeptide Composition
mamoon