The determination of
subcellular localization of a protein is considered to be most reliable and
important way to elucidate the function of a protein. For last few years,
numerous computational methods have been developed for the correct prediction
of subcellular locations of proteins, however, based on different computational
techniques, input features and datasets. These include PSORTB, NNPSL, TargetP, LOCSVMPSI,
SignalP, ESLpred, CELLO, PSLpred, SubLoc and HSLPred.
The expansion of raw protein sequence databases in the post genomic era and availability of fresh annotated sequences for major localizations particularly motivated us to introduce a new improved version of our previously forged eukaryotic subcellular localizations prediction method namely "ESLpred2" trained on the ~10 years older and highly redundant dataset (referred as RH2427 dataset), by including a recently generated highly non-redundant kingdom specific dataset (used for developing BaCelLo method). Furthermore, a systematic approach has been taken to improve the prediction quality using PSSM profiles generated from PSI-BLAST along with compositional attributes and similarity-search based information.
The present method has achieved a highest success rate for subcellular localizations prediction with good overall and average accuracy, and hence, compliments other existing subcellular localization prediction method.