Models

AntiCP 3.0 uses two prediction models:

  • Finetuned-ESM2 + BLAST
  • Finetuned-ESM2
Both models are based on a fine-tuned ESM2-t33 deep learning architecture, trained on experimentally validated anticancer proteins. The hybrid model additionally incorporates a BLAST-based alignment approach to identify similarities with known anticancer proteins, further improving predictive accuracy.

Predict

This module enables users to predict anticancer proteins with ease. Users can input sequences in one of the following ways:

  • Type or paste single/multiple peptides in FASTA format directly into the query box.
  • Upload a file containing sequences in FASTA format.
After providing the sequences, users can choose a prediction model:
  1. Finetuned ESM2 + BLAST (default option).
  2. Finetuned ESM2
Users can set a cut-off threshold for classification based on their specific requirements. The module also allows users to select various protein properties to be displayed, including:
  • Hydrophobicity
  • Steric hindrance
  • Molecular weight
  • Net charge
  • And more
For better job tracking, users can provide a job title and their email address (optional). Once all inputs and preferences are set, click Submit to process your prediction job. Prediction page

Prediction Result

After the prediction, the results are displayed in a tabular format (see figure below). The information displayed depends on the selected prediction model:
  • Hybrid Model: The table includes the following columns:
    • ML_Score
    • BLAST_Score
    • Hybrid_Score
    • Prediction
    • All selected properties (e.g., hydrophobicity, molecular weight, etc.).
  • Model 1: The table includes:
    • ML_Score
    • Prediction
    • All selected properties (e.g., hydrophobicity, molecular weight, etc.).
  • Mechanism of Action of ACP

BLAST

This module enables users to search a query protein sequence against the database of known anticancer and non-anticancer proteins using a similarity-based search method, i.e. BLAST. The module predicts the query sequence as:

  • Anticancer: If a match (hit) is found in the database.
  • Non-Anticancer: If no match (hit) is found in the database.
Users can input multiple protein sequences at once in the text area or upload them in FASTA format.

Mechanism of Action of ACP

BLAST Result

Based on the hit, each submitted sequence is given a prediction. Users can also view the detailed BLAST alignment results. Mechanism of Action of ACP

Standalone

Installation and Usage

  1. Navigate to the Download tab.
  2. Download the standalone version: Anticp3.zip.
  3. Note: If you do not have Anaconda installed on your system, you can download and install it from Miniconda Installation Guide. Miniconda
  4. Open your terminal and navigate to the directory where the file was downloaded.
  5. Unzip the standalone file using the following command:
    unzip anticp3.zip
  6. Change to the anticp3_standalone directory:
    cd anticp3_standalone
  7. Set up the conda environment by running:
    conda env create -f environment.yml
  8. Activate the environment:
    conda activate anticp3
  9. Run the Python script:
    python3 anticp3.py -h

Your environment is now set up, and you can begin using the Anticp3 standalone on your system.

Pip Package

Install the official AntiCP 3.0 Python package directly from PyPI for seamless integration into your workflows.

Installation Command:

pip install anticp3

Once installed, you can use the prediction module directly in your Python scripts or terminal.

Visit our PyPI Page for full documentation and updates.

HuggingFace Model

Access the fine-tuned AntiCP 3.0 model on Hugging Face for quick inference and experimentation.

Model Page:

Hugging Face HuggingFace Repository

Example Code to Load the Model:

from transformers import AutoModelForSequenceClassification, AutoTokenizer

model = AutoModelForSequenceClassification.from_pretrained("raghavagps-group/anticp3")
tokenizer = AutoTokenizer.from_pretrained("raghavagps-group/anticp3")

sequence = "MANCVVGYIGERCQYRDLKWWELRGGGGSGGGGSAPAFSVSPASGLSDGQSVSVSVSGAAAGETYYIAQCAPVGGQDACNPATATSFTTDASGAASFSFVVRKSYTGSTPEGTPVGSVDCATAACNLGAGNSGLDLGHVALTFGGGGGSGGGGSDHYNCVSSGGQCLYSACPIFTKIQGTCYRGKAKCCKLEHHHHHH"

# Tokenize and run inference
inputs = tokenizer(sequence, return_tensors="pt", truncation=True)

with torch.no_grad():
    logits = model(**inputs).logits
    probs = torch.nn.functional.softmax(logits, dim=-1)
    prediction = torch.argmax(probs, dim=1).item()

labels = {0: "Non-Anticancer", 1: "Anticancer"}
print("Prediction:", labels[prediction])

This allows you to use AntiCP 3.0 inside any deep learning pipeline easily!