Everything you need to know about using HAIRpred2 — from input preparation to interpreting results.
HAIRpred2 is a structure-based computational tool that predicts antibody-interacting residues (B-cell epitopes) in human antigen structures. It uses Relative Solvent Accessibility (RSA) computed from DSSP, combined with physicochemical properties of amino acids, in a 15-residue sliding window framework with a pre-trained Random Forest model.
Unlike sequence-based tools, HAIRpred2 uses the 3D structure of your antigen to accurately identify surface-exposed residues that form epitopes — the sites where antibodies bind.
HAIRpred2 is trained exclusively on human Ag-Ab complexes. For non-human antigens, predictions may be less accurate.
Upload your antigen structure in PDB format (.pdb extension). Requirements:
ATOM recordsPDB files can be obtained from RCSB Protein Data Bank. AlphaFold predicted structures are also accepted.
Specify the chain ID of the antigen. Chain IDs are case-sensitive single letters (e.g. A). Check the SEQRES or ATOM records in your PDB file to find available chain IDs.
AA,B (comma-separated, no spaces) If you are unsure of the chain IDs, open the PDB file in a text editor and look for lines starting with ATOM — column 22 (0-indexed) contains the chain ID.
Residues with predicted probability ≥ threshold are labeled Interacting. Default is 0.5.
| Threshold | Effect | Use case |
|---|---|---|
0.7 – 0.9 | Fewer predictions, high confidence | When you want minimal false positives |
0.5 | Balanced (recommended default) | General epitope mapping |
0.3 – 0.4 | More predictions, high coverage | When sensitivity is more important |
Filters out buried residues with RSA below the set minimum. Residues with RSA < 0.05 are deeply buried inside the protein core and cannot form antibody contacts. Recommended setting: 0.05.
Every prediction generates 5 output files, all sharing the same job prefix:
| File | Contents |
|---|---|
.csv | Per-residue table: Residue (e.g. N23), RSA, Probability (0–1), Prediction label |
_summary.txt | Total/interacting/non-interacting counts, average probability, top 10 residues |
_bfactor.pdb | PDB file with B-factor column replaced by probability ×100 |
.pml | PyMOL script — loads structure, colors residues, adds labels |
_patches.txt | Spatially clustered interacting residues forming epitope patches |
The main output file. Each row is one residue:
Residue,RSA,Probability,Prediction
T22,0.1823,0.3812,Non-interacting
N23,0.6541,0.6247,Interacting
S24,0.7102,0.7103,Interacting
Groups spatially adjacent interacting residues (Cα distance < 10Å) into clusters. Each cluster represents a likely epitope patch — the region an antibody would physically contact. Patches with higher average probability are stronger candidates.
Open PyMOL and run:
@result.pml
This will:
N23 (0.72)To toggle labels in PyMOL:
hide labels # turn off all labels
show labels # turn them back on
set label_size, 10 # make labels smaller if needed
# PyMOL
load result_bfactor.pdb
spectrum b, blue_white_red
# ChimeraX
open result_bfactor.pdb
color bfactor
Colors the structure from blue (low probability) through white to red (high probability), giving a continuous probability map across the entire surface.
HAIRpred2 encodes 7 features per residue in a 15-residue sliding window (7 residues on each side of the central residue), giving 105 features per residue total.
| Feature | Source | Description |
|---|---|---|
RSA | DSSP | Relative Solvent Accessibility from 3D structure (0 = buried, 1 = fully exposed) |
pI | AA property | Isoelectric point of the amino acid |
pKa1 | AA property | First acid dissociation constant |
pKa2 | AA property | Second acid dissociation constant |
Hydrophobicity | AA property | Hydrophobicity index |
Steric | AA property | Steric parameter (side chain bulkiness) |
EIIP | AA property | Electron-ion interaction pseudopotential |
Standard PDB format with ATOM records. Files downloaded from RCSB PDB work directly without any modification.
Yes. HAIRpred2 accepts any PDB file with valid ATOM records. However, accuracy may be slightly lower for predicted structures compared to experimentally resolved ones, since RSA values depend on the quality of the 3D model.
Typically 30–120 seconds depending on protein size. Large proteins (>500 residues) may take up to 3 minutes.
Check that: (1) your PDB file is valid and has ATOM records, (2) the chain ID you entered exists in the file, (3) the file is under 10MB. If the issue persists, contact us with your job ID.
A group of spatially adjacent interacting residues (Cα distance < 10Å). Most antibodies bind to a cluster of surface residues rather than a single isolated residue. Epitope patches identify these clusters and are more biologically meaningful than individual residue scores.
HAIRpred (v1) was sequence-based — it predicted epitopes from amino acid sequence alone. HAIRpred2 uses 3D structural information (RSA from the actual PDB structure) which significantly improves accuracy, especially for conformational epitopes that sequence-based tools cannot detect.
Yes. Download the Python standalone package from the Download page. It runs locally without internet access and supports multiple chains, RSA filtering, and custom thresholds via command line.