HAIRpred2 — Help & Documentation

Overview

HAIRpred2 is a structure-based computational tool that predicts antibody-interacting residues (B-cell epitopes) in human antigen structures. It uses Relative Solvent Accessibility (RSA) computed from DSSP, combined with physicochemical properties of amino acids, in a 15-residue sliding window framework with a pre-trained Random Forest model.

Unlike sequence-based tools, HAIRpred2 uses the 3D structure of your antigen to accurately identify surface-exposed residues that form epitopes — the sites where antibodies bind.

HAIRpred2 is trained exclusively on human Ag-Ab complexes. For non-human antigens, predictions may be less accurate.

Input Requirements

PDB File

Upload your antigen structure in PDB format (.pdb extension). Requirements:

Must contain standard ATOM records
Must be a 3D resolved structure with coordinates
Resolution better than 3.0 Å is recommended
File size under 10 MB

PDB files can be obtained from RCSB Protein Data Bank. AlphaFold predicted structures are also accepted.

Chain ID

Specify the chain ID of the antigen. Chain IDs are case-sensitive single letters (e.g. A). Check the SEQRES or ATOM records in your PDB file to find available chain IDs.

Single chain: enter A
Multiple chains: enter A,B (comma-separated, no spaces)

If you are unsure of the chain IDs, open the PDB file in a text editor and look for lines starting with ATOM — column 22 (0-indexed) contains the chain ID.

Parameters

Probability Threshold

Residues with predicted probability ≥ threshold are labeled Interacting. Default is 0.5.

Threshold	Effect	Use case
`0.7 – 0.9`	Fewer predictions, high confidence	When you want minimal false positives
`0.5`	Balanced (recommended default)	General epitope mapping
`0.3 – 0.4`	More predictions, high coverage	When sensitivity is more important

RSA Filter

Filters out buried residues with RSA below the set minimum. Residues with RSA < 0.05 are deeply buried inside the protein core and cannot form antibody contacts. Recommended setting: 0.05.

Output Files

Every prediction generates 5 output files, all sharing the same job prefix:

File	Contents
`.csv`	Per-residue table: Residue (e.g. N23), RSA, Probability (0–1), Prediction label
`_summary.txt`	Total/interacting/non-interacting counts, average probability, top 10 residues
`_bfactor.pdb`	PDB file with B-factor column replaced by probability ×100
`.pml`	PyMOL script — loads structure, colors residues, adds labels
`_patches.txt`	Spatially clustered interacting residues forming epitope patches

Prediction CSV

The main output file. Each row is one residue:

Residue,RSA,Probability,Prediction
T22,0.1823,0.3812,Non-interacting
N23,0.6541,0.6247,Interacting
S24,0.7102,0.7103,Interacting

Epitope Patches

Groups spatially adjacent interacting residues (Cα distance < 10Å) into clusters. Each cluster represents a likely epitope patch — the region an antibody would physically contact. Patches with higher average probability are stronger candidates.

Visualization

PyMOL Script (.pml)

Open PyMOL and run:

@result.pml

This will:

Load the B-factor PDB automatically
Color interacting residues red with sticks and semi-transparent surface
Color non-interacting residues blue
Add labels on each interacting Cα atom: e.g. N23 (0.72)

To toggle labels in PyMOL:

hide labels     # turn off all labels
show labels     # turn them back on
set label_size, 10   # make labels smaller if needed

B-Factor Coloring (PyMOL or ChimeraX)

# PyMOL
load result_bfactor.pdb
spectrum b, blue_white_red

# ChimeraX
open result_bfactor.pdb
color bfactor

Colors the structure from blue (low probability) through white to red (high probability), giving a continuous probability map across the entire surface.

Feature Description

HAIRpred2 encodes 7 features per residue in a 15-residue sliding window (7 residues on each side of the central residue), giving 105 features per residue total.

Feature	Source	Description
`RSA`	DSSP	Relative Solvent Accessibility from 3D structure (0 = buried, 1 = fully exposed)
`pI`	AA property	Isoelectric point of the amino acid
`pKa1`	AA property	First acid dissociation constant
`pKa2`	AA property	Second acid dissociation constant
`Hydrophobicity`	AA property	Hydrophobicity index
`Steric`	AA property	Steric parameter (side chain bulkiness)
`EIIP`	AA property	Electron-ion interaction pseudopotential

FAQ

What PDB format is accepted?

Standard PDB format with ATOM records. Files downloaded from RCSB PDB work directly without any modification.

Can I use AlphaFold predicted structures?

Yes. HAIRpred2 accepts any PDB file with valid ATOM records. However, accuracy may be slightly lower for predicted structures compared to experimentally resolved ones, since RSA values depend on the quality of the 3D model.

How long does prediction take?

Typically 30–120 seconds depending on protein size. Large proteins (>500 residues) may take up to 3 minutes.

My job failed — what should I do?

Check that: (1) your PDB file is valid and has ATOM records, (2) the chain ID you entered exists in the file, (3) the file is under 10MB. If the issue persists, contact us with your job ID.

What is an epitope patch?

A group of spatially adjacent interacting residues (Cα distance < 10Å). Most antibodies bind to a cluster of surface residues rather than a single isolated residue. Epitope patches identify these clusters and are more biologically meaningful than individual residue scores.

How is HAIRpred2 different from HAIRpred?

HAIRpred (v1) was sequence-based — it predicted epitopes from amino acid sequence alone. HAIRpred2 uses 3D structural information (RSA from the actual PDB structure) which significantly improves accuracy, especially for conformational epitopes that sequence-based tools cannot detect.

Is a standalone version available?

Yes. Download the Python standalone package from the Download page. It runs locally without internet access and supports multiple chains, RSA filtering, and custom thresholds via command line.

Help & User Guide