Machine learning approaches for prediction of linear B-cell epitopes on proteins

J Mol Recognit. 2006 May-Jun;19(3):200-8. doi: 10.1002/jmr.771.

Abstract

Identification and characterization of antigenic determinants on proteins has received considerable attention utilizing both, experimental as well as computational methods. For computational routines mostly structural as well as physicochemical parameters have been utilized for predicting the antigenic propensity of protein sites. However, the performance of computational routines has been low when compared to experimental alternatives. Here we describe the construction of machine learning based classifiers to enhance the prediction quality for identifying linear B-cell epitopes on proteins. Our approach combines several parameters previously associated with antigenicity, and includes novel parameters based on frequencies of amino acids and amino acid neighborhood propensities. We utilized machine learning algorithms for deriving antigenicity classification functions assigning antigenic propensities to each amino acid of a given protein sequence. We compared the prediction quality of the novel classifiers with respect to established routines for epitope scoring, and tested prediction accuracy on experimental data available for HIV proteins. The major finding is that machine learning classifiers clearly outperform the reference classification systems on the HIV epitope validation set.

MeSH terms

  • Algorithms
  • Artificial Intelligence*
  • Computational Biology / methods
  • Cross Reactions
  • Epitopes, B-Lymphocyte / chemistry
  • Epitopes, B-Lymphocyte / classification
  • Epitopes, B-Lymphocyte / immunology*
  • HIV / metabolism
  • Pattern Recognition, Automated / methods
  • Protein Structure, Tertiary
  • Proteins / chemistry
  • Proteins / classification
  • Proteins / immunology*
  • Reproducibility of Results
  • Viral Proteins / chemistry
  • Viral Proteins / classification
  • Viral Proteins / immunology

Substances

  • Epitopes, B-Lymphocyte
  • Proteins
  • Viral Proteins