Chemometric tools for classification and elucidation of protein secondary structure from infrared and circular dichroism spectroscopic measurements

Bibliographic Details
Title: Chemometric tools for classification and elucidation of protein secondary structure from infrared and circular dichroism spectroscopic measurements
Authors: Navea, Susana, Tauler, Romá, Goormaghtigh, Erik, de Juan, Anna
Source: Proteins: Structure, Function, and Bioinformatics. 63:527-541
Publisher Information: Wiley, 2006.
Publication Year: 2006
Subject Terms: 0301 basic medicine, Protein Structure, Secondary, 0303 health sciences, Crystallography, Spectrophotometry, Infrared, Protein Conformation, Circular Dichroism, Protein -- classification, Circular dichroism, Crystallography, X-Ray, Infrared -- methods, Protein Structure, Secondary, Protein classification, Databases, 03 medical and health sciences, Spectrophotometry, Circular Dichroism -- methods, Partial Least Squares (PLS), X-Ray, Chimie, Cluster Analysis, Chemometrics, Databases, Protein, Infrared spectroscopy
Description: Protein classification and characterization often rely on the information contained in the protein secondary structure. Protein class assignment is usually based on X‐ray diffraction measurements, which need the protein in a crystallized form, or on NMR spectra, to obtain the structure of a protein in solution. Simple spectroscopic techniques, such as circular dichroism (CD) and infrared (IR) spectroscopies, are also known to be related to protein secondary structure, but they have seldom been used for protein classification. To see the potential of CD, IR, and combined CD/IR measurements for protein classification, unsupervised pattern recognition methods, Principal Component Analysis (PCA) and cluster analysis, are proposed first to check for natural grouping tendencies of proteins according to their measured spectra. Partial Least Squares Discriminant Analysis (PLS‐DA), a supervised pattern recognition method, is used afterwards to test the possibility to model explicitly each protein class and to test these models in class assignment of unknown proteins. Determination of the protein secondary structure, understood as the prediction of the abundance of the different secondary structure motifs in the biomolecule, was carried out with the local regression method interval Partial Least Squares (iPLS). CD, IR, and CD/IR measurements were correlated to the fraction of the motif to be predicted, determined from X‐ray measurements. iPLS builds models extracting the spectral information most correlated to a specific secondary motif and avoids the use of irrelevant spectral regions. Spectral intervals chosen by iPLS models provide structural information which can be used to confirm previous biochemical assignments or identify new motif‐related spectral features. The predictive ability of the models built with the selected spectral regions has a quality similar to previous classical approaches. Proteins 2006. © 2006 Wiley‐Liss, Inc.
Document Type: Article
File Description: 2 full-text file(s): application/pdf; application/pdf
Language: English
ISSN: 1097-0134
0887-3585
DOI: 10.1002/prot.20890
Access URL: https://pubmed.ncbi.nlm.nih.gov/16456850
https://pubmed.ncbi.nlm.nih.gov/16456850/
https://europepmc.org/article/MED/16456850
https://core.ac.uk/display/8872589
https://onlinelibrary.wiley.com/doi/10.1002/prot.20890/abstract
https://www.onlinelibrary.wiley.com/doi/abs/10.1002/prot.20890
http://www.ncbi.nlm.nih.gov/pubmed/16456850
Rights: Wiley Online Library User Agreement
Accession Number: edsair.doi.dedup.....ff50356e560d9de891cd19c1e317b0d7
Database: OpenAIRE
Description
ISSN:10970134
08873585
DOI:10.1002/prot.20890