Process Biochemistry, Vol.41, No.3, 552-556, 2006
Discrimination of thermophilic and mesophilic proteins via pattern recognition methods
Four pattern recognition methods, namely, principal component analysis (PCA), stepwise regression (SR), partial least-square regression (PLSR), and backpropagation neural network, were used to discriminate thermophilic and mesophilic proteins. And four models were made to classify between these two kinds of proteins. To some degree the prediction accuracy of the methods was encouraging except for principal component analysis. Results showed that the average fitting accuracy of the four methods was 92%, 96%, 95% and 98%, respectively. And the average prediction reliability was 60%, 67.5%, 72.5% and 72.5%, respectively, the best prediction reliability for thermophilic proteins was 75%, and for mesophilic proteins was 85%. (c) 2005 Elsevier Ltd. All rights reserved.
Keywords:protein thermostability;sequence-characteristic relationship;principal component analysis;stepwise regression;partial least-square regression;backpropagation neural network