화학공학소재연구정보센터
Biochemical and Biophysical Research Communications, Vol.327, No.3, 845-847, 2005
Using GO-PseAA predictor to identify membrane proteins and their types
Cell membranes are crucial to the life of a cell. Although the basic structure of biological membrane is provided by the lipid bilayer, most of the specific functions are carried out by membrane proteins. Knowledge of membrane protein type often offers important clues toward determining the function of an uncharacterized protein. Therefore, predicting the type of a membrane protein from its primary sequence, or even just identifying whether the uncharacterized protein belongs to a membrane protein or not, is an important and challenging problem in bioinformatics and proteomics. To deal with these problems, the GO-PseAA predictor is introduced that is operated in a hybridization space by combining the gene ontology and pseudo amino acid composition. Meanwhile, to test the prediction quality, a dataset was constructed that contains 6476 non-membrane proteins and 5122 membrane proteins classified into five different types (Online Supplementary Materials A). To avoid redundancy and bias, none of the proteins included has greater than or equal to 40% sequence identity to any other. It has been observed that. the overall success rate by the jackknife cross-validation test in identifying non-membrane proteins and membrane proteins was 94.76%, and that in identifying the five membrane protein types was 95.84%. The high success rates suggest that the GO-PseAA predictor can catch the core feature of the statistical samples concerned and may become an automated high throughput toll in molecular and cell biology. (C) 2004 Elsevier Inc. All rights reserved.