Biochemical and Biophysical Research Communications, Vol.354, No.2, 498-504, 2007
Construction of mathematical model for high-level expression of foreign genes in pPIC9 vector and its verification
In this report, we introduced a mathematical model for high-level expression of foreign genes in pPIC9 vector. At first, we collected 40 heterologous genes expressed in pPIC9 vector, and these 40 genes were classified into high-level expression group (expression level > 100 mg/L, 12 genes) and low-level expression group (expression level < 100 mg/L, 28 genes). Then, the Naive Bayes method was used to construct the model with RNA secondary structure profile of 3'-end of foreign genes as features. The classification accuracy from leave-one-out cross-validation was 100%. Finally, another five genes collected from literatures were used to test the ability of the model. The results indicated that there were four genes correctly predicted. In addition, the model was also verified by expressing human neutrophil gelatinase-associated lipocalin (NGAL) gene with expression level more than 100 mg/L. Therefore, we propose that the model can be used to predict the expression level of heterologous genes before experiments and optimize the experiment designs to obtain the high-level expression. Furthermore, we have developed a web server for evaluation and design for high-level expression of foreign genes, which is accessible at http://ppic9.med.stu.edu.cn/ppic9. (c) 2007 Elsevier Inc. All rights reserved.