Biotechnology Progress, Vol.18, No.6, 1366-1376, 2002
Identification of critical batch operating parameters in fed-batch recombinant E-Coli Fermentations using decision tree analysis
To develop a useful fermentation process model, it is first necessary to identify which batch operating parameters are critical in determining the process outcome. To identify critical processing inputs in large databases, we have explored the use of Decision Tree Analysis with the decision metrics of Gain (i.e., Shannon Entropy changes), Gain Ratio, and a multiple hypergeometric distribution. The usefulness of this approach lies in its ability to treat "categorical" variables, which are typical of archived fermentation databases, as well as "continuous" variables. In this work, we demonstrate the use of Decision Tree Analysis for the problem of optimizing recombinant green fluorescent protein production in E. coli. A database of 85 fermentations was generated to examine the effect of 15 process input parameters on final biomass yield, maximum recombinant protein concentration, and productivity. The use of Decision Tree Analysis led to a considerable reduction in the fermentation database through the identification of the significant as well as insignificant inputs. However, different decision metrics selected different inputs and different numbers of inputs to classify the data for each output.