Journal of Physical Chemistry A, Vol.121, No.3, 699-707, 2017
Prediction of pK(a) Values for Druglike Molecules Using Semiempirical Quantum Chemical Methods
Rapid yet accurate pK(a) prediction for druglike molecules is a key challenge in computational chemistry. This study uses PM6-DH+/COSMO, PM6/COSMO, PM7/COSMO, PM3/COSMO, AM1/COSMO, PM3/SMD, AM1/SMD, and DFTB3/SMD to predict the pK(a) values of 53 amine groups in 48 druglike compounds. The approach uses an isodesmic reaction where the plc value is computed relative to a chemically related reference compound for which the pK(a) value has been measured experimentally or estimated using a standard empirical approach. The AM1- and PM3-based methods perform best with RMSE values of 1.4-1.6 pH units that have uncertainties of +/- 0.2-0.3 pH units, which make them statistically equivalent. However, for all but PM3/SMD and AM1/SMD the RMSEs are dominated by a single outlier, cefaclroxil, caused by proton transfer in the zwitterionic protonation state. If this outlier values for PM3/COSMO and AM1/COSMO drop to 1.0 +/- 0.2 and 1.1 +/- 0.3, whereas PM3/SMD and AM1/SMD remain at 1.5 +/- 0.3 and 1.6 +/- 0.3/0.4 pH units, making the COSMO-based predictions statistically better than.the SMD-based predictions. For pK(a) calculations where a zwitterionic state is not involved or proton transfer in a zwitterionic state is not observed, PM3/ COSMO or AM1/COSMO is the best pK(a) prediction method; otherwise PM3/SMD or AM1/SMD should. be used. Thus, fast and relatively accurate pK(a) prediction for 100-1000s of clruglike amines is feasible with the current setup and relatively modest computational resources.