Combustion and Flame, Vol.159, No.7, 2388-2397, 2012
Accelerating multi-dimensional combustion simulations using GPU and hybrid explicit/implicit ODE integration
Simulating multi-dimensional combustion with detailed kinetics often requires solving a large number of ordinary differential equation (ODE) problems at each global time step. In many cases, the ODE integrations account for the bulk of the total wall-clock time for the simulation. This paper introduces CHEMEQ2-GPU - a new explicit stiff ODE solver (based on the existing CHEMEQ2 solver) that exploits the parallel architecture of the modern graphics processing unit (GPU) to accelerate ODE integration in multi-dimensional combustion simulations. We also demonstrate efficient application of the CPU and GPU as co-processors, for further speedup. We describe a hybrid explicit/implicit ODE solver approach that combines the strengths of both solver types running simultaneously on the GPU and CPU, respectively. A dynamic load balancing scheme was used to assign the kinetics ODE integrations over all grid points to either the CPU-based implicit solver DVODE (which is the more efficient solver for highly stiff grid points) or CHEMEQ2-GPU (more efficient for moderately stiff or non-stiff grid points). We demonstrate CHEMEQ2-GPU and the hybrid approach in 3-D simulations of homogeneous charge compression ignition (HCCI) engines. The test cases applied two different n-heptane reaction mechanisms (a large detailed model and a small skeletal model) and three different mesh sizes. Engine simulations were performed using KIVA-CHEMKIN. CHEMEQ2 was about 2-3 times faster than DVODE, with similar prediction accuracy. The CHEMEQ2-GPU speedup relative to CHEMEQ2 increased linearly with the number of grid points for the range of meshes tested in this work. Assuming ideal linear scaling of simulation time with number of processors, the speed of CHEMEQ2-CPU on the Tests C2050 GPU was equivalent to CHEMEQ2 running on approximately 13 parallel 2.8 GHz CPU processors for the finest mesh; and the hybrid solver approach was equivalent to CHEMEQ2 on similar to 15 such CPU processors. In summary, CHEMEQ2-GPU provided the additional computing power of 14 parallel CPU processors (for the finest mesh tested) and the hybrid solver approach demonstrated a method to efficiently apply these additional co-processors with existing CPU cores for combustion simulations. CHEMEQ2-GPU scales favorably with the number of grid points and is available by request to the authors. This work presents opportunities for further development, particularly in CPU/GPU load balancing algorithms. (C) 2012 The Combustion Institute. Published by Elsevier Inc. All rights reserved.