Glider soaring via reinforcement learning in the field

Reddy G; Wong-Ng J; Celani A; Sejnowski TJ; Vergassola M

Nature, Vol.562, No.7726, 236-+, 2018

DOI10.1038/s41586-018-0533-0 Export Citation

Glider soaring via reinforcement learning in the field

Reddy G, Wong-Ng J, Celani A, Sejnowski TJ, Vergassola M

Soaring birds often rely on ascending thermal plumes (thermals) in the atmosphere as they search for prey or migrate across large distances(1-4). The landscape of convective currents is rugged and shifts on timescales of a few minutes as thermals constantly form, disintegrate or are transported away by the wind(5,6). How soaring birds find and navigate thermals within this complex landscape is unknown. Reinforcement learning(7) provides an appropriate framework in which to identify an effective navigational strategy as a sequence of decisions made in response to environmental cues. Here we use reinforcement learning to train a glider in the field to navigate atmospheric thermals autonomously. We equipped a glider of two-metre wingspan with a flight controller that precisely controlled the bank angle and pitch, modulating these at intervals with the aim of gaining as much lift as possible. A navigational strategy was determined solely from the glider's pooled experiences, collected over several days in the field. The strategy relies on on-board methods to accurately estimate the local vertical wind accelerations and the roll-wise torques on the glider, which serve as navigational cues. We establish the validity of our learned flight policy through field experiments, numerical simulations and estimates of the noise in measurements caused by atmospheric turbulence. Our results highlight the role of vertical wind accelerations and roll-wise torques as effective mechanosensory cues for soaring birds and provide a navigational strategy that is directly applicable to the development of autonomous soaring vehicles.