Seeing the Forest AND the Trees: Drone Self-Navigation at Speed in Complex Environments

Quadrotor navigates through a forest. All images courtesy University of Zurich.

Harnessing onboard sensing and computing capabilities to autonomously maneuver at high speed in obstacle-laden environments constitutes the next major hurdle or unmanned systems. The Robotics and Perception Group at the University of Zurich, collaborating with Intel in Germany and the U.S., has published a paper in Science Robotics magazine describing a simulation methodology that enabled an autonomous drone to maneuver aggressively through complex environments at up to 40 kph (25 mph).

The paper shows how they trained a small quadrotor to find its own way through obstacle courses by flight simulation. They directly map noisy sensory observations to collision-free trajectories in a receding-horizon fashion. This  reduces processing latency and increases robustness to noisy and incomplete perception. A convolutional network trained exclusively in simulation performs the sensorimotor mapping.

Experimental platform. The main computational unit is an NVIDIA Jetson TX2, whose GPU is used for neural
network inference and CPU for the control stack. Sensing is performed by an Intel Realsense T265 for state estimation and an Intel Realsense
D435 for depth estimation.

“While humans require years to train, the artificial intelligence (AI), leveraging high-performance simulators, can reach comparable navigation abilities much faster, basically overnight,”said lead author Antonio Loquercio, a post-doctoral student at UZH.

In their abstract, the authors state “We propose an end-to-end approach that can autonomously fly quadrotors through complex natural and human-made environments at high speeds, with purely onboard sensing and computation. The key principle is to directly map noisy sensory observations to collision-free trajectories in a receding-horizon fashion. This direct mapping drastically reduces processing latency and increases robustness to noisy and incomplete perception. The sensorimotor mapping is performed by a convolutional network that is trained exclusively in simulation via privileged learning: imitating an expert with access to privileged information. By simulating realistic sensor noise, our approach achieves zero-shot transfer from simulation to challenging real-world environments that were never experienced during training: dense forests, snow-covered terrain, derailed trains, and collapsed buildings.”

They conclude “Analyzing their performance indicates that humans have a very rich and detailed understanding of their surroundings and are capable of planning and executing plans that span far in the future (our approach plans only one second into the future). Both are capabilities that today’s autonomous systems still lack. We see our work as a stepping stone towards faster autonomous flight that is enabled by directly predicting collision-free trajectories from high-dimensional (noisy) sensory input.”