A Soft Actor-Critic Based Reinforcement Learning Approach for Motion Planning of UAVs Using Depth Images

The automated flight of multicopters in unknown areas is considered one of the most important problems in recent times. In view of current geopolitical events where drones are used for both military and civil tactics, the demand for and development of lean path-planning algorithms is of great importance. Multiple independent components involved in traditional path planning solutions are vulnerable to noise, which leads to latency in control, and, increases the potential for various failure modes within the pipeline. A general solution to this challenge in an unknown environment can be Deep Reinforcement Learning (DRL) methods for reactive path planning and obstacle avoidance for small/medium and large-scale multicopter vehicles. Reinforcement Learning (RL) based algorithms are considered an efficient black-box-based methodology that can act as learning path planning algorithms. In particular, the Soft Actor-Critic (SAC) algorithm shows promising results. SAC is known for its stability and efficiency by using a stochastic policy and maximizing an entropy-constrained reward function. These properties allow the algorithm to make exploratory decisions while maintaining learning stability. In addition, SAC benefits from high data efficiency, which is particularly advantageous when only limited resources are available for training. The efficiency of this methodology depends largely on the quality and quantity of the underlying database, as this directly influences the performance and accuracy of the models and algorithms derived from it. Our development is using a customized AI Gym wrapper for the drone simulator Airsim. The developed algorithm utilizes the advantages of SAC to create a stable and efficient learning environment that is particularly suitable for complex and dynamic scenarios as they occur in drone flight. The integration of the AI Gym Wrapper enables a seamless interface between the RL algorithm and the simulator, highlighting the challenges of indirect and direct collisions.

Subjects

MLE@TUHH

DDC Class

629.13: Aviation Engineering

Options

A Soft Actor-Critic Based Reinforcement Learning Approach for Motion Planning of UAVs Using Depth Images