TUHH Open Research
Help
  • Log In
    New user? Click here to register.Have you forgotten your password?
  • English
  • Deutsch
  • Communities & Collections
  • Publications
  • Research Data
  • People
  • Institutions
  • Projects
  • Statistics
  1. Home
  2. TUHH
  3. Publication References
  4. A Soft Actor-Critic Based Reinforcement Learning Approach for Motion Planning of UAVs Using Depth Images
 
Options

A Soft Actor-Critic Based Reinforcement Learning Approach for Motion Planning of UAVs Using Depth Images

Publikationstyp
Conference Paper
Date Issued
2024-11-15
Sprache
English
Author(s)
Mishra, Aadi Nath
Papakonstantinou, Stephanos  
Lufttransportsysteme M-28  
Gollnick, Volker  
Lufttransportsysteme M-28  
TORE-URI
https://hdl.handle.net/11420/52598
Citation
43rd AIAA DATC/IEEE Digital Avionics Systems Conference, DASC 2024
Contribution to Conference
43rd AIAA DATC/IEEE Digital Avionics Systems Conference, DASC 2024  
Publisher DOI
10.1109/DASC62030.2024.10748743
Scopus ID
2-s2.0-85211235859
Publisher
IEEE
ISBN
9798350349610
The automated flight of multicopters in unknown areas is considered one of the most important problems in recent times. In view of current geopolitical events where drones are used for both military and civil tactics, the demand for and development of lean path-planning algorithms is of great importance. Multiple independent components involved in traditional path planning solutions are vulnerable to noise, which leads to latency in control, and, increases the potential for various failure modes within the pipeline. A general solution to this challenge in an unknown environment can be Deep Reinforcement Learning (DRL) methods for reactive path planning and obstacle avoidance for small/medium and large-scale multicopter vehicles. Reinforcement Learning (RL) based algorithms are considered an efficient black-box-based methodology that can act as learning path planning algorithms. In particular, the Soft Actor-Critic (SAC) algorithm shows promising results. SAC is known for its stability and efficiency by using a stochastic policy and maximizing an entropy-constrained reward function. These properties allow the algorithm to make exploratory decisions while maintaining learning stability. In addition, SAC benefits from high data efficiency, which is particularly advantageous when only limited resources are available for training. The efficiency of this methodology depends largely on the quality and quantity of the underlying database, as this directly influences the performance and accuracy of the models and algorithms derived from it. Our development is using a customized AI Gym wrapper for the drone simulator Airsim. The developed algorithm utilizes the advantages of SAC to create a stable and efficient learning environment that is particularly suitable for complex and dynamic scenarios as they occur in drone flight. The integration of the AI Gym Wrapper enables a seamless interface between the RL algorithm and the simulator, highlighting the challenges of indirect and direct collisions.
Subjects
Deep Reinforcement Learning | Motion Planning | Obstacle Avoidance | Path Planning | Soft Actor-Critic | UAV
MLE@TUHH
DDC Class
629.13: Aviation Engineering
TUHH
Weiterführende Links
  • Contact
  • Send Feedback
  • Cookie settings
  • Privacy policy
  • Impress
DSpace Software

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science
Design by effective webwork GmbH

  • Deutsche NationalbibliothekDeutsche Nationalbibliothek
  • ORCiD Member OrganizationORCiD Member Organization
  • DataCiteDataCite
  • Re3DataRe3Data
  • OpenDOAROpenDOAR
  • OpenAireOpenAire
  • BASE Bielefeld Academic Search EngineBASE Bielefeld Academic Search Engine
Feedback