Fuger, KonradKonradFugerKwame OforiTimm-Giel, AndreasAndreasTimm-Giel2026-01-122026-01-122025-1027th International Conference on Modeling, Analysis and Simulation of Wireless and Mobile Systems, MSWiM 2025https://hdl.handle.net/11420/60787Advances in mechanical capabilities and mass manufacturing of Unmanned Aerial Vehicles (UAVs) are driving their application in various fields from precision agriculture to infrastructure monitoring and on-demand parcel delivery. Especially in urban areas it is projected that large amount of UAVs will inhabit the airspace. To facilitate the safe and reliable operation of large-scale urban UAV deployments, an Unmanned Aerial Traffic Management (UTM) system is required. Such a system needs to be aware of all movements within the airspace to control and monitor urban UAV operations. One way to realize this is the establishment of an ad-hoc network, which UAVs use for network-wide dissemination of their positions. Recently, Rate Decay Flooding (RDF) has been proposed as a tailor-made protocol to realize such a system. Although RDF has been proven to be efficient in supporting UTM applications in larger networks than ordinarily possible, much of its success relies on the proper selection of protocol parameters. In this work, we propose a reinforcement-learning framework that automatically adapts the configuration of RDF to its perceived environment. We utilize deep contextual bandits as a light-weight, but effective method to capture the non-linear relationship between the perceived environment and the achieved performance. We name this extension Dynamic Rate Decay Flooding (dynRDF). In a simulation study, we show that this solution is effective in finding optimal configurations for RDF for varying network sizes. To achieve this, only 2.7 % of all possible configurations had to be explored. Allowing dynRDF to also take the local UAV density into account, a performance gain of more than 12 % is achieved in a relevant composite metric capturing both the timely dissemination of position updates to nearby UAVs and reliable network-wide dissemination.enUAVUTMFloodingReinforcement LearningDeep Contextual BanditsComputer Science, Information and General Works::003: Systems Theory::003.5: Communication and ControlComputer Science, Information and General Works::004: Computer SciencesdynRDF: Using deep contextual bandits to optimize position flooding in urban UAV networksConference Paper10.1109/mswim67937.2025.11308755Conference Paper