Neue Verfahren zum Trainieren neuronaler Netze mittels Ansätzen aus der stochastischen Regelung und der Uncertainty Quantifikation

2025-09-252025-09-25https://hdl.handle.net/11420/57601Als Teil der Forschungsgruppe "Aktives Lernen für dynamische Systeme und Regelung (AleSCo) - Dateninformativität, Unsicherheiten und Garantien" werden in diesem Projekt neue Methoden für die Analyse und das Training neuronaler Netze mit Hilfe von Methoden zur Quantifizierung von stochastischen Unsicherheiten (Uncertainty Quantification) und mit Konzepten zur Analyse und Regelung stochastischer Systeme entwickelt. Das Projekt vollzieht damit einen Perspektivwechsel von der Anwendung maschineller Lernverfahren für regelungstechnische und systemtheoretische Fragestellungen hin zur Entwicklung und zum Einsatz systemtheoretischer und regelungstechnischer Konzepte für die Analyse und das Design maschineller Lernverfahren. Das Projekt stützt sich auf die strukturelle Verwandtschaft von neuronalen Differentialgleichungen (Neural ODEs - NODEs) und neuronalen Residual-Netzen (Residual Networks - ResNets). Wir nutzen polynomiale Reihenentwicklungen (Polynomial Chaos Expansion - PCE) von L2 Zufallsvariablen, ein Konzept, das auf Norbert Wiener zurückgeht, um neuartige Methoden zur Untersuchung der Generalisierungseigenschaften neuronaler Netze zu entwickeln. Spezifisch werden Teilmengen des Raumes der Eingangsdaten eines neuronalen Netzes durch PCEs beschrieben und durch das Netz propagiert, um so das statistische Generalisierungsverhalten der prädizierten Labels zu beschreiben. In einem zweiten Schritt nutzen wir die Methoden zur Unsicherheitspropagation direkt in der Formulierung von Trainingsproblemen. Dazu wird untersucht, wie sich Anforderungen an die Generalisierungseigenschaften direkt im Training in Form von Performanzkriterien oder Nebenbedingungen berücksichtigen lassen. Das Training von neuronalen Netzen (NODEs und ResNets) wird dazu systemtheoretisch als Problem der Optimalsteuerung formalisiert. Ein Ziel ist es das Stutzen (pruning) ohne Performanzverlust von trainierten tiefen Netzen mittels geeigneter systemtheoretischer Dissipativitätskonzepte zu untersuchen. Im dritten Schritt werden auf Basis der L2 und PCE erweiterten neuronalen Netzarchitekturen Methoden des aktiven Lernens entwickelt. Dazu wird untersucht, wie neue Datenpunkte durch Quantifizierung der Unsicherheiten der Labelprädiktionen zielgerichtet gewählt werden können. Weiterhin erweitern wir systemtheoretische Konzepte der Informativität von Daten für neuronale Netze. Wir untersuchen, inwieweit Konzepte aus der stochastischen Regelungstheorie den Entwurf möglichst flacher neuronaler Netze befördern. Die entwickelten Methoden werden anhand von etablierten Machine Learning Benchmarks und mittels neuer, im Kontext der Forschungsgruppe ALeSCo entwickelter, regelungstechnischer Testprobleme evaluiert.This project is part of the research unit "Active Learning for Systems and Control (ALeSCo) - Data Informativity, Uncertainty, and Guarantees". Data-driven and learning-based approaches for modelling of dynamic systems and for design of control laws have gained prominence in recent years. Due to their universal approximation properties, neural networks in different variants and architectures are among the most frequently considered learning methods in systems and control. In contrast to this trend, this project does not ask what machine learning can do for control. Rather we explore the question of how systems and control methods, in particular concepts from uncertainty quantification and stochastic control, can be beneficial in the design and analysis of training formulations for neural networks. Specifically, the project considers Neural Ordinary Differential Equations (NODEs) and their explicit discretizations which take the form of Residual Networks (ResNets). We formulate and analyze the propagation of data through NODEs and ResNets in the framework of Polynomial Chaos Expansions (PCE) of L2 random variables, a concept proposed by Norbert Wiener. These random variables are used to model the input data of neural networks, which enables the forward propagation of entire sets of input data. Using the PCE description of the propagated data in the output layer, we explore the analysis of generalization properties. We extend the uncertainty propagation towards the training of NODEs and ResNets combining the PCE-based L2 framework with optimal control formulations of the training problem. We explore how generalization properties of neural networks can be directly considered in the training problems and how system-theoretic dissipativity notions of optimal control problems allow for performance-preserving pruning of trained networks. Moreover, we devise active learning strategies based on the conceived L2 training problems. Specifically, we investigate how the propagated input data enables uncertainty quantification for label predictions. To this end, we investigate novel data informativity notions tailored to neural networks. Finally, we explore how stochastic control concepts, i.e. feedback policies, can be leveraged to design neural networks with quantifiable generalization properties. The investigated methods are evaluated on benchmark problems stemming from the machine learning literature and on systems and control specific benchmarks developed in the research unit ALeSCo.Neue Verfahren zum Trainieren neuronaler Netze mittels Ansätzen aus der stochastischen Regelung und der Uncertainty QuantifikationNeural ODE training via stochastic control and uncertainty quantification