TUHH Open Research
Help
  • Log In
    New user? Click here to register.Have you forgotten your password?
  • English
  • Deutsch
  • Communities & Collections
  • Publications
  • Research Data
  • People
  • Institutions
  • Projects
  • Statistics
  1. Home
  2. TUHH
  3. Publications
  4. On dissipativity of cross-entropy loss in training ResNets — A turnpike towards architecture search
 
Options

On dissipativity of cross-entropy loss in training ResNets — A turnpike towards architecture search

Citation Link: https://doi.org/10.15480/882.16637
Publikationstyp
Journal Article
Date Issued
2026-04-01
Sprache
English
Author(s)
Püttschneider, Jens  
Faulwasser, Timm  
Regelungstechnik E-14  
TORE-DOI
10.15480/882.16637
TORE-URI
https://hdl.handle.net/11420/61335
Journal
Automatica  
Volume
186
Article Number
112767
Citation
Automatica 186: 112767 (2026)
Publisher DOI
10.1016/j.automatica.2025.112767
Scopus ID
2-s2.0-105028616836
Publisher
Elsevier
The training of ResNets and neural ODEs can be formulated and analyzed from the perspective of optimal control. This paper proposes a dissipative formulation of the training of ResNets and neural ODEs for classification problems. Specifically, we consider a variant of the cross-entropy (label smoothing) as a loss function and as a regularization in the stage cost. Based on our dissipative formulation of the training, we prove that the training OCPs for ResNets and neural ODEs alike exhibit the turnpike phenomenon. We illustrate this finding with numerical results for the two spirals and MNIST datasets. Crucially, our training formulation ensures that the transformation of the data from input to output is achieved in the first layers. In the following layers, which constitute the turnpike, the data remains at an equilibrium state and therefore these layers do not contribute to the transformation learned. In principle, these layers can be pruned after training, resulting in a network with only the necessary number of layers thus simplifying tuning of hyperparameters.
Subjects
Deep learning
Dissipativity
Huber loss
Label smoothing
Manifold turnpike
Neural networks
Optimal control
DDC Class
006: Special computer methods
515: Analysis
004: Computer Sciences
Funding(s)
Projekt DEAL  
Lizenz
https://creativecommons.org/licenses/by/4.0/
Publication version
publishedVersion
Loading...
Thumbnail Image
Name

1-s2.0-S000510982500665X-main.pdf

Type

Main Article

Size

2.6 MB

Format

Adobe PDF

TUHH
Weiterführende Links
  • Contact
  • Send Feedback
  • Cookie settings
  • Privacy policy
  • Impress
DSpace Software

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science
Design by effective webwork GmbH

  • Deutsche NationalbibliothekDeutsche Nationalbibliothek
  • ORCiD Member OrganizationORCiD Member Organization
  • DataCiteDataCite
  • Re3DataRe3Data
  • OpenDOAROpenDOAR
  • OpenAireOpenAire
  • BASE Bielefeld Academic Search EngineBASE Bielefeld Academic Search Engine
Feedback