TUHH Open Research
Help
  • Log In
    New user? Click here to register.Have you forgotten your password?
  • English
  • Deutsch
  • Communities & Collections
  • Publications
  • Research Data
  • People
  • Institutions
  • Projects
  • Statistics
  1. Home
  2. TUHH
  3. Publications
  4. Geometric learning of latent parameters with Helmholtz Machines
 
Options

Geometric learning of latent parameters with Helmholtz Machines

Citation Link: https://doi.org/10.15480/882.14213
Publikationstyp
Doctoral Thesis
Date Issued
2025
Sprache
English
Author(s)
Várady, Csongor-Huba 
Advisor
Ay, Nihat  
Referee
Zemke, Jens  orcid-logo
Title Granting Institution
Technische Universität Hamburg
Place of Title Granting Institution
Hamburg
Examination Date
2024-11-28
Institute
Data Science Foundations E-21  
TORE-DOI
10.15480/882.14213
TORE-URI
https://tore.tuhh.de/handle/11420/52903
Citation
Technische Universität Hamburg (2025)
In this thesis, we use concepts from Information Geometry (IG), such as Natural Gradient Descent (NG), to improve the training of a Helmholtz Machine (HM) through the design and implementation of a novel algorithm called the Natural Reweighted Wake-Sleep (NRWS).

First, we prove that for any Directed Acyclic Graph (DAG) the associated Fisher Information Matrix (FIM), which describes the geometry of the statistical manifold, has a fine-grained block-diagonal structure that is efficient to invert. By exploiting the fact that the HM is composed of two DAG networks, we adapt its training algorithm into the NRWS implementing NG.

The NRWS not only achieves better performance in the minimum of the optimization loss compared to other training methods, such as the Reweighted Wake-Sleep (RWS) and Bidirectional Helmholtz Machine but also outperforms them in both epochs and wall-clock time. In particular, we present how the NRWS achieves state-of-the-art performance on standard benchmark datasets (MNIST, FashionMNIST, and Toronto Face Dataset) based on the importance sampling estimation of the log-likelihood of the HM.

By adapting Accelerated Gradients (AG) methods to operate within the geometry defined by the FIM of the HM, we further improve the performance of the NRWS. Using first-order AG methods, such as Momentum and Nesterov Momentum, improves the convergence rate of the NRWS without any computational overhead. Additionally, we develop a regularizer method based on the Maximum Entropy Principle, named the Entropy Regularizer (ER), which we show further improves the NRWS by reaching lower optimization loss and narrowing the generalization gap of the algorithm without extra time penalty, which can also be applied to non-geometric training methods. Conveniently, the NRWS framework is compatible with continuous random variables; hence, we show how the FIM can be derived for normally distributed hidden variables.

Finally, we explore the possibilities of using HMs with Convolutional Neural Networks (CNNs) by computing the FIM for such network topologies and showing that the resulting matrix also has a finely-grained block-diagonal structure. We finish by presenting a hypothesis on the difficulties of using CNNs with HMs and NRWS. We make significant contributions to the field of IG and HM, with numerous findings that could be further explored or reused in other research fields. Our results can represent a starting point for future research on improving training algorithms for neural networks and deep learning models using geometric methods, such as the NG.
Subjects
Helmholtz machine
Natural gradient
Natural reweighted wake sleep
DDC Class
006.3: Artificial Intelligence
510: Mathematics
Lizenz
https://creativecommons.org/licenses/by/4.0/
Loading...
Thumbnail Image
Name

Varady_Csongor_Huba-Geometric_Learning_of_Latent_Parameters_with_Helmholtz_Machines.pdf

Size

11.87 MB

Format

Adobe PDF

TUHH
Weiterführende Links
  • Contact
  • Send Feedback
  • Cookie settings
  • Privacy policy
  • Impress
DSpace Software

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science
Design by effective webwork GmbH

  • Deutsche NationalbibliothekDeutsche Nationalbibliothek
  • ORCiD Member OrganizationORCiD Member Organization
  • DataCiteDataCite
  • Re3DataRe3Data
  • OpenDOAROpenDOAR
  • OpenAireOpenAire
  • BASE Bielefeld Academic Search EngineBASE Bielefeld Academic Search Engine
Feedback