TUHH Open Research
Help
  • Log In
    New user? Click here to register.Have you forgotten your password?
  • English
  • Deutsch
  • Communities & Collections
  • Publications
  • Research Data
  • People
  • Institutions
  • Projects
  • Statistics
  1. Home
  2. TUHH
  3. Publications
  4. Dynamic structure investigation and spectra prediction of biomolecules using machine learning techniques
 
Options

Dynamic structure investigation and spectra prediction of biomolecules using machine learning techniques

Citation Link: https://doi.org/10.15480/882.9689
Publikationstyp
Doctoral Thesis
Date Issued
2024
Sprache
English
Author(s)
Kotobi, Amir  
Advisor
Meißner, Robert  orcid-logo
Referee
Bari, Sadia  
Title Granting Institution
Technische Universität Hamburg
Place of Title Granting Institution
Hamburg
Examination Date
2024-06-06
Institute
Modellierung weicher Materie M-29  
TORE-DOI
10.15480/882.9689
TORE-URI
https://hdl.handle.net/11420/47867
Citation
Technische Universität Hamburg (2024)
The investigation of biomolecular structures and the prediction of their spectra using experimental and theoretical studies in the gas phase represent fundamental steps in comprehending their intrinsic properties and biological functions. Nonetheless, the complexity of the potential energy surface of biomolecules, combined with limitations in computational resources, limits the interpretation of experimental observations. Integrating supervised and unsupervised machine learning (ML) techniques into theoretical calculations is considered as an effective way to address these challenges.
Infrared (IR) and X-ray absorption spectroscopy (XAS) has proven to be powerful experimental techniques to study the electronic and spatial structure of biomolecules such as peptides and proteins. Reproducing and validating the features observed in spectra resulting from these experiments often requires the use of sophisticated ab initio calculations and comprehensive understanding of biomolecules’ configurational space.
In this thesis, I introduced a novel approach in interpretation of IR experimental spectrum of a peptide which aims enhancing the exploratory power of searching configurational space by
combining REMD simulations, unsupervised machine learning, and ab initio calculations. This scheme relies on a set of structural descriptors and data-driven clustering technique which accounts for canonical ensemble of real experimental condition to obtain an accurate computed spectrum. We show that by partitioning the configurational space into subensembles of imilar conformations i.e. clusters, an accurate IR spectrum can be calculated by averaging the IR contribution of each representative conformer in each cluster, weighted according to the population of each cluster. While this approach unravels important fingerprints of experimental spectroscopic data, the calculation of IR and particularly XAS spectra, due to its inherently expensive theoretical computation, is often computationally prohibitive task for even medium-sized molecules.
To remedy the computational obstacles associated with spectra prediction, we develope a data-driven supervised ML frameworks, i.e. graph neural networks which are trained on a custom-generated XAS dataset to find a mapping between structures and spectroscopic signals, thus bypassing the need for expensive ab initio quantum chemistry calculations. To insure the
interpretability of GNN models’ predictions, we employ feature attribution to determine the respective contributions of various atoms in the molecules to the peaks observed in the XAS spectrum. Within this approach, we show that it is possible to link the peaks observed in the spectra to certain core and virtual orbitals from the quantum chemical calculations and obtain an
in-depth understanding of the ML predicted XAS spectrum.
The results presented in this thesis show that the integration of supervised and unsupervised ML techniques can effectively enhance the interpretation of spectroscopic data and make efficient use of the expensive ab initio calculations.
Subjects
Machine learning
Infrared (IR)
X-ray absorption spectroscopy (XAS)
Graph neural networks (GNN)
Explainability AI
DDC Class
540: Chemistry
570: Life Sciences, Biology
510: Mathematics
Funding(s)
DASHH Helmholtz Graduiertenkolleg  
Funding Organisations
Helmholtz-Zentrum Hereon  
DASHH
Deutsches Elektronen-Synchrotron DESY  
Lizenz
https://creativecommons.org/licenses/by/4.0/
Loading...
Thumbnail Image
Name

Amir_Kotobi_dynamic_structure_investigation_and_spectra_prediction_of_biomolecules_using_machine_learning_techniques.pdf

Size

26.45 MB

Format

Adobe PDF

TUHH
Weiterführende Links
  • Contact
  • Send Feedback
  • Cookie settings
  • Privacy policy
  • Impress
DSpace Software

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science
Design by effective webwork GmbH

  • Deutsche NationalbibliothekDeutsche Nationalbibliothek
  • ORCiD Member OrganizationORCiD Member Organization
  • DataCiteDataCite
  • Re3DataRe3Data
  • OpenDOAROpenDOAR
  • OpenAireOpenAire
  • BASE Bielefeld Academic Search EngineBASE Bielefeld Academic Search Engine
Feedback