TUHH Open Research
Help
  • Log In
    New user? Click here to register.Have you forgotten your password?
  • English
  • Deutsch
  • Communities & Collections
  • Publications
  • Research Data
  • People
  • Institutions
  • Projects
  • Statistics
  1. Home
  2. TUHH
  3. Publications
  4. How to crack a SMILES: automatic crosschecked chemical structure resolution across multiple services using MoleculeResolver
 
Options

How to crack a SMILES: automatic crosschecked chemical structure resolution across multiple services using MoleculeResolver

Citation Link: https://doi.org/10.15480/882.15775
Publikationstyp
Journal Article
Date Issued
2025-08-04
Sprache
English
Author(s)
Müller, Simon  orcid-logo
Thermische Verfahrenstechnik V-8  
TORE-DOI
10.15480/882.15775
TORE-URI
https://hdl.handle.net/11420/57004
Lizenz
https://creativecommons.org/licenses/by/4.0/
Journal
Journal of cheminformatics  
Volume
17
Issue
1
Article Number
117
Citation
Journal of Cheminformatics 17 (1): 117 (2025)
Publisher DOI
10.1186/s13321-025-01064-7
Scopus ID
2-s2.0-105012597653
Publisher
Springer Nature
Abstract: Accurate chemical structure resolution from textual identifiers such as names and CAS RN® is critical for computational modeling in chemistry and related fields. This paper introduces MoleculeResolver, an automated, robust Python-based tool designed to address inconsistencies and inaccuracies commonly encountered when converting chemical identifiers to canonical SMILES strings. MoleculeResolver systematically crosschecks structures retrieved from multiple reputable chemical databases, implements rigorous identifier plausibility checks, standardizes molecular structures, and intelligently selects the most accurate representation based on a unique resolution algorithm. Scientific contribution: Benchmarks across diverse datasets confirm that MoleculeResolver significantly enhances precision, recall, and overall reliability compared to traditional single-source methods, proving its utility as a valuable resource for chemists, data scientists, and researchers engaged in high-quality molecular data analysis and predictive model development.
Subjects
Chemical structure retrieval
Identifier
ML
MoleculeResolver
Python
QSPR
SMILES
DDC Class
540: Chemistry
620: Engineering
Publication version
publishedVersion
Loading...
Thumbnail Image
Name

s13321-025-01064-7.pdf

Type

Main Article

Size

3.58 MB

Format

Adobe PDF

TUHH
Weiterführende Links
  • Contact
  • Send Feedback
  • Cookie settings
  • Privacy policy
  • Impress
DSpace Software

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science
Design by effective webwork GmbH

  • Deutsche NationalbibliothekDeutsche Nationalbibliothek
  • ORCiD Member OrganizationORCiD Member Organization
  • DataCiteDataCite
  • Re3DataRe3Data
  • OpenDOAROpenDOAR
  • OpenAireOpenAire
  • BASE Bielefeld Academic Search EngineBASE Bielefeld Academic Search Engine
Feedback