Options
How to crack a SMILES: automatic crosschecked chemical structure resolution across multiple services using MoleculeResolver
Citation Link: https://doi.org/10.15480/882.15775
Publikationstyp
Journal Article
Date Issued
2025-08-04
Sprache
English
TORE-DOI
Journal
Volume
17
Issue
1
Article Number
117
Citation
Journal of Cheminformatics 17 (1): 117 (2025)
Publisher DOI
Scopus ID
Publisher
Springer Nature
Abstract: Accurate chemical structure resolution from textual identifiers such as names and CAS RN® is critical for computational modeling in chemistry and related fields. This paper introduces MoleculeResolver, an automated, robust Python-based tool designed to address inconsistencies and inaccuracies commonly encountered when converting chemical identifiers to canonical SMILES strings. MoleculeResolver systematically crosschecks structures retrieved from multiple reputable chemical databases, implements rigorous identifier plausibility checks, standardizes molecular structures, and intelligently selects the most accurate representation based on a unique resolution algorithm. Scientific contribution: Benchmarks across diverse datasets confirm that MoleculeResolver significantly enhances precision, recall, and overall reliability compared to traditional single-source methods, proving its utility as a valuable resource for chemists, data scientists, and researchers engaged in high-quality molecular data analysis and predictive model development.
Subjects
Chemical structure retrieval
Identifier
ML
MoleculeResolver
Python
QSPR
SMILES
DDC Class
540: Chemistry
620: Engineering
Publication version
publishedVersion
Loading...
Name
s13321-025-01064-7.pdf
Type
Main Article
Size
3.58 MB
Format
Adobe PDF