TUHH Open Research
Help
  • Log In
    New user? Click here to register.Have you forgotten your password?
  • English
  • Deutsch
  • Communities & Collections
  • Publications
  • Research Data
  • People
  • Institutions
  • Projects
  • Statistics
  1. Home
  2. TUHH
  3. Publications
  4. Preliminary analysis on data quality for ML applications
 
Options

Preliminary analysis on data quality for ML applications

Citation Link: https://doi.org/10.15480/882.4693
Publikationstyp
Conference Paper
Date Issued
2022-09
Sprache
English
Author(s)
Kiebler, Lorenz  
Moroff, Nikolas Ulrich  
Jacobsen, Jens Jakob  
Herausgeber*innen
Kersten, Wolfgang  orcid-logo
Jahn, Carlos  orcid-logo
Blecker, Thorsten  orcid-logo
Ringle, Christian M.  orcid-logo
TORE-DOI
10.15480/882.4693
TORE-URI
http://hdl.handle.net/11420/13909
First published in
Proceedings of the Hamburg International Conference of Logistics (HICL)  
Number in series
33
Start Page
207
End Page
236
Citation
Hamburg International Conference of Logistics (HICL) 33: 207-236 (2022)
Contribution to Conference
Hamburg International Conference of Logistics (HICL) 2022  
Publisher Link
https://www.epubli.de/shop/buch/changing-tides-the-new-role-of-resilience-and-sustainability-in-logistics-and-supply-chain-management-wolfgang-kersten-9783756541959/130939
Publisher
epubli
Peer Reviewed
true
Purpose: This publication investigates preliminary data quality analyses to estimate the efforts and expected results of the use of data sets for ML solutions already in the data understanding phase of an implementation. Knowledge about the necessary data cleaning efforts and result qualities allows potentials to be estimated early in the process.
Methodology: Through a literature research, characteristics of a time series as well as methods of data cleaning are analysed. Based on the results, a test environment is implemented in Python, enabling the evaluation of individual methods using sample data sets from the process industry and comparing them with different error analyses.
Findings: The publication describes a detailed overview of data cleaning procedures and addresses a first Indication of a connection between the final achievable forecast quality and the degree of error of the original data set. Insights into the influence of the choice of preprocessing method on the achievable quality of the AI-based forecast can be concluded.
Originality: Within the publication, the link between data characteristics in time series and preprocessing methods is established to draw conclusions in advance about the quality improvement to be expected from selected data cleaning methods and to provide decision support for the selection of the method.
Subjects
Artificial Intelligence
Blockchain
DDC Class
004: Informatik
330: Wirtschaft
380: Handel, Kommunikation, Verkehr
Publication version
publishedVersion
Lizenz
https://creativecommons.org/licenses/by-sa/4.0/
Loading...
Thumbnail Image
Name

Kiebler et al. (2022) - Preliminary Analysis on Data Quality for ML Applications.pdf

Size

872.31 KB

Format

Adobe PDF

TUHH
Weiterführende Links
  • Contact
  • Send Feedback
  • Cookie settings
  • Privacy policy
  • Impress
DSpace Software

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science
Design by effective webwork GmbH

  • Deutsche NationalbibliothekDeutsche Nationalbibliothek
  • ORCiD Member OrganizationORCiD Member Organization
  • DataCiteDataCite
  • Re3DataRe3Data
  • OpenDOAROpenDOAR
  • OpenAireOpenAire
  • BASE Bielefeld Academic Search EngineBASE Bielefeld Academic Search Engine
Feedback