Please use this identifier to cite or link to this item:
Publisher URL:
Title: Preliminary analysis on data quality for ML applications
Language: English
Authors: Kiebler, Lorenz 
Moroff, Nikolas Ulrich 
Jacobsen, Jens Jakob 
Editor: Kersten, Wolfgang  
Jahn, Carlos  
Blecker, Thorsten 
Ringle, Christian M.  
Keywords: Artificial Intelligence; Blockchain
Issue Date: Sep-2022
Publisher: epubli
Source: Hamburg International Conference of Logistics (HICL) 33: 207-236 (2022)
Abstract (english): 
Purpose: This publication investigates preliminary data quality analyses to estimate the efforts and expected results of the use of data sets for ML solutions already in the data understanding phase of an implementation. Knowledge about the necessary data cleaning efforts and result qualities allows potentials to be estimated early in the process.
Methodology: Through a literature research, characteristics of a time series as well as methods of data cleaning are analysed. Based on the results, a test environment is implemented in Python, enabling the evaluation of individual methods using sample data sets from the process industry and comparing them with different error analyses.
Findings: The publication describes a detailed overview of data cleaning procedures and addresses a first Indication of a connection between the final achievable forecast quality and the degree of error of the original data set. Insights into the influence of the choice of preprocessing method on the achievable quality of the AI-based forecast can be concluded.
Originality: Within the publication, the link between data characteristics in time series and preprocessing methods is established to draw conclusions in advance about the quality improvement to be expected from selected data cleaning methods and to provide decision support for the selection of the method.
Conference: Hamburg International Conference of Logistics (HICL) 2022 
DOI: 10.15480/882.4693
ISBN: 978-3-756541-95-9
ISSN: 2365-5070
Document Type: Chapter/Article (Proceedings)
Peer Reviewed: Yes
License: CC BY-SA 4.0 (Attribution-ShareAlike 4.0) CC BY-SA 4.0 (Attribution-ShareAlike 4.0)
Part of Series: Proceedings of the Hamburg International Conference of Logistics (HICL) 
Volume number: 33
Appears in Collections:Publications with fulltext

Files in This Item:
File Description SizeFormat
Kiebler et al. (2022) - Preliminary Analysis on Data Quality for ML Applications.pdfPreliminary Analysis on Data Quality for ML Applications872,31 kBAdobe PDFView/Open
Show full item record

Page view(s)

checked on Dec 9, 2022


checked on Dec 9, 2022

Google ScholarTM


Note about this record

Cite this record


This item is licensed under a Creative Commons License Creative Commons