Please use this identifier to cite or link to this item: https://doi.org/10.15480/882.2671
Publisher DOI: 10.18420/inf2019_25
Title: Do we need real data? : testing and training algorithms with artificial geolocation data
Language: English
Authors: Kaiser, Jan 
Bavendiek, Kai 
Schupp, Sibylle 
Keywords: geolocation data;artificial data;data generation;neural networks generators;data quality
Issue Date: 2019
Publisher: Gesellschaft für Informatik
Source: In: David, K., Geihs, K., Lange, M. & Stumme, G. (Hrsg.), INFORMATIK 2019: 50 Jahre Gesellschaft für Informatik – Informatik für Gesellschaft. Bonn: Gesellschaft für Informatik e.V.. (S. 205-218).
Part of Series: GI-Edition 
Volume number: 294
Abstract (english): 
As big data becomes increasingly important, so do algorithms that operate on geolocation data. Privacy requirements and the cost of collecting large sets of geolocation data, however, make it difficult to test those algorithms with real data. Artificially generated data sets therefore present an appealing alternative. This paper explores the use of two types of neural networks as generators of geolocation data and introduces a method based on the Turing Test to determine whether generated geolocation data is indistinguishable from real data. In an extensive evaluation we apply the method to data generated by our own implementation of neural networks as well as the widely used BerlinMOD generator on the one hand, the four most prominent data sets of real geolocation data covering at total of 65 million records on the other hand. The experiments show that in eleven of twelve cases artificial data sets can be told from real ones. We conclude that, at present, the generators we tested provide no safe replacement for real data.
Conference: 50 Jahre Gesellschaft für Informatik – Informatik für Gesellschaft, Fachtagung vom 23.-26. September 2019 in Kassel 
URI: http://hdl.handle.net/11420/4344
DOI: 10.15480/882.2671
ISBN: 978-3-88579-688-6
Institute: Softwaresysteme E-16 
Document Type: Chapter/Article (Proceedings)
License: CC BY-SA 4.0 (Attribution-ShareAlike 4.0) CC BY-SA 4.0 (Attribution-ShareAlike 4.0)
Appears in Collections:Publications with fulltext

Files in This Item:
File Description SizeFormat
paper3_02.pdfVerlags-PDF1,33 MBAdobe PDFView/Open
Thumbnail
Show full item record

Page view(s)

121
Last Week
0
Last month
5
checked on Sep 24, 2021

Download(s)

28
checked on Sep 24, 2021

Google ScholarTM

Check

Note about this record

Cite this record

Export

This item is licensed under a Creative Commons License Creative Commons