Options
Adaptive Window Strategy for Topic Modeling in Document Streams
Publikationstyp
Conference Paper
Date Issued
2018-07
Sprache
English
Article Number
8489771
Citation
2018 International Joint Conference on Neural Networks (IJCNN 2018)
Contribution to Conference
Publisher DOI
Scopus ID
Extracting global themes from a written text has recently become a major issue for computational intelligence, in particular in Natural Language Processing communities. Among all proposed solutions, Latent Dirichlet Allocation (LDA) has gained a vast interest and several variants have been proposed to adapt to changing environments. With the emergence of data streams, for instance from social media, the domain faces a new challenge: Topic extraction in real time. In this paper, we propose a simple approach called Adaptive Window based Incremental LDA (AWILDA) originating from the cross-over between LDA and state-of-the-art methods in data stream mining. We train new topic models only when a drift is detected and select training data on the fly using ADWIN algorithm. We provide both theoretical guarantees for our method and experimental validation on artificial and real-world data.