Options
Ensemble Topic Modeling via Matrix Factorization
Author(s)
Date Issued
2016-09-21
Date Available
2017-02-13T15:06:25Z
Abstract
Topic models can provide us with an insight into the underlying latent structure of a large corpus of documents, facilitating knowledge discovery and information summarization. A range of methods have been proposed in the literature, including probabilistic topic models and techniques based on matrix factorization. However, these methods tend to have stochastic elements in their initialization, which can lead to their output being unstable. That is, if a topic modeling algorithm is applied to the same data multiple times, the output will not necessarily always be the same. With this idea of stability in mind we ask the question – how can we produce a definitive topic model that is both stable and accurate? To address this, we propose a new ensemble topic modeling method, based on Non-negative Matrix Factorization (NMF), which combines a collection of unstable topic models to produce a definitive output. We evaluate this method on an annotated tweet corpus, where we show that this new approach is more accurate and stable than traditional NMF.
Sponsorship
Science Foundation Ireland
Type of Material
Conference Publication
Publisher
CEUR Workshop Proceedings
Volume
1751
Copyright (Published Version)
2016 the Authors
Keywords
Web versions
Language
English
Status of Item
Peer reviewed
Part of
Greene, D., Mac Namee, B. and Ross, R. (eds.). Proceedings of 24th Irish Conference on Artificial Intelligence and Cognitive Science (AICS'16)
Conference Details
24th Irish Conference on Artificial Intelligence and Cognitive Science (AICS'16), Dublin, Ireland, 20-21 September 2016
This item is made available under a Creative Commons License
File(s)
Owning collection
Views
1335
Acquisition Date
Apr 18, 2024
Apr 18, 2024
Downloads
356
Last Week
2
2
Last Month
4
4
Acquisition Date
Apr 18, 2024
Apr 18, 2024