Ensemble Topic Modeling via Matrix Factorization
|Title:||Ensemble Topic Modeling via Matrix Factorization||Authors:||Belford, Mark
|Permanent link:||http://hdl.handle.net/10197/8336||Date:||21-Sep-2016||Abstract:||Topic models can provide us with an insight into the underlying latent structure of a large corpus of documents, facilitating knowledge discovery and information summarization. A range of methods have been proposed in the literature, including probabilistic topic models and techniques based on matrix factorization. However, these methods tend to have stochastic elements in their initialization, which can lead to their output being unstable. That is, if a topic modeling algorithm is applied to the same data multiple times, the output will not necessarily always be the same. With this idea of stability in mind we ask the question – how can we produce a definitive topic model that is both stable and accurate? To address this, we propose a new ensemble topic modeling method, based on Non-negative Matrix Factorization (NMF), which combines a collection of unstable topic models to produce a definitive output. We evaluate this method on an annotated tweet corpus, where we show that this new approach is more accurate and stable than traditional NMF.||Funding Details:||Science Foundation Ireland||Type of material:||Conference Publication||Publisher:||CEUR Workshop Proceedings||Copyright (published version):||2016 the Authors||Keywords:||Machine learning;Statistics||Language:||en||Status of Item:||Peer reviewed||Is part of:||Greene, D., Mac Namee, B. and Ross, R. (eds.). Proceedings of 24th Irish Conference on Artificial Intelligence and Cognitive Science (AICS'16)||Conference Details:||24th Irish Conference on Artificial Intelligence and Cognitive Science (AICS'16), Dublin, Ireland, 20-21 September 2016|
|Appears in Collections:||Insight Research Collection|
Show full item record
This item is available under the Attribution-NonCommercial-NoDerivs 3.0 Ireland. No item may be reproduced for commercial purposes. For other possible restrictions on use please refer to the publisher's URL where this is made available, or to notes contained in the item itself. Other terms may apply.