Ensemble Topic Modeling via Matrix Factorization

Files in This Item:
File Description SizeFormat 
insight_publication.pdf247.25 kBAdobe PDFDownload
Title: Ensemble Topic Modeling via Matrix Factorization
Authors: Belford, Mark
MacNamee, Brian
Greene, Derek
Permanent link: http://hdl.handle.net/10197/8336
Date: 21-Sep-2016
Abstract: Topic models can provide us with an insight into the underlying latent structure of a large corpus of documents, facilitating knowledge discovery and information summarization. A range of methods have been proposed in the literature, including probabilistic topic models and techniques based on matrix factorization. However, these methods tend to have stochastic elements in their initialization, which can lead to their output being unstable. That is, if a topic modeling algorithm is applied to the same data multiple times, the output will not necessarily always be the same. With this idea of stability in mind we ask the question – how can we produce a definitive topic model that is both stable and accurate? To address this, we propose a new ensemble topic modeling method, based on Non-negative Matrix Factorization (NMF), which combines a collection of unstable topic models to produce a definitive output. We evaluate this method on an annotated tweet corpus, where we show that this new approach is more accurate and stable than traditional NMF.
Funding Details: Science Foundation Ireland
Type of material: Conference Publication
Publisher: CEUR Workshop Proceedings
Copyright (published version): 2016 the Authors
Keywords: Machine learning;Statistics
Language: en
Status of Item: Peer reviewed
Is part of: Greene, D., Mac Namee, B. and Ross, R. (eds.). Proceedings of 24th Irish Conference on Artificial Intelligence and Cognitive Science (AICS'16)
Conference Details: 24th Irish Conference on Artificial Intelligence and Cognitive Science (AICS'16), Dublin, Ireland, 20-21 September 2016
Appears in Collections:Insight Research Collection

Show full item record

Download(s) 50

82
checked on May 25, 2018

Google ScholarTM

Check


This item is available under the Attribution-NonCommercial-NoDerivs 3.0 Ireland. No item may be reproduced for commercial purposes. For other possible restrictions on use please refer to the publisher's URL where this is made available, or to notes contained in the item itself. Other terms may apply.