Extending Jensen Shannon Divergence to Compare Multiple Corpora

Files in This Item:
File Description SizeFormat 
insight_publication.pdf1.19 MBAdobe PDFDownload
Title: Extending Jensen Shannon Divergence to Compare Multiple Corpora
Authors: Lu, JinghuiHenchion, MaeveMacNamee, Brian
Permanent link: http://hdl.handle.net/10197/10517
Date: 1-Jan-2017
Online since: 2019-05-20T08:53:52Z
Abstract: Investigating public discourse on social media platforms has proven a viable way to reflect the impacts of political issues. In this paper we frame this as a corpus comparison problem in which the online discussion of different groups are treated as different corpora to be compared. We propose an extended version of the Jensen-Shannon divergence measure to compare multiple corpora and use the FP-growth algorithm to mix unigrams and bigrams in this comparison. We also propose a set of visualizations that can illustrate the results of this analysis. To demonstrate these approaches we compare the Twitter discourse surrounding Brexit in Ireland and Great Britain across a 14 week time period.
Funding Details: Teagasc
Type of material: Conference Publication
Publisher: CEUR-WS.org
Series/Report no.: CEUR Workshop Proceedings, Volume 2086, 2018
Keywords: Machine learning and statisticsJensen-Shannon divergence measureVisualisationsBrexit
Language: en
Status of Item: Peer reviewed
Is part of: McAuley, J., McKeever, S. (eds.). Proceedings of the 25th Irish Conference on Artificial Intelligence and Cognitive Science
Conference Details: 25th Irish Conference on Artificial Intelligence and Cognitive Science, Dublin, Ireland, 7 - 8 December 2017
Appears in Collections:Insight Research Collection

Show full item record

Google ScholarTM

Check


This item is available under the Attribution-NonCommercial-NoDerivs 3.0 Ireland. No item may be reproduced for commercial purposes. For other possible restrictions on use please refer to the publisher's URL where this is made available, or to notes contained in the item itself. Other terms may apply.