Lynch, GerardGerardLynchCunningham, PádraigPádraigCunningham2016-02-162016-02-162014 Assoc2014-06-27978-1-941643-11-2http://hdl.handle.net/10197/75105th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis (WASSA 2014), Baltimore, Maryland, USA, 27 June 2014Determining relevant content automatically is a challenging task for any aggregation system. In the business intelligence domain, particularly in the application area of Online Reputation Management, it may be desirable to label tweets as either customer comments which deserve rapid attention or tweets from industry experts or sources regarding the higher-level operations of a particular entity. We present an approach using a combination of linguistic and Twitter-specific features to represent tweets and examine the efficacy of these in distinguishing between tweets which have been labelled using Amazon’s Mechanical Turk crowd sourcing platform. Features such as part of-speech tags and function words provehighly effective at discriminating between the two categories of tweet related to several distinct entity types, with Twitter related metrics such as the presence of hash tags, retweets and user mentions also adding to classification accuracy. Accuracy of 86% is reported using an SVM classifier and a mixed set of the aforementioned features on a corpus of tweets related to seven business entities.enSocial mediaText analyticsLinguistically Informed Tweet Categorization for Online Reputation ManagementConference Publication73782015-11-16https://creativecommons.org/licenses/by-nc-nd/3.0/ie/