Now showing 1 - 4 of 4
  • Publication
    Score Normalization and Aggregation for Active Learning in Multi-label Classification
    (University College Dublin. School of Computer Science and Informatics, 2010-02) ; ; ;
    Active learning is useful in situations where labeled data is scarce, unlabeled data is available, and labeling a large number of examples is costly or impractical. These techniques help by identifying a minimal set of examples to label that will support the training of an effective classifier. Thus active learning is particularly relevant for the automation of annotation tasks in multimedia. In this paper we consider the problem of employing active learning for the assignment of multiple annotations or “tags” to images in personal image collections. This form of multi-label classification has received a lot of attention in recent years, however active multi-label classification is still a new research area. The main challenge in active multilabel classification is the selection of unlabeled examples that will be informative for all tags under consideration. This selection task proves surprisingly difficult primarily because of the paucity of labeled data available. In this paper we present some solutions to this problem based on aggregated rankings from classifiers for individual tags.
      102
  • Publication
    Taking the pulse of the web : assessing sentiment on topics in online media
    The task of identifying sentiment trends in the popular media has long been of interest to analysts and pundits. Until recently, this task has required professional annotators to manually inspect individual articles in order to identify their polarity. With the increased availability of large volumes of online news content via syndicated feeds, researchers have begun to examine ways to automate aspects of this process. In this work, we describe a sentiment analysis system that uses crowdsourcing to gather non-expert annotations for economic news articles. By using these annotations in conjunction with a supervised machine learning strategy, we can generalize to label a much larger set of articles, allowing us to effectively track sentiment in different news sources over time.
      263
  • Publication
    Deriving insights from national happiness indices
    In online social media, individuals produce vast amounts of content which in effect "instruments" the world around us. Users on sites such as Twitter are publicly broadcasting status updates that provide an indication of their mood at a given moment in time, often accompanied by geolocation information. A number of strategies exist to aggregate such content to produce sentiment scores in order to build a "happiness index". In this paper, we describe such a system based on Twitter that maintains a happiness index for nine US cities. The main contribution of this paper is a companion system called SentireCrowds that allows us to identify the underlying causes behind shifts in sentiment. This ability to analyse the components of the sentiment signal highlights a number of problems. It shows that sentiment scoring on social media data without considering context is difficult. More importantly, it highlights cases where sentiment scoring methods are susceptible to unexpected shifts due to noise and trending memes.
      1366Scopus© Citations 14
  • Publication
    Using crowdsourcing and active learning to track sentiment in online media
    Tracking sentiment in the popular media has long been of interest to media analysts and pundits. With the availability of news content via online syndicated feeds, it is now possible to automate some aspects of this process. There is also great potential to crowdsource much of the annotation work that is required to train a machine learning system to perform sentiment scoring. We describe such a system for tracking economic sentiment in online media that has been deployed since August 2009. It uses annotations provided by a cohort of non-expert annotators to train a learning system to classify a large body of news items. We report on the design challenges addressed in managing the effort of the annotators and in making annotation an interesting experience.
    Scopus© Citations 67  2915