Now showing 1 - 5 of 5
- PublicationReal time News Story Detection and Tracking with HashtagsTopic Detection and Tracking (TDT) is an important research topic in data mining and information retrieval and has been explored for many years. Most of the studies have approached the problem from the event tracking point of view. We argue that the definition of stories as events is not reflecting the full picture. In this work we propose a story tracking method built on crowd-tagging in social media, where news articles are labeled with hashtags in real-time. The social tags act as rich metadata for news articles, with the advantage that, if carefully employed, they can capture emerging concepts and address concept drift in a story. We present an approach for employing social tags for the purpose of story detection and tracking and show initial empirical results. We compare our method to classic keyword query retrieval and discuss an example of story tracking over time.
- PublicationOn Supporting Digital Journalism: Case Studies in Co-Designing Journalistic ToolsSince 2013 researchers at University College Dublin in the Insight Centre for Data Analytics have been involved in a significant research programme in digital journalism, specifically targeting tools and social media guidelines to support the work of journalists. Most of this programme was undertaken in collaboration with The Irish Times. This collaboration involved identifying key problems currently faced by digital journalists, developing tools as solutions to these problems, and then iteratively co-designing these tools with feedback from journalists. This paper reports on our experiences and learnings from this research programme, with a view to informing similar efforts in the future.
- PublicationHashtagger+: Efficient High-Coverage Social Tagging of Streaming NewsNews and social media now play a synergistic role and neither domain can be grasped in isolation. On one hand, platformssuch as Twitter have taken a central role in the dissemination and consumption of news. On the other hand, news editors rely on socialmedia for following their audiences attention and for crowd-sourcing news stories. Twitter hashtags function as a key connectionbetween Twitter crowds and the news media, by naturally naming and contextualizing stories, grouping the discussion of news andmarking topic trends. In this work we propose Hashtagger+, an efficient learning-to-rank framework for merging news and socialstreams in real-time, by recommending Twitter hashtags to news articles. We provide an extensive study of different approaches forstreaming hashtag recommendation, and show that pointwise learning-to-rank is more effective than multi-class classification as wellas more complex learning-to-rank approaches. We improve the efficiency and coverage of a state-of-the-art hashtag recommendationmodel by proposing new techniques for data collection and feature computation. In our comprehensive evaluation on real-data weshow that we drastically outperform the accuracy and efficiency of prior methods. Our prototype system delivers recommendations inunder 1 minute, with a Precision@1 of 94% and article coverage of 80%. This is an order of magnitude faster than prior approaches,and brings improvements of 5% in precision and 20% in coverage. By effectively linking the news stream to the social stream via therecommended hashtags, we open the door to solving many challenging problems related to story detection and tracking. To showcasethis potential, we present an application of our recommendations to automated news story tracking via social tags. Ourrecommendation framework is implemented in a real-time Web system available from insight4news.ucd.ie.
672Scopus© Citations 21
- PublicationTopy: Real-time Story Tracking via Social TagsThe Topy system automates real-time story tracking by utilizing crowd- sourced tagging on social media platforms. Topy employs a state-of-the-art Twitter hashtag recommender to continuously annotate news articles with hashtags, a rich meta-data source that allows connecting articles under drastically different timelines than typical keyword based story tracking systems. Employing social tags for story tracking has the following advantages: (1) social annotation of news enables the detection of emerging concepts and topic drift in a story; (2) hashtags go beyond topics by grouping articles based on connected themes (e.g., #rip, #blacklivesmatter, #icantbreath); (3) hashtags link articles that focus on subplots of the same story (e.g., #palmyra, #isis, #refugeecrisis).
521Scopus© Citations 2
- PublicationSocialTree: Socially Augmented Structured Summaries of News StoriesNews story understanding entails having an effective summary of a related group of articles that may span different time ranges, involve different topics and entities, and have connections to other stories. In this work, we present an approach to efficiently extract structured summaries of news stories by augmenting news media with the structure of social discourse as reflected in social media in the form of social tags. Existing event detection, topic-modeling, clustering and summarization methods yield news story summaries based only on noun phrases and named entities. These representations are sensitive to the article wording and the keyword extraction algorithm. Moreover, keyword-based representations are rarely helpful for highlighting the inter-story connections or for reflecting the inner structure of the news story because of high word ambiguity and clutter from the large variety of keywords describing news stories. Our method combines the news and social media domains to create structured summaries of news stories in the form of hierarchies of keywords and social tags, named SocialTree. We show that the properties of social tags can be exploited to augment the construction of hierarchical summaries of news stories and to alleviate the weaknesses of existing keyword-based representations. In our quantitative and qualitative evaluation the proposed method strongly outperforms the state-of-the-art with regard to both coverage and informativeness of the summaries.
268Scopus© Citations 1