Options
Ifrim, Georgiana
Preferred name
Ifrim, Georgiana
Official Name
Ifrim, Georgiana
Research Output
Now showing 1 - 7 of 7
- PublicationBe In The Know: Connecting News Articles to Relevant Twitter Conversations(2016-09-19)
; ; In this paper we propose a framework for tracking and automatically connecting news articles to Twitter conversations as captured by Twitter hashtags. For example, such a system could alert journalists about news that get a lot of Twitter reaction, so they can investigate those conversations for new developments in the story, promote their article to a set of interested consumers, or discover general sentiment towards the story. Mapping articles to hashtags is nevertheless challenging, due to different language style of articles versus tweets, the streaming aspect, and user behavior when marking tweet-terms as hashtags. We track the Irish Times RSS-feed and a focused Twitter stream over a two months period, and present a system that assigns hashtags to each article, based on its Twitter echo. We propose a machine learning approach for classifying article hashtag pairs. Our empirical study shows that our system delivers high precision for this task.201 - PublicationEfficient Sequence Regression by Learning Linear Models in All-Subsequence SpaceWe present a new approach for learning a sequence regression function, i.e., a mapping from sequential observations to a numeric score. Our learning algorithm employs coordinate gradient descent with Gauss-Southwell optimization in the feature space of all subsequences. We give a tight upper bound for the coordinate wise gradients of squared error loss which enables efficient Gauss-Southwell selection. The proposed bound is built by separating the positive and the negative gradients of the loss function and exploits the structure of the feature space. Extensive experiments on simulated as well as real-world sequence regression benchmarks show that the bound is effective and our proposed learning algorithm is efficient and accurate. The resulting linear regression model provides the user with a list of the most predictive features selected during the learning stage, adding to the interpretability of the method. Code and data related to this chapter are available at: https://github.com/svgsponer/SqLoss.
566 - PublicationConstructing Subsumption Hierarchies of Web Queries(2015-06-04)
; ; In this work, we present an approach for automatically identifying subsumption relations between web queries, a difficult (due to feature sparseness and ambiguity), but extremely useful task for many applications, ranging from user profiling and semantic enhancement of query logs, to traffic minimisation in distributed search environments (e.g., federations of digital libraries or cloud-based systems). We start by matching each query to the topics of a comprehensive web directory, and use these topics to apply query expansion in an iterative fashion. Subsequently, all expanded queries are mapped onto the DMOZ hierarchy, and the resulting subsumption relations are directly inferred from the directory structure once conflicts in the hierarchy are resolved. We evaluate our technique on real-world queries, and show that our approach is effective under all settings.131 - PublicationReal time News Story Detection and Tracking with HashtagsTopic Detection and Tracking (TDT) is an important research topic in data mining and information retrieval and has been explored for many years. Most of the studies have approached the problem from the event tracking point of view. We argue that the definition of stories as events is not reflecting the full picture. In this work we propose a story tracking method built on crowd-tagging in social media, where news articles are labeled with hashtags in real-time. The social tags act as rich metadata for news articles, with the advantage that, if carefully employed, they can capture emerging concepts and address concept drift in a story. We present an approach for employing social tags for the purpose of story detection and tracking and show initial empirical results. We compare our method to classic keyword query retrieval and discuss an example of story tracking over time.
544 - PublicationAnalyzing the impact of electricity price forecasting on energy cost-aware schedulingEnergy cost-aware scheduling, i.e., scheduling that adapts to real-time energy price volatility, can save large energy consumers millions of dollars every year in electricity costs. Energy price forecasting coupled with energy price-aware scheduling, is a step toward this goal. In this work, we study cost-aware schedules and the effect of various price forecasting schemes on the end schedule-cost. We show that simply optimizing price forecasts based on classical regression error metrics (e.g., Mean Squared Error), does not work well for scheduling. Price forecasts that do result in significantly better schedules, optimize a combination of metrics, each having a different impact on the end-schedule-cost. For example, both price estimation and price ranking are important for scheduling, but they carry different weight. We consider day-ahead energy price forecasting using the Irish Single Electricity Market as a case-study, and test our price forecasts for two real-world scheduling applications: animal feed manufacturing and home energy management systems. We show that price forecasts that co-optimize price estimation and price ranking, result in significant energy-cost savings. We believe our results are relevant for many real-life scheduling applications that are currently plagued with very large energy bills.
570Scopus© Citations 19 - PublicationTopy: Real-time Story Tracking via Social TagsThe Topy system automates real-time story tracking by utilizing crowd- sourced tagging on social media platforms. Topy employs a state-of-the-art Twitter hashtag recommender to continuously annotate news articles with hashtags, a rich meta-data source that allows connecting articles under drastically different timelines than typical keyword based story tracking systems. Employing social tags for story tracking has the following advantages: (1) social annotation of news enables the detection of emerging concepts and topic drift in a story; (2) hashtags go beyond topics by grouping articles based on connected themes (e.g., #rip, #blacklivesmatter, #icantbreath); (3) hashtags link articles that focus on subplots of the same story (e.g., #palmyra, #isis, #refugeecrisis).
637Scopus© Citations 3 - PublicationLearning-to-Rank for Real-Time High-Precision Hashtag Recommendation for Streaming NewsWe address the problem of real-time recommendation ofstreaming Twitter hashtags to an incoming stream of newsarticles. The technical challenge can be framed as largescale topic classication where the set of topics (i.e., hashtags)is huge and highly dynamic. Our main applicationscome from digital journalism, e.g., for promoting originalcontent to Twitter communities and for social indexing ofnews to enable better retrieval, story tracking and summarisation.In contrast to state-of-the-art methods that focus onmodelling each individual hashtag as a topic, we propose alearning-to-rank approach for modelling hashtag relevance,and present methods to extract time-aware features fromhighly dynamic content. We present the data collection andprocessing pipeline, as well as our methodology for achievinglow latency, high precision recommendations. Our empiricalresults show that our method outperforms the state-of-theart,delivering more than 80% precision. Our techniques areimplemented in a real-time system1, and are currently underuser trial with a big news organisation.
2460Scopus© Citations 30