Now showing 1 - 4 of 4
  • Publication
    UCD : Diachronic Text Classification with Character, Word, and Syntactic N-grams
    We present our submission to SemEval-2015Task 7: Diachronic Text Evaluation, in whichwe approach the task of assigning a date toa text as a multi-class classification problem.We extract n-gram features from the text atthe letter, word, and syntactic level, and usethese to train a classifier on date-labeled trainingdata. We also incorporate date probabilitiesof syntactic features as estimated from avery large external corpus of books. Our systemachieved the highest performance of allsystems on subtask 2: identifying texts by specifictime language use.
      549
  • Publication
    Helping News Editors Write Better Headlines: A Recommender to Improve the Keyword Contents and Shareability of News Headlines
    We present a software tool that employs state-of- the-art natural language processing (NLP) and ma- chine learning techniques to help newspaper editors compose effective headlines for online publication. The system identifies the most salient keywords in a news article and ranks them based on both their overall popularity and their direct relevance to the article. The system also uses a supervised regres- sion model to identify headlines that are likely to be widely shared on social media. The user inter- face is designed to simplify and speed the editor’s decision process on the composition of the head- line. As such, the tool provides an efficient way to combine the benefits of automated predictors of engagement and search-engine optimization (SEO) with human judgments of overall headline quality.
      353
  • Publication
    Discovering News Events That Move Markets
    Recently, there has been an explosion of interest in the use of textual sources (e.g., market reports, news articles, company reports) to predict changes in stock and commodity markets. Most of this research is on sentiment analysis, but some of it has tried to use the news itself to predict market movements. In this paper, we use 10-years of news articles – from a weekly, agricultural, trade newspaper – to predict price changes in a commodity market for beef. Two experiments explore the different ways in which news reports affect the market via (i) major market-impacting events (i.e., rare natural disasters or food scandals) or (ii) minor market-impacting events (e.g., mundane reports about inflation, oil prices, etc). We find that different techniques need to be used to uncover major events (e.g., LLRs) as opposed to minor events (e.g., classifiers) and show that no single unified predictive model appears to be able to do both.
      350Scopus© Citations 3
  • Publication
    Temporal Word Analogies: Identifying Lexical Replacement with Diachronic Word Embeddings
    (Association for Computational Linguistics, 2017-08-04)
    This paper introduces the concept of temporal word analogies: pairs of words which occupy the same semantic space at different points in time. One well-known property of word embeddings is that they are able to effectively model traditional word analogies ("word w1 is to word w2 as word w3 is to word w4") through vector addition. Here, I show that temporal word analogies ("word w1 at time ta is like word w2 at time tß") can effectively be modeled with diachronic word embeddings, provided that the independent embedding spaces from each time period are appropriately transformed into a common vector space. When applied to a diachronic corpus of news articles, this method is able to identify temporal word analogies such as "Ronald Reagan in 1987 is like Bill Clinton in 1997", or "Walkman in 1987 is like iPod in 2007".
      334Scopus© Citations 35