Now showing 1 - 8 of 8
  • Publication
    An Investigation into Information Navigation via Diverse Keyword-based Facets
    In the age of information overload, it is necessary to provide effective information navigation tools that operate over unstructured textual data. Current state-of-the-art methods are limited in terms of providing effective exploration capabilities for various information seeking tasks, especially those arising in domains such as online journalism. Here we argue for improvements in faceted search systems, via new strategies for identifying keyword-based facets. Our proposed technique utilises a PageRank model operating over the graph of terms appearing in documents, while also employing novel methods for biasing significant terms and named entities. In addition, we consider the notion of diversity within extracted keywords in an effort to maximize coverage over a range of topics. We perform experimental evaluations over issues relevant to the Irish General Elections 2016, demonstrating the superior performance of our proposed technique.
      177
  • Publication
    EVE: explainable vector based embedding technique using Wikipedia
    (Springer, 2019-08) ;
    We present an unsupervised explainable vector embedding technique, called EVE, which is built upon the structure of Wikipedia. The proposed model defines the dimensions of a semantic vector representing a concept using human readable labels, thereby it is readily interpretable. Specifically, each vector is constructed using the Wikipedia category graph structure together with the Wikipedia article link structure. To test the effectiveness of the proposed model, we consider its usefulness in three fundamental tasks: 1) intruder detection to evaluate its ability to identify a non-coherent vector from a list of coherent vectors, 2) ability to cluster to evaluate its tendency to group related vectors together while keeping unrelated vectors in separate clusters, and 3) sorting relevant items first to evaluate its ability to rank vectors (items) relevant to the query in the top order of the result. For each task, we also propose a strategy to generate a task-specific human-interpretable explanation from the model. These demonstrate the overall effectiveness of the explainable embeddings generated by EVE. Finally, we compare EVE with the Word2Vec, FastText, and GloVe embedding techniques across the three tasks, and report improvements over the state-of-the-art.
    Scopus© Citations 16  308
  • Publication
    Lit@EVE: Explainable Recommendation based on Wikipedia Concept Vectors
    (Springer, 2017-12-30) ;
    We present an explainable recommendation system for novels and authors,called Lit@EVE, which is based on Wikipedia concept vectors. In this system,each novel or author is treated as a concept whose definition is extractedas a concept vector through the application of an explainable word embeddingtechnique called EVE. Each dimension of the concept vector is labelled as eithera Wikipedia article or a Wikipedia category name, making the vector representationreadily interpretable. In order to recommend items, the Lit@EVE systemuses these vectors to compute similarity scores between a target novel or authorand all other candidate items. Finally, the system generates an ordered list of suggesteditems by showing the most informative features as human-readable labels,thereby making the recommendation explainable.
      433
  • Publication
    TwitterCracy: Exploratory Monitoring of Twitter Streams for the 2016 U.S. Presidential Election Cycle
    We present TwitterCracy, an exploratory search system that allows users to search and monitor across the Twitter streams of political entities. Its exploratory capabilities stem from the application of lightweight time-series based clustering together with biased PageRank to extract facets from tweets and presenting them in a manner that facilitates exploration.
      387
  • Publication
    TweetCric: A Twitter-based Accountability Mechanism for Cricket
    This paper demonstrates a Web service called TweetCric touncover cricket insights from Twitter with the aim of facilitating sportsanalysts and journalists. It essentially arranges crowdsourced Twitterdata about a team in comprehensive visualizations by incorporatingdomain-specic approaches to sentiment analysis.
      489
  • Publication
    ZeChipC: Time Series Interpolation Method Based on Lebesgue Sampling
    In this paper, we present an interpolation method based on Lebesgue sampling that could help to develop systems based time series more efficiently. Our methods can transmit times series, frequently used in health monitoring, with the same level of accuracy but using much fewer data. Our method is based in Lebesgue sampling, which collects information depending on the values of the signal (e.g. the signal output is sampled when it crosses specific limits). Lebesgue sampling contains additional information about the shape of the signal in-between two sampled points. Using this information would allow generating an interpolated signal closer to the original one. In our contribution, we propose a novel time-series interpolation method designed explicitly for Lebesgue sampling called ZeChipC. ZeChipC is a combination of Zero-order hold and Piecewise Cubic Hermite Interpolating Polynomial (PCHIP) interpolation. ZeChipC includes new functionality to adapt the reconstructed signal to concave/convex regions. The proposed methods have been compared with state-of-the-art interpolation methods using Lebesgue sampling and have offered higher average performance.
      12
  • Publication
    Valve Health Identification Using Sensors and Machine Learning Methods
    Predictive maintenance models attempt to identify developing issues with industrial equipment before they become critical. In this paper, we describe both supervised and unsupervised approaches to predictive maintenance for subsea valves in the oil and gas industry. The supervised approach is appropriate for valves for which a long history of operation along with manual assessments of the state of the valves exists, while the unsupervised approach is suitable to address the cold start problem when new valves, for which we do not have an operational history, come online. For the supervised prediction problem, we attempt to distinguish between healthy and unhealthy valve actuators using sensor data measuring hydraulic pressures and flows during valve opening and closing events. Unlike previous approaches that solely rely on raw sensor data, we derive frequency and time domain features, and experiment with a range of classification algorithms and different feature subsets. The performing models for the supervised approach were discovered to be Adaboost and Random Forest ensembles. In the unsupervised approach, the goal is to detect sudden abrupt changes in valve behaviour by comparing the sensor readings from consecutive opening or closing events. Our novel methodology doing this essentially works by comparing the sequences of sensor readings captured during these events using both raw sensor readings, as well as normalised and first derivative versions of the sequences. We evaluate the effectiveness of a number of well-known time series similarity measures and find that using discrete Frechet distance or dynamic time warping leads to the best results, with the Bray-Curtis similarity measure leading to only marginally poorer change detection but requiring considerably less computational effort.
    Scopus© Citations 2  277
  • Publication
    Topy: Real-time Story Tracking via Social Tags
    The Topy system automates real-time story tracking by utilizing crowd- sourced tagging on social media platforms. Topy employs a state-of-the-art Twitter hashtag recommender to continuously annotate news articles with hashtags, a rich meta-data source that allows connecting articles under drastically different timelines than typical keyword based story tracking systems. Employing social tags for story tracking has the following advantages: (1) social annotation of news enables the detection of emerging concepts and topic drift in a story; (2) hashtags go beyond topics by grouping articles based on connected themes (e.g., #rip, #blacklivesmatter, #icantbreath); (3) hashtags link articles that focus on subplots of the same story (e.g., #palmyra, #isis, #refugeecrisis).
      628Scopus© Citations 3