Now showing 1 - 5 of 5
  • Publication
    Personalized, Health-Aware Recipe Recommendation: An Ensemble Topic Modeling Based Approach
    Food choices are personal and complex and have a significant impact on our long-term health and quality of life. By helping users to make informed and satisfying decisions, Recommender Systems (RS) have the potential to support users in making healthier food choices. Intelligent users-modeling is a key challenge in achieving this potential. This paper investigates Ensemble Topic Modelling (EnsTM) based Feature Identification techniques for efficient user-modeling and recipe recommendation. It builds on findings in EnsTM to propose a reduced data representation format and a smart user-modeling strategy that makes capturing user-preference fast, efficient and interactive. This approach enables personalization, even in a cold-start scenario. We compared three EnsTM based variations through a user study with 48 participants, using a large-scale, real-world corpus of 230,876 recipes, and compare against a conventional Content Based (CB) approach. EnsTM based recommenders performed significantly better than the CB approach. Besides acknowledging multi-domain contents such as taste, demographics and cost, our proposed approach also considers user’s nutritional preference and assists them finding recipes under diverse nutritional categories. Furthermore, it provides excellent coverage and enables implicit understanding of user’s food practices. Subsequent analysis also exposed correlation between certain features and healthier lifestyle.
  • Publication
    Anomaly Detection in Raw Audio Using Deep Autoregressive Networks
    (IEEE, 2019-05-17) ;
    Anomaly detection involves the recognition of patterns outside of what is considered normal, given a certain set of input data. This presents a unique set of challenges for machine learning, particularly if we assume a semi-supervised scenario in which anomalous patterns are unavailable at training time meaning algorithms must rely on non-anomalous data alone. Anomaly detection in time series adds an additional level of complexity given the contextual nature of anomalies. For time series modelling, autoregressive deep learning architectures such as WaveNet have proven to be powerful generative models, specifically in the field of speech synthesis. In this paper, we propose to extend the use of this type of architecture to anomaly detection in raw audio. In experiments using multiple audio datasets we compare the performance of this approach to a baseline autoencoder model and show superior performance in almost all cases.
      1088Scopus© Citations 40
  • Publication
    Overlapping community finding with noisy pairwise constraints
    In many real applications of semi-supervised learning, the guidance provided by a human oracle might be “noisy” or inaccurate. Human annotators will often be imperfect, in the sense that they can make subjective decisions, they might only have partial knowledge of the task at hand, or they may simply complete a labeling task incorrectly due to the burden of annotation. Similarly, in the context of semi-supervised community finding in complex networks, information encoded as pairwise constraints may be unreliable or conflicting due to the human element in the annotation process. This study aims to address the challenge of handling noisy pairwise constraints in overlapping semi-supervised community detection, by framing the task as an outlier detection problem. We propose a general architecture which includes a process to “clean” or filter noisy constraints. Furthermore, we introduce multiple designs for the cleaning process which use different type of outlier detection models, including autoencoders. A comprehensive evaluation is conducted for each proposed methodology, which demonstrates the potential of the proposed architecture for reducing the impact of noisy supervision in the context of overlapping community detection.
  • Publication
    Handling Noisy Constraints in Semi-supervised Overlapping Community Finding
    Community structure is an essential property that helps us to understand the nature of complex networks. Since algorithms for detecting communities are unsupervised in nature, they can fail to uncover useful groupings, particularly when the underlying communities in a network are highly overlapping [1]. Recent work has sought to address this via semi-supervised learning [2], using a human annotator or “oracle” to provide limited supervision. This knowledge is typically encoded in the form of must-link and cannot-link constraints, which indicate that a pair of nodes should always be or should never be assigned to the same community. In this way, we can uncover communities which are otherwise difficult to identify via unsupervised techniques. However, in real semi-supervised learning applications, human supervision may be unreliable or “noisy”, relying on subjective decision making [3]. Annotators can disagree with one another, they might only have limited knowledge of a domain, or they might simply complete a labeling task incorrectly due to the burden of annotation. Thus, we might reasonably expect that the pairwise constraints used in a real semi-supervised community detection task could be imperfect or conflicting. The aim of this study is to explore the effect of noisy, incorrectly-labeled constraints on the performance of semisupervised community finding algorithms for overlapping networks. Furthermore, we propose an approach to mitigate such cases in real-world network analysis tasks. We treat noisy pairwise constraints as anomalies, and use an autoencoder, a commonlyused method in the domain of anomaly detection, to identify such constraints. Initial experiments on synthetic network demonstrate the usefulness of this approach.
  • Publication
    Deep Context-Aware Novelty Detection
    A common assumption of novelty detection is that the distribution of both “normal" and “novel" data are static. This, however, is often not the case—for example scenarios where data evolves over time or where the definition of normal and novel depends on contextual information both lead to changes in these distributions. This can lead to significant difficulties when attempting to train a model on datasets where the distribution of normal data in one scenario is similar to that of novel data in another scenario. In this paper we propose a context-aware approach to novelty detection for deep autoencoders to address these difficulties. We create a semisupervised network architecture that utilises auxiliary labels to reveal contextual information and allow the model to adapt to a variety of contexts in which the definitions of normal and novel change. We evaluate our approach on both image data and real world audio data displaying these characteristics and show that the performance of individually trained models can be achieved in a single model.