Handling Noisy Constraints in Semi-supervised Overlapping Community Finding
Files in This Item:
|insight_publication.pdf||76.9 kB||Adobe PDF||Request a copy|
|Title:||Handling Noisy Constraints in Semi-supervised Overlapping Community Finding||Authors:||Alghamdi, Elham; Rushe, Ellen; Hossein Zadeh Bazargani, Mehran; MacNamee, Brian; Greene, Derek||Permanent link:||http://hdl.handle.net/10197/11290||Date:||12-Dec-2019||Online since:||2020-02-27T11:24:00Z||Abstract:||Community structure is an essential property that helps us to understand the nature of complex networks. Since algorithms for detecting communities are unsupervised in nature, they can fail to uncover useful groupings, particularly when the underlying communities in a network are highly overlapping . Recent work has sought to address this via semi-supervised learning , using a human annotator or “oracle” to provide limited supervision. This knowledge is typically encoded in the form of must-link and cannot-link constraints, which indicate that a pair of nodes should always be or should never be assigned to the same community. In this way, we can uncover communities which are otherwise difficult to identify via unsupervised techniques. However, in real semi-supervised learning applications, human supervision may be unreliable or “noisy”, relying on subjective decision making . Annotators can disagree with one another, they might only have limited knowledge of a domain, or they might simply complete a labeling task incorrectly due to the burden of annotation. Thus, we might reasonably expect that the pairwise constraints used in a real semi-supervised community detection task could be imperfect or conflicting. The aim of this study is to explore the effect of noisy, incorrectly-labeled constraints on the performance of semisupervised community finding algorithms for overlapping networks. Furthermore, we propose an approach to mitigate such cases in real-world network analysis tasks. We treat noisy pairwise constraints as anomalies, and use an autoencoder, a commonlyused method in the domain of anomaly detection, to identify such constraints. Initial experiments on synthetic network demonstrate the usefulness of this approach.||Funding Details:||Science Foundation Ireland||Type of material:||Conference Publication||Keywords:||Machine Learning & Statistics; Community structure; Complex networks||Other versions:||https://www.complexnetworks.org/||Language:||en||Status of Item:||Peer reviewed||Conference Details:||The 8th International Conference on Complex Networks and their Applications (Complex Networks 2019), Lisbon, Portugal, 10-12 December 2019|
|Appears in Collections:||Computer Science Research Collection|
Insight Research Collection
Show full item record
This item is available under the Attribution-NonCommercial-NoDerivs 3.0 Ireland. No item may be reproduced for commercial purposes. For other possible restrictions on use please refer to the publisher's URL where this is made available, or to notes contained in the item itself. Other terms may apply.