Prioritized Relationship Analysis in Heterogeneous Information Networks
|Title:||Prioritized Relationship Analysis in Heterogeneous Information Networks||Authors:||Liang, Jiongqian
Nicholson, Patrick K.
|Permanent link:||http://hdl.handle.net/10197/10137||Date:||Apr-2018||Online since:||2019-04-24T13:32:43Z||Abstract:||An increasing number of applications are modeled and analyzed in network form, where nodes represent entities of interest and edges represent interactions or relationships between entities. Commonly, such relationship analysis tools assume homogeneity in both node type and edge type. Recent research has sought to redress the assumption of homogeneity and focused on mining heterogeneous information networks (HINs) where both nodes and edges can be of different types. Building on such efforts, in this work, we articulate a novel approach for mining relationships across entities in such networks while accounting for user preference over relationship type and interestingness metric. We formalize the problem as a top-k lightest paths problem, contextualized in a real-world communication network, and seek to find the k most interesting path instances matching the preferred relationship type. Our solution, PROphetic HEuristic Algorithm for Path Searching (PRO-HEAPS), leverages a combination of novel graph preprocessing techniques, well-designed heuristics and the venerable A* search algorithm. We run our algorithm on real-world large-scale graphs and show that our algorithm significantly outperforms a wide variety of baseline approaches with speedups as large as 100X. To widen the range of applications, we also extend PRO-HEAPS to (i) support relationship analysis between two groups of entities and (ii) allow pattern path in the query to contain logical statements with operators AND, OR, NOT, and wild-card “.”. We run experiments using this generalized version of PRO-HEAPS and demonstrate that the advantage of PRO-HEAPS becomes even more pronounced for these general cases. Furthermore, we conduct a comprehensive analysis to study how the performance of PRO-HEAPS varies with respect to various attributes of the input HIN. We finally conduct a case study to demonstrate valuable applications of our algorithm.||Type of material:||Journal Article||Publisher:||ACM||Journal:||ACM Transactions on Knowledge Discovery from Data||Volume:||12||Issue:||3||Copyright (published version):||2018 ACM||Keywords:||Heterogeneous information networks; Semantic relationship queries; Graph algorithms; Information systems; Data mining; Social networks||DOI:||10.1145/3154401||Language:||en||Status of Item:||Peer reviewed|
|Appears in Collections:||Computer Science Research Collection|
Show full item record
This item is available under the Attribution-NonCommercial-NoDerivs 3.0 Ireland. No item may be reproduced for commercial purposes. For other possible restrictions on use please refer to the publisher's URL where this is made available, or to notes contained in the item itself. Other terms may apply.