Prioritized Relationship Analysis in Heterogeneous Information Networks

DC FieldValueLanguage
dc.contributor.authorLiang, Jiongqian-
dc.contributor.authorAjwani, Deepak-
dc.contributor.authorNicholson, Patrick K.-
dc.contributor.authoret al.- ACMen_US
dc.identifier.citationACM Transactions on Knowledge Discovery from Dataen_US
dc.description.abstractAn increasing number of applications are modeled and analyzed in network form, where nodes represent entities of interest and edges represent interactions or relationships between entities. Commonly, such relationship analysis tools assume homogeneity in both node type and edge type. Recent research has sought to redress the assumption of homogeneity and focused on mining heterogeneous information networks (HINs) where both nodes and edges can be of different types. Building on such efforts, in this work, we articulate a novel approach for mining relationships across entities in such networks while accounting for user preference over relationship type and interestingness metric. We formalize the problem as a top-k lightest paths problem, contextualized in a real-world communication network, and seek to find the k most interesting path instances matching the preferred relationship type. Our solution, PROphetic HEuristic Algorithm for Path Searching (PRO-HEAPS), leverages a combination of novel graph preprocessing techniques, well-designed heuristics and the venerable A* search algorithm. We run our algorithm on real-world large-scale graphs and show that our algorithm significantly outperforms a wide variety of baseline approaches with speedups as large as 100X. To widen the range of applications, we also extend PRO-HEAPS to (i) support relationship analysis between two groups of entities and (ii) allow pattern path in the query to contain logical statements with operators AND, OR, NOT, and wild-card “.”. We run experiments using this generalized version of PRO-HEAPS and demonstrate that the advantage of PRO-HEAPS becomes even more pronounced for these general cases. Furthermore, we conduct a comprehensive analysis to study how the performance of PRO-HEAPS varies with respect to various attributes of the input HIN. We finally conduct a case study to demonstrate valuable applications of our algorithm.en_US
dc.rights© ACM, 2018. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in ACM Transactions on Knowledge Discovery from Data, {12, 3, (2018)}
dc.subjectHeterogeneous information networksen_US
dc.subjectSemantic relationship queriesen_US
dc.subjectGraph algorithmsen_US
dc.subjectInformation systemsen_US
dc.subjectData miningen_US
dc.subjectSocial networksen_US
dc.titlePrioritized Relationship Analysis in Heterogeneous Information Networksen_US
dc.typeJournal Articleen_US
dc.statusPeer revieweden_US
dc.neeo.contributorNicholson|Patrick K.|aut|-
dc.neeo.contributoret al.||aut|-
dc.description.othersponsorshipNational Science Foundation (US)en_US
item.fulltextWith Fulltext-
Appears in Collections:Computer Science Research Collection
Files in This Item:
File Description SizeFormat 
ajwani_tkdd17.pdf18.23 MBAdobe PDFDownload
Show simple item record

Google ScholarTM



This item is available under the Attribution-NonCommercial-NoDerivs 3.0 Ireland. No item may be reproduced for commercial purposes. For other possible restrictions on use please refer to the publisher's URL where this is made available, or to notes contained in the item itself. Other terms may apply.