1 - 5 of 205
- PublicationThemeCrowds: Multiresolution Summaries of Twitter Usage(University College Dublin. School of Computer Science and Informatics, 2011-06)Users of social media sites, such as Twitter, rapidly generate large volumes of text content on a daily basis. Visual summaries are needed to understand what groups of people are saying collectively in this unstructured text data. Users will typically discuss a wide variety of topics, where the number of authors talking about a specific topic can quickly grow or diminish over time, and what the collective is saying about the subject can shift as a situation develops. In this paper, we present a technique that summarises what collections of Twitter users are saying about certain topics over time. As the correct resolution for inspecting the data is unknown in advance, the users are clustered hierarchically over a fixed time interval based on the similarity of their posts. The visualisation technique takes this data structure as its input. Given a topic, it finds the correct resolution of users at each time interval and provides tags to summarise what the collective is discussing. The technique is tested on three microblogging corpora, consisting of up to tens of millions of tweets and over a million users. We provide some preliminary user feedback from a research group interested in the area of social media analysis, where this tool could be applied.
- PublicationViewing the minimum dominating set and maximum coverage problems motivated by "word of mouth marketing" in a problem decomposition context(University College Dublin. School of Computer Science and Informatics, 2009)Modelling and analyzing the flow of influence is a key challenge in social network analysis. In scenarios where the network is too large to analyze in detail for computational reasons graph partitioning is a useful aid to decompose the large graph into manageable subgraphs. The question that arises in such a situation is how to partition a given graph such that the the solution obtained by combining the solutions from the individual subgraphs is as close as possible to the optimal solution obtained from the full graph (with respect to a particular objective). While graph cuts such as the min cut, ratio cut and normalised cut are a useful aid in breaking down the large problem into tractable subproblems, they may not yield the optimal graph partitioning with respect to a given objective. A natural question that arises in this scenario is “How close is the solution given by the graph cut to that of the optimal partitioning?” or in other words Are the above graph cuts good heuristics? In this report we pose the above questions with respect to two graph theoretic problems namely the minimum dominating set and maximum coverage. We partition the graphs using the normalised cut and present results that suggest that the normalised cut provides a “good partitioning” with respect to the defined objective.
- PublicationMulti-View Clustering for Mining Heterogeneous Social Network Data(University College Dublin. School of Computer Science and Informatics, 2009-03)Uncovering community structure is a core challenge in social network analysis. This is a significant challenge for large networks where there is a single type of relation in the network (e.g. friend or knows). In practice there may be other types of relation, for instance demographic or geographic information, that also reveal network structure. Uncovering structure in such multi-relational networks presents a greater challenge due to the difficulty of integrating information from different, often discordant views. In this paper we describe a system for performing cluster analysis on heterogeneous multi-view data, and present an analysis of the research themes in a bibliographic literature network, based on the integration of both co-citation links and text similarity relationships between papers in the network.
- PublicationAn Analysis of Current Trends in CBR Research Using Multi-View Clustering(University College Dublin. School of Computer Science and Informatics, 2009-03)The European Conference on Case-Based Reasoning (CBR) in 2008 marked 15 years of international and European CBR conferences where almost seven hundred research papers were published. In this report we review the research themes covered in these papers and identify the topics that are active at the moment. The main mechanism for this analysis is a clustering of the research papers based on both co-citation links and text similarity. It is interesting to note that the core set of papers has attracted citations from almost three thousand papers outside the conference collection so it is clear that the CBR conferences are a sub-part of a much larger whole. It is remarkable that the research themes revealed by this analysis do not map directly to the sub-topics of CBR that might appear in a textbook. Instead they reflect the applications-oriented focus of CBR research, and cover the promising application areas and research challenges that are faced.
- PublicationMining Features and Sentiment from Review Experiences(Springer, 2013-07-11)Supplementing product information with user-generated content such as ratings and reviews can help to convert browsers into buyers. As a result this type of content is now front and centre for many major e-commerce sites such as Amazon. We believe that this type of content can provide a rich source of valuable information that is useful for a variety of purposes. In this work we are interested in harnessing past reviews to support the writing of new useful reviews, especially for novice contributors. We describe how automatic topic extraction and sentiment analysis can be used to mine valuable information from user-generated reviews, to make useful suggestions to users at review writing time about features that they may wish to cover in their own reviews. We describe the results of a live-user trial to show how the resulting system is capable of delivering high quality reviews that are comparable to the best that sites like Amazon have to offer in terms of information content and helpfulness.
427Scopus© Citations 9