Repository logo
  • Log In
    New user? Click here to register.Have you forgotten your password?
University College Dublin
  • Colleges & Schools
  • Statistics
  • All of DSpace
  • Log In
    New user? Click here to register.Have you forgotten your password?
  1. Home
  2. Institutes and Centres
  3. Insight Centre for Data Analytics
  4. Insight Research Collection
  5. Parallel extraction of Regions-of-Interest from social media data
 
  • Details
Options

Parallel extraction of Regions-of-Interest from social media data

File(s)
FileDescriptionSizeFormat
Download insight_publication.pdf8.57 MB
Author(s)
Belcastro, Loris 
Kechadi, Tahar 
Marozzo, Fabrizio 
Pastore, Luca 
et al. 
Uri
http://hdl.handle.net/10197/11699
Date Issued
25 April 2021
Date Available
13T08:40:46Z November 2020
Abstract
Geotagged data gathered from social media can be used to discover places‐of‐interest (PoIs) that have attracted many visitors. Since a PoI is generally identified by geographical coordinates of a single point, it is hard to match it with people trajectories. Therefore, we define an area, called region‐of‐interest (RoI), represented by the boundaries of a PoI. The main goal of this study is to discover RoIs from PoIs using spatial data mining techniques. In this paper, we propose a new parallel method for extracting RoIs from social media datasets. It consists of two main steps: (i) automatic keyword extraction and data grouping and (ii) parallel RoI extraction. The first step extracts keywords identifying the PoIs; these keywords are used to group social media items according to the places they refer to. The second step uses a Parallel Clustering Approach (ParCA) of spatial dataset to identify RoIs. ParCA exploits a parallel execution of DBSCAN on subsets of data to generate subclusters on each processing node and then merge overlapping subclusters to form global clusters. ParCA was implemented using the MapReduce model. Experiments performed over a set of PoIs in the city of Rome using social media data show that our approach is highly scalable and reaches an accuracy of 79% in detecting RoIs. On a parallel computer with 50 cores, we obtained a speedup of 52 by processing large datasets divided into 32 splits, compared with the execution time registered when each dataset is not partitioned.
Sponsorship
Science Foundation Ireland
Other Sponsorship
Insight Research Centre
Type of Material
Journal Article
Publisher
Wiley
Journal
Concurrency and Computation: Practice and Experience
Volume
33
Issue
8
Copyright (Published Version)
2020 Wiley
Keywords
  • Parallel clustering

  • Regions-of-interest

  • Rol mining

  • Scalability

  • Social media analysis...

DOI
10.1002/cpe.5638
Language
English
Status of Item
Peer reviewed
This item is made available under a Creative Commons License
https://creativecommons.org/licenses/by-nc-nd/3.0/ie/
Owning collection
Insight Research Collection
Scopus© citations
7
Acquisition Date
Mar 28, 2023
View Details
Views
749
Last Week
2
Last Month
2
Acquisition Date
Mar 28, 2023
View Details
Downloads
344
Last Week
2
Last Month
20
Acquisition Date
Mar 28, 2023
View Details
google-scholar
University College Dublin Research Repository UCD
The Library, University College Dublin, Belfield, Dublin 4
Phone: +353 (0)1 716 7583
Fax: +353 (0)1 283 7667
Email: mailto:research.repository@ucd.ie
Guide: http://libguides.ucd.ie/rru

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Cookie settings
  • Privacy policy
  • End User Agreement