Distributed Spatial Data Clustering as a New Approach for Big Data Analysis

Files in This Item:
File Description SizeFormat 
insight_publication.pdf739.72 kBAdobe PDFDownload
Title: Distributed Spatial Data Clustering as a New Approach for Big Data Analysis
Authors: Bendechache, Malika
Le-Khac, Nhien-An
Kechadi, Tahar
Permanent link: http://hdl.handle.net/10197/9649
Date: 20-Aug-2017
Online since: 2019-03-21T15:28:48Z
Abstract: In this paper we propose a new approach for Big Data mining and analysis. This new approach works well on distributed datasets and deals with data clustering task of the analysis. The approach consists of two main phases: the first phase executes a clustering algorithm on local data, assuming that the datasets was already distributed among the system processing nodes. The second phase deals with the local clusters aggregation to generate global clusters. This approach not only generates local clusters on each processing node in parallel, but also facilitates the formation of global clusters without prior knowledge of the number of the clusters, which many partitioning clustering algorithm require. In this study, this approach was applied on spatial datasets. The pro- posed aggregation phase is very efficient and does not involve the exchange of large amounts of data between the processing nodes. The experimental results show that the approach has super-linear speed-up, scales up very well, and can take advantage of the recent programming models, such as MapReduce model, as its results are not affected by the types of communications.
Funding Details: Science Foundation Ireland
Type of material: Conference Publication
Publisher: Springer
Journal: Communications in Computer and Information Science
Volume: 845
Start page: 38
End page: 56
Copyright (published version): 2018 Springer Nature Singapore
Keywords: Distributed data miningDistributed computingSynchronous communicationAsynchronous communicationSuper-speedupSpacial data mining
DOI: 10.1007/978-981-13-0292-3
Other versions: http://ausdm17.azurewebsites.net/
https://arxiv.org/
Language: en
Status of Item: Peer reviewed
Conference Details: The 15th Australasian Data Mining Conference, Melbourne, Australia, 19-20 August 2017
Appears in Collections:Insight Research Collection

Show full item record

Google ScholarTM

Check

Altmetric


This item is available under the Attribution-NonCommercial-NoDerivs 3.0 Ireland. No item may be reproduced for commercial purposes. For other possible restrictions on use please refer to the publisher's URL where this is made available, or to notes contained in the item itself. Other terms may apply.