Distributed Spatial Data Clustering as a New Approach for Big Data Analysis

DC FieldValueLanguage
dc.contributor.authorBendechache, Malika-
dc.contributor.authorLe-Khac, Nhien-An-
dc.contributor.authorKechadi, Tahar-
dc.date.accessioned2019-03-21T15:28:48Z-
dc.date.available2019-03-21T15:28:48Z-
dc.date.copyright2018 Springer Nature Singaporeen_US
dc.date.issued2017-08-20-
dc.identifier.citationCommunications in Computer and Information Scienceen_US
dc.identifier.urihttp://hdl.handle.net/10197/9649-
dc.descriptionThe 15th Australasian Data Mining Conference, Melbourne, Australia, 19-20 August 2017en_US
dc.description.abstractIn this paper we propose a new approach for Big Data mining and analysis. This new approach works well on distributed datasets and deals with data clustering task of the analysis. The approach consists of two main phases: the first phase executes a clustering algorithm on local data, assuming that the datasets was already distributed among the system processing nodes. The second phase deals with the local clusters aggregation to generate global clusters. This approach not only generates local clusters on each processing node in parallel, but also facilitates the formation of global clusters without prior knowledge of the number of the clusters, which many partitioning clustering algorithm require. In this study, this approach was applied on spatial datasets. The pro- posed aggregation phase is very efficient and does not involve the exchange of large amounts of data between the processing nodes. The experimental results show that the approach has super-linear speed-up, scales up very well, and can take advantage of the recent programming models, such as MapReduce model, as its results are not affected by the types of communications.en_US
dc.description.sponsorshipScience Foundation Irelanden_US
dc.language.isoenen_US
dc.publisherSpringeren_US
dc.subjectDistributed data miningen_US
dc.subjectDistributed computingen_US
dc.subjectSynchronous communicationen_US
dc.subjectAsynchronous communicationen_US
dc.subjectSuper-speedupen_US
dc.subjectSpacial data miningen_US
dc.titleDistributed Spatial Data Clustering as a New Approach for Big Data Analysisen_US
dc.typeConference Publicationen_US
dc.internal.webversionshttp://ausdm17.azurewebsites.net/-
dc.internal.webversionshttps://arxiv.org/-
dc.statusPeer revieweden_US
dc.identifier.volume845en_US
dc.identifier.startpage38en_US
dc.identifier.endpage56en_US
dc.identifier.doi10.1007/978-981-13-0292-3-
dc.neeo.contributorBendechache|Malika|aut|-
dc.neeo.contributorLe-Khac|Nhien-An|aut|-
dc.neeo.contributorKechadi|Tahar|aut|-
dc.date.updated2018-01-31T09:37:33Z-
item.fulltextWith Fulltext-
item.grantfulltextopen-
Appears in Collections:Insight Research Collection
Files in This Item:
File Description SizeFormat 
insight_publication.pdf739.72 kBAdobe PDFDownload
Show simple item record

Google ScholarTM

Check

Altmetric


This item is available under the Attribution-NonCommercial-NoDerivs 3.0 Ireland. No item may be reproduced for commercial purposes. For other possible restrictions on use please refer to the publisher's URL where this is made available, or to notes contained in the item itself. Other terms may apply.