Minimizing Network Traffic for Distributed Joins Using Lightweight Locality-Aware Scheduling

DC FieldValueLanguage
dc.contributor.authorCheng, Long-
dc.contributor.authorMurphy, John-
dc.contributor.authorLiu, Qingzhi-
dc.contributor.authoret al.-
dc.date.accessioned2019-04-18T11:48:29Z-
dc.date.available2019-04-18T11:48:29Z-
dc.date.copyright2018 Springer Nature Switzerland AGen_US
dc.date.issued2018-08-31-
dc.identifier.isbn978-3-319-96983-1-
dc.identifier.urihttp://hdl.handle.net/10197/10051-
dc.descriptionThe 24th International European Conference on Parallel and Distributed Computing (EURO-PAR 2018), Turin, Italy, 27-31 2018en_US
dc.description.abstractLarge computing systems such as data centers are becoming the mainstream infrastructures for big data processing. As one of the key data operators in such scenarios, distributed joins is still challenging current techniques since it always incurs a significant cost on network communication. Various advanced approaches have been proposed to improve the performance, however, most of them just focus on data skew handling, and algorithms designed specifically for communication reduction have received less attention. Moreover, although the state-of-the-art technique can minimize network traffic, it provides fine-grained optimal schedules for all individual join keys, which could result in obvious overhead. In this paper, we propose a new approach called LAS (Lightweight Locality-Aware Scheduling), which targets reducing network communication for large distributed joins in an efficient and effective manner. We present the detailed design and implementation of LAS, and conduct an experimental evaluation using large data joins. Our results show that LAS can effectively reduce scheduling overhead and achieve comparable performance on network reduction compared to the state-of-the-art.en_US
dc.description.sponsorshipEuropean Commission Horizon 2020en_US
dc.language.isoenen_US
dc.publisherEuro-Paren_US
dc.relation.ispartofAldinucci, M., Padovani, L., Torquati, M. (eds.). Euro-Par 2018: Parallel Processing 24th International Conference on Parallel and Distributed Computing, Turin, Italy, August 27 - 31, 2018, Proceedingsen_US
dc.subjectDistributed joinsen_US
dc.subjectData localityen_US
dc.subjectNetwork communicationen_US
dc.subjectLocality-aware schedulingen_US
dc.titleMinimizing Network Traffic for Distributed Joins Using Lightweight Locality-Aware Schedulingen_US
dc.typeConference Publicationen_US
dc.internal.authorcontactotherlong.cheng@ucd.ieen_US
dc.internal.webversionshttps://europar2018.org/-
dc.statusUnspecifieden_US
dc.identifier.startpage293en_US
dc.identifier.endpage305en_US
dc.identifier.doi10.1007/978-3-319-96983-1-
dc.neeo.contributorCheng|Long|aut|-
dc.neeo.contributorMurphy|John|aut|-
dc.neeo.contributorLiu|Qingzhi|aut|-
dc.neeo.contributoret al.||aut|-
dc.date.updated2018-07-18T15:52:43Z-
dc.identifier.projectinfo:eu-repo/grantAgreement/EC/H2020/799066//NEtwork-aware Optimization for Query Executions in Large Systems/NEO-QEen_US
dc.identifier.grantid799066-
item.fulltextWith Fulltext-
item.grantfulltextopen-
Appears in Collections:Computer Science Research Collection
Files in This Item:
File Description SizeFormat 
schedule.pdf342.24 kBAdobe PDFDownload
Show simple item record

Google ScholarTM

Check

Altmetric


This item is available under the Attribution-NonCommercial-NoDerivs 3.0 Ireland. No item may be reproduced for commercial purposes. For other possible restrictions on use please refer to the publisher's URL where this is made available, or to notes contained in the item itself. Other terms may apply.