Options
Sitchinava, Nodari
Preferred name
Sitchinava, Nodari
Official Name
Sitchinava, Nodari
Research Output
Now showing 1 - 3 of 3
- PublicationGeometric Algorithms for Private-Cache Chip MultiprocessorsWe study techniques for obtaining efficient algorithms for geometric problems on private-cache chip multiprocessors.
257Scopus© Citations 11 - PublicationEmpirical Evaluation of the Parallel Distribution Sweeping Framework on Multicore ArchitecturesIn this paper, we perform an empirical evaluation of the Parallel External Memory (PEM) model in the context of geometric problems. In particular, we implement the parallel distribution sweeping framework of Ajwani, Sitchinava and Zeh to solve batched 1-dimensional stabbing max problem. While modern processors consist of sophisticated memory systems (multiple levels of caches, set associativity, TLB, prefetching), we empirically show that algorithms designed in simple models, that focus on minimizing the I/O transfers between shared memory and single level cache, can lead to efficient software on current multicore architectures. Our implementation exhibits significantly fewer accesses to slow DRAM and, therefore, outperforms traditional approaches based on plane sweep and two-way divide and conquer.
213Scopus© Citations 2 - PublicationI/O-Optimal Distribution Sweeping on Private-Cache Chip MultiprocessorsThe parallel external memory (PEM) model has been used as a basis for the design and analysis of a wide range of algorithms for private-cache multi-core architectures. As a tool for developing geometric algorithms in this model, a parallel version of the I/O-efficient distribution sweeping framework was introduced recently, and a number of algorithms for problems on axis-aligned objects were obtained using this framework. The obtained algorithms were efficient but not optimal. In this paper, we improve the framework to obtain algorithms with the optimal I/O complexity of O(sort P(N) + K/PB) for a number of problems on axis-aligned objects, P denotes the number of cores/processors, B denotes the number of elements that fit in a cache line, N and K denote the sizes of the input and output, respectively, and sort P(N) denotes the I/O complexity of sorting N items using P processors in the PEM model. To obtain the above improvement, we present a new one-dimensional batched range counting algorithm on a sorted list of ranges and points that achieves an I/O complexity of O((N + K)/PB), where K is the sum of the counts of all the ranges. The key to achieving efficient load balancing among the processors in this algorithm is a new method to count the output without enumerating it, which might be of independent interest.
365Scopus© Citations 4