PPoPP B

50 papers

YearTitle / Authors
2018A microbenchmark to study GPU performance models.
Vasily Volkov
2018A persistent lock-free queue for non-volatile memory.
Michal Friedman, Maurice Herlihy, Virendra J. Marathe, Erez Petrank
2018A predictable synchronisation algorithm.
Stefan Reif, Wolfgang Schröder-Preikschat
2018A scalable distance-1 vertex coloring algorithm for power-law graphs.
Jesun Sahariar Firoz, Marcin Zalewski, Andrew Lumsdaine
2018A scalable queue for work distribution on GPUs.
Bernhard Kerbl, Joerg H. Mueller, Michael Kenzel, Dieter Schmalstieg, Markus Steinberger
2018An effective fusion and tile size model for optimizing image processing pipelines.
Abhinav Jangda, Uday Bondhugula
2018Automated code acceleration targeting heterogeneous openCL devices.
Heinrich Riebler, Gavin Vaz, Tobias Kenter, Christian Plessl
2018Bridging the gap between deep learning and sparse matrix format selection.
Yue Zhao, Jiajia Li, Chunhua Liao, Xipeng Shen
2018Cache-tries: concurrent lock-free hash tries with constant-time operations.
Aleksandar Prokopec
2018Communication-avoiding parallel minimum cuts and connected components.
Lukas Gianinazzi, Pavel Kalvoda, Alessandro De Palma, Maciej Besta, Torsten Hoefler
2018Designing scalable FPGA architectures using high-level synthesis.
Johannes de Fine Licht, Michaela Blott, Torsten Hoefler
2018DisCVar: discovering critical variables using algorithmic differentiation for transient faults.
Harshitha Menon, Kathryn M. Mohror
2018Efficient parallel determinacy race detection for two-dimensional dags.
Yifan Xu, I-Ting Angelina Lee, Kunal Agrawal
2018Efficient shuffle management with SCache for DAG computing frameworks.
Zhouwang Fu, Tao Song, Zhengwei Qi, Haibing Guan
2018Featherlight on-the-fly false-sharing detection.
Milind Chabbi, Shasha Wen, Xu Liu
2018FlashR: parallelize and scale R for machine learning using SSDs.
Da Zheng, Disa Mhembere, Joshua T. Vogelstein, Carey E. Priebe, Randal C. Burns
2018Graph partitioning applied to DAG scheduling to reduce NUMA effects.
Isaac Sánchez Barrera, Marc Casas, Miquel Moretó, Eduard Ayguadé, Jesús Labarta, Mateo Valero
2018Griffin: uniting CPU and GPU in information retrieval systems for intra-query parallelism.
Yang Liu, Jianguo Wang, Steven Swanson
2018HPVM: heterogeneous parallel virtual machine.
Maria Kotsifakou, Prakalp Srivastava, Matthew D. Sinclair, Rakesh Komuravelli, Vikram S. Adve, Sarita V. Adve
2018Harnessing epoch-based reclamation for efficient range queries.
Maya Arbel-Raviv, Trevor Brown
2018Hierarchical memory management for mutable state.
Adrien Guatto, Sam Westrick, Ram Raghunathan, Umut A. Acar, Matthew Fluet
2018High-performance genomic analysis framework with in-memory computing.
Xueqi Li, Guangming Tan, Bingchen Wang, Ninghui Sun
2018Interval-based memory reclamation.
Haosen Wen, Joseph Izraelevitz, Wentao Cai, H. Alan Beadle, Michael L. Scott
2018Juggler: a dependence-aware task-based execution framework for GPUs.
Mehmet E. Belviranli, Seyong Lee, Jeffrey S. Vetter, Laxmi N. Bhuyan
2018Layrub: layer-centric GPU memory reuse and data migration in extreme-scale deep learning systems.
Bo Liu, Wenbin Jiang, Hai Jin, Xuanhua Shi, Yang Ma
2018Lazygraph: lazy data coherency for replicas in distributed graph-parallel computation.
Lei Wang, Liangji Zhuang, Junhang Chen, Huimin Cui, Fang Lv, Ying Liu, Xiaobing Feng
2018Making pull-based graph processing performant.
Samuel Grossman, Heiner Litz, Christos Kozyrakis
2018Optimizing N-dimensional, winograd-based convolution for manycore CPUs.
Zhen Jia, Aleksandar Zlateski, Frédo Durand, Kai Li
2018PAM: parallel augmented maps.
Yihan Sun, Daniel Ferizovic, Guy E. Blelloch
2018Performance challenges in modular parallel programs.
Umut A. Acar, Vitaly Aksenov, Arthur Charguéraud, Mike Rainey
2018Performance modeling for GPUs using abstract kernel emulation.
Changwan Hong, Aravind Sukumaran-Rajam, Jinsung Kim, Prashant Singh Rawat, Sriram Krishnamoorthy, Louis-Noël Pouchet, Fabrice Rastello, P. Sadayappan
2018Practical concurrent traversals in search trees.
Dana Drachsler-Cohen, Martin T. Vechev, Eran Yahav
2018Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2018, Vienna, Austria, February 24-28, 2018
Andreas Krall, Thomas R. Gross
2018Quantifying and reducing execution variance in STM via model driven commit optimization.
Girish Mururu, Ada Gavrilovska, Santosh Pande
2018Reducing the burden of parallel loop schedulers for many-core processors.
Mahwish Arif, Hans Vandierendonck
2018Reducing transaction aborts by looking to the future.
Nachshon Cohen, Erez Petrank, James R. Larus
2018Register optimizations for stencils on GPUs.
Prashant Singh Rawat, Fabrice Rastello, Aravind Sukumaran-Rajam, Louis-Noël Pouchet, Atanas Rountev, P. Sadayappan
2018Register-based implementation of the sparse general matrix-matrix multiplication on GPUs.
Junhong Liu, Xin He, Weifeng Liu, Guangming Tan
2018Revealing parallel scans and reductions in sequential loops through function reconstruction.
Peng Jiang, Gagan Agrawal
2018SIMD code generation for stencils on brick decompositions.
Tuowen Zhao, Mary W. Hall, Protonu Basu, Samuel Williams, Hans Johansen
2018Safe privatization in transactional memory.
Artem Khyzha, Hagit Attiya, Alexey Gotsman, Noam Rinetzky
2018SecureMR: secure mapreduce using homomorphic encryption and program partitioning.
Yao Dong, Ana L. Milanova, Julian Dolby
2018Shared-memory parallelization of MTTKRP for dense tensors.
Koby Hayashi, Grey Ballard, Yujie Jiang, Michael J. Tobia
2018Strong trylocks for reader-writer locks.
Andreia Correia, Pedro Ramalhete
2018Superneurons: dynamic GPU memory management for training deep neural networks.
Linnan Wang, Jinmian Ye, Yiyang Zhao, Wei Wu, Ang Li, Shuaiwen Leon Song, Zenglin Xu, Tim Kraska
2018Transparent GPU memory management for DNNs.
Jung-Ho Park, Hyungmin Cho, Wookeun Jung, Jaejin Lee
2018Two concurrent data structures for efficient datalog query processing.
Herbert Jordan, Bernhard Scholz, Pavle Subotic
2018VerifiedFT: a verified, high-performance precise dynamic race detector.
James R. Wilcox, Cormac Flanagan, Stephen N. Freund
2018swSpTRSV: a fast sparse triangular solve with sparse level tile layout on sunway architectures.
Xinliang Wang, Weifeng Liu, Wei Xue, Li Wu
2018vSensor: leveraging fixed-workload snippets of programs for performance variance detection.
Xiongchao Tang, Jidong Zhai, Xuehai Qian, Bingsheng He, Wei Xue, Wenguang Chen