PPoPP B

47 papers

YearTitle / Authors
2022A W-cycle algorithm for efficient batched SVD on GPUs.
Junmin Xiao, Qing Xue, Hui Ma, Xiaoyang Zhang, Guangming Tan
2022A parallel branch-and-bound algorithm with history-based domination.
Taspon Gonggiatgul, Ghassan Shobaki, Pinar Muyan-Özçelik
2022An LLVM-based open-source compiler for NVIDIA GPUs.
Da Yan, Wei Wang, Xiaowen Chu
2022Asymmetry-aware scalable locking.
Nian Liu, Jinyu Gu, Dahai Tang, Kenli Li, Binyu Zang, Haibo Chen
2022Automatic differentiation of parallel loops with formal methods.
Jan Hückelheim, Laurent Hascoët
2022Automatic synthesis of parallel unix commands and pipelines with KumQuat.
Jiasi Shen, Martin C. Rinard, Nikos Vasilakis
2022BaGuaLu: targeting brain scale pretrained models with over 37 million cores.
Zixuan Ma, Jiaao He, Jiezhong Qiu, Huanqi Cao, Yuanwei Wang, Zhenbo Sun, Liyan Zheng, Haojie Wang, Shizhi Tang, Tianyu Zheng, Junyang Lin, Guanyu Feng, Zeqiang Huang, Jie Gao, Aohan Zeng, Jianwei Zhang, Runxin Zhong, Tianhui Shi, Sha Liu, Weimin Zheng, Jie Tang, Hongxia Yang, Xin Liu, Jidong Zhai, Wenguang Chen
2022Bundling linked data structures for linearizable range queries.
Jacob Nelson-Slivon, Ahmed Hassan, Roberto Palmieri
2022CASE: a compiler-assisted SchEduling framework for multi-GPU systems.
Chao Chen, Chris Porter, Santosh Pande
2022Deadlock-free asynchronous message reordering in rust with multiparty session types.
Zak Cutner, Nobuko Yoshida, Martin Vassor
2022Detectable recovery of lock-free data structures.
Hagit Attiya, Ohad Ben-Baruch, Panagiota Fatourou, Danny Hendler, Eleftherios Kosmas
2022Dopia: online parallelism management for integrated CPU/GPU architectures.
Younghyun Cho, Jiyeon Park, Florian Negele, Changyeon Jo, Thomas R. Gross, Bernhard Egger
2022Elimination (a, b)-trees with fast, durable updates.
Anubhav Srivastava, Trevor Brown
2022Extending the limit of molecular dynamics with
Zhuoqiang Guo, Denghui Lu, Yujin Yan, Siyu Hu, Rongrong Liu, Guangming Tan, Ninghui Sun, Wanrun Jiang, Lijun Liu, Yixiao Chen, Linfeng Zhang, Mohan Chen, Han Wang, Weile Jia
2022FasterMoE: modeling and optimizing training of large-scale dynamic pre-trained models.
Jiaao He, Jidong Zhai, Tiago Antunes, Haojie Wang, Fuwen Luo, Shangfeng Shi, Qin Li
2022FliT: a library for simple and efficient persistent algorithms.
Yuanhao Wei, Naama Ben-David, Michal Friedman, Guy E. Blelloch, Erez Petrank
2022Hardening selective protection across multiple program inputs for HPC applications.
Yafan Huang, Shengjian Guo, Sheng Di, Guanpeng Li, Franck Cappello
2022High performance GPU concurrent B+tree.
Weihua Zhang, Chuanlei Zhao, Lu Peng, Yuzhe Lin, Fengzhe Zhang, Jinhu Jiang
2022Interference relation-guided SMT solving for multi-threaded program verification.
Hongyu Fan, Weiting Liu, Fei He
2022Jiffy: a lock-free skip list with batch updates and snapshots.
Tadeusz Kobus, Maciej Kokocinski, Pawel T. Wojciechowski
2022LB-HM: load balance-aware data placement on heterogeneous memory for task-parallel HPC applications.
Zhen Xie, Jie Liu, Sam Ma, Jiajia Li, Dong Li
2022LOTUS: locality optimizing triangle counting.
Mohsen Koohi Esfahani, Peter Kilpatrick, Hans Vandierendonck
2022Lock-free locks revisited.
Naama Ben-David, Guy E. Blelloch, Yuanhao Wei
2022Mashup: making serverless computing useful for HPC workflows via hybrid execution.
Rohan Basu Roy, Tirthak Patel, Vijay Gadepally, Devesh Tiwari
2022Multi-queues can be state-of-the-art priority schedulers.
Anastasiia Postnikova, Nikita Koval, Giorgi Nadiradze, Dan Alistarh
2022Near-optimal sparse allreduce for distributed deep learning.
Shigang Li, Torsten Hoefler
2022Optimizing consistency for partially replicated data stores.
Ivan Kuraj, Armando Solar-Lezama, Nadia Polikarpova
2022Optimizing sparse computations jointly.
Kazem Cheshmi, Michelle Mills Strout, Maryam Mehri Dehnavi
2022PPoPP '22: 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Seoul, Republic of Korea, April 2 - 6, 2022
Jaejin Lee, Kunal Agrawal, Michael F. Spear
2022ParGeo: a library for parallel computational geometry.
Yiqiu Wang, Shangdi Yu, Laxman Dhulipala, Yan Gu, Julian Shun
2022Parallel algorithms for masked sparse matrix-matrix products.
Srdan Milakovic, Oguz Selvitopi, Israt Nisa, Zoran Budimlic, Aydin Buluç
2022Parallel block-delayed sequences.
Sam Westrick, Mike Rainey, Daniel Anderson, Guy E. Blelloch
2022PathCAS: an efficient middle ground for concurrent search data structures.
Trevor Brown, William Sigouin, Dan Alistarh
2022PerFlow: a domain specific framework for automatic performance analysis of parallel applications.
Yuyang Jin, Haojie Wang, Runxin Zhong, Chen Zhang, Jidong Zhai
2022QGTC: accelerating quantized graph neural networks via GPU tensor core.
Yuke Wang, Boyuan Feng, Yufei Ding
2022RTNN: accelerating neighbor search using hardware ray tracing.
Yuhao Zhu
2022Remote OpenMP offloading.
Atmn Patel, Johannes Doerfert
2022Rethinking graph data placement for graph neural network training on multiple GPUs.
Shihui Song, Peng Jiang
2022Scaling graph traversal to 281 trillion edges with 40 million cores.
Huanqi Cao, Yuanwei Wang, Haojie Wang, Heng Lin, Zixuan Ma, Wanwang Yin, Wenguang Chen
2022Stream processing with dependency-guided synchronization.
Konstantinos Kallas, Filip Niksic, Caleb Stanford, Rajeev Alur
2022The performance power of software combining in persistence.
Panagiota Fatourou, Nikolaos D. Kallimanis, Eleftherios Kosmas
2022The problem-based benchmark suite (PBBS), V2.
Daniel Anderson, Guy E. Blelloch, Laxman Dhulipala, Magdalen Dobson, Yihan Sun
2022TileSpGEMM: a tiled algorithm for parallel sparse general matrix-matrix multiplication on GPUs.
Yuyao Niu, Zhengyang Lu, Haonan Ji, Shuhui Song, Zhou Jin, Weifeng Liu
2022Towards OmpSs-2 and OpenACC interoperation.
Orestis Korakitis, Simon Garcia De Gonzalo, Nicolas L. Guidotti, João Pedro Barreto, José C. Monteiro, Antonio J. Peña
2022Understanding and detecting deep memory persistency bugs in NVM programs with DeepMC.
Benjamin Reidys, Jian Huang
2022Vapro: performance variance detection and diagnosis for production-run parallel applications.
Liyan Zheng, Jidong Zhai, Xiongchao Tang, Haojie Wang, Teng Yu, Yuyang Jin, Shuaiwen Leon Song, Wenguang Chen
2022wCQ: a fast wait-free queue with bounded memory usage.
Ruslan Nikolaev, Binoy Ravindran