| 2022 | A W-cycle algorithm for efficient batched SVD on GPUs. Junmin Xiao, Qing Xue, Hui Ma, Xiaoyang Zhang, Guangming Tan |
| 2022 | A parallel branch-and-bound algorithm with history-based domination. Taspon Gonggiatgul, Ghassan Shobaki, Pinar Muyan-Özçelik |
| 2022 | An LLVM-based open-source compiler for NVIDIA GPUs. Da Yan, Wei Wang, Xiaowen Chu |
| 2022 | Asymmetry-aware scalable locking. Nian Liu, Jinyu Gu, Dahai Tang, Kenli Li, Binyu Zang, Haibo Chen |
| 2022 | Automatic differentiation of parallel loops with formal methods. Jan Hückelheim, Laurent Hascoët |
| 2022 | Automatic synthesis of parallel unix commands and pipelines with KumQuat. Jiasi Shen, Martin C. Rinard, Nikos Vasilakis |
| 2022 | BaGuaLu: targeting brain scale pretrained models with over 37 million cores. Zixuan Ma, Jiaao He, Jiezhong Qiu, Huanqi Cao, Yuanwei Wang, Zhenbo Sun, Liyan Zheng, Haojie Wang, Shizhi Tang, Tianyu Zheng, Junyang Lin, Guanyu Feng, Zeqiang Huang, Jie Gao, Aohan Zeng, Jianwei Zhang, Runxin Zhong, Tianhui Shi, Sha Liu, Weimin Zheng, Jie Tang, Hongxia Yang, Xin Liu, Jidong Zhai, Wenguang Chen |
| 2022 | Bundling linked data structures for linearizable range queries. Jacob Nelson-Slivon, Ahmed Hassan, Roberto Palmieri |
| 2022 | CASE: a compiler-assisted SchEduling framework for multi-GPU systems. Chao Chen, Chris Porter, Santosh Pande |
| 2022 | Deadlock-free asynchronous message reordering in rust with multiparty session types. Zak Cutner, Nobuko Yoshida, Martin Vassor |
| 2022 | Detectable recovery of lock-free data structures. Hagit Attiya, Ohad Ben-Baruch, Panagiota Fatourou, Danny Hendler, Eleftherios Kosmas |
| 2022 | Dopia: online parallelism management for integrated CPU/GPU architectures. Younghyun Cho, Jiyeon Park, Florian Negele, Changyeon Jo, Thomas R. Gross, Bernhard Egger |
| 2022 | Elimination (a, b)-trees with fast, durable updates. Anubhav Srivastava, Trevor Brown |
| 2022 | Extending the limit of molecular dynamics with Zhuoqiang Guo, Denghui Lu, Yujin Yan, Siyu Hu, Rongrong Liu, Guangming Tan, Ninghui Sun, Wanrun Jiang, Lijun Liu, Yixiao Chen, Linfeng Zhang, Mohan Chen, Han Wang, Weile Jia |
| 2022 | FasterMoE: modeling and optimizing training of large-scale dynamic pre-trained models. Jiaao He, Jidong Zhai, Tiago Antunes, Haojie Wang, Fuwen Luo, Shangfeng Shi, Qin Li |
| 2022 | FliT: a library for simple and efficient persistent algorithms. Yuanhao Wei, Naama Ben-David, Michal Friedman, Guy E. Blelloch, Erez Petrank |
| 2022 | Hardening selective protection across multiple program inputs for HPC applications. Yafan Huang, Shengjian Guo, Sheng Di, Guanpeng Li, Franck Cappello |
| 2022 | High performance GPU concurrent B+tree. Weihua Zhang, Chuanlei Zhao, Lu Peng, Yuzhe Lin, Fengzhe Zhang, Jinhu Jiang |
| 2022 | Interference relation-guided SMT solving for multi-threaded program verification. Hongyu Fan, Weiting Liu, Fei He |
| 2022 | Jiffy: a lock-free skip list with batch updates and snapshots. Tadeusz Kobus, Maciej Kokocinski, Pawel T. Wojciechowski |
| 2022 | LB-HM: load balance-aware data placement on heterogeneous memory for task-parallel HPC applications. Zhen Xie, Jie Liu, Sam Ma, Jiajia Li, Dong Li |
| 2022 | LOTUS: locality optimizing triangle counting. Mohsen Koohi Esfahani, Peter Kilpatrick, Hans Vandierendonck |
| 2022 | Lock-free locks revisited. Naama Ben-David, Guy E. Blelloch, Yuanhao Wei |
| 2022 | Mashup: making serverless computing useful for HPC workflows via hybrid execution. Rohan Basu Roy, Tirthak Patel, Vijay Gadepally, Devesh Tiwari |
| 2022 | Multi-queues can be state-of-the-art priority schedulers. Anastasiia Postnikova, Nikita Koval, Giorgi Nadiradze, Dan Alistarh |
| 2022 | Near-optimal sparse allreduce for distributed deep learning. Shigang Li, Torsten Hoefler |
| 2022 | Optimizing consistency for partially replicated data stores. Ivan Kuraj, Armando Solar-Lezama, Nadia Polikarpova |
| 2022 | Optimizing sparse computations jointly. Kazem Cheshmi, Michelle Mills Strout, Maryam Mehri Dehnavi |
| 2022 | PPoPP '22: 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Seoul, Republic of Korea, April 2 - 6, 2022 Jaejin Lee, Kunal Agrawal, Michael F. Spear |
| 2022 | ParGeo: a library for parallel computational geometry. Yiqiu Wang, Shangdi Yu, Laxman Dhulipala, Yan Gu, Julian Shun |
| 2022 | Parallel algorithms for masked sparse matrix-matrix products. Srdan Milakovic, Oguz Selvitopi, Israt Nisa, Zoran Budimlic, Aydin Buluç |
| 2022 | Parallel block-delayed sequences. Sam Westrick, Mike Rainey, Daniel Anderson, Guy E. Blelloch |
| 2022 | PathCAS: an efficient middle ground for concurrent search data structures. Trevor Brown, William Sigouin, Dan Alistarh |
| 2022 | PerFlow: a domain specific framework for automatic performance analysis of parallel applications. Yuyang Jin, Haojie Wang, Runxin Zhong, Chen Zhang, Jidong Zhai |
| 2022 | QGTC: accelerating quantized graph neural networks via GPU tensor core. Yuke Wang, Boyuan Feng, Yufei Ding |
| 2022 | RTNN: accelerating neighbor search using hardware ray tracing. Yuhao Zhu |
| 2022 | Remote OpenMP offloading. Atmn Patel, Johannes Doerfert |
| 2022 | Rethinking graph data placement for graph neural network training on multiple GPUs. Shihui Song, Peng Jiang |
| 2022 | Scaling graph traversal to 281 trillion edges with 40 million cores. Huanqi Cao, Yuanwei Wang, Haojie Wang, Heng Lin, Zixuan Ma, Wanwang Yin, Wenguang Chen |
| 2022 | Stream processing with dependency-guided synchronization. Konstantinos Kallas, Filip Niksic, Caleb Stanford, Rajeev Alur |
| 2022 | The performance power of software combining in persistence. Panagiota Fatourou, Nikolaos D. Kallimanis, Eleftherios Kosmas |
| 2022 | The problem-based benchmark suite (PBBS), V2. Daniel Anderson, Guy E. Blelloch, Laxman Dhulipala, Magdalen Dobson, Yihan Sun |
| 2022 | TileSpGEMM: a tiled algorithm for parallel sparse general matrix-matrix multiplication on GPUs. Yuyao Niu, Zhengyang Lu, Haonan Ji, Shuhui Song, Zhou Jin, Weifeng Liu |
| 2022 | Towards OmpSs-2 and OpenACC interoperation. Orestis Korakitis, Simon Garcia De Gonzalo, Nicolas L. Guidotti, João Pedro Barreto, José C. Monteiro, Antonio J. Peña |
| 2022 | Understanding and detecting deep memory persistency bugs in NVM programs with DeepMC. Benjamin Reidys, Jian Huang |
| 2022 | Vapro: performance variance detection and diagnosis for production-run parallel applications. Liyan Zheng, Jidong Zhai, Xiongchao Tang, Haojie Wang, Teng Yu, Yuyang Jin, Shuaiwen Leon Song, Wenguang Chen |
| 2022 | wCQ: a fast wait-free queue with bounded memory usage. Ruslan Nikolaev, Binoy Ravindran |