| 2023 | 2PLSF: Two-Phase Locking with Starvation-Freedom. Pedro Ramalhete, Andreia Correia, Pascal Felber |
| 2023 | A Programming Model for GPU Load Balancing. Muhammad Osama, Serban D. Porumbescu, John D. Owens |
| 2023 | A Scalable Hybrid Total FETI Method for Massively Parallel FEM Simulations. Kehao Lin, Chunbao Zhou, Yan Zeng, Ningming Nie, Jue Wang, Shigang Li, Yangde Feng, Yangang Wang, Kehan Yao, Tiechui Yao, Jilin Zhang, Jian Wan |
| 2023 | AArch64 Atomics: Might They Be Harming Your Performance? Ricardo Jesus, Michèle Weiland |
| 2023 | Block-STM: Scaling Blockchain Execution by Turning Ordering Curse to a Performance Blessing. Rati Gelashvili, Alexander Spiegelman, Zhuolun Xiang, George Danezis, Zekun Li, Dahlia Malkhi, Yu Xia, Runtian Zhou |
| 2023 | Boosting Performance and QoS for Concurrent GPU B+trees by Combining-Based Synchronization. Weihua Zhang, Chuanlei Zhao, Lu Peng, Yuzhe Lin, Fengzhe Zhang, Yunping Lu |
| 2023 | CuPBoP: A Framework to Make CUDA Portable. Ruobing Han, Jun Chen, Bhanu Garg, Jeffrey Young, Jaewoong Sim, Hyesoon Kim |
| 2023 | DSP: Efficient GNN Training with Multiple GPUs. Zhenkun Cai, Qihui Zhou, Xiao Yan, Da Zheng, Xiang Song, Chenguang Zheng, James Cheng, George Karypis |
| 2023 | Dynamic N: M Fine-Grained Structured Sparse Attention Mechanism. Zhaodong Chen, Zheng Qu, Yuying Quan, Liu Liu, Yufei Ding, Yuan Xie |
| 2023 | Efficient All-Reduce for Distributed DNN Training in Optical Interconnect Systems. Fei Dai, Yawen Chen, Zhiyi Huang, Haibo Zhang, Fangfang Zhang |
| 2023 | Efficient Direct Convolution Using Long SIMD Instructions. Alexandre de Limas Santana, Adrià Armejach, Marc Casas |
| 2023 | Elastic Averaging for Efficient Pipelined DNN Training. Zihao Chen, Chen Xu, Weining Qian, Aoying Zhou |
| 2023 | End-to-End LU Factorization of Large Matrices on GPUs. Yang Xia, Peng Jiang, Gagan Agrawal, Rajiv Ramnath |
| 2023 | Exploring the Use of WebAssembly in HPC. Mohak Chadha, Nils Krueger, Jophin John, Anshul Jindal, Michael Gerndt, Shajulin Benedict |
| 2023 | Fast Parallel Exact Inference on Bayesian Networks. Jiantong Jiang, Zeyi Wen, Atif Bin Mansoor, Ajmal Mian |
| 2023 | Fast Symmetric Eigenvalue Decomposition via WY Representation on Tensor Core. Shaoshuai Zhang, Ruchi Shah, Hiroyuki Ootomo, Rio Yokota, Panruo Wu |
| 2023 | Fast and Scalable Channels in Kotlin Coroutines. Nikita Koval, Dan Alistarh, Roman Elizarov |
| 2023 | Generating Fast FFT Kernels on CPUs via FFT-Specific Intrinsics. Zhihao Li, Haipeng Jia, Yunquan Zhang, Yuyan Sun, Yiwei Zhang, Tun Chen |
| 2023 | High-Performance Filters for GPUs. Hunter McCoy, Steven A. Hofmeyr, Katherine A. Yelick, Prashant Pandey |
| 2023 | High-Performance GPU-to-CPU Transpilation and Optimization via High-Level Parallel Constructs. William S. Moses, Ivan R. Ivanov, Jens Domke, Toshio Endo, Johannes Doerfert, Oleksandr Zinenko |
| 2023 | High-Performance and Scalable Agent-Based Simulation with BioDynaMo. Lukas Breitwieser, Ahmad Hesam, Fons Rademakers, Juan Gómez-Luna, Onur Mutlu |
| 2023 | High-Throughput GPU Random Walk with Fine-Tuned Concurrent Query Processing. Cheng Xu, Chao Li, Pengyu Wang, Xiaofeng Hou, Jing Wang, Shixuan Sun, Minyi Guo, Hanqing Wu, Dongbai Chen, Xiangwen Liu |
| 2023 | Improving Energy Saving of One-Sided Matrix Decompositions on CPU-GPU Heterogeneous Systems. Jieyang Chen, Xin Liang, Kai Zhao, Hadi Zamani Sabzi, Laxmi N. Bhuyan, Zizhong Chen |
| 2023 | Learning to Parallelize in a Shared-Memory Environment with Transformers. Re'em Harel, Yuval Pinter, Gal Oren |
| 2023 | Lifetime-Based Optimization for Simulating Quantum Circuits on a New Sunway Supercomputer. Yaojian Chen, Yong Liu, Xinmin Shi, Jiawei Song, Xin Liu, Lin Gan, Chu Guo, Haohuan Fu, Jie Gao, Dexun Chen, Guangwen Yang |
| 2023 | Merchandiser: Data Placement on Heterogeneous Memory for Task-Parallel HPC Applications with Load-Balance Awareness. Zhen Xie, Jie Liu, Jiajia Li, Dong Li |
| 2023 | OpenCilk: A Modular and Extensible Software Infrastructure for Fast Task-Parallel Code. Tao B. Schardl, I-Ting Angelina Lee |
| 2023 | PiPAD: Pipelined and Parallel Dynamic GNN Training on GPUs. Chunyang Wang, Desen Sun, Yuebin Bai |
| 2023 | Practically and Theoretically Efficient Garbage Collection for Multiversioning. Yuanhao Wei, Guy E. Blelloch, Panagiota Fatourou, Eric Ruppert |
| 2023 | Proceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, PPoPP 2023, Montreal, QC, Canada, 25 February 2023 - 1 March 2023 Maryam Mehri Dehnavi, Milind Kulkarni, Sriram Krishnamoorthy |
| 2023 | Provably Fast and Space-Efficient Parallel Biconnectivity. Xiaojun Dong, Letong Wang, Yan Gu, Yihan Sun |
| 2023 | Provably Good Randomized Strategies for Data Placement in Distributed Key-Value Stores. Zhe Wang, Jinhao Zhao, Kunal Agrawal, He Liu, Meng Xu, Jing Li |
| 2023 | Stream-K: Work-Centric Parallel Decomposition for Dense Matrix-Matrix Multiplication on the GPU. Muhammad Osama, Duane Merrill, Cris Cecka, Michael Garland, John D. Owens |
| 2023 | Swift: Expedited Failure Recovery for Large-Scale DNN Training. Yuchen Zhong, Guangming Sheng, Juncheng Liu, Jinhui Yuan, Chuan Wu |
| 2023 | TDC: Towards Extremely Efficient CNNs on GPUs via Hardware-Aware Tucker Decomposition. Lizhi Xiang, Miao Yin, Chengming Zhang, Aravind Sukumaran-Rajam, P. Sadayappan, Bo Yuan, Dingwen Tao |
| 2023 | TGOpt: Redundancy-Aware Optimizations for Temporal Graph Attention Networks. Yufeng Wang, Charith Mendis |
| 2023 | TL4x: Buffered Durable Transactions on Disk as Fast as in Memory. Gal Assa, Andreia Correia, Pedro Ramalhete, Valerio Schiavoni, Pascal Felber |
| 2023 | The ERA Theorem for Safe Memory Reclamation. Gali Sheffi, Erez Petrank |
| 2023 | The State-of-the-Art LCRQ Concurrent Queue Algorithm Does NOT Require CAS2. Raed Romanov, Nikita Koval |
| 2023 | Transactional Composition of Nonblocking Data Structures. Wentao Cai, Haosen Wen, Michael L. Scott |
| 2023 | Unexpected Scaling in Path Copying Trees. Vitaly Aksenov, Trevor Brown, Alexander Fedorov, Ilya Kokorin |
| 2023 | Visibility Algorithms for Dynamic Dependence Analysis and Distributed Coherence. Michael Bauer, Elliott Slaughter, Sean Treichler, Wonchan Lee, Michael Garland, Alex Aiken |
| 2023 | WISE: Predicting the Performance of Sparse Matrix Vector Multiplication with Machine Learning. Serif Yesil, Azin Heidarshenas, Adam Morrison, Josep Torrellas |
| 2023 | iQAN: Fast and Accurate Vector Search with Efficient Intra-Query Parallelism on Multi-Core Architectures. Zhen Peng, Minjia Zhang, Kai Li, Ruoming Jin, Bin Ren |