PPoPP B

44 papers

YearTitle / Authors
20232PLSF: Two-Phase Locking with Starvation-Freedom.
Pedro Ramalhete, Andreia Correia, Pascal Felber
2023A Programming Model for GPU Load Balancing.
Muhammad Osama, Serban D. Porumbescu, John D. Owens
2023A Scalable Hybrid Total FETI Method for Massively Parallel FEM Simulations.
Kehao Lin, Chunbao Zhou, Yan Zeng, Ningming Nie, Jue Wang, Shigang Li, Yangde Feng, Yangang Wang, Kehan Yao, Tiechui Yao, Jilin Zhang, Jian Wan
2023AArch64 Atomics: Might They Be Harming Your Performance?
Ricardo Jesus, Michèle Weiland
2023Block-STM: Scaling Blockchain Execution by Turning Ordering Curse to a Performance Blessing.
Rati Gelashvili, Alexander Spiegelman, Zhuolun Xiang, George Danezis, Zekun Li, Dahlia Malkhi, Yu Xia, Runtian Zhou
2023Boosting Performance and QoS for Concurrent GPU B+trees by Combining-Based Synchronization.
Weihua Zhang, Chuanlei Zhao, Lu Peng, Yuzhe Lin, Fengzhe Zhang, Yunping Lu
2023CuPBoP: A Framework to Make CUDA Portable.
Ruobing Han, Jun Chen, Bhanu Garg, Jeffrey Young, Jaewoong Sim, Hyesoon Kim
2023DSP: Efficient GNN Training with Multiple GPUs.
Zhenkun Cai, Qihui Zhou, Xiao Yan, Da Zheng, Xiang Song, Chenguang Zheng, James Cheng, George Karypis
2023Dynamic N: M Fine-Grained Structured Sparse Attention Mechanism.
Zhaodong Chen, Zheng Qu, Yuying Quan, Liu Liu, Yufei Ding, Yuan Xie
2023Efficient All-Reduce for Distributed DNN Training in Optical Interconnect Systems.
Fei Dai, Yawen Chen, Zhiyi Huang, Haibo Zhang, Fangfang Zhang
2023Efficient Direct Convolution Using Long SIMD Instructions.
Alexandre de Limas Santana, Adrià Armejach, Marc Casas
2023Elastic Averaging for Efficient Pipelined DNN Training.
Zihao Chen, Chen Xu, Weining Qian, Aoying Zhou
2023End-to-End LU Factorization of Large Matrices on GPUs.
Yang Xia, Peng Jiang, Gagan Agrawal, Rajiv Ramnath
2023Exploring the Use of WebAssembly in HPC.
Mohak Chadha, Nils Krueger, Jophin John, Anshul Jindal, Michael Gerndt, Shajulin Benedict
2023Fast Parallel Exact Inference on Bayesian Networks.
Jiantong Jiang, Zeyi Wen, Atif Bin Mansoor, Ajmal Mian
2023Fast Symmetric Eigenvalue Decomposition via WY Representation on Tensor Core.
Shaoshuai Zhang, Ruchi Shah, Hiroyuki Ootomo, Rio Yokota, Panruo Wu
2023Fast and Scalable Channels in Kotlin Coroutines.
Nikita Koval, Dan Alistarh, Roman Elizarov
2023Generating Fast FFT Kernels on CPUs via FFT-Specific Intrinsics.
Zhihao Li, Haipeng Jia, Yunquan Zhang, Yuyan Sun, Yiwei Zhang, Tun Chen
2023High-Performance Filters for GPUs.
Hunter McCoy, Steven A. Hofmeyr, Katherine A. Yelick, Prashant Pandey
2023High-Performance GPU-to-CPU Transpilation and Optimization via High-Level Parallel Constructs.
William S. Moses, Ivan R. Ivanov, Jens Domke, Toshio Endo, Johannes Doerfert, Oleksandr Zinenko
2023High-Performance and Scalable Agent-Based Simulation with BioDynaMo.
Lukas Breitwieser, Ahmad Hesam, Fons Rademakers, Juan Gómez-Luna, Onur Mutlu
2023High-Throughput GPU Random Walk with Fine-Tuned Concurrent Query Processing.
Cheng Xu, Chao Li, Pengyu Wang, Xiaofeng Hou, Jing Wang, Shixuan Sun, Minyi Guo, Hanqing Wu, Dongbai Chen, Xiangwen Liu
2023Improving Energy Saving of One-Sided Matrix Decompositions on CPU-GPU Heterogeneous Systems.
Jieyang Chen, Xin Liang, Kai Zhao, Hadi Zamani Sabzi, Laxmi N. Bhuyan, Zizhong Chen
2023Learning to Parallelize in a Shared-Memory Environment with Transformers.
Re'em Harel, Yuval Pinter, Gal Oren
2023Lifetime-Based Optimization for Simulating Quantum Circuits on a New Sunway Supercomputer.
Yaojian Chen, Yong Liu, Xinmin Shi, Jiawei Song, Xin Liu, Lin Gan, Chu Guo, Haohuan Fu, Jie Gao, Dexun Chen, Guangwen Yang
2023Merchandiser: Data Placement on Heterogeneous Memory for Task-Parallel HPC Applications with Load-Balance Awareness.
Zhen Xie, Jie Liu, Jiajia Li, Dong Li
2023OpenCilk: A Modular and Extensible Software Infrastructure for Fast Task-Parallel Code.
Tao B. Schardl, I-Ting Angelina Lee
2023PiPAD: Pipelined and Parallel Dynamic GNN Training on GPUs.
Chunyang Wang, Desen Sun, Yuebin Bai
2023Practically and Theoretically Efficient Garbage Collection for Multiversioning.
Yuanhao Wei, Guy E. Blelloch, Panagiota Fatourou, Eric Ruppert
2023Proceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, PPoPP 2023, Montreal, QC, Canada, 25 February 2023 - 1 March 2023
Maryam Mehri Dehnavi, Milind Kulkarni, Sriram Krishnamoorthy
2023Provably Fast and Space-Efficient Parallel Biconnectivity.
Xiaojun Dong, Letong Wang, Yan Gu, Yihan Sun
2023Provably Good Randomized Strategies for Data Placement in Distributed Key-Value Stores.
Zhe Wang, Jinhao Zhao, Kunal Agrawal, He Liu, Meng Xu, Jing Li
2023Stream-K: Work-Centric Parallel Decomposition for Dense Matrix-Matrix Multiplication on the GPU.
Muhammad Osama, Duane Merrill, Cris Cecka, Michael Garland, John D. Owens
2023Swift: Expedited Failure Recovery for Large-Scale DNN Training.
Yuchen Zhong, Guangming Sheng, Juncheng Liu, Jinhui Yuan, Chuan Wu
2023TDC: Towards Extremely Efficient CNNs on GPUs via Hardware-Aware Tucker Decomposition.
Lizhi Xiang, Miao Yin, Chengming Zhang, Aravind Sukumaran-Rajam, P. Sadayappan, Bo Yuan, Dingwen Tao
2023TGOpt: Redundancy-Aware Optimizations for Temporal Graph Attention Networks.
Yufeng Wang, Charith Mendis
2023TL4x: Buffered Durable Transactions on Disk as Fast as in Memory.
Gal Assa, Andreia Correia, Pedro Ramalhete, Valerio Schiavoni, Pascal Felber
2023The ERA Theorem for Safe Memory Reclamation.
Gali Sheffi, Erez Petrank
2023The State-of-the-Art LCRQ Concurrent Queue Algorithm Does NOT Require CAS2.
Raed Romanov, Nikita Koval
2023Transactional Composition of Nonblocking Data Structures.
Wentao Cai, Haosen Wen, Michael L. Scott
2023Unexpected Scaling in Path Copying Trees.
Vitaly Aksenov, Trevor Brown, Alexander Fedorov, Ilya Kokorin
2023Visibility Algorithms for Dynamic Dependence Analysis and Distributed Coherence.
Michael Bauer, Elliott Slaughter, Sean Treichler, Wonchan Lee, Michael Garland, Alex Aiken
2023WISE: Predicting the Performance of Sparse Matrix Vector Multiplication with Machine Learning.
Serif Yesil, Azin Heidarshenas, Adam Morrison, Josep Torrellas
2023iQAN: Fast and Accurate Vector Search with Efficient Intra-Query Parallelism on Multi-Core Architectures.
Zhen Peng, Minjia Zhang, Kai Li, Ruoming Jin, Bin Ren