PPoPP B

49 papers

YearTitle / Authors
2021A fast work-efficient SSSP algorithm for GPUs.
Kai Wang, Don Fussell, Calvin Lin
2021A lock-free relaxed concurrent queue for fast work distribution.
Giorgos Kappes, Stergios V. Anastasiadis
2021A more pragmatic implementation of the lock-free, ordered, linked list.
Jesper Larsson Träff, Manuel Pöter
2021A novel memory-efficient deep learning training framework via error-bounded lossy compression.
Sian Jin, Guanpeng Li, Shuaiwen Leon Song, Dingwen Tao
2021Advanced synchronization techniques for task-based runtime systems.
David Álvarez, Kevin Sala, Marcos Maroñas, Aleix Roca, Vicenç Beltran
2021An efficient uncertain graph processing framework for heterogeneous architectures.
Heng Zhang, Lingda Li, Donglin Zhuang, Rui Liu, Shuang Song, Dingwen Tao, Yanjun Wu, Shuaiwen Leon Song
2021An ownership policy and deadlock detector for promises.
Caleb Voss, Vivek Sarkar
2021ApproxTuner: a compiler and runtime system for adaptive approximations.
Hashim Sharif, Yifan Zhao, Maria Kotsifakou, Akash Kothari, Ben Schreiber, Elizabeth Wang, Yasmin Sarita, Nathan Zhao, Keyur Joshi, Vikram S. Adve, Sasa Misailovic, Sarita V. Adve
2021Are dynamic memory managers on GPUs slow?: a survey and benchmarks.
Martin Winter, Mathias Parger, Daniel Mlakar, Markus Steinberger
2021Asynchrony versus bulk-synchrony for a generalized N-body problem from genomics.
Marquita Ellis, Aydin Buluç, Katherine A. Yelick
2021BiPart: a parallel and deterministic hypergraph partitioner.
Sepideh Maleki, Udit Agarwal, Martin Burtscher, Keshav Pingali
2021Bundled references: an abstraction for highly-concurrent linearizable range queries.
Jacob Nelson, Ahmed Hassan, Roberto Palmieri
2021Compiler support for near data computing.
Mahmut Taylan Kandemir, Jihyun Ryoo, Xulong Tang, Mustafa Karaköy
2021Constant-time snapshots with applications to concurrent data structures.
Yuanhao Wei, Naama Ben-David, Guy E. Blelloch, Panagiota Fatourou, Eric Ruppert, Yihan Sun
2021Corder: cache-aware reordering for optimizing graph analytics.
Yuang Chen, Yeh-Ching Chung
2021DAPPLE: a pipelined data parallel approach for training large models.
Shiqing Fan, Yi Rong, Chen Meng, Zongyan Cao, Siyu Wang, Zhen Zheng, Chuan Wu, Guoping Long, Jun Yang, Lixue Xia, Lansong Diao, Xiaoyong Liu, Wei Lin
2021DFOGraph: an I/O- and communication-efficient system for distributed fully-out-of-core graph processing.
Jiping Yu, Wei Qin, Xiaowei Zhu, Zhenbo Sun, Jianqiang Huang, Xiaohan Li, Wenguang Chen
2021Dynamic scaling for low-precision learning.
Ruobing Han, Min Si, James Demmel, Yang You
2021EGEMM-TC: accelerating scientific computing on tensor cores with extended precision.
Boyuan Feng, Yuke Wang, Guoyang Chen, Weifeng Zhang, Yuan Xie, Yufei Ding
2021Efficient algorithms for persistent transactional memory.
Pedro Ramalhete, Andreia Correia, Pascal Felber
2021Efficiently reclaiming memory in concurrent search data structures while bounding wasted memory.
Daniel Solomon, Adam Morrison
2021Efficiently running SpMV on long vector architectures.
Constantino Gómez, Filippo Mantovani, Erich Focht, Marc Casas
2021Exploring deep reuse in winograd CNN inference.
Ruofan Wu, Feng Zhang, Zhen Zheng, Xiaoyong Du, Xipeng Shen
2021Extending MapReduce framework with locality keys.
Yifeng Chen, Bei Wang, Xiaolin Wang
2021Extracting clean performance models from tainted programs.
Marcin Copik, Alexandru Calotoiu, Tobias Grosser, Nicolas Wicki, Felix Wolf, Torsten Hoefler
2021FFT blitz: the tensor cores strike back.
Sultan Durrani, Muhammad Saad Chughtai, Abdul Dakkak, Wen-Mei Hwu, Lawrence Rauchwerger
2021GPTune: multitask learning for autotuning exascale applications.
Yang Liu, Wissam M. Sid-Lakhdar, Osni Marques, Xinran Zhu, Chang Meng, James Weldon Demmel, Xiaoye S. Li
2021I/O lower bounds for auto-tuning of convolutions in CNNs.
Xiaoyang Zhang, Junmin Xiao, Guangming Tan
2021Improving communication by optimizing on-node data movement with data layout.
Tuowen Zhao, Mary W. Hall, Hans Johansen, Samuel Williams
2021In-situ workflow auto-tuning through combining component models.
Tong Shu, Yanfei Guo, Justin M. Wozniak, Xiaoning Ding, Ian T. Foster, Tahsin M. Kurç
2021Investigating the semantics of futures in transactional memory systems.
Jingna Zeng, Shady Issa, Paolo Romano, Luís E. T. Rodrigues, Seif Haridi
2021Lightweight preemptive user-level threads.
Shumpei Shiina, Shintaro Iwasaki, Kenjiro Taura, Pavan Balaji
2021Modernizing parallel code with pattern analysis.
Roberto Castañeda Lozano, Murray Cole, Björn Franke
2021NBR: neutralization based reclamation.
Ajay Singh, Trevor Brown, Ali José Mashtizadeh
2021On group mutual exclusion for dynamic systems.
Shreyas Gokhale, Sahil Dhoked, Neeraj Mittal
2021On the parallel I/O optimality of linear algebra kernels: near-optimal LU factorization.
Grzegorz Kwasniewski, Tal Ben-Nun, Alexandros Nikolaos Ziogas, Timo Schneider, Maciej Besta, Torsten Hoefler
2021OrcGC: automatic lock-free memory reclamation.
Andreia Correia, Pedro Ramalhete, Pascal Felber
2021PPoPP '21: 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Virtual Event, Republic of Korea, February 27- March 3, 2021
Jaejin Lee, Erez Petrank
2021Parallel binary code analysis.
Xiaozhu Meng, Jonathon M. Anderson, John M. Mellor-Crummey, Mark W. Krentel, Barton P. Miller, Srdan Milakovic
2021Reasoning about recursive tree traversals.
Yanjun Wang, Jinwei Liu, Dalin Zhang, Xiaokang Qiu
2021Scaling implicit parallelism via dynamic control replication.
Michael Bauer, Wonchan Lee, Elliott Slaughter, Zhihao Jia, Mario Di Renzo, Manolis Papadakis, Galen M. Shipman, Patrick S. McCormick, Michael Garland, Alex Aiken
2021ShadowVM: accelerating data plane for data analytics with bare metal CPUs and GPUs.
Zhifang Li, Mingcong Han, Shangwei Wu, Chuliang Weng
2021Simplifying low-level GPU programming with GAS.
Da Yan, Wei Wang, Xiaowen Chu
2021Sparta: high-performance, element-wise sparse tensor contraction on heterogeneous memory.
Jiawen Liu, Jie Ren, Roberto Gioiosa, Dong Li, Jiajia Li
2021Synthesizing optimal collective algorithms.
Zixian Cai, Zhengyang Liu, Saeed Maleki, Madanlal Musuvathi, Todd Mytkowicz, Jacob Nelson, Olli Saarikivi
2021TurboTransformers: an efficient GPU serving system for transformer models.
Jiarui Fang, Yang Yu, Chengduo Zhao, Jie Zhou
2021Understanding a program's resiliency through error propagation.
Zhimin Li, Harshitha Menon, Kathryn M. Mohror, Peer-Timo Bremer, Yarden Livnat, Valerio Pascucci
2021Understanding and bridging the gaps in current GNN performance optimizations.
Kezhao Huang, Jidong Zhai, Zhen Zheng, Youngmin Yi, Xipeng Shen
2021Verifying C11-style weak memory libraries.
Sadegh Dalvandi, Brijesh Dongol