PPoPP - RankMe

61 papers

Year	Title / Authors
2019	A GPU memory efficient speed-up scheme for training ultra-deep neural networks: poster. Jinrong Guo, Wantao Liu, Wang Wang, Qu Lu, Songlin Hu, Jizhong Han, Ruixuan Li
2019	A coordinated tiling and batching framework for efficient GEMM on GPUs. Xiuhong Li, Yun Liang, Shengen Yan, Liancheng Jia, Yinghan Li
2019	A distributed hypervisor for resource aggregation: poster. Yubin Chen, Zhuocheng Ding, Jin Zhang, Yun Wang, Zhengwei Qi, Haibing Guan
2019	A pattern based algorithmic autotuner for graph processing on GPUs. Ke Meng, Jiajia Li, Guangming Tan, Ninghui Sun
2019	A round-efficient distributed betweenness centrality algorithm. Loc Hoang, Matteo Pontecorvi, Roshan Dathathri, Gurbinder Gill, Bozhi You, Keshav Pingali, Vijaya Ramachandran
2019	A specialized B-tree for concurrent datalog evaluation. Herbert Jordan, Pavle Subotic, David Zhao, Bernhard Scholz
2019	Accelerating distributed stochastic gradient descent with adaptive periodic parameter averaging: poster. Peng Jiang, Gagan Agrawal
2019	Adaptive sparse matrix-matrix multiplication on the GPU. Martin Winter, Daniel Mlakar, Rhaleb Zayer, Hans-Peter Seidel, Markus Steinberger
2019	Adaptive sparse tiling for sparse matrix multiplication. Changwan Hong, Aravind Sukumaran-Rajam, Israt Nisa, Kunal Singh, P. Sadayappan
2019	Automated multi-dimensional elasticity for streaming runtimes: poster. Xiang Ni, Scott Schneider, Raju Pavuluri, Jonathan Kaus, Kun-Lung Wu
2019	BASMAT: bottleneck-aware sparse matrix-vector multiplication auto-tuning on GPGPUs. Athena Elafrou, Georgios I. Goumas, Nectarios Koziris
2019	Beyond human-level accuracy: computational challenges in deep learning. Joel Hestness, Newsha Ardalani, Gregory F. Diamos
2019	Blockchain abstract data type: poster. Emmanuelle Anceaume, Antonella Del Pozzo, Romaric Ludinard, Maria Potop-Butucaru, Sara Tucci Piergiovanni
2019	Building parallel programming language constructs in the AbleC extensible C compiler framework: a PPoPP tutorial. Travis Carlson, Eric Van Wyk
2019	Checking linearizability using hitting families. Burcu Kulahcioglu Ozkan, Rupak Majumdar, Filip Niksic
2019	Compiler-assisted adaptive program scheduling in big.LITTLE systems: poster. Marcelo Novaes, Vinicius Petrucci, Abdoulaye Gamatié, Fernando Magno Quintão Pereira
2019	Corrected trees for reliable group communication. Martin Küttler, Maksym Planeta, Jan Bierbaum, Carsten Weinhold, Hermann Härtig, Amnon Barak, Torsten Hoefler
2019	Creating repeatable, reusable experimentation pipelines with popper: tutorial. Ivo Jimenez, Jay F. Lofstead, Carlos Maltzahn
2019	CuLDA_CGS: solving large-scale LDA problems on GPUs. Xiaolong Xie, Yun Liang, Xiuhong Li, Wei Tan
2019	Data-flow/dependence profiling for structured transformations. Fabian Gruber, Manuel Selva, Diogo Sampaio, Christophe Guillon, Antoine Moynault, Louis-Noël Pouchet, Fabrice Rastello
2019	Efficient race detection with futures. Robert Utterback, Kunal Agrawal, Jeremy T. Fineman, I-Ting Angelina Lee
2019	Encapsulated open nesting for STM: fine-grained higher-level conflict detection. Martin Bättig, Thomas R. Gross
2019	Engineering a high-performance GPU B-Tree. Muhammad A. Awad, Saman Ashkiani, Rob Johnson, Martin Farach-Colton, John D. Owens
2019	Exploiting the input sparsity to accelerate deep neural networks: poster. Xiao Dong, Lei Liu, Guangli Li, Jiansong Li, Peng Zhao, Xueying Wang, Xiaobing Feng
2019	GOPipe: a granularity-oblivious programming framework for pipelined stencil executions on GPU. Chanyoung Oh, Zhen Zheng, Xipeng Shen, Jidong Zhai, Youngmin Yi
2019	GPOP: a cache and memory-efficient framework for graph processing over partitions. Kartik Lakhotia, Rajgopal Kannan, Sourav Pati, Viktor K. Prasanna
2019	GPU-based 3D cryo-EM reconstruction with key-value streams: poster. Kunpeng Wang, Shizhen Xu, Hongkun Yu, Haohuan Fu, Guangwen Yang
2019	Harmonia: a high throughput B+tree for GPUs. Zhaofeng Yan, Yuzhe Lin, Lu Peng, Weihua Zhang
2019	High performance distributed deep learning: a beginner's guide. Dhabaleswar K. Panda, Ammar Ahmad Awan, Hari Subramoni
2019	High-throughput image alignment for connectomics using frugal snap judgments: poster. Tim Kaler, Brian Wheatman, Sarah Wooders
2019	Implementing parallel and concurrent tree structures. Yihan Sun, Guy E. Blelloch
2019	Incremental flattening for nested data parallelism. Troels Henriksen, Frederik Thorøe, Martin Elsman, Cosmin E. Oancea
2019	LOFT: lock-free transactional data structures. Avner Elizarov, Guy Golan-Gueta, Erez Petrank
2019	Leveraging hardware TM in Haskell. Ryan Yates, Michael L. Scott
2019	Lightweight hardware transactional memory profiling. Qingsen Wang, Pengfei Su, Milind Chabbi, Xu Liu
2019	Lock-free channels for programming via communicating sequential processes: poster. Nikita Koval, Dan Alistarh, Roman Elizarov
2019	Making concurrent algorithms detectable: poster. Naama Ben-David, Guy E. Blelloch, Michal Friedman, Yuanhao Wei
2019	Managing application parallelism via parallel efficiency regulation: poster. Sharanyan Srikanthan, Princeton Ferro, Sayak Chakraborti, Sandhya Dwarkadas
2019	Modular transactions: bounding mixed races in space and time. Brijesh Dongol, Radha Jagadeesan, James Riely
2019	Optimizing GPU programs by register demotion: poster. Putt Sakdhnagool, Amit Sabne, Rudolf Eigenmann
2019	Optimizing computation-communication overlap in asynchronous task-based programs: poster. Emilio Castillo, Nikhil Jain, Marc Casas, Miquel Moretó, Martin Schulz, Ramón Beivide, Mateo Valero, Abhinav Bhatele
2019	Optimizing graph processing on GPUs using approximate computing: poster. Somesh Singh, Rupesh Nasre
2019	Performance portable C++ programming with RAJA. David Beckingsale, Richard D. Hornung, Tom Scogland, Arturo Vargas
2019	Proactive work stealing for futures. Kyle Singer, Yifan Xu, I-Ting Angelina Lee
2019	Proceedings of the 24th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2019, Washington, DC, USA, February 16-20, 2019 Jeffrey K. Hollingsworth, Idit Keidar
2019	Processing transactions in a predefined order. Mohamed M. Saad, Masoomeh Javidi Kishi, Shihao Jing, Sandeep Hans, Roberto Palmieri
2019	Profiling based out-of-core hybrid method for large neural networks: poster. Yuki Ito, Haruki Imai, Tung D. Le, Yasushi Negishi, Kiyokuni Kawachiya, Ryo Matsumiya, Toshio Endo
2019	Programming quantum computers: a primer with IBM Q and D-Wave exercises. Frank Mueller, Greg Byrd, Patrick Dreher
2019	Provably and practically efficient granularity control. Umut A. Acar, Vitaly Aksenov, Arthur Charguéraud, Mike Rainey
2019	QTLS: high-performance TLS asynchronous offload framework with Intel® QuickAssist technology. Xiaokang Hu, Changzheng Wei, Jian Li, Brian Will, Ping Yu, Lu Gong, Haibing Guan
2019	S-EnKF: co-designing for scalable ensemble Kalman filter. Junmin Xiao, Shijie Wang, Weiqiang Wan, Xuehai Hong, Guangming Tan
2019	SEP-graph: finding shortest execution paths for graph processing under a hybrid framework on GPU. Hao Wang, Liang Geng, Rubao Lee, Kaixi Hou, Yanfeng Zhang, Xiaodong Zhang
2019	Scheduling HPC workloads on heterogeneous-ISA architectures: poster. Mohamed Lamine Karaoui, Anthony Carno, Robert Lyerly, Sang-Hoon Kim, Pierre Olivier, Changwoo Min, Binoy Ravindran
2019	Semantics-aware scheduling policies for synchronization determinism. Qi Zhao, Zhengyi Qiu, Guoliang Jin
2019	Stretching the capacity of hardware transactional memory in IBM POWER architectures. Ricardo Filipe, Shady Issa, Paolo Romano, João Barreto
2019	T-thinker: a task-centric distributed framework for compute-intensive divide-and-conquer algorithms. Da Yan, Guimu Guo, Md Mashiur Rahman Chowdhury, M. Tamer Özsu, John C. S. Lui, Weida Tan
2019	Throughput-oriented GPU memory allocation. Isaac Gelado, Michael Garland
2019	Toward efficient architecture-independent algorithms for dynamic programs: poster. Mohammad Mahdi Javanmard, Pramod Ganapathi, Rathish Das, Zafar Ahmad, Stephen L. Tschudi, Rezaul Chowdhury
2019	Transitive joins: a sound and efficient online deadlock-avoidance policy. Caleb Voss, Tiago Cogumbreiro, Vivek Sarkar
2019	VEBO: a vertex- and edge-balanced ordering heuristic to load balance parallel graph processing. Jiawen Sun, Hans Vandierendonck, Dimitrios S. Nikolopoulos
2019	Verifying C11 programs operationally. Simon Doherty, Brijesh Dongol, Heike Wehrheim, John Derrick