PPoPP B

56 papers

YearTitle / Authors
2016A high-performance parallel algorithm for nonnegative matrix factorization.
Ramakrishnan Kannan, Grey Ballard, Haesun Park
2016A programming system for future proofing performance critical libraries.
Li-Wen Chang, Izzat El Hajj, Hee-Seok Kim, Juan Gómez-Luna, Abdul Dakkak, Wen-mei W. Hwu
2016A scalable lock-free hash table with open addressing.
Jesper Puge Nielsen, Sven Karlsson
2016A wait-free queue as fast as fetch-and-add.
Chaoran Yang, John M. Mellor-Crummey
2016AUTOGEN: automatic discovery of cache-oblivious parallel recursive algorithms for solving dynamic programs.
Rezaul Alam Chowdhury, Pramod Ganapathi, Jesmin Jahan Tithi, Charles Bachmeier, Bradley C. Kuszmaul, Charles E. Leiserson, Armando Solar-Lezama, Yuan Tang
2016Adding approximate counters.
Guy L. Steele Jr., Jean-Baptiste Tristan
2016Affinity-aware work-stealing for integrated CPU-GPU processors.
Naila Farooqui, Rajkishore Barik, Brian T. Lewis, Tatiana Shpeisman, Karsten Schwan
2016An interval constrained memory allocator for the Givy GAS runtime.
François Gindraud, Fabrice Rastello, Albert Cohen, François Broquedis
2016Articulation points guided redundancy elimination for betweenness centrality.
Lei Wang, Fan Yang, Liangji Zhuang, Huimin Cui, Fang Lv, Xiaobing Feng
2016Be my guest: MCS lock now welcomes guests.
Tianzheng Wang, Milind Chabbi, Hideaki Kimura
2016Benchmarking weak memory models.
Carl G. Ritson, Scott Owens
2016CUDA acceleration for Xen virtual machines in infiniband clusters with rCUDA.
Javier Prades, Carlos Reaño, Federico Silla
2016Causal consistency: beyond memory.
Matthieu Perrin, Achour Mostéfaoui, Claude Jard
2016Coarse grain parallelization of deep neural networks.
Marc González Tallada
2016Concurrent hash tables: fast
Tobias Maier, Peter Sanders, Roman Dementiev
2016Contention-conscious, locality-preserving locks.
Milind Chabbi, John M. Mellor-Crummey
2016DSMR: a shared and distributed memory algorithm for single-source shortest path problem.
Saeed Maleki, Donald Nguyen, Andrew Lenharth, María Jesús Garzarán, David A. Padua, Keshav Pingali
2016Data-centric combinatorial optimization of parallel code.
Hao Luo, Guoyang Chen, Pengcheng Li, Chen Ding, Xipeng Shen
2016Declarative coordination of graph-based parallel programs.
Flávio Cruz, Ricardo Rocha, Seth Copen Goldstein
2016Distributed Halide.
Tyler Denniston, Shoaib Kamil, Saman P. Amarasinghe
2016DomLock: a new multi-granularity locking technique for hierarchies.
Saurabh Kalikar, Rupesh Nasre
2016Drinking from both glasses: combining pessimistic and optimistic tracking of cross-thread dependences.
Man Cao, Minjia Zhang, Aritra Sengupta, Michael D. Bond
2016ESTIMA: extrapolating scalability of in-memory applications.
Georgios Chatzopoulos, Aleksandar Dragojevic, Rachid Guerraoui
2016Effect of portable fine-grained locality on energy efficiency and performance in concurrent search trees.
Ibrahim Umar, Otto J. Anshus, Phuong Hoai Ha
2016Efficient distributed workstealing via matchmaking.
Hrushit Parikh, Vinit Deodhar, Ada Gavrilovska, Santosh Pande
2016Exploiting accelerators for efficient high dimensional similarity search.
Sandeep R. Agrawal, Christopher M. Dee, Alvin R. Lebeck
2016GPU multisplit.
Saman Ashkiani, Andrew A. Davidson, Ulrich Meyer, John D. Owens
2016Generic messages: capability-based shared memory parallelism for event-loop systems.
Luca Salucci, Daniele Bonetta, Stefan Marr, Walter Binder
2016Grain graphs: OpenMP performance analysis made easy.
Ananya Muddukrishna, Peter A. Jonsson, Artur Podobas, Mats Brorsson
2016Gunrock: a high-performance graph processing library on the GPU.
Yangzihao Wang, Andrew A. Davidson, Yuechao Pan, Yuduo Wu, Andy Riffel, John D. Owens
2016High performance model based image reconstruction.
Xiao Wang, Amit Sabne, Sherman J. Kisner, Anand Raghunathan, Charles A. Bouman, Samuel P. Midkiff
2016Hybrid CPU-GPU scheduling and execution of tree traversals.
Jianqiao Liu, Nikhil Hegde, Milind Kulkarni
2016Improving efficacy of internal binary search trees using local recovery.
Arunmoezhi Ramachandran, Neeraj Mittal
2016Keep calm and react with foresight: strategies for low-latency and energy-efficient elastic data stream processing.
Tiziano De Matteis, Gabriele Mencagli
2016Lease/release: architectural support for scaling contended data structures.
Syed Kamran Haider, William Hasenplaugh, Dan Alistarh
2016Merge-based sparse matrix-vector multiplication (SpMV) using the CSR storage format.
Duane Merrill, Michael Garland
2016Multi-core on-the-fly SCC decomposition.
Vincent Bloemen, Alfons Laarman, Jaco van de Pol
2016NUMA-aware scheduling and memory allocation for data-flow task-parallel applications.
Andi Drebes, Antoniu Pop, Karine Heydemann, Nathalie Drach, Albert Cohen
2016OPR: deterministic group replay for one-sided communication.
Xuehai Qian, Koushik Sen, Paul Hargrove, Costin Iancu
2016On designing NUMA-aware concurrency control for scalable transactional memory.
Mohamed Mohamedin, Roberto Palmieri, Sebastiano Peluso, Binoy Ravindran
2016On ordering transaction commit.
Mohamed M. Saad, Roberto Palmieri, Binoy Ravindran
2016Optimistic concurrency with OPTIK.
Rachid Guerraoui, Vasileios Trigonakis
2016Parallel type-checking with haskell using saturating LVars and stream generators.
Ryan R. Newton, Ömer S. Agacan, Peter P. Fogg, Sam Tobin-Hochstadt
2016Preemption-aware planning on big-data systems.
Marco Rabozzi, Matteo Mazzucchelli, Roberto Cordone, Giovanni Matteo Fumarola, Marco D. Santambrogio
2016Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2016, Barcelona, Spain, March 12-16, 2016
Rafael Asenjo, Tim Harris
2016Production-guided concurrency debugging.
Nuno Machado, Brandon Lucia, Luís E. T. Rodrigues
2016Refined transactional lock elision.
Dave Dice, Alex Kogan, Yossi Lev
2016SPIRIT: a runtime system for distributed irregular tree applications.
Nikhil Hegde, Jianqiao Liu, Milind Kulkarni
2016Samsara parallel: a non-BSP parallel-in-time model.
Yifeng Chen, Kun Huang, Bei Wang, Guohui Li, Xiang Cui
2016Scalable adaptive NUMA-aware lock: combining local locking and remote locking for efficient concurrency.
Mingzhe Zhang, Francis C. M. Lau, Cho-Li Wang, Luwei Cheng, Haibo Chen
2016The virtues of conflict: analysing modern concurrency.
Ganesh Narayanaswamy, Saurabh Joshi, Daniel Kroening
2016Tidex: a mutual exclusion lock.
Pedro Ramalhete, Andreia Correia
2016Unifying fixed code and fixed data mapping of load-imbalanced pipelined loops.
Aristeidis Mastoras, Thomas R. Gross
2016User-assisted storage reuse determination for dynamic task graphs.
Mehmet Can Kurt, Bin Ren, Sriram Krishnamoorthy, Gagan Agrawal
2016Verification of MPI Java programs using software model checking.
Waqas ur Rehman, Muhammad Sohaib Ayub, Junaid Haroon Siddiqui
2016Work stealing for interactive services to meet target latency.
Jing Li, Kunal Agrawal, Sameh Elnikety, Yuxiong He, I-Ting Angelina Lee, Chenyang Lu, Kathryn S. McKinley