PPoPP - RankMe

56 papers

Year	Title / Authors
2016	A high-performance parallel algorithm for nonnegative matrix factorization. Ramakrishnan Kannan, Grey Ballard, Haesun Park
2016	A programming system for future proofing performance critical libraries. Li-Wen Chang, Izzat El Hajj, Hee-Seok Kim, Juan Gómez-Luna, Abdul Dakkak, Wen-mei W. Hwu
2016	A scalable lock-free hash table with open addressing. Jesper Puge Nielsen, Sven Karlsson
2016	A wait-free queue as fast as fetch-and-add. Chaoran Yang, John M. Mellor-Crummey
2016	AUTOGEN: automatic discovery of cache-oblivious parallel recursive algorithms for solving dynamic programs. Rezaul Alam Chowdhury, Pramod Ganapathi, Jesmin Jahan Tithi, Charles Bachmeier, Bradley C. Kuszmaul, Charles E. Leiserson, Armando Solar-Lezama, Yuan Tang
2016	Adding approximate counters. Guy L. Steele Jr., Jean-Baptiste Tristan
2016	Affinity-aware work-stealing for integrated CPU-GPU processors. Naila Farooqui, Rajkishore Barik, Brian T. Lewis, Tatiana Shpeisman, Karsten Schwan
2016	An interval constrained memory allocator for the Givy GAS runtime. François Gindraud, Fabrice Rastello, Albert Cohen, François Broquedis
2016	Articulation points guided redundancy elimination for betweenness centrality. Lei Wang, Fan Yang, Liangji Zhuang, Huimin Cui, Fang Lv, Xiaobing Feng
2016	Be my guest: MCS lock now welcomes guests. Tianzheng Wang, Milind Chabbi, Hideaki Kimura
2016	Benchmarking weak memory models. Carl G. Ritson, Scott Owens
2016	CUDA acceleration for Xen virtual machines in infiniband clusters with rCUDA. Javier Prades, Carlos Reaño, Federico Silla
2016	Causal consistency: beyond memory. Matthieu Perrin, Achour Mostéfaoui, Claude Jard
2016	Coarse grain parallelization of deep neural networks. Marc González Tallada
2016	Concurrent hash tables: fast Tobias Maier, Peter Sanders, Roman Dementiev
2016	Contention-conscious, locality-preserving locks. Milind Chabbi, John M. Mellor-Crummey
2016	DSMR: a shared and distributed memory algorithm for single-source shortest path problem. Saeed Maleki, Donald Nguyen, Andrew Lenharth, María Jesús Garzarán, David A. Padua, Keshav Pingali
2016	Data-centric combinatorial optimization of parallel code. Hao Luo, Guoyang Chen, Pengcheng Li, Chen Ding, Xipeng Shen
2016	Declarative coordination of graph-based parallel programs. Flávio Cruz, Ricardo Rocha, Seth Copen Goldstein
2016	Distributed Halide. Tyler Denniston, Shoaib Kamil, Saman P. Amarasinghe
2016	DomLock: a new multi-granularity locking technique for hierarchies. Saurabh Kalikar, Rupesh Nasre
2016	Drinking from both glasses: combining pessimistic and optimistic tracking of cross-thread dependences. Man Cao, Minjia Zhang, Aritra Sengupta, Michael D. Bond
2016	ESTIMA: extrapolating scalability of in-memory applications. Georgios Chatzopoulos, Aleksandar Dragojevic, Rachid Guerraoui
2016	Effect of portable fine-grained locality on energy efficiency and performance in concurrent search trees. Ibrahim Umar, Otto J. Anshus, Phuong Hoai Ha
2016	Efficient distributed workstealing via matchmaking. Hrushit Parikh, Vinit Deodhar, Ada Gavrilovska, Santosh Pande
2016	Exploiting accelerators for efficient high dimensional similarity search. Sandeep R. Agrawal, Christopher M. Dee, Alvin R. Lebeck
2016	GPU multisplit. Saman Ashkiani, Andrew A. Davidson, Ulrich Meyer, John D. Owens
2016	Generic messages: capability-based shared memory parallelism for event-loop systems. Luca Salucci, Daniele Bonetta, Stefan Marr, Walter Binder
2016	Grain graphs: OpenMP performance analysis made easy. Ananya Muddukrishna, Peter A. Jonsson, Artur Podobas, Mats Brorsson
2016	Gunrock: a high-performance graph processing library on the GPU. Yangzihao Wang, Andrew A. Davidson, Yuechao Pan, Yuduo Wu, Andy Riffel, John D. Owens
2016	High performance model based image reconstruction. Xiao Wang, Amit Sabne, Sherman J. Kisner, Anand Raghunathan, Charles A. Bouman, Samuel P. Midkiff
2016	Hybrid CPU-GPU scheduling and execution of tree traversals. Jianqiao Liu, Nikhil Hegde, Milind Kulkarni
2016	Improving efficacy of internal binary search trees using local recovery. Arunmoezhi Ramachandran, Neeraj Mittal
2016	Keep calm and react with foresight: strategies for low-latency and energy-efficient elastic data stream processing. Tiziano De Matteis, Gabriele Mencagli
2016	Lease/release: architectural support for scaling contended data structures. Syed Kamran Haider, William Hasenplaugh, Dan Alistarh
2016	Merge-based sparse matrix-vector multiplication (SpMV) using the CSR storage format. Duane Merrill, Michael Garland
2016	Multi-core on-the-fly SCC decomposition. Vincent Bloemen, Alfons Laarman, Jaco van de Pol
2016	NUMA-aware scheduling and memory allocation for data-flow task-parallel applications. Andi Drebes, Antoniu Pop, Karine Heydemann, Nathalie Drach, Albert Cohen
2016	OPR: deterministic group replay for one-sided communication. Xuehai Qian, Koushik Sen, Paul Hargrove, Costin Iancu
2016	On designing NUMA-aware concurrency control for scalable transactional memory. Mohamed Mohamedin, Roberto Palmieri, Sebastiano Peluso, Binoy Ravindran
2016	On ordering transaction commit. Mohamed M. Saad, Roberto Palmieri, Binoy Ravindran
2016	Optimistic concurrency with OPTIK. Rachid Guerraoui, Vasileios Trigonakis
2016	Parallel type-checking with haskell using saturating LVars and stream generators. Ryan R. Newton, Ömer S. Agacan, Peter P. Fogg, Sam Tobin-Hochstadt
2016	Preemption-aware planning on big-data systems. Marco Rabozzi, Matteo Mazzucchelli, Roberto Cordone, Giovanni Matteo Fumarola, Marco D. Santambrogio
2016	Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2016, Barcelona, Spain, March 12-16, 2016 Rafael Asenjo, Tim Harris
2016	Production-guided concurrency debugging. Nuno Machado, Brandon Lucia, Luís E. T. Rodrigues
2016	Refined transactional lock elision. Dave Dice, Alex Kogan, Yossi Lev
2016	SPIRIT: a runtime system for distributed irregular tree applications. Nikhil Hegde, Jianqiao Liu, Milind Kulkarni
2016	Samsara parallel: a non-BSP parallel-in-time model. Yifeng Chen, Kun Huang, Bei Wang, Guohui Li, Xiang Cui
2016	Scalable adaptive NUMA-aware lock: combining local locking and remote locking for efficient concurrency. Mingzhe Zhang, Francis C. M. Lau, Cho-Li Wang, Luwei Cheng, Haibo Chen
2016	The virtues of conflict: analysing modern concurrency. Ganesh Narayanaswamy, Saurabh Joshi, Daniel Kroening
2016	Tidex: a mutual exclusion lock. Pedro Ramalhete, Andreia Correia
2016	Unifying fixed code and fixed data mapping of load-imbalanced pipelined loops. Aristeidis Mastoras, Thomas R. Gross
2016	User-assisted storage reuse determination for dynamic task graphs. Mehmet Can Kurt, Bin Ren, Sriram Krishnamoorthy, Gagan Agrawal
2016	Verification of MPI Java programs using software model checking. Waqas ur Rehman, Muhammad Sohaib Ayub, Junaid Haroon Siddiqui
2016	Work stealing for interactive services to meet target latency. Jing Li, Kunal Agrawal, Sameh Elnikety, Yuxiong He, I-Ting Angelina Lee, Chenyang Lu, Kathryn S. McKinley