PPoPP - RankMe

45 papers

Year	Title / Authors
2015	A collection-oriented programming model for performance portability. Saurav Muralidharan, Michael Garland, Bryan Catanzaro, Albert Sidelnik, Mary W. Hall
2015	A framework for practical parallel fast matrix multiplication. Austin R. Benson, Grey Ballard
2015	A hierarchical approach to reducing communication in parallel graph algorithms. Harshvardhan, Nancy M. Amato, Lawrence Rauchwerger
2015	A library for portable and composable data locality optimizations for NUMA systems. Zoltan Majó, Thomas R. Gross
2015	A parallel algorithm for global states enumeration in concurrent systems. Yen-Jung Chang, Vijay K. Garg
2015	A programming model and runtime system for significance-aware energy-efficient computing. Vassilis Vassiliadis, Konstantinos Parasyris, Charalambos Chalios, Christos D. Antonopoulos, Spyros Lalis, Nikolaos Bellas, Hans Vandierendonck, Dimitrios S. Nikolopoulos
2015	An OpenACC-based unified programming model for multi-accelerator systems. Jungwon Kim, Seyong Lee, Jeffrey S. Vetter
2015	Are web applications ready for parallelism? Cosmin Radoi, Stephan Herhut, Jaswanth Sreeram, Danny Dig
2015	Automatic scalable atomicity via semantic locking. Guy Golan-Gueta, G. Ramalingam, Mooly Sagiv, Eran Yahav
2015	Barrier elision for production parallel programs. Milind Chabbi, Wim Lavrijsen, Wibe de Jong, Koushik Sen, John M. Mellor-Crummey, Costin Iancu
2015	CASTLE: fast concurrent internal binary search tree using edge-based locking. Arunmoezhi Ramachandran, Neeraj Mittal
2015	Cache-oblivious wavefront: improving parallelism of recursive dynamic programming algorithms without losing cache-efficiency. Yuan Tang, Ronghui You, Haibin Kan, Jesmin Jahan Tithi, Pramod Ganapathi, Rezaul Alam Chowdhury
2015	Combining phase identification and statistic modeling for automated parallel benchmark generation. Ye Jin, Mingliang Liu, Xiaosong Ma, Qing Liu, Jeremy Logan, Norbert Podhorszki, Jong Youl Choi, Scott Klasky
2015	Decoupled load balancing. Olga Pearce, Todd Gamblin, Bronis R. de Supinski, Martin Schulz, Nancy M. Amato
2015	Diagnosing the causes and severity of one-sided message contention. Nathan R. Tallent, Abhinav Vishnu, Hubertus Van Dam, Jeff Daily, Darren J. Kerbyson, Adolfy Hoisie
2015	Distributed memory code generation for mixed Irregular/Regular computations. Mahesh Ravishankar, Roshan Dathathri, Venmugil Elango, Louis-Noël Pouchet, J. Ramanujam, Atanas Rountev, P. Sadayappan
2015	Dynamic deadlock verification for general barrier synchronisation. Tiago Cogumbreiro, Raymond Hu, Francisco Martins, Nobuko Yoshida
2015	Efficient and reasonable object-oriented concurrency. Scott West, Sebastian Nanz, Bertrand Meyer
2015	Fence placement for legacy data-race-free programs via synchronization read detection. Andrew J. McPherson, Vijay Nagarajan, Susmit Sarkar, Marcelo Cintra
2015	GStream: a graph streaming processing method for large-scale graphs on GPUs. Hyunseok Seo, Jinwook Kim, Min-Soo Kim
2015	Gunrock: a high-performance graph processing library on the GPU. Yangzihao Wang, Andrew A. Davidson, Yuechao Pan, Yuduo Wu, Andy Riffel, John D. Owens
2015	High performance locks for multi-level NUMA systems. Milind Chabbi, Michael W. Fagan, John M. Mellor-Crummey
2015	JAWS: a JavaScript framework for adaptive CPU-GPU work sharing. Xianglan Piao, Channoh Kim, Younghwan Oh, Huiying Li, Jincheon Kim, Hanjun Kim, Jae W. Lee
2015	Low-overhead software transactional memory with progress guarantees and strong semantics. Minjia Zhang, Jipeng Huang, Man Cao, Michael D. Bond
2015	MPI+Threads: runtime contention and remedies. Abdelhalim Amer, Huiwei Lu, Yanjie Wei, Pavan Balaji, Satoshi Matsuoka
2015	More than you ever wanted to know about synchronization: synchrobench, measuring the impact of the synchronization on concurrent algorithms. Vincent Gramoli
2015	NUMA-aware graph-structured analytics. Kaiyuan Zhang, Rong Chen, Haibo Chen
2015	On optimizing machine learning workloads via kernel fusion. Arash Ashari, Shirish Tatikonda, Matthias Boehm, Berthold Reinwald, Keith Campbell, John Keenleyside, P. Sadayappan
2015	Optimization of asynchronous graph processing on GPU with hybrid coloring model. Xuanhua Shi, Junling Liang, Sheng Di, Bingsheng He, Hai Jin, Lu Lu, Zhixiang Wang, Xuan Luo, Jianlong Zhong
2015	PLUTO+: near-complete modeling of affine transformations for parallelism and locality. Aravind Acharya, Uday Bondhugula
2015	Performance implications of dynamic memory allocators on transactional memory systems. Alexandro Baldassin, Edson Borin, Guido Araujo
2015	Predicate RCU: an RCU for scalable concurrent updates. Maya Arbel, Adam Morrison
2015	Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2015, San Francisco, CA, USA, February 7-11, 2015 Albert Cohen, David Grove
2015	SYNC or ASYNC: time to fuse for distributed graph-parallel computation. Chenning Xie, Rong Chen, Haibing Guan, Binyu Zang, Haibo Chen
2015	Scalable and efficient implementation of 3d unstructured meshes computation: a case study on matrix assembly. Loïc Thébault, Eric Petit, Quang Dinh
2015	Section based program analysis to reduce overhead of detecting unsynchronized thread communication. Madan Mohan Das, Gabriel Southern, Jose Renau
2015	SemCache++: semantics-aware caching for efficient multi-GPU offloading. Nabeel AlSaber, Milind Kulkarni
2015	Software partitioning of hardware transactions. Lingxiang Xiang, Michael L. Scott
2015	Static/Dynamic validation of MPI collective communications in multi-threaded context. Emmanuelle Saillard, Patrick Carribault, Denis Barthou
2015	The SprayList: a scalable relaxed priority queue. Dan Alistarh, Justin Kopinsky, Jerry Li, Nir Shavit
2015	The lazy happens-before relation: better partial-order reduction for systematic concurrency testing. Paul Thomson, Alastair F. Donaldson
2015	The lock-free k-LSM relaxed priority queue. Martin Wimmer, Jakob Gruber, Jesper Larsson Träff, Philippas Tsigas
2015	Tiles: a new language mechanism for heterogeneous parallelism. Yifeng Chen, Xiang Cui, Hong Mei
2015	Towards batched linear solvers on accelerated hardware platforms. Azzam Haidar, Tingxing Dong, Piotr Luszczek, Stanimire Tomov, Jack J. Dongarra
2015	VirtCL: a framework for OpenCL device abstraction and management. Yi-Ping You, Hen-Jung Wu, Yeh-Ning Tsai, Yen-Ting Chao