PPoPP B

47 papers

YearTitle / Authors
201421st century computer architecture.
Mark D. Hill
2014A decomposition for in-place matrix transposition.
Bryan Catanzaro, Alexander Keller, Michael Garland
2014A general technique for non-blocking trees.
Trevor Brown, Faith Ellen, Eric Ruppert
2014A practical wait-free simulation for lock-free data structures.
Shahar Timnat, Erez Petrank
2014A tool to analyze the performance of multithreaded programs on NUMA architectures.
Xu Liu, John M. Mellor-Crummey
2014ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '14, Orlando, FL, USA, February 15-19, 2014
José E. Moreira, James R. Larus
2014Automatic semantic locking.
Guy Golan-Gueta, G. Ramalingam, Mooly Sagiv, Eran Yahav
2014Beyond parallel programming with domain specific languages.
Kunle Olukotun
2014CUDA-NP: realizing nested thread-level parallelism in GPGPU applications.
Yi Yang, Huiyang Zhou
2014Concurrency bug localization using shared memory access pairs.
Wenwen Wang, Chenggang Wu, Pen-Chung Yew, Xiang Yuan, Zhenjiang Wang, Jianjun Li, Xiaobing Feng
2014Concurrency testing using schedule bounding: an empirical study.
Paul Thomson, Alastair F. Donaldson, Adam Betts
2014Data structures for task-based priority scheduling.
Martin Wimmer, Francesco Versaci, Jesper Larsson Träff, Daniel Cederman, Philippas Tsigas
2014Designing and auto-tuning parallel 3-D FFT for computation-communication overlap.
Sukhyun Song, Jeffrey K. Hollingsworth
2014Detecting silent data corruption through data dynamic monitoring for scientific applications.
Leonardo Arturo Bautista-Gomez, Franck Cappello
2014Efficient deterministic multithreading without global barriers.
Kai Lu, Xu Zhou, Tom Bergan, Xiaoping Wang
2014Efficient search for inputs causing high floating-point errors.
Wei-Fan Chiang, Ganesh Gopalakrishnan, Zvonimir Rakamaric, Alexey Solovyev
2014Eliminating global interpreter locks in ruby through hardware transactional memory.
Rei Odaira, José G. Castaños, Hisanobu Tomari
2014Extracting logical structure and identifying stragglers in parallel execution traces.
Katherine E. Isaacs, Todd Gamblin, Abhinav Bhatele, Peer-Timo Bremer, Martin Schulz, Bernd Hamann
2014Fast concurrent lock-free binary search trees.
Aravind Natarajan, Neeraj Mittal
2014Fine-grain parallel megabase sequence comparison with multiple heterogeneous GPUs.
Edans F. de O. Sandes, Guillermo Miranda, Alba Cristina Magalhaes Alves de Melo, Xavier Martorell, Eduard Ayguadé
2014Heterogeneous computing: what does it mean for compiler research?
Norm Rubin
2014In-place transposition of rectangular matrices on accelerators.
I-Jui Sung, Juan Gómez-Luna, José María González-Linares, Nicolás Guil, Wen-mei W. Hwu
2014Infrastructure-free logging and replay of concurrent execution on multiple cores.
Kyu Hyung Lee, Dohyeong Kim, Xiangyu Zhang
2014Initial study of multi-endpoint runtime for MPI+OpenMP hybrid programming model on multi-core systems.
Miao Luo, Xiaoyi Lu, Khaled Hamidouche, Krishna Chaitanya Kandalla, Dhabaleswar K. Panda
2014Leveraging hardware message passing for efficient thread synchronization.
Darko Petrovic, Thomas Ropars, André Schiper
2014Lock contention aware thread migrations.
Kishore Kumar Pusukuri, Rajiv Gupta, Laxmi Narayan Bhuyan
2014Optimistic transactional boosting.
Ahmed Hassan, Roberto Palmieri, Binoy Ravindran
2014PREDATOR: predictive false sharing detection.
Tongping Liu, Chen Tian, Ziang Hu, Emery D. Berger
2014Parallelization hints via code skeletonization.
Cfir Aguston, Yosi Ben-Asher, Gadi Haber
2014Parallelizing dynamic programming through rank convergence.
Saeed Maleki, Madanlal Musuvathi, Todd Mytkowicz
2014Portable, MPI-interoperable coarray fortran.
Chaoran Yang, Wesley Bland, John M. Mellor-Crummey, Pavan Balaji
2014Practical concurrent binary search trees via logical ordering.
Dana Drachsler, Martin T. Vechev, Eran Yahav
2014Provably good scheduling for parallel programs that use data structures through implicit batching.
Kunal Agrawal, Jeremy T. Fineman, Brendan Sheridan, Jim Sukha, Robert Utterback
2014Race directed scheduling of concurrent programs.
Mahdi Eslamimehr, Jens Palsberg
2014Resilient X10: efficient failure-aware programming.
David Cunningham, David Grove, Benjamin Herta, Arun Iyengar, Kiyokuni Kawachiya, Hiroki Murata, Vijay A. Saraswat, Mikio Takeuchi, Olivier Tardieu
2014Revisiting loop fusion in the polyhedral framework.
Sanyam Mehta, Pei-Hung Lin, Pen-Chung Yew
2014SCCMulti: an improved parallel strongly connected components algorithm.
Daniel Tomkins, Timmie G. Smith, Nancy M. Amato, Lawrence Rauchwerger
2014Singe: leveraging warp specialization for high performance on GPUs.
Michael Bauer, Sean Treichler, Alex Aiken
2014Task mapping stencil computations for non-contiguous allocations.
Vitus J. Leung, David P. Bunde, Jonathan Ebbers, Stefan P. Feer, Nickolas W. Price, Zachary D. Rhodes, Matthew Swank
2014Theoretical analysis of classic algorithms on highly-threaded many-core GPUs.
Lin Ma, Kunal Agrawal, Roger D. Chamberlain
2014Time-warp: lightweight abort minimization in transactional memory.
Nuno Lourenco Diegues, Paolo Romano
2014Towards fair and efficient SMP virtual machine scheduling.
Jia Rao, Xiaobo Zhou
2014Trace driven dynamic deadlock detection and reproduction.
Malavika Samak, Murali Krishna Ramanathan
2014Triolet: a programming system that unifies algorithmic skeleton interfaces for high-performance cluster computing.
Christopher I. Rodrigues, Thomas B. Jablin, Abdul Dakkak, Wen-mei W. Hwu
2014Well-structured futures and cache locality.
Maurice Herlihy, Zhiyu Liu
2014X10 and APGAS at Petascale.
Olivier Tardieu, Benjamin Herta, David Cunningham, David Grove, Prabhanjan Kambadur, Vijay A. Saraswat, Avraham Shinnar, Mikio Takeuchi, Mandana Vaziri
2014yaSpMV: yet another SpMV framework on GPUs.
Shengen Yan, Chao Li, Yunquan Zhang, Huiyang Zhou