| 2010 | A distributed placement service for graph-structured and tree-structured data. Gregory Buehrer, Srinivasan Parthasarathy, Shirish Tatikonda |
| 2010 | A practical concurrent binary search tree. Nathan Grasso Bronson, Jared Casper, Hassan Chafi, Kunle Olukotun |
| 2010 | A symbolic verifier for CUDA programs. Guodong Li, Ganesh Gopalakrishnan, Robert M. Kirby, Daniel J. Quinlan |
| 2010 | An adaptive performance modeling tool for GPU architectures. Sara S. Baghsorkhi, Matthieu Delahaye, Sanjay J. Patel, William D. Gropp, Wen-mei W. Hwu |
| 2010 | An optimizing compiler for GPGPU programs with input-data sharing. Yi Yang, Ping Xiang, Jingfei Kong, Huiyang Zhou |
| 2010 | Analyzing lock contention in multithreaded applications. Nathan R. Tallent, John M. Mellor-Crummey, Allan Porterfield |
| 2010 | Application heartbeats for software performance and health. Henry Hoffmann, Jonathan Eastep, Marco D. Santambrogio, Jason E. Miller, Anant Agarwal |
| 2010 | Applying the concurrent collections programming model to asynchronous parallel dense linear algebra. Aparna Chandramowlishwaran, Kathleen Knobe, Richard W. Vuduc |
| 2010 | CUDAlign: using GPU to accelerate the comparison of megabase genomic sequences. Edans Flavius de Oliveira Sandes, Alba Cristina Magalhaes Alves de Melo |
| 2010 | Compiler aided selective lock assignment for improving the performance of software transactional memory. Sandya Mannarswamy, Dhruva R. Chakrabarti, Kaushik Rajan, Sujoy Saraswati |
| 2010 | Composable thread coloring. Dean F. Sutherland, William L. Scherlis |
| 2010 | Continuous speculative program parallelization in software. Chao Zhang, Chen Ding, Xiaoming Gu, Kirk Kelsey, Tongxin Bai, Xiaobing Feng |
| 2010 | Data transformations enabling loop vectorization on multithreaded data parallel architectures. Byunghyun Jang, Perhaad Mistry, Dana Schaa, Rodrigo Dominguez, David R. Kaeli |
| 2010 | Debugging programs that use atomic blocks and transactional memory. Ferad Zyulkyarov, Tim Harris, Osman S. Unsal, Adrián Cristal, Mateo Valero |
| 2010 | Does cache sharing on modern CMP matter to the performance of contemporary multithreaded programs? Eddy Z. Zhang, Yunlian Jiang, Xipeng Shen |
| 2010 | Effective communication and computation overlap with hybrid MPI/SMPSs. Vladimir Marjanovic, Jesús Labarta, Eduard Ayguadé, Mateo Valero |
| 2010 | Exascale computing: the challenges and opportunities in the next decade. Tilak Agerwala |
| 2010 | Extreme scale computing: challenges and opportunities. Josep Torrellas, Bill Gropp, Jaime H. Moreno, Kunle Olukotun, Vivek Sarkar |
| 2010 | Fast tridiagonal solvers on the GPU. Yao Zhang, Jonathan Cohen, John D. Owens |
| 2010 | Featherweight X10: a core calculus for async-finish parallelism. Jonathan K. Lee, Jens Palsberg |
| 2010 | GAMBIT: effective unit testing for concurrency libraries. Katherine E. Coons, Sebastian Burckhardt, Madanlal Musuvathi |
| 2010 | Helper locks for fork-join parallel programming. Kunal Agrawal, Charles E. Leiserson, Jim Sukha |
| 2010 | Improving parallelism and locality with asynchronous algorithms. Lixia Liu, Zhiyuan Li |
| 2010 | Input-driven dynamic execution prediction of streaming applications. Farhana Aleen, Monirul Sharif, Santosh Pande |
| 2010 | Intra-application shared cache partitioning for multithreaded applications. Sai Prashanth Muralidhara, Mahmut T. Kandemir, Padma Raghavan |
| 2010 | Is hardware innovation over? Arvind |
| 2010 | Is transactional programming actually easier? Christopher J. Rossbach, Owen S. Hofmann, Emmett Witchel |
| 2010 | KRASH: reproducible CPU load generation on many cores machines. Swann Perarnau, Guillaume Huard |
| 2010 | Lazy binary-splitting: a run-time adaptive work-stealing scheduler. Alexandros Tzannes, George C. Caragea, Rajeev Barua, Uzi Vishkin |
| 2010 | Leveraging parallel nesting in transactional memory. João Pedro Barreto, Aleksandar Dragojevic, Paulo Ferreira, Rachid Guerraoui, Michal Kapalka |
| 2010 | Load balancing on speed. Steven A. Hofmeyr, Costin Iancu, Filip Blagojevic |
| 2010 | Model-driven autotuning of sparse matrix-vector multiply on GPUs. Jeewhan Choi, Amik Singh, Richard W. Vuduc |
| 2010 | Modeling advanced collective communication algorithms on cell-based systems. Qasim Ali, Samuel P. Midkiff, Vijay S. Pai |
| 2010 | Modeling transactional memory workload performance. Donald E. Porter, Emmett Witchel |
| 2010 | NOrec: streamlining STM by abolishing ownership records. Luke Dalessandro, Michael F. Spear, Michael L. Scott |
| 2010 | New abstractions for effective performance analysis of STM programs. Dhruva R. Chakrabarti |
| 2010 | PHANTOM: predicting performance of parallel applications on large-scale parallel machines using a single node. Jidong Zhai, Wenguang Chen, Weimin Zheng |
| 2010 | Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP 2010, Bangalore, India, January 9-14, 2010 R. Govindarajan, David A. Padua, Mary W. Hall |
| 2010 | SLAW: a scalable locality-aware adaptive work-stealing scheduler for multi-core systems. Yi Guo, Yisheng Zhao, Vincent Cavé, Vivek Sarkar |
| 2010 | Scalable communication protocols for dynamic sparse data exchange. Torsten Hoefler, Christian Siebert, Andrew Lumsdaine |
| 2010 | Scaling LAPACK panel operations using parallel cache assignment. Anthony M. Castaldo, R. Clint Whaley |
| 2010 | Scheduling support for transactional memory contention management. Walther Maldonado, Patrick Marlier, Pascal Felber, Adi Suissa, Danny Hendler, Alexandra Fedorova, Julia L. Lawall, Gilles Muller |
| 2010 | Structure-driven optimizations for amorphous data-parallel programs. Mario Méndez-Lojo, Donald Nguyen, Dimitrios Prountzos, Xin Sui, Muhammad Amber Hassaan, Milind Kulkarni, Martin Burtscher, Keshav Pingali |
| 2010 | Supporting lock-free composition of concurrent data objects. Daniel Cederman, Philippas Tsigas |
| 2010 | Symbolic prefetching in transactional distributed shared memory. Alokika Dash, Brian Demsky |
| 2010 | The LOFAR correlator: implementation and performance analysis. John W. Romein, P. Chris Broekema, Jan David Mol, Rob van Nieuwpoort |
| 2010 | The pilot library for novice MPI programmers. John D. Carter, William B. Gardner, Gary Gréwal |
| 2010 | Thread to strand binding of parallel network applications in massive multi-threaded systems. Petar Radojkovic, Vladimir Cakarevic, Javier Verdú, Alex Pajuelo, Francisco J. Cazorla, Mario Nemirovsky, Mateo Valero |
| 2010 | Towards scalable and transparent parallelization of multiplayer games using transactional memory support. Daniel Lupei, Bogdan Simion, Don Pinto, Matthew Misler, Mihai Burcea, William Krick, Cristiana Amza |
| 2010 | Using data structure knowledge for efficient lock generation and strong atomicity. Gautam Upadhyaya, Samuel P. Midkiff, Vijay S. Pai |