| 2008 | A case study in SIMD text processing with parallel bit streams: UTF-8 to UTF-16 transcoding. Robert D. Cameron |
| 2008 | A portable runtime interface for multi-level memory hierarchies. Mike Houston, Ji Young Park, Manman Ren, Timothy J. Knight, Kayvon Fatahalian, Alex Aiken, William J. Dally, Pat Hanrahan |
| 2008 | All-window profiling of concurrent executions. Chen Ding, Trishul M. Chilimbi |
| 2008 | An adaptive memory conscious approach for mining frequent trees: implications for multi-core architectures. Shirish Tatikonda, Srinivasan Parthasarathy |
| 2008 | Assertional reasoning about data races in relaxed memory models. Beverly A. Sanders, Kyunghee Kim |
| 2008 | Automated application-level checkpointing based on live-variable analysis in MPI programs. Panfeng Wang, Xuejun Yang, Hongyi Fu, Yunfei Du, Zhiyun Wang, Jia Jia |
| 2008 | Automatic data movement and computation mapping for multi-level parallel architectures with explicitly managed memories. Muthu Manikandan Baskaran, Uday Bondhugula, Sriram Krishnamoorthy, J. Ramanujam, Atanas Rountev, P. Sadayappan |
| 2008 | Cache-aware iteration space partitioning. Arun Kejariwal, Alexandru Nicolau, Utpal Banerjee, Alexander V. Veidenbaum, Constantine D. Polychronopoulos |
| 2008 | Compiler optimizations for parallelizing general-purpose applications under thread-level speculation. Antonia Zhai, Shengyue Wang, Pen-Chung Yew, Guojin He |
| 2008 | Compiler-enhanced incremental checkpointing for OpenMP applications. Greg Bronevetsky, Daniel Marques, Keshav Pingali, Radu Rugina, Sally A. McKee |
| 2008 | Compilers and parallel computing systems. Frances E. Allen |
| 2008 | Concurrent GC leveraging transactional memory. Phil McGachey, Ali-Reza Adl-Tabatabai, Richard L. Hudson, Vijay Menon, Bratin Saha, Tatiana Shpeisman |
| 2008 | Design and implementation of a high-performance MPI for C# and the common language infrastructure. Douglas P. Gregor, Andrew Lumsdaine |
| 2008 | Dynamic performance tuning of word-based software transactional memory. Pascal Felber, Christof Fetzer, Torvald Riegel |
| 2008 | Enhancing the performance of MPI-IO applications by overlapping I/O, computation and communication. Christina M. Patrick, Seung Woo Son, Mahmut T. Kandemir |
| 2008 | Experience on optimizing irregular computation for memory hierarchy in manycore architecture. Guangming Tan, Dongrui Fan, Junchao Zhang, Andrew Russo, Guang R. Gao |
| 2008 | Experiences using adaptive concurrency in transactional memory with Lee's routing algorithm. Mohammad Ansari, Christos Kotselidis, Kim Jarvis, Mikel Luján, Chris C. Kirkham, Ian Watson |
| 2008 | Extracting coarse-grain parallelism in general-purpose programs. Sean Rul, Hans Vandierendonck, Koen De Bosschere |
| 2008 | FastForward for efficient pipeline parallelism: a cache-optimized concurrent lock-free queue. John Giacomoni, Tipp Moseley, Manish Vachharajani |
| 2008 | Formal specification of the MPI-2.0 standard in TLA+. Guodong Li, Michael Delisi, Ganesh Gopalakrishnan, Robert M. Kirby |
| 2008 | High performance dense linear algebra on a spatially distributed processor. Jeffrey R. Diamond, Behnam Robatmili, Stephen W. Keckler, Robert A. van de Geijn, Kazushige Goto, Doug Burger |
| 2008 | ISP: a tool for model checking MPI programs. Sarvani S. Vakkalanka, Subodh Sharma, Ganesh Gopalakrishnan, Robert M. Kirby |
| 2008 | Massive parallel LDPC decoding on GPU. Gabriel Falcão Paiva Fernandes, Leonel Sousa, Vítor Manuel Mendes da Silva |
| 2008 | Matrix product on heterogeneous master-worker platforms. Jack J. Dongarra, Jean-Francois Pineau, Yves Robert, Frédéric Vivien |
| 2008 | Modeling optimistic concurrency using quantitative dependence analysis. Christoph von Praun, Rajesh Bordawekar, Calin Cascaval |
| 2008 | Nested parallelism in transactional memory. Kunal Agrawal, Jeremy T. Fineman, Jim Sukha |
| 2008 | On the correctness of transactional memory. Rachid Guerraoui, Michal Kapalka |
| 2008 | Optimization principles and application performance evaluation of a multithreaded GPU using CUDA. Shane Ryoo, Christopher I. Rodrigues, Sara S. Baghsorkhi, Sam S. Stone, David Blair Kirk, Wen-mei W. Hwu |
| 2008 | Performance without pain = productivity: data layout and collective communication in UPC. Rajesh Nishtala, George Almási, Calin Cascaval |
| 2008 | Practical experiences with Java software transactional memory. Evgueni Brevnov, Yuri Dolgov, Boris Kuznetsov, Dmitry Yershov, Vyacheslav Shakin, Dong-yuan Chen, Vijay Menon, Suresh Srinivas |
| 2008 | Probabilistic advanced reservations for batch-scheduled parallel machines. Daniel Nurmi, Richard Wolski, John Brevik |
| 2008 | Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP 2008, Salt Lake City, UT, USA, February 20-23, 2008 Siddhartha Chatterjee, Michael L. Scott |
| 2008 | Programming with tiles. Jia Guo, Ganesh Bikshandi, Basilio B. Fraguela, María Jesús Garzarán, David A. Padua |
| 2008 | Quasi-static scheduling for safe futures. Armand Navabi, Xiangyu Zhang, Suresh Jagannathan |
| 2008 | Safer open-nested transactions through ownership. Kunal Agrawal, I-Ting Angelina Lee, Jim Sukha |
| 2008 | Scalable packet classification using interpreting: a cross-platform multi-core solution. Haipeng Cheng, Zheng Chen, Bei Hua, Xinan Tang |
| 2008 | Semantics-based distributed I/O for mpiBLAST. Pavan Balaji, Wu-chun Feng, Jeremy S. Archuleta, Heshan Lin, Rajkumar Kettimuthu, Rajeev Thakur, Xiaosong Ma |
| 2008 | Software transactional memory for large scale clusters. Robert L. Bocchino Jr., Vikram S. Adve, Bradford L. Chamberlain |
| 2008 | Split hardware transactions: true nesting of transactions using best-effort hardware transactional memory. Yossi Lev, Jan-Willem Maessen |
| 2008 | SuperMatrix: a multithreaded runtime scheduling system for algorithms-by-blocks. Ernie Chan, Field G. Van Zee, Paolo Bientinesi, Enrique S. Quintana-Ortí, Gregorio Quintana-Ortí, Robert A. van de Geijn |
| 2008 | Toward high performance nonblocking software transactional memory. Virendra J. Marathe, Mark Moir |
| 2008 | Transactional boosting: a methodology for highly-concurrent transactional objects. Maurice Herlihy, Eric Koskinen |
| 2008 | Type inference for locality analysis of distributed data structures. Satish Chandra, Vijay A. Saraswat, Vivek Sarkar, Rastislav Bodík |
| 2008 | Where will all the threads come from? John M. Mellor-Crummey |
| 2008 | ZOID: I/O-forwarding infrastructure for petascale architectures. Kamil Iskra, John W. Romein, Kazutomo Yoshii, Peter H. Beckman |