| 2008 | A compiler framework for optimization of affine loop nests for gpgpus. Muthu Manikandan Baskaran, Uday Bondhugula, Sriram Krishnamoorthy, J. Ramanujam, Atanas Rountev, P. Sadayappan |
| 2008 | A freespace crossbar for multi-core processors. Michel N. Victor, Aris K. Silzars, Edward S. Davidson |
| 2008 | A projection-based optimization framework for abstractions with application to the unstructured mesh domain. Brian S. White, Sally A. McKee, Daniel J. Quinlan |
| 2008 | A regression-based approach to scalability prediction. Bradley J. Barnes, Barry Rountree, David K. Lowenthal, Jaxk Reeves, Bronis R. de Supinski, Martin Schulz |
| 2008 | Accurate memory signatures and synthetic address traces for HPC applications. Jonathan Weinberg, Allan Snavely |
| 2008 | Adaptive runtime tuning of parallel sparse matrix-vector multiplication on distributed memory systems. Seyong Lee, Rudolf Eigenmann |
| 2008 | Advanced collective communication in aspen. Qasim Ali, Vijay S. Pai, Samuel P. Midkiff |
| 2008 | An approach for adaptive DRAM temperature and power management. Song Liu, Seda Ogrenci Memik, Yu Zhang, Gokhan Memik |
| 2008 | Analysis of dynamic power management on multi-core processors. William Lloyd Bircher, Lizy K. John |
| 2008 | Analyzing memory access intensity in parallel programs on multicore. Lixia Liu, Zhiyuan Li, Ahmed H. Sameh |
| 2008 | Automatic SIMD vectorization of chains of recurrences. Yixin Shou, Robert A. van Engelen |
| 2008 | Automatic analysis of speedup of MPI applications. Marc Casas, Rosa M. Badia, Jesús Labarta |
| 2008 | Autonomous learning for efficient resource utilization of dynamic VM migration. Hyung Won Choi, Hukeun Kwak, Andrew Sohn, Kyusik Chung |
| 2008 | Biomedical image analysis on a cooperative cluster of GPUs and multicores. Timothy D. R. Hartley, Ümit V. Çatalyürek, Antonio Ruiz, Francisco D. Igual, Rafael Mayo, Manuel Ujaldon |
| 2008 | CUBA: an architecture for efficient CPU/co-processor data communication. Isaac Gelado, John H. Kelm, Shane Ryoo, Steven S. Lumetta, Nacho Navarro, Wen-mei W. Hwu |
| 2008 | Can software reliability outperform hardware reliability on high performance interconnects?: a case study with MPI over infiniband. Matthew J. Koop, Rahul Kumar, Dhabaleswar K. Panda |
| 2008 | Challenges on the road to exascale computing. Tilak Agerwala |
| 2008 | CprFS: a user-level file system to support consistent file states for checkpoint and restart. Ruini Xue, Wenguang Chen, Weimin Zheng |
| 2008 | Data mining on the cell broadband engine. Gregory Buehrer, Srinivasan Parthasarathy, Matthew Goyder |
| 2008 | Efficient computation of sum-products on GPUs through software-managed cache. Mark Silberstein, Assaf Schuster, Dan Geiger, Anjul Patney, John D. Owens |
| 2008 | Evaluating the effect of replacing CNK with linux on the compute-nodes of blue gene/l. Edi Shmueli, George Almási, José R. Brunheroto, José G. Castaños, Gábor Dózsa, Sameer Kumar, Derek Lieber |
| 2008 | Exploiting idle register classes for fast spill destination. Fang Lu, Lei Wang, Xiaobing Feng, Zhiyuan Li, Zhaoqing Zhang |
| 2008 | Fast scan algorithms on graphics processors. Yuri Dotsenko, Naga K. Govindaraju, Peter-Pike J. Sloan, Charles Boyd, John Manferdelli |
| 2008 | Focused prefetching: performance oriented prefetching based on commit stalls. R. Manikantan, R. Govindarajan |
| 2008 | Implementing Wilson-Dirac operator on the cell broadband engine. Khaled Z. Ibrahim, François Bodin |
| 2008 | Many-core GPU computing with NVIDIA CUDA. Mark J. Harris |
| 2008 | Optimizing irregular shared-memory applications for clusters. Seung-Jai Min, Rudolf Eigenmann |
| 2008 | Orchestrating data transfer for the cell/B.E. processor. Tong Chen, Haibo Lin, Tao Zhang |
| 2008 | Performance portable optimizations for loops containing communication operations. Costin Iancu, Wei Chen, Katherine A. Yelick |
| 2008 | Petaflop/s, seriously. David E. Keyes |
| 2008 | Phasers: a unified deadlock-free construct for collective and point-to-point synchronization. Jun Shirako, David M. Peixotto, Vivek Sarkar, William N. Scherer III |
| 2008 | Power-aware dynamic placement of HPC applications. Akshat Verma, Puneet Ahuja, Anindya Neogi |
| 2008 | Preserving time in large-scale communication traces. Prasun Ratn, Frank Mueller, Bronis R. de Supinski, Martin Schulz |
| 2008 | Proceedings of the 22nd Annual International Conference on Supercomputing, ICS 2008, Island of Kos, Greece, June 7-12, 2008 Pin Zhou |
| 2008 | Rotating register allocation with multiple rotating branches. Suhyun Kim, Soo-Mook Moon |
| 2008 | Shifted declustering: a placement-ideal layout scheme for multi-way replication storage architecture. Huijun Zhu, Peng Gu, Jun Wang |
| 2008 | Soft error vulnerability of iterative linear algebra methods. Greg Bronevetsky, Bronis R. de Supinski |
| 2008 | The deep computing messaging framework: generalized scalable message passing on the blue gene/P supercomputer. Sameer Kumar, Gábor Dózsa, Gheorghe Almási, Philip Heidelberger, Dong Chen, Mark Giampapa, Michael Blocksome, Ahmad Faraj, Jeff Parker, Joe Ratterman, Brian E. Smith, Charles Archer |
| 2008 | The shared-thread multiprocessor. Jeffery A. Brown, Dean M. Tullsen |
| 2008 | Three-dimensional delaunay refinement for multi-core processors. Andrey N. Chernikov, Nikos Chrisochoides |
| 2008 | Timely offloading of result-data in HPC centers. Henry M. Monti, Ali Raza Butt, Sudharshan S. Vazhkudai |