| 2008 | 0.374 Pflop/s trillion-particle kinetic modeling of laser plasma interaction on Roadrunner. Kevin J. Bowers, Brian J. Albright, Ben Bergen, Lin Yin, Kevin J. Barker, Darren J. Kerbyson |
| 2008 | 369 Tflop/s molecular dynamics simulations on the Roadrunner general-purpose heterogeneous supercomputer. Sriram Swaminarayan, Kai Kadau, Timothy C. Germann, Gordon C. Fossum |
| 2008 | A dynamic scheduler for balancing HPC applications. Carlos Boneti, Roberto Gioiosa, Francisco J. Cazorla, Mateo Valero |
| 2008 | A multi-level parallel simulation approach to electron transport in nano-scale transistors. Mathieu Luisier, Gerhard Klimeck |
| 2008 | A novel domain oriented approach for scientific grid workflow composition. Jun Qin, Thomas Fahringer |
| 2008 | A novel migration-based NUCA design for chip multiprocessors. Mahmut T. Kandemir, Feihui Li, Mary Jane Irwin, Seung Woo Son |
| 2008 | A scalable parallel framework for analyzing terascale molecular dynamics simulation trajectories. Tiankai Tu, Charles A. Rendleman, David W. Borhani, Ron O. Dror, Justin Gullingsrud, Morten Ø. Jensen, John L. Klepeis, Paul Maragakis, Patrick J. Miller, Kate A. Stafford, David E. Shaw |
| 2008 | Accelerating configuration interaction calculations for nuclear structure. Philip Sternberg, Esmond G. Ng, Chao Yang, Pieter Maris, James P. Vary, Masha Sosonkina, Hung Viet Le |
| 2008 | Adapting a message-driven parallel application to GPU-accelerated clusters. James C. Phillips, John E. Stone, Klaus Schulten |
| 2008 | An adaptive cut-off for task parallelism. Alejandro Duran, Julita Corbalán, Eduard Ayguadé |
| 2008 | An efficient parallel approach for identifying protein families in large-scale metagenomic data sets. Changjun Wu, Ananth Kalyanaraman |
| 2008 | Analysis of application heartbeats: learning structural and temporal features in time series data for identification of performance problems. Emma S. Buneci, Daniel A. Reed |
| 2008 | Applying double auctions for scheduling of workflows on the Grid. Marek Wieczorek, Stefan Podlipnig, Radu Prodan, Thomas Fahringer |
| 2008 | Asymmetric interactions in symmetric multi-core systems: analysis, enhancements and evaluation. Thomas Scogland, Pavan Balaji, Wu-chun Feng, Ganesh Narayanaswamy |
| 2008 | Bandwidth intensive 3-D FFT kernel for GPUs using CUDA. Akira Nukada, Yasuhiko Ogata, Toshio Endo, Satoshi Matsuoka |
| 2008 | Benchmarking GPUs to tune dense linear algebra. Vasily Volkov, James Demmel |
| 2008 | BitDew: a programmable environment for large-scale data management and distribution. Gilles Fedak, Haiwu He, Franck Cappello |
| 2008 | Capturing performance knowledge for automated analysis. Kevin A. Huck, Oscar R. Hernandez, Van Bui, Sunita Chandrasekaran, Barbara M. Chapman, Allen D. Malony, Lois C. McInnes, Boyana Norris |
| 2008 | Characterizing and predicting the I/O performance of HPC applications using a parameterized synthetic benchmark. Hongzhang Shan, Katie Antypas, John Shalf |
| 2008 | Characterizing application sensitivity to OS interference using kernel-level noise injection. Kurt B. Ferreira, Patrick G. Bridges, Ron Brightwell |
| 2008 | Communication avoiding Gaussian elimination. Laura Grigori, James Demmel, Hua Xiang |
| 2008 | Dendro: parallel algorithms for multigrid and AMR methods on 2: 1 balanced octrees. Rahul S. Sampath, Santi S. Adavani, Hari Sundar, Ilya Lashuk, George Biros |
| 2008 | Dynamically adapting file domain partitioning methods for collective I/O based on underlying parallel file system locking protocols. Wei-keng Liao, Alok N. Choudhary |
| 2008 | Early evaluation of IBM BlueGene/P. Sadaf R. Alam, Richard F. Barrett, M. Bast, Mark R. Fahey, Jeffery A. Kuehn, Collin McCurdy, James H. Rogers, Philip C. Roth, Ramanan Sankaran, Jeffrey S. Vetter, Patrick H. Worley, Weikuan Yu |
| 2008 | Efficient auction-based grid reservations using dynamic programming. Andrew Mutz, Richard Wolski |
| 2008 | Efficient management of data center resources for massively multiplayer online games. Vlad Nae, Alexandru Iosup, Stefan Podlipnig, Radu Prodan, Dick H. J. Epema, Thomas Fahringer |
| 2008 | Entering the petaflop era: the architecture and performance of Roadrunner. Kevin J. Barker, Kei Davis, Adolfy Hoisie, Darren J. Kerbyson, Michael Lang, Scott Pakin, José Carlos Sancho |
| 2008 | EpiSimdemics: an efficient algorithm for simulating the spread of infectious disease over large realistic social networks. Christopher L. Barrett, Keith R. Bisset, Stephen G. Eubank, Xizhou Feng, Madhav V. Marathe |
| 2008 | Extending CC-NUMA systems to support write update optimizations. Liqun Cheng, John B. Carter |
| 2008 | Feedback-controlled resource sharing for predictable eScience. Sang-Min Park, Marty Humphrey |
| 2008 | Global trees: a framework for linked data structures on distributed memory parallel systems. D. Brian Larkins, James Dinan, Sriram Krishnamoorthy, Srinivasan Parthasarathy, Atanas Rountev, P. Sadayappan |
| 2008 | Hiding I/O latency with pre-execution prefetching for parallel applications. Yong Chen, Surendra Byna, Xian-He Sun, Rajeev Thakur, William Gropp |
| 2008 | High performance discrete Fourier transforms on graphics processors. Naga K. Govindaraju, Brandon Lloyd, Yuri Dotsenko, Burton Smith, John Manferdelli |
| 2008 | High performance multivariate visual data exploration for extremely large data. Oliver Rübel, Prabhat, Kesheng Wu, Hank Childs, Jeremy S. Meredith, Cameron G. R. Geddes, Estelle Cormier-Michel, Sean Ahern, Gunther H. Weber, Peter Messmer, Hans Hagen, Bernd Hamann, E. Wes Bethel |
| 2008 | High-frequency simulations of global seismic wave propagation using SPECFEM3D_GLOBE on 62K processors. Laura Carrington, Dimitri Komatitsch, Michael Laurenzano, Mustafa M. Tikir, David Michéa, Nicolas Le Goff, Allan Snavely, Jeroen Tromp |
| 2008 | High-radix crossbar switches enabled by proximity communication. Hans Eberle, Pedro Javier García, José Flich, José Duato, Robert J. Drost, Nils Gura, David Hopkins, Wladek Olesinski |
| 2008 | Lessons learned at 208K: towards debugging millions of cores. Gregory L. Lee, Dong H. Ahn, Dorian C. Arnold, Bronis R. de Supinski, Matthew P. LeGendre, Barton P. Miller, Martin Schulz, Ben Liblit |
| 2008 | Linearly scaling 3D fragment method for large-scale electronic structure calculations. Lin-Wang Wang, Byounghak Lee, Hongzhang Shan, Zhengji Zhao, Juan C. Meza, Erich Strohmaier, David H. Bailey |
| 2008 | Massively parallel genomic sequence search on the Blue Gene/P architecture. Heshan Lin, Pavan Balaji, Ruth Poole, Carlos P. Sosa, Xiaosong Ma, Wu-chun Feng |
| 2008 | Massively parallel volume rendering using 2-3 swap image compositing. Hongfeng Yu, Chaoli Wang, Kwan-Liu Ma |
| 2008 | Materialized community ground models for large-scale earthquake simulation. Steven W. Schlosser, Michael P. Ryan, Ricardo Taborda-Rios, Julio López, David R. O'Hallaron, Jacobo Bielak |
| 2008 | New algorithm to enable 400+ TFlop/s sustained performance in simulations of disorder effects in high- Gonzalo Alvarez, Michael S. Summers, Don E. Maxwell, Markus Eisenbach, Jeremy S. Meredith, Jeffrey M. Larkin, John M. Levesque, Thomas A. Maier, Paul R. C. Kent, Eduardo F. D'Azevedo, Thomas C. Schulthess |
| 2008 | Nimrod/K: towards massively parallel dynamic grid workflows. David Abramson, Colin Enticott, Ilkay Altintas |
| 2008 | PAM: a novel performance/power aware meta-scheduler for multi-core systems. Mohammad Banikazemi, Dan E. Poff, Bülent Abali |
| 2008 | Parallel I/O prefetching using MPI file caching and I/O signatures. Surendra Byna, Yong Chen, Xian-He Sun, Rajeev Thakur, William Gropp |
| 2008 | Parallel exact inference on the cell broadband engine processor. Yinglong Xia, Viktor K. Prasanna |
| 2008 | Performance optimization of TCP/IP over 10 gigabit ethernet by precise instrumentation. Takeshi Yoshino, Yutaka Sugawara, Katsushi Inagami, Junji Tamatsukuri, Mary Inaba, Kei Hiraki |
| 2008 | Performance prediction of large-scale parallell system and application using macro-level simulation. Ryutaro Susukita, Hisashige Ando, Mutsumi Aoyagi, Hiroaki Honda, Yuichi Inadomi, Koji Inoue, Shigeru Ishizuki, Yasunori Kimura, Hidemi Komatsu, Motoyoshi Kurokawa, Kazuaki J. Murakami, Hidetomo Shibamura, Shuji Yamamura, Yunqing Yu |
| 2008 | Positivity, posynomials and tile size selection. Lakshminarayanan Renganarayanan, Sanjay V. Rajopadhye |
| 2008 | Prefetch throttling and data pinning for improving performance of shared caches. Ozcan Ozturk, Seung Woo Son, Mahmut T. Kandemir, Mustafa Karaköy |
| 2008 | Proactive process-level live migration in HPC environments. Chao Wang, Frank Mueller, Christian Engelmann, Stephen L. Scott |
| 2008 | Proceedings of the ACM/IEEE Conference on High Performance Computing, SC 2008, November 15-21, 2008, Austin, Texas, USA |
| 2008 | Programming the Intel 80-core network-on-a-chip terascale processor. Timothy G. Mattson, Rob F. Van der Wijngaart, Michael A. Frumkin |
| 2008 | SMARTMAP: operating system support for efficient data sharing among processes on a multi-core processor. Ron Brightwell, Kevin T. Pedretti, Trammell Hudson |
| 2008 | Scalable adaptive mantle convection simulation on petascale supercomputers. Carsten Burstedde, Omar Ghattas, Michael Gurnis, Georg Stadler, Eh Tan, Tiankai Tu, Lucas C. Wilcox, Shijie Zhong |
| 2008 | Scalable load-balance measurement for SPMD codes. Todd Gamblin, Bronis R. de Supinski, Martin Schulz, Robert J. Fowler, Daniel A. Reed |
| 2008 | Scaling parallel I/O performance through I/O delegate and caching system. Arifa Nisar, Wei-keng Liao, Alok N. Choudhary |
| 2008 | Scientific application-based performance comparison of SGI Altix 4700, IBM POWER5+, and SGI ICE 8200 supercomputers. Subhash Saini, Dale Talcott, Dennis C. Jespersen, M. Jahed Djomehri, Haoqiang Jin, Rupak Biswas |
| 2008 | Server-storage virtualization: integration and load balancing in data centers. Aameek Singh, Madhukar R. Korupolu, Dushmanta Mohapatra |
| 2008 | Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures. Kaushik Datta, Mark Murphy, Vasily Volkov, Samuel Williams, Jonathan Carter, Leonid Oliker, David A. Patterson, John Shalf, Katherine A. Yelick |
| 2008 | The cost of doing science on the cloud: the Montage example. Ewa Deelman, Gurmeet Singh, Miron Livny, G. Bruce Berriman, John Good |
| 2008 | The role of MPI in development time: a case study. Lorin Hochstein, Forrest Shull, Lynn B. Reid |
| 2008 | Toward loosely coupled programming on petascale systems. Ioan Raicu, Zhao Zhang, Michael Wilde, Ian T. Foster, Peter H. Beckman, Kamil Iskra, Ben Clifford |
| 2008 | Using overlays for efficient data transfer over shared wide-area networks. Gaurav Khanna, Ümit V. Çatalyürek, Tahsin M. Kurç, Rajkumar Kettimuthu, P. Sadayappan, Ian T. Foster, Joel H. Saltz |
| 2008 | Using server-to-server communication in parallel file systems to simplify consistency and improve performance. Philip H. Carns, Bradley W. Settlemyer, Walter B. Ligon III |
| 2008 | Wide-area performance profiling of 10GigE and InfiniBand technologies. Nageswara S. V. Rao, Weikuan Yu, William R. Wing, Stephen W. Poole, Jeffrey S. Vetter |