| 2010 | 190 TFlops Astrophysical N-body Simulation on a Cluster of GPUs. Tsuyoshi Hamada, Keigo Nitadori |
| 2010 | 3.5-D Blocking Optimization for Stencil Computations on Modern CPUs and GPUs. Anthony D. Nguyen, Nadathur Satish, Jatin Chhugani, Changkyu Kim, Pradeep Dubey |
| 2010 | A Block-Oriented Language and Runtime System for Tensor Algebra with Very Large Arrays. Beverly A. Sanders, Rodney J. Bartlett, Erik Deumens, Victor Lotrich, Mark Ponton |
| 2010 | A Flexible Reservation Algorithm for Advance Network Provisioning. Mehmet Balman, Evangelos Chaniotakis, Arie Shoshani, Alex Sim |
| 2010 | A Multi-Scale Heart Simulation on Massively Parallel Computers. Akira Hosoi, Takumi Washio, Jun-ichi Okada, Yoshimasa Kadooka, Kengo Nakajima, Toshiaki Hisada |
| 2010 | A Parallel Implementation of Electron-Phonon Scattering in Nanoelectronic Devices up to 95k Cores. Mathieu Luisier |
| 2010 | A Scalable and Distributed Dynamic Formal Verifier for MPI Programs. Anh Vo, Sriram Aananthakrishnan, Ganesh Gopalakrishnan, Bronis R. de Supinski, Martin Schulz, Greg Bronevetsky |
| 2010 | Accelerating I/O Forwarding in IBM Blue Gene/P Systems. Venkatram Vishwanath, Mark Hereld, Kamil Iskra, Dries Kimpe, Vitali A. Morozov, Michael E. Papka, Robert B. Ross, Kazutomo Yoshii |
| 2010 | An 80-Fold Speedup, 15.0 TFlops Full GPU Acceleration of Non-Hydrostatic Weather Model ASUCA Production Code. Takashi Shimokawabe, Takayuki Aoki, Chiashi Muroi, Junichi Ishida, Kohei Kawano, Toshio Endo, Akira Nukada, Naoya Maruyama, Satoshi Matsuoka |
| 2010 | An Adaptive Framework for Simulation and Online Remote Visualization of Critical Climate Applications in Resource-constrained Environments. Preeti Malakar, Vijay Natarajan, Sathish S. Vadhiyar |
| 2010 | Automatic Run-time Parallelization and Transformation of I/O. Thorvald Natvig, Anne C. Elster, Jan Christian Meyer |
| 2010 | CPM in CMPs: Coordinated Power Management in Chip-Multiprocessors. Asit K. Mishra, Shekhar Srikantaiah, Mahmut T. Kandemir, Chita R. Das |
| 2010 | Characterizing the Influence of System Noise on Large-Scale Applications by Simulation. Torsten Hoefler, Timo Schneider, Andrew Lumsdaine |
| 2010 | Circuit-Switched Memory Access in Photonic Interconnection Networks for High-Performance Embedded Computing. Gilbert Hendry, Eric Robinson, Vitaliy Gleyzer, Johnnie Chan, Luca P. Carloni, Nadya Travinin Bliss, Keren Bergman |
| 2010 | Combined Iterative and Model-driven Optimization in an Automatic Parallelization Framework. Louis-Noël Pouchet, Uday Bondhugula, Cédric Bastoul, Albert Cohen, J. Ramanujam, P. Sadayappan |
| 2010 | Conference on High Performance Computing Networking, Storage and Analysis, SC 2010, New Orleans, LA, USA, November 13-19, 2010 |
| 2010 | DASH: a Recipe for a Flash-based Data Intensive Supercomputer. Jiahua He, Arun Jagatheesan, Sandeep K. S. Gupta, Jeffrey Bennett, Allan Snavely |
| 2010 | Data Sharing Options for Scientific Workflows on Amazon EC2. Gideon Juve, Ewa Deelman, Karan Vahi, Gaurang Mehta, G. Bruce Berriman, Benjamin P. Berman, Philip Maechling |
| 2010 | Design, Modeling, and Evaluation of a Scalable Multi-level Checkpointing System. Adam Moody, Greg Bronevetsky, Kathryn M. Mohror, Bronis R. de Supinski |
| 2010 | Diagnosis, Tuning, and Redesign for Multicore Performance: A Case Study of the Fast Multipole Method. Aparna Chandramowlishwaran, Kamesh Madduri, Richard W. Vuduc |
| 2010 | Direct Numerical Simulation of Particulate Flows on 294912 Processor Cores. Jan Götz, Klaus Iglberger, Markus Stürmer, Ulrich Rüde |
| 2010 | Elastic Cloud Caches for Accelerating Service-Oriented Computations. David Chiu, Apeksha Shetty, Gagan Agrawal |
| 2010 | Experiences with a Lightweight Supercomputer Kernel: Lessons Learned from Blue Gene's CNK. Mark Giampapa, Thomas Gooding, Todd Inglett, Robert W. Wisniewski |
| 2010 | Exploiting 162-Nanosecond End-to-End Communication Latency on Anton. Ron O. Dror, J. P. Grossman, Kenneth M. Mackenzie, Brian Towles, Edmond Chow, John K. Salmon, Cliff Young, Joseph A. Bank, Brannon Batson, Martin M. Deneroff, Jeffrey Kuskin, Richard H. Larson, Mark A. Moraes, David E. Shaw |
| 2010 | Exploring a Novel Gathering Method for Finite Element Codes on the Cell/B.E. Architecture. Mohammad Jowkar, Raúl de la Cruz, José María Cela |
| 2010 | Extreme-Scale AMR. Carsten Burstedde, Omar Ghattas, Michael Gurnis, Tobin Isaac, Georg Stadler, Tim Warburton, Lucas C. Wilcox |
| 2010 | Fast PGAS Implementation of Distributed Graph Algorithms. Guojing Cong, George Almási, Vijay A. Saraswat |
| 2010 | FlowChecker: Detecting Bugs in MPI Libraries via Message Flow Checking. Zhezhe Chen, Qi Gao, Wenbin Zhang, Feng Qin |
| 2010 | Functional Partitioning to Optimize End-to-End Performance on Many-core Architectures. Min Li, Sudharshan S. Vazhkudai, Ali Raza Butt, Fei Meng, Xiaosong Ma, Youngjae Kim, Christian Engelmann, Galen M. Shipman |
| 2010 | Hierarchical Diagonal Blocking and Precision Reduction Applied to Combinatorial Multigrid. Guy E. Blelloch, Ioannis Koutis, Gary L. Miller, Kanat Tangwongsan |
| 2010 | IOrchestrator: Improving the Performance of Multi-node I/O Systems via Inter-Server Coordination. Xuechen Zhang, Kei Davis, Song Jiang |
| 2010 | JAWS: Job-Aware Workload Scheduling for the Exploration of Turbulence Simulations. Xiaodan Wang, Eric A. Perlman, Randal C. Burns, Tanu Malik, Tamas Budavari, Charles Meneveau, Alexander S. Szalay |
| 2010 | Managing Variability in the IO Performance of Petascale Storage Systems. Jay F. Lofstead, Fang Zheng, Qing Liu, Scott Klasky, Ron A. Oldfield, Todd Kordenbrock, Karsten Schwan, Matthew Wolf |
| 2010 | Multiscale Simulation of Cardiovascular flows on the IBM Bluegene/P: Full Heart-Circulation System at Red-Blood Cell Resolution. Amanda Peters Randles, Simone Melchionna, Efthimios Kaxiras, Jonas Lätt, Joy K. Sircar, Massimo Bernaschi, Mauro Bisson, Sauro Succi |
| 2010 | Multithreaded Asynchronous Graph Traversal for In-Memory and Semi-External Memory. Roger A. Pearce, Maya B. Gokhale, Nancy M. Amato |
| 2010 | On-Chip Network Evaluation Framework. Hanjoon Kim, Seulki Heo, Junghoon Lee, Jaehyuk Huh, John Kim |
| 2010 | OpenMPC: Extended OpenMP Programming and Tuning for GPUs. Seyong Lee, Rudolf Eigenmann |
| 2010 | Optimal Utilization of Heterogeneous Resources for Biomolecular Simulations. Scott S. Hampton, Sadaf R. Alam, Paul S. Crozier, Pratul K. Agarwal |
| 2010 | Overlapping Methods of All-to-All Communication and FFT Algorithms for Torus-Connected Massively Parallel Supercomputers. Jun Doi, Yasushi Negishi |
| 2010 | Parallel Fast Gauss Transform. Rahul S. Sampath, Hari Sundar, Shravan K. Veerapaneni |
| 2010 | Parallelizing the QUDA Library for Multi-GPU Calculations in Lattice Quantum Chromodynamics. Ronald Babich, Michael A. Clark, Bálint Joó |
| 2010 | PerfExpert: An Easy-to-Use Performance Diagnosis Tool for HPC Applications. Martin Burtscher, Byoung-Do Kim, Jeffrey R. Diamond, John D. McCalpin, Lars Koesterke, James C. Browne |
| 2010 | Petascale Direct Numerical Simulation of Blood Flow on 200K Cores and Heterogeneous Architectures. Abtin Rahimian, Ilya Lashuk, Shravan K. Veerapaneni, Aparna Chandramowlishwaran, Dhairya Malhotra, Logan Moon, Rahul S. Sampath, Aashay Shringarpure, Jeffrey S. Vetter, Richard W. Vuduc, Denis Zorin, George Biros |
| 2010 | Power-Aware Consolidation of Scientific Workflows in Virtualized Environments. Qian Zhu, Jiedan Zhu, Gagan Agrawal |
| 2010 | Reducing Cache Pollution Through Detection and Elimination of Non-Temporal Memory Accesses. Andreas Sandberg, David Eklov, Erik Hagersten |
| 2010 | Scalable Earthquake Simulation on Petascale Supercomputers. Yifeng Cui, Kim B. Olsen, Thomas H. Jordan, Kwangyoon Lee, Jun Zhou, Patrick Small, Daniel Roten, Geoffrey Ely, Dhabaleswar K. Panda, Amit Chourasia, John M. Levesque, Steven M. Day, Philip Maechling |
| 2010 | Scalable Graph Exploration on Multicore Processors. Virat Agarwal, Fabrizio Petrini, Davide Pasetto, David A. Bader |
| 2010 | Scalable Identification of Load Imbalance in Parallel Executions Using Call Path Profiles. Nathan R. Tallent, Laksono Adhianto, John M. Mellor-Crummey |
| 2010 | Scalable Tile Communication-Avoiding QR Factorization on Multicore Cluster Systems. Fengguang Song, Hatem Ltaief, Bilel Hadri, Jack J. Dongarra |
| 2010 | Scaling Hierarchical N-body Simulations on GPU Clusters. Pritish Jetley, Lukasz Wesolowski, Filippo Gioachin, Laxmikant V. Kalé, Thomas R. Quinn |
| 2010 | Simple but Effective Heterogeneous Main Memory with On-Chip Memory Controller Support. Xiangyu Dong, Yuan Xie, Naveen Muralimanohar, Norman P. Jouppi |
| 2010 | Size Matters: Space/Time Tradeoffs to Improve GPGPU Applications Performance. Abdullah Gharaibeh, Matei Ripeanu |
| 2010 | Strider: Runtime Support for Optimizing Strided Data Accesses on Multi-Cores with Explicitly Managed Memories. Jae-Seung Yeom, Dimitrios S. Nikolopoulos |
| 2010 | The 48-core SCC Processor: the Programmer's View. Timothy G. Mattson, Michael Riepen, Thomas Lehnig, Paul Brett, Werner Haas, Patrick Kennedy, Jason Howard, Sriram R. Vangal, Nitin Borkar, Gregory Ruhl, Saurabh Dighe |
| 2010 | The Sharing Tracker: Using Ideas from Cache Coherence Hardware to Reduce Off-Chip Memory Traffic with Non-Coherent Caches. David Tarjan, Kevin Skadron |
| 2010 | Toward First Principles Electronic Structure Simulations of Excited States and Strong Correlations in Nano- and Materials Science. Anton Kozhevnikov, Adolfo G. Eguiluz, Thomas C. Schulthess |
| 2010 | Understanding the Impact of Emerging Non-Volatile Memories on High-Performance, IO-Intensive Computing. Adrian M. Caulfield, Joel Coburn, Todor I. Mollov, Arup De, Ameen Akel, Jiahua He, Arun Jagatheesan, Rajesh K. Gupta, Allan Snavely, Steven Swanson |
| 2010 | vSnoop: Improving TCP Throughput in Virtualized Environments via Acknowledgement Offload. Ardalan Kangarlou, Sahan Gamage, Ramana Rao Kompella, Dongyan Xu |