| 2012 | 4.45 Pflops astrophysical Tomoaki Ishiyama, Keigo Nitadori, Junichiro Makino |
| 2012 | A divide and conquer strategy for scaling weather simulations with multiple regions of interest. Preeti Malakar, Thomas George, Sameer Kumar, Rashmi Mittal, Vijay Natarajan, Yogish Sabharwal, Vaibhav Saxena, Sathish S. Vadhiyar |
| 2012 | A framework for low-communication 1-D FFT. Ping Tak Peter Tang, Jongsoo Park, Daehyun Kim, Vladimir Petrov |
| 2012 | A massively space-time parallel N-body solver. Robert Speck, Daniel Ruprecht, Rolf Krause, Matthew Emmett, Michael L. Minion, Mathias Winkel, Paul Gibbon |
| 2012 | A multi-objective auto-tuning framework for parallel codes. Herbert Jordan, Peter Thoman, Juan Jose Durillo Barrionuevo, Simone Pellegrini, Philipp Gschwandtner, Thomas Fahringer, Hans Moritsch |
| 2012 | A multithreaded algorithm for network alignment via approximate matching. Arif M. Khan, David F. Gleich, Alex Pothen, Mahantesh Halappanavar |
| 2012 | A new scalable parallel DBSCAN algorithm using the disjoint-set data structure. Md. Mostofa Ali Patwary, Diana Palsetia, Ankit Agrawal, Wei-keng Liao, Fredrik Manne, Alok N. Choudhary |
| 2012 | A parallel two-level preconditioner for cosmic microwave background map-making. Laura Grigori, Radek Stompor, Mikolaj Szydlarski |
| 2012 | A practical method for estimating performance degradation on multicore processors, and its application to HPC workloads. Tyler Dwyer, Alexandra Fedorova, Sergey Blagodurov, Mark Roth, Fabien Gaud, Jian Pei |
| 2012 | A scalable, numerically stable, high-performance tridiagonal solver using GPUs. Li-Wen Chang, John A. Stratton, Hee-Seok Kim, Wen-mei W. Hwu |
| 2012 | A study of DRAM failures in the field. Vilas Sridharan, Dean Liberty |
| 2012 | A study on data deduplication in HPC storage systems. Dirk Meister, Jürgen Kaiser, André Brinkmann, Toni Cortes, Michael Kuhn, Julian M. Kunkel |
| 2012 | ATLAS grid workload on NDGF resources: analysis, modeling, and workload generation. Dmytro Karpenko, Roman Vitenberg, Alexander L. Read |
| 2012 | Accelerating MapReduce on a coupled CPU-GPU architecture. Linchuan Chen, Xin Huo, Gagan Agrawal |
| 2012 | Alleviating scalability issues of checkpointing protocols. Rolf Riesen, Kurt B. Ferreira, Dilma Da Silva, Pierre Lemarinier, Dorian C. Arnold, Patrick G. Bridges |
| 2012 | Application data prefetching on the IBM blue gene/Q supercomputer. I-Hsin Chung, Changhoan Kim, Hui-Fang Wen, Guojing Cong |
| 2012 | Aspen: a domain specific language for performance modeling. Kyle Spafford, Jeffrey S. Vetter |
| 2012 | Automatic generation of software pipelines for heterogeneous parallel systems. Jacques A. Pienaar, Srimat T. Chakradhar, Anand Raghunathan |
| 2012 | Bamboo: translating MPI applications to a latency-tolerant, data-driven form. Tan Nguyen, Pietro Cicotti, Eric J. Bylaska, Dan Quinlan, Scott B. Baden |
| 2012 | Billion-particle SIMD-friendly two-point correlation on large-scale HPC cluster systems. Jatin Chhugani, Changkyu Kim, Hemant Shukla, Jongsoo Park, Pradeep Dubey, John Shalf, Horst D. Simon |
| 2012 | Breaking the speed and scalability barriers for graph exploration on distributed-memory machines. Fabio Checconi, Fabrizio Petrini, Jeremiah Willcock, Andrew Lumsdaine, Anamitra R. Choudhury, Yogish Sabharwal |
| 2012 | Byte-precision level of detail processing for variable precision analytics. John Jenkins, Eric R. Schendel, Sriram Lakshminarasimhan, David A. Boyuka II, Terry Rogers, Stéphane Ethier, Robert B. Ross, Scott Klasky, Nagiza F. Samatova |
| 2012 | Characterizing and mitigating work time inflation in task parallel programs. Stephen Olivier, Bronis R. de Supinski, Martin Schulz, Jan F. Prins |
| 2012 | Characterizing output bottlenecks in a supercomputer. Bing Xie, Jeffrey S. Chase, David Dillow, Oleg Drokin, Scott Klasky, Sarp Oral, Norbert Podhorszki |
| 2012 | Classifying soft error vulnerabilities in extreme-scale scientific applications using a binary instrumentation tool. Dong Li, Jeffrey S. Vetter, Weikuan Yu |
| 2012 | Code generation for parallel execution of a class of irregular loops on distributed memory systems. Mahesh Ravishankar, John Eisenlohr, Louis-Noël Pouchet, J. Ramanujam, Atanas Rountev, P. Sadayappan |
| 2012 | Combining in-situ and in-transit processing to enable extreme-scale scientific analysis. Janine Bennett, Hasan Abbasi, Peer-Timo Bremer, Ray W. Grout, Attila Gyulassy, Tong Jin, Scott Klasky, Hemanth Kolla, Manish Parashar, Valerio Pascucci, Philippe P. Pébay, David C. Thompson, Hongfeng Yu, Fan Zhang, Jacqueline Chen |
| 2012 | Communication avoiding and overlapping for numerical linear algebra. Evangelos Georganas, Jorge González-Domínguez, Edgar Solomonik, Yili Zheng, Juan Touriño, Katherine A. Yelick |
| 2012 | Communication-avoiding parallel strassen: implementation and performance. Benjamin Lipshitz, Grey Ballard, James Demmel, Oded Schwartz |
| 2012 | Compass: a scalable simulator for an architecture for cognitive computing. Robert Preissl, Theodore M. Wong, Pallab Datta, Myron Flickner, Raghavendra Singh, Steven K. Esser, William P. Risk, Horst D. Simon, Dharmendra S. Modha |
| 2012 | Compiler-directed file layout optimization for hierarchical storage systems. Wei Ding, Yuanrui Zhang, Mahmut T. Kandemir, Seung Woo Son |
| 2012 | Containment domains: a scalable, efficient, and flexible resilience scheme for exascale systems. Jinsuk Chung, Ikhwan Lee, Michael B. Sullivan, Jee Ho Ryoo, Dong-Wan Kim, Doe Hyun Yoon, Larry Kaplan, Mattan Erez |
| 2012 | Cost- and deadline-constrained provisioning for scientific workflow ensembles in IaaS clouds. Maciej Malawski, Gideon Juve, Ewa Deelman, Jarek Nabrzyski |
| 2012 | Cray cascade: a scalable HPC system based on a Dragonfly network. Greg Faanes, Abdulla Bataineh, Duncan Roweth, Tom Court, Edwin Froese, Robert Alverson, Tim Johnson, Joe Kopnick, Mike Higgins, James Reinhard |
| 2012 | Critical lock analysis: diagnosing critical section bottlenecks in multithreaded applications. Guancheng Chen, Per Stenström |
| 2012 | Data-intensive spatial filtering in large numerical simulation datasets. Kalin Kanov, Randal C. Burns, Gregory L. Eyink, Charles Meneveau, Alexander S. Szalay |
| 2012 | Dataflow-driven GPU performance projection for multi-kernel transformations. Jiayuan Meng, Vitali A. Morozov, Venkatram Vishwanath, Kalyan Kumaran |
| 2012 | Demonstrating lustre over a 100Gbps wide area network of 3, 500km. Robert Henschel, Stephen C. Simms, David Y. Hancock, Scott Michael, Tom Johnson, Nathan Heald, Thomas William, Donald K. Berry, Matthew Allen, Richard Knepper, Matt Davy, Matthew R. Link, Craig A. Stewart |
| 2012 | Design and analysis of data management in scalable parallel scripting. Zhao Zhang, Daniel S. Katz, Justin M. Wozniak, Allan Espinosa, Ian T. Foster |
| 2012 | Design and implementation of an intelligent end-to-end network QoS system. Sushant Sharma, Dimitrios Katramatos, Dantong Yu, Li Shi |
| 2012 | Design and modeling of a non-blocking checkpointing system. Kento Sato, Naoya Maruyama, Kathryn M. Mohror, Adam Moody, Todd Gamblin, Bronis R. de Supinski, Satoshi Matsuoka |
| 2012 | Design of a scalable InfiniBand topology service to enable network-topology-aware placement of processes. Hari Subramoni, Sreeram Potluri, Krishna Chaitanya Kandalla, Bill Barth, Jérôme Vienne, Jeff Keasler, Karen A. Tomko, Karl W. Schulz, Adam Moody, Dhabaleswar K. Panda |
| 2012 | Designing a unified programming model for heterogeneous machines. Michael Garland, Manjunath Kudlur, Yili Zheng |
| 2012 | Detection and correction of silent data corruption for large-scale high-performance computing. David Fiala, Frank Mueller, Christian Engelmann, Rolf Riesen, Kurt B. Ferreira, Ron Brightwell |
| 2012 | Direction-optimizing breadth-first search. Scott Beamer, Krste Asanovic, David A. Patterson |
| 2012 | Early evaluation of directive-based GPU programming models for productive exascale computing. Seyong Lee, Jeffrey S. Vetter |
| 2012 | Efficient and reliable network tomography in heterogeneous networks using BitTorrent broadcasts and clustering algorithms. Kiril Dichev, Fergal Reid, Alexey L. Lastovetsky |
| 2012 | Efficient backprojection-based synthetic aperture radar computation with many-core processors. Jongsoo Park, Ping Tak Peter Tang, Mikhail Smelyanskiy, Daehyun Kim, Thomas Benson |
| 2012 | Efficient data restructuring and aggregation for I/O acceleration in PIDX. Sidharth Kumar, Venkatram Vishwanath, Philip H. Carns, Joshua A. Levine, Robert Latham, Giorgio Scorzelli, Hemanth Kolla, Ray W. Grout, Robert B. Ross, Michael E. Papka, Jacqueline Chen, Valerio Pascucci |
| 2012 | Extending the BT NAS parallel benchmark to exascale computing. Rob F. Van der Wijngaart, Srinivas Sridharan, Victor W. Lee |
| 2012 | Extreme-scale UQ for Bayesian inverse problems governed by PDEs. Tan Bui-Thanh, Carsten Burstedde, Omar Ghattas, James Martin, Georg Stadler, Lucas C. Wilcox |
| 2012 | Fault prediction under the microscope: a closer look into HPC systems. Ana Gainaru, Franck Cappello, Marc Snir, William Kramer |
| 2012 | First-ever full observable universe simulation. Jean-Michel Alimi, Vincent Bouillot, Yann Rasera, Vincent Reverdy, Pier-Stefano Corasaniti, Irène Balmès, Stéphane Requena, Xavier Delaruelle, Jean-Noel Richet |
| 2012 | Forward and adjoint simulations of seismic wave propagation on emerging large-scale GPU architectures. Max Rietmann, Peter Messmer, Tarje Nissen-Meyer, Daniel Peter, Piero Basini, Dimitri Komatitsch, Olaf Schenk, Jeroen Tromp, Lapo Boschi, Domenico Giardini |
| 2012 | GRAPE-8: an accelerator for gravitational Junichiro Makino, Hiroshi Daisaka |
| 2012 | Hardware-software coherence protocol for the coexistence of caches and local memories. Lluc Alvarez, Lluís Vilanova, Marc González, Xavier Martorell, Nacho Navarro, Eduard Ayguadé |
| 2012 | Heuristic static load-balancing algorithm applied to the fragment molecular orbital method. Yuri Alexeev, Ashutosh Mahajan, Sven Leyffer, Graham Fletcher, Dmitri G. Fedorov |
| 2012 | Hierarchical task mapping of cell-based AMR cosmology simulations. Jingjin Wu, Zhiling Lan, Xuanxing Xiong, Nickolay Y. Gnedin, Andrey V. Kravtsov |
| 2012 | High performance RDMA-based design of HDFS over InfiniBand. Nusrat S. Islam, Md. Wasi-ur-Rahman, Jithin Jose, Raghunath Rajachandrasekar, Hao Wang, Hari Subramoni, Chet Murthy, Dhabaleswar K. Panda |
| 2012 | High performance radiation transport simulations: preparing for Titan. Christopher Baker, Gregory G. Davidson, Thomas M. Evans, Steven P. Hamilton, Joshua J. Jarrell, Wayne Joubert |
| 2012 | High throughput software for direct numerical simulations of compressible two-phase flows. Babak Hejazialhosseini, Diego Rossinelli, Christian Conti, Petros Koumoutsakos |
| 2012 | High-performance general solver for extremely large-scale semidefinite programming problems. Katsuki Fujisawa, Hitoshi Sato, Satoshi Matsuoka, Toshio Endo, Makoto Yamashita, Maho Nakata |
| 2012 | Host load prediction in a Google compute cloud with a Bayesian model. Sheng Di, Derrick Kondo, Walfredo Cirne |
| 2012 | Hybridizing S3D into an exascale application using OpenACC: an approach for moving to multi-petaflops and beyond. John M. Levesque, Ramanan Sankaran, Ray W. Grout |
| 2012 | Large-scale energy-efficient graph traversal: a path to efficient data-intensive supercomputing. Nadathur Satish, Changkyu Kim, Jatin Chhugani, Pradeep Dubey |
| 2012 | Legion: expressing locality and independence with logical regions. Michael Bauer, Sean Treichler, Elliott Slaughter, Alex Aiken |
| 2012 | Looking under the hood of the IBM blue gene/Q network. Dong Chen, Noel Eisley, Philip Heidelberger, Sameer Kumar, Amith R. Mamidala, Fabrizio Petrini, Robert M. Senger, Yutaka Sugawara, Robert Walkup, Burkhard D. Steinmacher-Burow, Anamitra R. Choudhury, Yogish Sabharwal, Swati Singhal, Jeffrey J. Parker |
| 2012 | MAGE: adaptive granularity and ECC for resilient and power efficient memory systems. Sheng Li, Doe Hyun Yoon, Ke Chen, Jishen Zhao, Jung Ho Ahn, Jay B. Brockman, Yuan Xie, Norman P. Jouppi |
| 2012 | MPI runtime error detection with MUST: advances in deadlock detection. Tobias Hilbrich, Joachim Protze, Martin Schulz, Bronis R. de Supinski, Matthias S. Müller |
| 2012 | Managing data-movement for effective shared-memory parallelization of out-of-core sparse solvers. Haim Avron, Anshul Gupta |
| 2012 | Mapping applications with collectives over sub-communicators on torus networks. Abhinav Bhatele, Todd Gamblin, Steve H. Langer, Peer-Timo Bremer, Erik W. Draeger, Bernd Hamann, Katherine E. Isaacs, Aaditya G. Landge, Joshua A. Levine, Valerio Pascucci, Martin Schulz, Charles H. Still |
| 2012 | Massively parallel X-ray scattering simulations. Abhinav Sarje, Xiaoye S. Li, Slim Chourou, Elaine R. Chan, Alexander Hexemer |
| 2012 | McrEngine: a scalable checkpointing system using data-aware aggregation and compression. Tanzima Zerin Islam, Kathryn M. Mohror, Saurabh Bagchi, Adam Moody, Bronis R. de Supinski, Rudolf Eigenmann |
| 2012 | Measuring interference between live datacenter applications. Melanie Kambadur, Tipp Moseley, Rick Hank, Martha A. Kim |
| 2012 | NUMA-aware graph mining techniques for performance and energy efficiency. Michael R. Frasca, Kamesh Madduri, Padma Raghavan |
| 2012 | Novel views of performance data to analyze large-scale adaptive applications. Abhinav Bhatele, Todd Gamblin, Katherine E. Isaacs, Brian T. N. Gunney, Martin Schulz, Peer-Timo Bremer, Bernd Hamann |
| 2012 | On distributed file tree walk of parallel file systems. Jharrod Lafon, Satyajayant Misra, Jon Bringhurst |
| 2012 | On the effectiveness of application-aware self-management for scientific discovery in volunteer computing systems. Trilce Estrada, Michela Taufer |
| 2012 | On using virtual circuits for GridFTP transfers. Zhengyang Liu, Malathi Veeraraghavan, Zhenzhen Yan, Chris Tracy, Jing Tie, Ian T. Foster, John M. Dennis, Jason Hick, Yee-Ting Li, W. Yang |
| 2012 | Optimization of geometric multigrid for emerging multi- and manycore processors. Samuel Williams, Dhiraj D. Kalamkar, Amik Singh, Anand M. Deshpande, Brian van Straalen, Mikhail Smelyanskiy, Ann S. Almgren, Pradeep Dubey, John Shalf, Leonid Oliker |
| 2012 | Optimization principles for collective neighborhood communications. Torsten Hoefler, Timo Schneider |
| 2012 | Optimizing fine-grained communication in a biomolecular simulation application on Cray XK6. Yanhua Sun, Gengbin Zheng, Chao Mei, Eric J. Bohm, James C. Phillips, Laximant V. Kalé, Terry R. Jones |
| 2012 | Optimizing overlay-based virtual networking through optimistic interrupts and cut-through forwarding. Zheng Cui, Lei Xia, Patrick G. Bridges, Peter A. Dinda, John R. Lange |
| 2012 | Optimizing the computation of n-point correlations on large-scale astronomical data. William B. March, Kenneth Czechowski, Marat Dukhan, Thomas Benson, Dongryeol Lee, Andrew J. Connolly, Richard W. Vuduc, Edmond Chow, Alexander G. Gray |
| 2012 | Parallel Bayesian network structure learning with application to gene networks. Olga Nikolova, Srinivas Aluru |
| 2012 | Parallel I/O, analysis, and visualization of a trillion particle simulation. Surendra Byna, Jerry Chi-Yuan Chou, Oliver Rübel, Prabhat, Homa Karimabadi, William S. Daughton, Vadim Roytershteyn, E. Wes Bethel, Mark Howison, Ke-Jou Hsu, Kuan-Wu Lin, Arie Shoshani, Andrew Uselton, Kesheng Wu |
| 2012 | Parallel geometric-algebraic multigrid on unstructured forests of octrees. Hari Sundar, George Biros, Carsten Burstedde, Johann Rudi, Omar Ghattas, Georg Stadler |
| 2012 | Parallel particle advection and FTLE computation for time-varying flow fields. Boonthanome Nouanesengsy, Teng-Yok Lee, Kewei Lu, Han-Wei Shen, Tom Peterka |
| 2012 | Parametric flows: automated behavior equivalencing for symbolic analysis of races in CUDA programs. Peng Li, Guodong Li, Ganesh Gopalakrishnan |
| 2012 | Patus for convenient high-performance stencils: evaluation in earthquake simulations. Matthias Christen, Olaf Schenk, Yifeng Cui |
| 2012 | Peta-scale lattice quantum chromodynamics on a blue gene/Q supercomputer. Jun Doi |
| 2012 | Portable section-level tuning of compiler parallelized applications. Dheya Mustafa, Rudolf Eigenmann |
| 2012 | Protocols for wide-area data-intensive applications: design and performance issues. Yufei Ren, Tan Li, Dantong Yu, Shudong Jin, Thomas G. Robertazzi, Brian Tierney, Eric Pouyoul |
| 2012 | RAMZzz: rank-aware dram power management with dynamic migrations and demotions. Donghong Wu, Bingsheng He, Xueyan Tang, Jianliang Xu, Minyi Guo |
| 2012 | SC Conference on High Performance Computing Networking, Storage and Analysis, SC '12, Salt Lake City, UT, USA - November 11 - 15, 2012 Jeffrey K. Hollingsworth |
| 2012 | SGI® UV2: a fused computation and data analysis machine. Greg Thorson, Michael Woodacre |
| 2012 | Scalable multi-GPU 3-D FFT for TSUBAME 2.0 supercomputer. Akira Nukada, Kento Sato, Satoshi Matsuoka |
| 2012 | Scalia: an adaptive scheme for efficient multi-cloud storage. Thanasis G. Papaioannou, Nicolas Bonvin, Karl Aberer |
| 2012 | The universe at extreme scale: multi-petaflop sky simulation on the BG/Q. Salman Habib, Vitali A. Morozov, Hal Finkel, Adrian Pope, Katrin Heitmann, Kalyan Kumaran, Tom Peterka, Joseph A. Insley, David Daniel, Patricia K. Fasel, Nicholas Frontiere, Zarija Lukic |
| 2012 | Tiling stencil computations to maximize parallelism. Vinayaka Bandishti, Irshad Pananilath, Uday Bondhugula |
| 2012 | Toward real-time modeling of human heart ventricles at cellular resolution: simulation of drug-induced arrhythmias. Arthur A. Mirin, David F. Richards, James N. Glosli, Erik W. Draeger, Bor Chan, Jean-Luc Fattebert, William D. Krauss, Tomas Oppelstrup, John Jeremy Rice, John A. Gunnels, Viatcheslav Gurev, Changhoan Kim, John Magerlein, Matthias Reumann, Hui-Fang Wen |
| 2012 | Unleashing the high-performance and low-power of multi-core DSPs for general-purpose HPC. Francisco D. Igual, Murtaza Ali, Arnon Friedmann, Eric Stotzer, Timothy Wentz, Robert A. van de Geijn |
| 2012 | Usage behavior of a large-scale scientific archive. Ian F. Adams, Brian A. Madden, Joel Cameron Frank, Mark W. Storer, Ethan L. Miller, Gene Harano |
| 2012 | ValuePack: value-based scheduling framework for CPU-GPU clusters. Vignesh T. Ravi, Michela Becchi, Gagan Agrawal, Srimat T. Chakradhar |
| 2012 | What scientific applications can benefit from hardware transactional memory? Martin Schindewolf, Barna L. Bihari, John C. Gyllenhaal, Martin Schulz, Amy Wang, Wolfgang Karl |