| 2016 | An Analytical Model-Based Auto-tuning Framework for Locality-Aware Loop Scheduling. Rengan Xu, Sunita Chandrasekaran, Xiaonan Tian, Barbara M. Chapman |
| 2016 | An Efficient Parallel Load-Balancing Framework for Orthogonal Decomposition of Geometrical Data. Bruno R. C. Magalhães, Farhan Tauheed, Thomas Heinis, Anastasia Ailamaki, Felix Schürmann |
| 2016 | AutoMOMML: Automatic Multi-objective Modeling with Machine Learning. Prasanna Balaprakash, Ananta Tiwari, Stefan M. Wild, Laura Carrington, Paul D. Hovland |
| 2016 | Comparing Runtime Systems with Exascale Ambitions Using the Parallel Research Kernels. Rob F. Van der Wijngaart, Abdullah Kayi, Jeff R. Hammond, Gabriele Jost, Tom St. John, Srinivas Sridharan, Timothy G. Mattson, John Abercrombie, Jacob Nelson |
| 2016 | Distributed Job Allocation for Large-Scale Manycores. Subramanian Ramachandran, Frank Mueller |
| 2016 | Dynamic Sparse-Matrix Allocation on GPUs. James King, Thomas Gilray, Robert M. Kirby, Matthew Might |
| 2016 | Efficiency of High Order Spectral Element Methods on Petascale Architectures. Maxwell Hutchinson, Alexander Heinecke, Hans Pabst, Greg Henry, Matteo Parsani, David E. Keyes |
| 2016 | Efficient and Predictable Group Communication for Manycore NoCs. Karthik Yagna, Onkar Patil, Frank Mueller |
| 2016 | High Order Seismic Simulations on the Intel Xeon Phi Processor (Knights Landing). Alexander Heinecke, Alexander Breuer, Michael Bader, Pradeep Dubey |
| 2016 | High Performance Computing - 31st International Conference, ISC High Performance 2016, Frankfurt, Germany, June 19-23, 2016, Proceedings Julian M. Kunkel, Pavan Balaji, Jack J. Dongarra |
| 2016 | INAM2: InfiniBand Network Analysis and Monitoring with MPI. Hari Subramoni, Albert Mathews Augustine, Mark Daniel Arnold, Jonathan L. Perkins, Xiaoyi Lu, Khaled Hamidouche, Dhabaleswar K. Panda |
| 2016 | Leveraging a Cluster-Booster Architecture for Brain-Scale Simulations. Pramod S. Kumbhar, Michael L. Hines, Aleksandr Ovcharenko, Damián A. Mallón, James Gonzalo King, Florentino Sainz, Felix Schürmann, Fabien Delalondre |
| 2016 | Many-Core Acceleration of a Discrete Ordinates Transport Mini-App at Extreme Scale. Tom Deakin, Simon McIntosh-Smith, Wayne P. Gaudin |
| 2016 | Mitigating MPI Message Matching Misery. Mario Flajslik, James Dinan, Keith D. Underwood |
| 2016 | Multi-versioning Performance Opportunities in BGAS System for Resilience. Nan Dun, Dirk Pleiter, Aiman Fang, Nicolas Vandenbergen, Andrew A. Chien |
| 2016 | OpenAtom: Scalable Ab-Initio Molecular Dynamics with Diverse Capabilities. Nikhil Jain, Eric J. Bohm, Eric Mikida, Subhasish Mandal, Minjung Kim, Prateek Jindal, Qi Li, Sohrab Ismail-Beigi, Glenn J. Martyna, Laxmikant V. Kalé |
| 2016 | Parallel Community Detection Algorithm Using a Data Partitioning Strategy with Pairwise Subdomain Duplication. Diana Palsetia, William Hendrix, Sunwoo Lee, Ankit Agrawal, Wei-keng Liao, Alok N. Choudhary |
| 2016 | Performance, Design, and Autotuning of Batched GEMM for GPUs. Ahmad Abdelfattah, Azzam Haidar, Stanimire Tomov, Jack J. Dongarra |
| 2016 | Predictive Modeling for Job Power Consumption in HPC Systems. Andrea Borghesi, Andrea Bartolini, Michele Lombardi, Michela Milano, Luca Benini |
| 2016 | Resource Management for Running HPC Applications in Container Clouds. Stephen Herbein, Ayush Dusia, Aaron Myles Landwehr, Sean McDaniel, José Monsalve Diaz, Yang Yang, Seetharami R. Seelam, Michela Taufer |
| 2016 | SPRITE: A Fast Parallel SNP Detection Pipeline. Vasudevan Rengasamy, Kamesh Madduri |
| 2016 | Scalability of Partial Differential Equations Preconditioner Resilient to Soft and Hard Faults. Karla Morris, Francesco Rizzi, Khachik Sargsyan, Kathryn Dahlgren, Paul Mycek, Cosmin Safta, Olivier P. Le Maître, Omar M. Knio, Bert J. Debusschere |
| 2016 | Supercomputing Centers and Electricity Service Providers: A Geographically Distributed Perspective on Demand Management in Europe and the United States. Tapasya Patki, Natalie J. Bates, Girish Ghatikar, Anders Clausen, Sonja Klingert, Ghaleb Abdulla, Mehdi Sheikhalishahi |
| 2016 | TCU: A Multi-Objective Hardware Thread Mapping Unit for HPC Clusters. Ravi Kumar Pujari, Thomas Wild, Andreas Herkersdorf |
| 2016 | TiDA: High-Level Programming Abstractions for Data Locality Management. Didem Unat, Tan Nguyen, Weiqun Zhang, Muhammed Nufail Farooqi, Burak Bastem, George Michelogiannakis, Ann S. Almgren, John Shalf |
| 2016 | Towards Machine Learning on the Automata Processor. Tommy Tracy II, Yao Fu, Indranil Roy, Eric Jonas, Paul Glendenning |