| 2015 | A case for application-oblivious energy-efficient MPI runtime. Akshay Venkatesh, Abhinav Vishnu, Khaled Hamidouche, Nathan R. Tallent, Dhabaleswar K. Panda, Darren J. Kerbyson, Adolfy Hoisie |
| 2015 | A kernel-independent FMM in general dimensions. William B. March, Bo Xiao, Sameer Tharakan, Chenhan D. Yu, George Biros |
| 2015 | A parallel connectivity algorithm for de Bruijn graphs in metagenomic applications. Patrick Flick, Chirag Jain, Tony Pan, Srinivas Aluru |
| 2015 | A practical approach to reconciling availability, performance, and capacity in provisioning extreme-scale storage systems. Lipeng Wan, Feiyi Wang, Sarp Oral, Devesh Tiwari, Sudharshan S. Vazhkudai, Qing Cao |
| 2015 | A work-efficient algorithm for parallel unordered depth-first search. Umut A. Acar, Arthur Charguéraud, Mike Rainey |
| 2015 | Adaptive and transparent cache bypassing for GPUs. Ang Li, Gert-Jan van den Braak, Akash Kumar, Henk Corporaal |
| 2015 | Adaptive data placement for staging-based coupled scientific workflows. Qian Sun, Tong Jin, Melissa Romanus, Hoang Bui, Fan Zhang, Hongfeng Yu, Hemanth Kolla, Scott Klasky, Jacqueline Chen, Manish Parashar |
| 2015 | An elegant sufficiency: load-aware differentiated scheduling of data transfers. Rajkumar Kettimuthu, Gayane Vardoyan, Gagan Agrawal, P. Sadayappan, Ian T. Foster |
| 2015 | An extreme-scale implicit solver for complex PDEs: highly heterogeneous flow in earth's mantle. Johann Rudi, A. Cristiano I. Malossi, Tobin Isaac, Georg Stadler, Michael Gurnis, Peter W. J. Staar, Yves Ineichen, Costas Bekas, Alessandro Curioni, Omar Ghattas |
| 2015 | An input-adaptive and in-place approach to dense tensor-times-matrix multiply. Jiajia Li, Casey Battaglino, Ioakeim Perros, Jimeng Sun, Richard W. Vuduc |
| 2015 | AnalyzeThis: an analysis workflow-aware storage system. Hyogi Sim, Youngjae Kim, Sudharshan S. Vazhkudai, Devesh Tiwari, Ali Anwar, Ali Raza Butt, Lavanya Ramakrishnan |
| 2015 | Analyzing and mitigating the impact of manufacturing variability in power-constrained supercomputing. Yuichi Inadomi, Tapasya Patki, Koji Inoue, Mutsumi Aoyagi, Barry Rountree, Martin Schulz, David K. Lowenthal, Yasutaka Wada, Keiichiro Fukazawa, Masatsugu Ueda, Masaaki Kondo, Ikuo Miyoshi |
| 2015 | Automatic sharing classification and timely push for cache-coherent systems. Malek Musleh, Vijay S. Pai |
| 2015 | BD-CATS: big data clustering at trillion particle scale. Md. Mostofa Ali Patwary, Surendra Byna, Nadathur Rajagopalan Satish, Narayanan Sundaram, Zarija Lukic, Vadim Roytershteyn, Michael J. Anderson, Yushu Yao, Prabhat, Pradeep Dubey |
| 2015 | Big omics data experience. Patricia H. Kovatch, Anthony Costa, Zachary Giles, Eugene Fluder, Hyung Min Cho, Svetlana Mazurkova |
| 2015 | Bridging OpenCL and CUDA: a comparative analysis and translation. Junghyun Kim, Thanh Tuan Dao, Jaehoon Jung, Jinyoung Joo, Jaejin Lee |
| 2015 | CIVL: the concurrency intermediate verification language. Stephen F. Siegel, Manchun Zheng, Ziqing Luo, Timothy K. Zirkel, Andre V. Marianiello, John G. Edenhofner, Matthew B. Dwyer, Michael S. Rogers |
| 2015 | CilkSpec: optimistic concurrency for Cilk. Shaizeen Aga, Sriram Krishnamoorthy, Satish Narayanasamy |
| 2015 | Clock delta compression for scalable order-replay of non-deterministic parallel applications. Kento Sato, Dong H. Ahn, Ignacio Laguna, Gregory L. Lee, Martin Schulz |
| 2015 | Cost-effective diameter-two topologies: analysis and evaluation. Georgios Kathareios, Cyriel Minkenberg, Bogdan Prisacari, Germán Rodríguez, Torsten Hoefler |
| 2015 | Data partitioning strategies for graph workloads on heterogeneous clusters. Michael LeBeane, Shuang Song, Reena Panda, Jee Ho Ryoo, Lizy K. John |
| 2015 | Dynamic power sharing for higher job throughput. Daniel A. Ellsworth, Allen D. Malony, Barry Rountree, Martin Schulz |
| 2015 | ELF: maximizing memory-level parallelism for GPUs with coordinated warp and fetch scheduling. Jason Jong Kyu Park, Yongjun Park, Scott A. Mahlke |
| 2015 | Efficient implementation of quantum materials simulations on distributed CPU-GPU systems. Raffaele Solcà, Anton Kozhevnikov, Azzam Haidar, Stanimire Tomov, Jack J. Dongarra, Thomas C. Schulthess |
| 2015 | Elastic job bundling: an adaptive resource request strategy for large-scale parallel applications. Feng Liu, Jon B. Weissman |
| 2015 | Energy-aware data transfer algorithms. Ismail Alan, Engin Arslan, Tevfik Kosar |
| 2015 | Engineering inhibitory proteins with InSiPS: the in-silico protein synthesizer. Andrew Schoenrock, Daniel J. Burnside, Houman Moteshareie, Alex Wong, Ashkan Golshani, Frank Dehne |
| 2015 | Enterprise: breadth-first graph traversal on GPUs. Hang Liu, H. Howie Huang |
| 2015 | Exploiting asynchrony from exact forward recovery for DUE in iterative solvers. Luc Jaulmes, Marc Casas, Miquel Moretó, Eduard Ayguadé, Jesús Labarta, Mateo Valero |
| 2015 | Exploring network optimizations for large-scale graph analytics. Xinyu Que, Fabio Checconi, Fabrizio Petrini, Xing Liu, Daniele Buono |
| 2015 | Fault tolerant MapReduce-MPI for HPC clusters. Yanfei Guo, Wesley Bland, Pavan Balaji, Xiaobo Zhou |
| 2015 | Finding the limits of power-constrained application performance. Peter E. Bailey, Aniruddha Marathe, David K. Lowenthal, Barry Rountree, Martin Schulz |
| 2015 | Frugal ECC: efficient and versatile memory error protection through fine-grained compression. Jungrae Kim, Michael B. Sullivan, Seong-Lyong Gong, Mattan Erez |
| 2015 | Full correlation matrix analysis of fMRI data on Intel® Xeon Phi™ coprocessors. Yida Wang, Michael J. Anderson, Jonathan D. Cohen, Alexander Heinecke, Kai Li, Nadathur Satish, Narayanan Sundaram, Nicholas B. Turk-Browne, Theodore L. Willke |
| 2015 | GossipMap: a distributed community detection algorithm for billion-edge directed graphs. Seung-Hee Bae, Bill Howe |
| 2015 | GraphBIG: understanding graph computing in the context of industrial solutions. Lifeng Nai, Yinglong Xia, Ilie Gabriel Tanase, Hyesoon Kim, Ching-Yung Lin |
| 2015 | GraphReduce: processing large-scale graphs on accelerator-based systems. Dipanjan Sengupta, Shuaiwen Leon Song, Kapil Agarwal, Karsten Schwan |
| 2015 | High-performance algebraic multigrid solver optimized for multi-core based distributed parallel systems. Jongsoo Park, Mikhail Smelyanskiy, Ulrike Meier Yang, Dheevatsa Mudigere, Pradeep Dubey |
| 2015 | HipMer: an extreme-scale de novo genome assembler. Evangelos Georganas, Aydin Buluç, Jarrod Chapman, Steven A. Hofmeyr, Chaitanya Aluru, Rob Egan, Leonid Oliker, Daniel Rokhsar, Katherine A. Yelick |
| 2015 | HydraDB: a resilient RDMA-driven key-value middleware for in-memory cluster computing. Yandong Wang, Li Zhang, Jian Tan, Min Li, Yuqing Gao, Xavier Guerin, Xiaoqiao Meng, Shicong Meng |
| 2015 | IOrchestra: supporting high-performance data-intensive applications in the cloud via collaborative virtualization. Ron Chi-Lung Chiang, H. Howie Huang, Timothy Wood, Changbin Liu, Oliver Spatscheck |
| 2015 | Implicit nonlinear wave simulation with 1.08T DOF and 0.270T unstructured finite elements to enhance comprehensive earthquake simulation. Tsuyoshi Ichimura, Kohei Fujita, Pher Errol Balde Quinay, Lalith Maddegedara, Muneo Hori, Seizo Tanaka, Yoshihisa Shizawa, Hiroshi Kobayashi, Kazuo Minami |
| 2015 | Improving backfilling by using machine learning to predict running times. Éric Gaussier, David Glesser, Valentin Reis, Denis Trystram |
| 2015 | Improving concurrency and asynchrony in multithreaded MPI applications using software offloading. Karthikeyan Vaidyanathan, Dhiraj D. Kalamkar, Kiran Pamnany, Jeff R. Hammond, Pavan Balaji, Dipankar Das, Jongsoo Park, Bálint Joó |
| 2015 | Improving the scalability of the ocean barotropic solver in the community earth system model. Yong Hu, Xiaomeng Huang, Allison H. Baker, Yu-heng Tseng, Frank O. Bryan, John M. Dennis, Guangwen Yang |
| 2015 | Large-scale compute-intensive analysis via a combined in-situ and co-scheduling workflow approach. Christopher M. Sewell, Katrin Heitmann, Hal Finkel, George Zagaris, Suzanne Parete-Koon, Patricia K. Fasel, Adrian Pope, Nicholas Frontiere, Li-Ta Lo, O. E. Bronson Messer, Salman Habib, James P. Ahrens |
| 2015 | Local recovery and failure masking for stencil-based applications at extreme scales. Marc Gamell, Keita Teranishi, Michael A. Heroux, Jackson R. Mayo, Hemanth Kolla, Jacqueline Chen, Manish Parashar |
| 2015 | Mantle: a programmable metadata load balancer for the ceph file system. Michael A. Sevilla, Noah Watkins, Carlos Maltzahn, Ike Nassi, Scott A. Brandt, Sage A. Weil, Greg Farnum, Sam Fineberg |
| 2015 | Massively parallel models of the human circulatory system. Amanda Randles, Erik W. Draeger, Tomas Oppelstrup, Liam Krauss, John A. Gunnels |
| 2015 | Massively parallel phase-field simulations for ternary eutectic directional solidification. Martin Bauer, Johannes Hötzer, Marcus Jainta, Philipp Steinmetz, Marco Berghoff, Florian Schornbaum, Christian Godenschwager, Harald Köstler, Britta Nestler, Ulrich Rüde |
| 2015 | Memory access patterns: the missing piece of the multi-GPU puzzle. Tal Ben-Nun, Ely Levy, Amnon Barak, Eri Rubin |
| 2015 | Monetary cost optimizations for MPI-based HPC applications on Amazon clouds: checkpoints and replicated execution. Yifan Gong, Bingsheng He, Amelie Chi Zhou |
| 2015 | Multi-objective job placement in clusters. Sergey Blagodurov, Alexandra Fedorova, Evgeny Vinnik, Tyler Dwyer, Fabien Hermenier |
| 2015 | Network endpoint congestion control for fine-grained communication. Nan Jiang, Larry R. Dennison, William J. Dally |
| 2015 | Node variability in large-scale power measurements: perspectives from the Green500, Top500 and EEHPCWG. Thomas Scogland, Jonathan Azose, David Rohr, Suzanne Rivoire, Natalie J. Bates, Daniel Hackenberg |
| 2015 | Optimal scheduling of in-situ analysis for large-scale scientific simulations. Preeti Malakar, Venkatram Vishwanath, Todd S. Munson, Christopher Knight, Mark Hereld, Sven Leyffer, Michael E. Papka |
| 2015 | PGX.D: a fast distributed graph processing engine. Sungpack Hong, Siegfried Depner, Thomas Manhardt, Jan Van Der Lugt, Merijn Verstraaten, Hassan Chafi |
| 2015 | Parallel distributed memory construction of suffix and longest common prefix arrays. Patrick Flick, Srinivas Aluru |
| 2015 | Parallel implementation and performance optimization of the configuration-interaction method. Hongzhang Shan, Samuel Williams, Calvin W. Johnson, Kenneth S. McElvain, W. Erich Ormand |
| 2015 | Particle tracking in open simulation laboratories. Kalin Kanov, Randal C. Burns |
| 2015 | Performance of random sampling for computing low-rank approximations of a dense matrix on GPUs. Théo Mary, Ichitaro Yamazaki, Jakub Kurzak, Piotr Luszczek, Stanimire Tomov, Jack J. Dongarra |
| 2015 | Performance optimization for the k-nearest neighbors kernel on x86 architectures. Chenhan D. Yu, Jianyu Huang, Woody Austin, Bo Xiao, George Biros |
| 2015 | Practical scalable consensus for pseudo-synchronous distributed systems. Thomas Hérault, Aurélien Bouteiller, George Bosilca, Marc Gamell, Keita Teranishi, Manish Parashar, Jack J. Dongarra |
| 2015 | Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2015, Austin, TX, USA, November 15-20, 2015 Jackie Kern, Jeffrey S. Vetter |
| 2015 | Profile-based power shifting in interconnection networks with on/off links. Shinobu Miwa, Hiroshi Nakamura |
| 2015 | Pushing back the limit of Mauro Calderara, Sascha Brück, Andreas Pedersen, Mohammad H. Bani-Hashemian, Joost VandeVondele, Mathieu Luisier |
| 2015 | Randomized algorithms to update partial singular value decomposition on a hybrid CPU/GPU cluster. Ichitaro Yamazaki, Jakub Kurzak, Piotr Luszczek, Jack J. Dongarra |
| 2015 | Recovering logical structure from Charm++ event traces. Katherine E. Isaacs, Abhinav Bhatele, Jonathan Lifflander, David Böhme, Todd Gamblin, Martin Schulz, Bernd Hamann, Peer-Timo Bremer |
| 2015 | Regent: a high-productivity programming language for HPC with logical regions. Elliott Slaughter, Wonchan Lee, Sean Treichler, Michael Bauer, Alex Aiken |
| 2015 | Relative debugging for a highly parallel hybrid computer system. Luiz De Rose, Andrew Gontarek, Aaron Vose, Robert Moench, David Abramson, Minh Ngoc Dinh, Chao Jin |
| 2015 | Reliability lessons learned from GPU experience with the Titan supercomputer at Oak Ridge leadership computing facility. Devesh Tiwari, Saurabh Gupta, George Gallarno, Jim Rogers, Don Maxwell |
| 2015 | Runtime-driven shared last-level cache management for task-parallel programs. Abhisek Pan, Vijay S. Pai |
| 2015 | STELLA: a domain-specific tool for structured grid methods in weather and climate models. Tobias Gysi, Carlos Osuna, Oliver Fuhrer, Mauro Bianco, Thomas C. Schulthess |
| 2015 | STS-k: a multilevel sparse triangular solution scheme for NUMA multicores. Humayun Kabir, Joshua Dennis Booth, Guillaume Aupy, Anne Benoit, Yves Robert, Padma Raghavan |
| 2015 | ScaAnalyzer: a tool to identify memory scalability bottlenecks in parallel programs. Xu Liu, Bo Wu |
| 2015 | Scalable sparse tensor decompositions in distributed memory systems. Oguz Kaya, Bora Uçar |
| 2015 | Scaling iterative graph computations with GraphMap. Kisung Lee, Ling Liu, Karsten Schwan, Calton Pu, Qi Zhang, Yang Zhou, Emre Yigitoglu, Pingpeng Yuan |
| 2015 | Scientific benchmarking of parallel computing systems: twelve ways to tell the masses when reporting performance results. Torsten Hoefler, Roberto Belli |
| 2015 | Smart: a MapReduce-like framework for in-situ scientific analytics. Yi Wang, Gagan Agrawal, Tekin Bicer, Wei Jiang |
| 2015 | The Spack package manager: bringing order to HPC software chaos. Todd Gamblin, Matthew P. LeGendre, Michael R. Collette, Gregory L. Lee, Adam Moody, Bronis R. de Supinski, Scott Futral |
| 2015 | The in-silico lab-on-a-chip: petascale and high-throughput simulations of microfluidics at cell resolution. Diego Rossinelli, Yu-Hang Tang, Kirill Lykov, Dmitry Alexeev, Massimo Bernaschi, Panagiotis E. Hadjidoukas, Mauro Bisson, Wayne Joubert, Christian Conti, George E. Karniadakis, Massimiliano Fatica, Igor Pivkin, Petros Koumoutsakos |
| 2015 | Understanding the propagation of transient errors in HPC applications. Rizwan A. Ashraf, Roberto Gioiosa, Gokcen Kestor, Ronald F. DeMara, Chen-Yong Cher, Pradip Bose |
| 2015 | VOCL-FT: introducing techniques for efficient soft error coprocessor recovery. Antonio J. Peña, Wesley Bland, Pavan Balaji |