| 2017 | 0.5 petabyte simulation of a 45-qubit quantum circuit. Thomas Häner, Damian S. Steiger |
| 2017 | 18.9-Pflops nonlinear earthquake simulation on Sunway TaihuLight: enabling depiction of 18-Hz and 8-meter scenarios. Haohuan Fu, Conghui He, Bingwei Chen, Zekun Yin, Zhenguo Zhang, Wenqiang Zhang, Tingjian Zhang, Wei Xue, Weiguo Liu, Wanwang Yin, Guangwen Yang, Xiaofei Chen |
| 2017 | A comparative study of SDN and adaptive routing on dragonfly networks. Peyman Faizian, Md Atiqul Mollah, Zhou Tong, Xin Yuan, Michael Lang |
| 2017 | A configurable rule based classful token bucket filter network request scheduler for the lustre file system. Yingjin Qian, Xi Li, Shuichi Ihara, Lingfang Zeng, Jürgen Kaiser, Tim Süß, André Brinkmann |
| 2017 | A framework for scalable biophysics-based image analysis. Amir Gholami, Andreas Mang, Klaudius Scheufele, Christos Davatzikos, Miriam Mehl, George Biros |
| 2017 | An efficient MPI/openMP parallelization of the Hartree-Fock method for the second generation of Intel Vladimir A. Mironov, Yuri Alexeev, Kristopher Keipert, Michael D'Mello, Alexander A. Moskovsky, Mark S. Gordon |
| 2017 | CAPES: unsupervised storage performance tuning using neural network-based deep reinforcement learning. Yan Li, Kenneth Chang, Oceane Bel, Ethan L. Miller, Darrell D. E. Long |
| 2017 | Charliecloud: unprivileged containers for user-defined software stacks in HPC. Reid Priedhorsky, Tim Randles |
| 2017 | Control replication: compiling implicit parallelism to efficient SPMD with logical regions. Elliott Slaughter, Wonchan Lee, Sean Treichler, Wen Zhang, Michael Bauer, Galen M. Shipman, Patrick S. McCormick, Alex Aiken |
| 2017 | Correcting soft errors online in fast fourier transform. Xin Liang, Jieyang Chen, Dingwen Tao, Sihuan Li, Panruo Wu, Hongbo Li, Kaiming Ouyang, Yuanlai Liu, Fengguang Song, Zizhong Chen |
| 2017 | DataRaceBench: a benchmark suite for systematic evaluation of data race detection tools. Chunhua Liao, Pei-Hung Lin, Joshua Asplund, Markus Schordan, Ian Karlin |
| 2017 | Deep learning at 15PF: supervised and semi-supervised classification for scientific data. Thorsten Kurth, Jian Zhang, Nadathur Satish, Evan Racah, Ioannis Mitliagkas, Md. Mostofa Ali Patwary, Tareq M. Malas, Narayanan Sundaram, Wahid Bhimji, Mikhail Smorkalov, Jack Deslippe, Mikhail Shiryaev, Srinivas Sridharan, Prabhat, Pradeep Dubey |
| 2017 | Designing vector-friendly compact BLAS and LAPACK kernels. Kyungjoo Kim, Timothy B. Costa, Mehmet Deveci, Andrew M. Bradley, Simon D. Hammond, Murat Efe Guney, Sarah Knepper, Shane Story, Sivasankaran Rajamanickam |
| 2017 | Distributed southwell: an iterative method with low communication costs. Jordi Wolfson-Pou, Edmond Chow |
| 2017 | Efficient and scalable calculation of complex band structure using Sakurai-Sugiura method. Shigeru Iwase, Yasunori Futamura, Akira Imakura, Tetsuya Sakurai, Tomoya Ono |
| 2017 | Efficient process mapping in geo-distributed cloud data centers. Amelie Chi Zhou, Yifan Gong, Bingsheng He, Jidong Zhai |
| 2017 | Egeria: a framework for automatic synthesis of HPC advising tools through multi-layered natural language processing. Hui Guan, Xipeng Shen, Hamid Krim |
| 2017 | Embracing a new era of highly efficient and productive quantum Monte Carlo simulations. Amrita Mathuriya, Ye Luo, Raymond C. Clay III, Anouar Benali, Luke Shulenburger, Jeongnim Kim |
| 2017 | Experimental and analytical study of Xeon Phi reliability. Daniel Oliveira, Laércio Lima Pilla, Nathan DeBardeleben, Sean Blanchard, Heather Quinn, Israel Koren, Philippe O. A. Navaux, Paolo Rech |
| 2017 | Exploring and analyzing the real impact of modern on-package memory on HPC scientific kernels. Ang Li, Weifeng Liu, Mads Ruben Burgdorff Kristensen, Brian Vinter, Hao Wang, Kaixi Hou, Andres Marquez, Shuaiwen Leon Song |
| 2017 | Extreme scale multi-physics simulations of the tsunamigenic 2004 sumatra megathrust earthquake. Carsten Uphoff, Sebastian Rettenberger, Michael Bader, Elizabeth H. Madden, Thomas Ulrich, Stephanie Wollherr, Alice-Agnes Gabriel |
| 2017 | Failures in large scale systems: long-term measurement, analysis, and implications. Saurabh Gupta, Tirthak Patel, Christian Engelmann, Devesh Tiwari |
| 2017 | GPU triggered networking for intra-kernel communications. Michael LeBeane, Khaled Hamidouche, Brad Benton, Maurício Breternitz, Steven K. Reinhardt, Lizy K. John |
| 2017 | GUIDE: a scalable information directory service to collect, federate, and analyze logs for operational insights into a leadership HPC facility. Sudharshan S. Vazhkudai, Ross G. Miller, Devesh Tiwari, Christopher Zimmer, Feiyi Wang, Sarp Oral, Raghul Gunasekaran, Deryl Steinert |
| 2017 | Galactos: computing the anisotropic 3-point correlation function for 2 billion galaxies. Brian Friesen, Md. Mostofa Ali Patwary, Brian Austin, Nadathur Satish, Zachary Slepian, Narayanan Sundaram, Deborah Bard, Daniel J. Eisenstein, Jack Deslippe, Pradeep Dubey, Prabhat |
| 2017 | Geometry-oblivious FMM for compressing dense SPD matrices. Chenhan D. Yu, James Levitt, Severin Reiz, George Biros |
| 2017 | Gravel: fine-grain GPU-initiated network messages. Marc S. Orr, Shuai Che, Bradford M. Beckmann, Mark Oskin, Steven K. Reinhardt, David A. Wood |
| 2017 | Input-aware auto-tuning of compute-bound HPC kernels. Philippe Tillet, David D. Cox |
| 2017 | Large-scale adaptive mesh simulations through non-volatile byte-addressable memory. Bao Nguyen, Hua Tan, Xuechen Zhang |
| 2017 | Leveraging near data processing for high-performance checkpoint/restart. Abhinav Agrawal, Gabriel H. Loh, James Tuck |
| 2017 | LocoFS: a loosely-coupled metadata service for distributed file systems. Siyang Li, Youyou Lu, Jiwu Shu, Yang Hu, Tao Li |
| 2017 | Low communication FMM-accelerated FFT on GPUs. Cris Cecka |
| 2017 | Massively parallel 3D image reconstruction. Xiao Wang, Amit Sabne, Putt Sakdhnagool, Sherman J. Kisner, Charles A. Bouman, Samuel P. Midkiff |
| 2017 | Melissa: large scale in transit sensitivity analysis avoiding intermediate files. Théophile Terraz, Alejandro Ribés, Yvan Fournier, Bertrand Iooss, Bruno Raffin |
| 2017 | Obtaining dynamic scheduling policies with simulation and machine learning. Danilo Carastan-Santos, Raphael Y. de Camargo |
| 2017 | Optimizing geometric multigrid method computation using a DSL approach. Vinay Vasista, Kumudha Narasimhan, Siddharth Bhat, Uday Bondhugula |
| 2017 | Optimizing the query performance of block index through data analysis and I/O modeling. Tzu-Hsien Wu, Jerry Chi-Yuan Chou, Shyng Hao, Bin Dong, Scott Klasky, Kesheng Wu |
| 2017 | PapyrusKV: a high-performance parallel key-value store for distributed NVM architectures. Jungwon Kim, Seyong Lee, Jeffrey S. Vetter |
| 2017 | Parastack: efficient hang detection for MPI programs at large scale. Hongbo Li, Zizhong Chen, Rajiv Gupta |
| 2017 | Performance modeling under resource constraints using deep transfer learning. Aniruddha Marathe, Rushil Anirudh, Nikhil Jain, Abhinav Bhatele, Jayaraman J. Thiagarajan, Bhavya Kailkhura, Jae-Seung Yeom, Barry Rountree, Todd Gamblin |
| 2017 | Predicting the performance impact of different fat-tree configurations. Nikhil Jain, Abhinav Bhatele, Louis H. Howell, David Böhme, Ian Karlin, Edgar A. León, Misbah Mubarak, Noah Wolfe, Todd Gamblin, Matthew L. Leininger |
| 2017 | Probabilistic guarantees of execution duration for Amazon spot instances. Rich Wolski, John Brevik, Ryan Chard, Kyle Chard |
| 2017 | Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2017, Denver, CO, USA, November 12 - 17, 2017 Bernd Mohr, Padma Raghavan |
| 2017 | REFINE: realistic fault injection via compiler-based instrumentation for accuracy, portability and speed. Giorgis Georgakoudis, Ignacio Laguna, Dimitrios S. Nikolopoulos, Martin Schulz |
| 2017 | Redesigning CAM-SE for peta-scale climate modeling performance and ultra-high resolution on Sunway TaihuLight. Haohuan Fu, Junfeng Liao, Nan Ding, Xiaohui Duan, Lin Gan, Yishuang Liang, Xinliang Wang, Jinzhe Yang, Yan Zheng, Weiguo Liu, Lanning Wang, Guangwen Yang |
| 2017 | Representative paths analysis. Nathan R. Tallent, Darren J. Kerbyson, Adolfy Hoisie |
| 2017 | Run-to-run variability on Xeon Phi based cray XC systems. Sudheer Chunduri, Kevin Harms, Scott Parker, Vitali A. Morozov, Samuel Oshin, Naveen Cherukuri, Kalyan Kumaran |
| 2017 | Scalable reduction collectives with data partitioning-based multi-leader design. Mohammadreza Bayatpour, Sourav Chakraborty, Hari Subramoni, Xiaoyi Lu, Dhabaleswar K. Panda |
| 2017 | Scaling betweenness centrality using communication-efficient sparse matrix multiplication. Edgar Solomonik, Maciej Besta, Flavio Vella, Torsten Hoefler |
| 2017 | Scaling deep learning on GPU and knights landing clusters. Yang You, Aydin Buluç, James Demmel |
| 2017 | Scientific user behavior and data-sharing trends in a petascale file system. Seung-Hwan Lim, Hyogi Sim, Raghul Gunasekaran, Sudharshan S. Vazhkudai |
| 2017 | ScrubJay: deriving knowledge from the disarray of HPC performance data. Alfredo Giménez, Todd Gamblin, Abhinav Bhatele, Chad Wood, Kathleen Shoga, Aniruddha Marathe, Peer-Timo Bremer, Bernd Hamann, Martin Schulz |
| 2017 | Securing HPC: development of a low cost, open source multi-factor authentication infrastructure. W. Cyrus Proctor, Patrick Storm, Matthew R. Hanlon, Nathaniel Mendoza |
| 2017 | Sympiler: transforming sparse matrix codes by decoupling symbolic analysis. Kazem Cheshmi, Shoaib Kamil, Michelle Mills Strout, Maryam Mehri Dehnavi |
| 2017 | Tagit: an integrated indexing and search service for file systems. Hyogi Sim, Youngjae Kim, Sudharshan S. Vazhkudai, Geoffroy R. Vallée, Seung-Hwan Lim, Ali Raza Butt |
| 2017 | Tessellating stencils. Liang Yuan, Yunquan Zhang, Peng Guo, Shan Huang |
| 2017 | Topology-aware GPU scheduling for learning workloads in cloud environments. Marcelo Amaral, Jordà Polo, David Carrera, Seetharami R. Seelam, Malgorzata Steinder |
| 2017 | Toward standardized near-data processing with unrestricted data placement for GPUs. Gwangsun Kim, Niladrish Chatterjee, Mike O'Connor, Kevin Hsieh |
| 2017 | Towards fine-grained dynamic tuning of HPC applications on modern multi-core architectures. Mohammed Sourouri, Espen Birger Raknes, Nico Reissmann, Johannes Langguth, Daniel Hackenberg, Robert Schöne, Per Gunnar Kjeldsberg |
| 2017 | Transactional NVM cache with high performance and crash consistency. Qingsong Wei, Chundong Wang, Cheng Chen, Yechao Yang, Jun Yang, Mingdi Xue |
| 2017 | Understanding error propagation in deep learning neural network (DNN) accelerators and applications. Guanpeng Li, Siva Kumar Sastry Hari, Michael B. Sullivan, Timothy Tsai, Karthik Pattabiraman, Joel S. Emer, Stephen W. Keckler |
| 2017 | Understanding object-level memory access patterns across the spectrum. Xu Ji, Chao Wang, Nosayba El-Sayed, Xiaosong Ma, Youngjae Kim, Sudharshan S. Vazhkudai, Wei Xue, Daniel Sánchez |
| 2017 | Unimem: runtime data managementon non-volatile memory-based heterogeneous main memory. Kai Wu, Yingchao Huang, Dong Li |
| 2017 | Why is MPI so slow?: analyzing the fundamental limits in implementing MPI-3.1. Ken Raffenetti, Abdelhalim Amer, Lena Oden, Charles Archer, Wesley Bland, Hajime Fujita, Yanfei Guo, Tomislav Janjusic, Dmitry Durnov, Michael Blocksome, Min Si, Sangmin Seo, Akhil Langer, Gengbin Zheng, Masamichi Takagi, Paul K. Coffman, Jithin Jose, Sayantan Sur, Alexander Sannikov, Sergey Oblomov, Michael Chuvelev, Masayuki Hatanaka, Xin Zhao, Paul F. Fischer, Thilina Rathnayake, Matthew Otten, Misun Min, Pavan Balaji |
| 2017 | sPIN: high-performance streaming processing in the network. Torsten Hoefler, Salvatore Di Girolamo, Konstantin Taranov, Ryan E. Grant, Ron Brightwell |