SC A

102 papers

YearTitle / Authors
2020A 1024-member ensemble data assimilation with 3.5-km mesh global weather simulations.
Hisashi Yashiro, Koji Terasaki, Yuta Kawai, Shuhei Kudo, Takemasa Miyoshi, Toshiyuki Imamura, Kazuo Minami, Hikaru Inoue, Tatsuo Nishiki, Takayuki Saji, Masaki Satoh, Hirofumi Tomita
2020A hierarchical and load-aware design for large message neighborhood collectives.
S. Mahdieh Ghazimirsaeed, Qinghua Zhou, Amit Ruhela, Mohammadreza Bayatpour
2020A parallel framework for constraint-based bayesian network learning via markov blanket discovery.
Ankit Srivastava, Sriram P. Chockalingam, Srinivas Aluru
2020A performance-portable nonhydrostatic atmospheric dycore for the energy exascale earth system model running at cloud-resolving resolutions.
Luca Bertagna, Oksana Guba, Mark A. Taylor, James G. Foucar, Jeff Larkin, Andrew M. Bradley, Sivasankaran Rajamanickam, Andrew G. Salinger
2020A submatrix-based method for approximate matrix function evaluation in the quantum chemistry code CP2K.
Michael Lass, Robert Schade, Thomas D. Kühne, Christian Plessl
2020ANT-man: towards agile power management in the microservice era.
Xiaofeng Hou, Chao Li, Jiacheng Liu, Lu Zhang, Yang Hu, Minyi Guo
2020Accelerating large-scale excited-state GW calculations on leadership HPC systems.
Mauro Del Ben, Charlene Yang, Zhenglu Li, Felipe H. da Jornada, Steven G. Louie, Jack Deslippe
2020Accelerating sparse DNN models without hardware-support via tile-wise sparsity.
Cong Guo, Bo Yang Hsueh, Jingwen Leng, Yuxian Qiu, Yue Guan, Zehuan Wang, Xiaoying Jia, Xipeng Li, Minyi Guo, Yuhao Zhu
2020Acceleration of fusion plasma turbulence simulations using the mixed-precision communication-avoiding krylov method.
Yasuhiro Idomura, Takuya Ina, Yussuf Ali, Toshiyuki Imamura
2020Alias-free, matrix-free,
Ammar Hakim, James Juno
2020Alita: comprehensive performance isolation through bias resource management for public clouds.
Quan Chen, Shuai Xue, Shang Zhao, Shanpei Chen, Yihao Wu, Yu Xu, Zhuo Song, Tao Ma, Yong Yang, Minyi Guo
2020An efficient and non-intrusive GPU scheduling framework for deep learning training systems.
Shaoqi Wang, Oscar J. Gonzalez, Xiaobo Zhou, Thomas Williams, Brian D. Friedman, Martin Havemann, Thomas Y. C. Woo
2020An in-depth analysis of the slingshot interconnect.
Daniele De Sensi, Salvatore Di Girolamo, Kim H. McMahon, Duncan Roweth, Torsten Hoefler
2020Architecture and performance studies of 3D-Hyper-FleX-LION for reconfigurable all-to-all HPC networks.
Gengchen Liu, Roberto Proietti, Marjan Fariborz, Pouya Fotouhi, Xian Xiao, S. J. Ben Yoo
2020BORA: a bag optimizer for robotic analysis.
Jian Zhang, Tao Xie, Yuzhuo Jing, Yanjie Song, Guanzhou Hu, Si Chen, Shu Yin
2020Batch: machine learning inference serving on serverless platforms with adaptive batching.
Ahsan Ali, Riccardo Pinciroli, Feng Yan, Evgenia Smirni
2020BiQGEMM: matrix multiplication with lookup table for binary-coding-based quantized DNNs.
Yongkweon Jeon, Baeseong Park, Se Jung Kwon, Byeongwook Kim, Jeongin Yun, Dongsoo Lee
2020C-SAW: a framework for graph sampling and random walk on GPUs.
Santosh Pandey, Lingda Li, Adolfy Hoisie, Xiaoye S. Li, Hang Liu
2020CAB-MPI: exploring interprocess work-stealing towards balanced MPI communication.
Kaiming Ouyang, Min Si, Atsushi Hori, Zizhong Chen, Pavan Balaji
2020CCAMP: an integrated translation and optimization framework for OpenACC and OpenMP.
Jacob Lambert, Seyong Lee, Jeffrey S. Vetter, Allen D. Malony
2020CRAC: checkpoint-restart architecture for CUDA with streams and UVM.
Twinkle Jain, Gene Cooperman
2020Cell-list based molecular dynamics on many-core processors: a case study on sunway TaihuLight supercomputer.
Xiaohui Duan, Ping Gao, Meng Zhang, Tingjian Zhang, Hongsong Meng, Yuxuan Li, Bertil Schmidt, Haohuan Fu, Lin Gan, Wei Xue, Weiguo Liu, Guangwen Yang
2020Chronicles of astra: challenges and lessons from the first petascale arm supercomputer.
Kevin T. Pedretti, Andrew J. Younge, Simon D. Hammond, James H. Laros III, Matthew L. Curry, Michael J. Aguilar, Robert J. Hoekstra, Ron Brightwell
2020Co-design for A64FX manycore processor and "Fugaku".
Mitsuhisa Sato, Yutaka Ishikawa, Hirofumi Tomita, Yuetsu Kodama, Tetsuya Odajima, Miwako Tsuji, Hisashi Yashiro, Masaki Aoki, Naoyuki Shida, Ikuo Miyoshi, Kouichi Hirai, Atsushi Furuya, Akira Asato, Kuniki Morita, Toshiyuki Shimizu
2020Compiler-based timing for extremely fine-grain preemptive parallelism.
Souradip Ghosh, Michael Cuevas, Simone Campanoni, Peter A. Dinda
2020Compiling generalized histograms for GPU.
Troels Henriksen, Sune Hellfritzsch, Ponnuswamy Sadayappan, Cosmin E. Oancea
2020Convolutional neural network training with distributed K-FAC.
J. Gregory Pauloski, Zhao Zhang, Lei Huang, Weijia Xu, Ian T. Foster
2020Cost-aware prediction of uncorrected DRAM errors in the field.
Isaac Boixaderas, Darko Zivanovic, Sergi Moré, Javier Bartolome, David Vicente, Marc Casas, Paul M. Carpenter, Petar Radojkovic, Eduard Ayguadé
2020Density matrix quantum circuit simulation via the BSP machine on modern GPU clusters.
Ang Li, Omer Subasi, Xiu Yang, Sriram Krishnamoorthy
2020Distributed many-to-many protein sequence alignment using sparse matrices.
Oguz Selvitopi, Saliya Ekanayake, Giulia Guidi, Georgios A. Pavlopoulos, Ariful Azad, Aydin Buluç
2020Distributed-memory DMRG via sparse and dense parallel tensor contractions.
Ryan Levy, Edgar Solomonik, Bryan K. Clark
2020Distributed-memory parallel symmetric nonnegative matrix factorization.
Srinivas Eswar, Koby Hayashi, Grey Ballard, Ramakrishnan Kannan, Richard W. Vuduc, Haesun Park
2020DrCCTProf: a fine-grained call path profiler for ARM-based clusters.
Qidong Zhao, Xu Liu, Milind Chabbi
2020Efficient 2D tensor network simulation of quantum systems.
Yuchen Pang, Tianyi Hao, Annika Dugad, Yiqing Zhou, Edgar Solomonik
2020Efficient tiled sparse matrix multiplication through matrix signatures.
Süreyya Emre Kurt, Aravind Sukumaran-Rajam, Fabrice Rastello, P. Sadayappan
2020Evaluation of a minimally synchronous algorithm for 2: 1 octree balance.
Hansol Suh, Tobin Isaac
2020Experimental evaluation of NISQ quantum computers: error measurement, characterization, and implications.
Tirthak Patel, Abhay Potharaju, Baolin Li, Rohan Basu Roy, Devesh Tiwari
2020Fast stencil-code computation on a wafer-scale processor.
Kamil Rocki, Dirk Van Essendelft, Ilya Sharapov, Robert Schreiber, Michael Morrison, Vladimir Kibardin, Andrey Portnoy, Jean-Francois Dietiker, Madhava Syamlal, Michael James
2020FatPaths: routing in supercomputers and data centers when shortest paths fall short.
Maciej Besta, Marcel Schneider, Marek Konieczny, Karolina Cynk, Erik Henriksson, Salvatore Di Girolamo, Ankit Singla, Torsten Hoefler
2020FeatGraph: a flexible and efficient backend for graph neural network systems.
Yuwei Hu, Zihao Ye, Minjie Wang, Jiali Yu, Da Zheng, Mu Li, Zheng Zhang, Zhiru Zhang, Yida Wang
2020Foresight: analysis that matters for data reduction.
Pascal Grosset, Christopher M. Biwer, Jesus Pulido, Arvind T. Mohan, Ayan Biswas, John Patchett, Terece L. Turton, David H. Rogers, Daniel Livescu, James P. Ahrens
2020GE-SpMM: general-purpose sparse matrix-matrix multiplication on GPUs for graph neural networks.
Guyue Huang, Guohao Dai, Yu Wang, Huazhong Yang
2020GEMS: GPU-enabled memory-aware model-parallelism system for distributed DNN training.
Arpan Jain, Ammar Ahmad Awan, Asmaa M. Aljuhani, Jahanzeb Maqbool Hashmi, Quentin G. Anthony, Hari Subramoni, Dhabaleswar K. Panda, Raghu Machiraju, Anil Parwani
2020GPU lifetimes on titan supercomputer: survival analysis and reliability.
George Ostrouchov, Don Maxwell, Rizwan A. Ashraf, Christian Engelmann, Mallikarjun Shankar, James H. Rogers
2020GPU-trident: efficient modeling of error propagation in GPU programs.
Abdul Rehman Anwer, Guanpeng Li, Karthik Pattabiraman, Michael B. Sullivan, Timothy Tsai, Siva Kumar Sastry Hari
2020GVProf: a value profiler for GPU-based clusters.
Keren Zhou, Yueming Hao, John M. Mellor-Crummey, Xiaozhu Meng, Xu Liu
2020GraphPi: high performance graph pattern matching through effective redundancy elimination.
Tianhui Shi, Mingshu Zhai, Yi Xu, Jidong Zhai
2020HPC I/O throughput bottleneck analysis with explainable local models.
Mihailo Isakov, Eliakin Del Rosario, Sandeep Madireddy, Prasanna Balaprakash, Philip H. Carns, Robert B. Ross, Michel A. Kinsy
2020Herring: rethinking the parameter server at scale for the cloud.
Indu Thangakrishnan, Derya Cavdar, Can Karakus, Piyush Ghai, Yauheni Selivonchyk, Cory Pruce
2020High-performance parallel graph coloring with strong guarantees on work, depth, and quality.
Maciej Besta, Armon Carigiet, Kacper Janda, Zur Vonarburg-Shmaria, Lukas Gianinazzi, Torsten Hoefler
2020INEC: fast and coherent in-network erasure coding.
Haiyang Shi, Xiaoyi Lu
2020Improving all-to-many personalized communication in two-phase I/O.
Qiao Kang, Robert B. Ross, Robert Latham, Sunwoo Lee, Ankit Agrawal, Alok N. Choudhary, Wei-keng Liao
2020Iris: allocation banking and identity and access management for the exascale era.
Gabor Torok, Mark R. Day, Rebecca Hartman-Baker, Cory Snavely
2020Job characteristics on large-scale systems: long-term analysis, quantification, and implications.
Tirthak Patel, Zhengchun Liu, Raj Kettimuthu, Paul Rich, William E. Allcock, Devesh Tiwari
2020Kraken: memory-efficient continual learning for large-scale real-time recommendations.
Minhui Xie, Kai Ren, Youyou Lu, Guangxu Yang, Qingxing Xu, Bihai Wu, Jiazhen Lin, Hongbo Ao, Wanhong Xu, Jiwu Shu
2020Live forensics for HPC systems: a case study on distributed storage systems.
Saurabh Jha, Shengkun Cui, Subho S. Banerjee, Tianyin Xu, Jeremy Enos, Mike Showerman, Zbigniew T. Kalbarczyk, Ravishankar K. Iyer
2020Massive parallelization for finding shortest lattice vectors based on ubiquity generator framework.
Nariaki Tateiwa, Yuji Shinano, Satoshi Nakamura, Akihiro Yoshida, Shizuo Kaji, Masaya Yasuda, Katsuki Fujisawa
2020MeshfreeFlowNet: a physics-constrained deep continuous space-time super-resolution framework.
Chiyu Max Jiang, Soheil Esmaeilzadeh, Kamyar Azizzadenesheli, Karthik Kashinath, Mustafa Mustafa, Hamdi A. Tchelepi, Philip Marcus, Prabhat, Anima Anandkumar
2020Metis: learning to schedule long-running applications in shared container clusters at scale.
Luping Wang, Qizhen Weng, Wei Wang, Chen Chen, Bo Li
2020MoHA: a composable system for efficient in-situ analytics on heterogeneous HPC systems.
Haoyuan Xing, Gagan Agrawal, Rajiv Ramnath
2020Multi-node multi-GPU diffeomorphic image registration for large-scale imaging problems.
Malte Brunn, Naveen Himthani, George Biros, Miriam Mehl, Andreas Mang
2020Newton-ADMM: a distributed GPU-accelerated optimizer for multiclass classification problems.
Chih-Hao Fang, Sudhir B. Kylasa, Fred Roosta, Michael W. Mahoney, Ananth Grama
2020OMPRacer: a scalable and precise static race detector for OpenMP programs.
Bradley Swain, Yanze Li, Peiming Liu, Ignacio Laguna, Giorgis Georgakoudis, Jeff Huang
2020Optimizing deep learning recommender systems training on CPU cluster architectures.
Dhiraj D. Kalamkar, Evangelos Georganas, Sudarshan Srinivasan, Jianping Chen, Mikhail Shiryaev, Alexander Heinecke
2020Pencil: a pipelined algorithm for distributed stencils.
Hengjie Wang, Aparna Chandramowlishwaran
2020Petascale XCT: 3D image reconstruction with hierarchical communications on multi-GPU nodes.
Mert Hidayetoglu, Tekin Bicer, Simon Garcia De Gonzalo, Bin Ren, Vincent De Andrade, Doga Gürsoy, Raj Kettimuthu, Ian T. Foster, Wen-mei W. Hwu
2020Preempt: scalable epidemic interventions using submodular optimization on multi-GPU systems.
Marco Minutoli, Prathyush Sambaturu, Mahantesh Halappanavar, Antonino Tumeo, Ananth Kalyanaraman, Anil Vullikanti
2020Preparing nuclear astrophysics for exascale.
Max P. Katz, Ann S. Almgren, Maria Barrios Sazo, Kiran Eiden, Kevin Gott, Alice Harpole, Jean M. Sexton, Donald E. Willcox, Weiqun Zhang, Michael Zingale
2020Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2020, Virtual Event / Atlanta, Georgia, USA, November 9-19, 2020
Christine Cuicchi, Irene Qualters, William T. Kramer
2020Processing full-scale square kilometre array data on the summit supercomputer.
Ruonan Wang, Rodrigo Tobar, Markus Dolensky, Tao An, Andreas Wicenec, Chen Wu, Fred Dulwich, Norbert Podhorszki, Valentine Anantharaj, Eric Suchyta, Bao-qiang Lao, Scott Klasky
2020Pushing the limit of molecular dynamics with
Weile Jia, Han Wang, Mohan Chen, Denghui Lu, Lin Lin, Roberto Car, Weinan E, Linfeng Zhang
2020RDMP-KV: designing remote direct memory persistence based key-value stores with PMEM.
Tianxi Li, Dipti Shankar, Shashank Gugnani, Xiaoyi Lu
2020RLScheduler: an automated HPC batch job scheduler using reinforcement learning.
Di Zhang, Dong Dai, Youbiao He, Forrest Sheng Bao, Bing Xie
2020Recurrent neural network architecture search for geophysical emulation.
Romit Maulik, Romain Egele, Bethany Lusch, Prasanna Balaprakash
2020Reducing communication in graph neural network training.
Alok Tripathy, Katherine A. Yelick, Aydin Buluç
2020Rocket: efficient and scalable all-pairs computations on heterogeneous platforms.
Stijn Heldens, Pieter Hijma, Ben van Werkhoven, Jason Maassen, Henri E. Bal, Rob van Nieuwpoort
2020Runtime-guided ECC protection using online estimation of memory vulnerability.
Luc Jaulmes, Miquel Moretó, Mateo Valero, Mattan Erez, Marc Casas
2020SEFEE: lightweight storage error forecasting in large-scale enterprise storage systems.
Amirhessam Yazdi, Xing Lin, Lei Yang, Feng Yan
2020ScalAna: automating scaling loss detection with graph analysis.
Yuyang Jin, Haojie Wang, Teng Yu, Xiongchao Tang, Torsten Hoefler, Xu Liu, Jidong Zhai
2020Scalable heterogeneous execution of a coupled-cluster model with perturbative triples.
Jinsung Kim, Ajay Panyala, Bo Peng, Karol Kowalski, P. Sadayappan, Sriram Krishnamoorthy
2020Scalable knowledge graph analytics at 136 petaflop/s.
Ramakrishnan Kannan, Piyush Sao, Hao Lu, Drahomira Herrmannova, Vijay Thakkar, Robert M. Patton, Richard W. Vuduc, Thomas E. Potok
2020Scalable yet rigorous floating-point error analysis.
Arnab Das, Ian Briggs, Ganesh Gopalakrishnan, Sriram Krishnamoorthy, Pavel Panchekha
2020Scaling distributed deep learning workloads beyond the memory capacity with KARMA.
Mohamed Wahib, Haoyu Zhang, Truong Thao Nguyen, Aleksandr Drozd, Jens Domke, Lingqi Zhang, Ryousei Takano, Satoshi Matsuoka
2020Scaling the hartree-fock matrix build on summit.
Giuseppe M. J. Barca, David L. Poole, Jorge L. Galvez Vallejo, Melisa Alkan, Colleen Bertoni, Alistair P. Rendell, Mark S. Gordon
2020SegAlign: a scalable GPU-based whole genome aligner.
Sneha D. Goenka, Yatish Turakhia, Benedict Paten, Mark Horowitz
2020Smart-PGSim: using neural network to accelerate AC-OPF power grid simulation.
Wenqian Dong, Zhen Xie, Gokcen Kestor, Dong Li
2020SpTFS: sparse tensor format selection for MTTKRP via deep learning.
Qingxiao Sun, Yi Liu, Ming Dun, Hailong Yang, Zhongzhi Luan, Lin Gan, Guangwen Yang, Depei Qian
2020Sparse GPU kernels for deep learning.
Trevor Gale, Matei Zaharia, Cliff Young, Erich Elsen
2020Speeding up SpMV for power-law graph analytics by enhancing locality & vectorization.
Serif Yesil, Azin Heidarshenas, Adam Morrison, Josep Torrellas
2020TAGO: rethinking routing design in high performance reconfigurable networks.
Min Yee Teh, Yu-Han Hung, George Michelogiannakis, Shijia Yan, Madeleine Glick, John Shalf, Keren Bergman
2020TOSS-2020: a commodity software stack for HPC.
Edgar A. León, Trent D'Hooge, Nathan Hanford, Ian Karlin, Ramesh Pankajakshan, Jim Foraker, Chris Chambreau, Matthew L. Leininger
2020Taming I/O variation on QoS-less HPC storage: what can applications do?
Zhenbo Qiao, Qing Liu, Norbert Podhorszki, Scott Klasky, Jieyang Chen
2020Task bench: a parameterized benchmark for evaluating parallel runtime performance.
Elliott Slaughter, Wei Wu, Yuankun Fu, Legend Brandenburg, Nicolai Garcia, Wilhem Kautz, Emily Marx, Kaleb S. Morris, Qinglei Cao, George Bosilca, Seema Mirchandaney, Wonchan Lee, Sean Treichler, Patrick S. McCormick, Alex Aiken
2020Term quantization: furthering quantization at run time.
Hsiang-Tsung Kung, Bradley McDanel, Sai Qian Zhang
2020Toward realization of numerical towing-tank tests by wall-resolved large eddy simulation based on 32 billion grid finite-element computation.
Chisachi Kato, Yoshinobu Yamade, Katsuhiro Nagano, Kiyoshi Kumahata, Kazuo Minami, Tatsuo Nishikawa
2020Tuning floating-point precision using dynamic program information and temporal locality.
Hugo Brunie, Costin Iancu, Khaled Z. Ibrahim, Philip Brisk, Brandon Cook
2020Veritas: accurately estimating the correct output on noisy intermediate-scale quantum computers.
Tirthak Patel, Devesh Tiwari
2020Waiting game: optimally provisioning fixed resources for cloud-enabled schedulers.
Pradeep Ambati, Noman Bashir, David Irwin, Prashant J. Shenoy
2020ZeRO: memory optimizations toward training trillion parameter models.
Samyam Rajbhandari, Jeff Rasley, Olatunji Ruwase, Yuxiong He
2020ZeroSpy: exploring software inefficiency with redundant zeros.
Xin You, Hailong Yang, Zhongzhi Luan, Depei Qian, Xu Liu
2020fBLAS: streaming linear algebra on FPGA.
Tiziano De Matteis, Johannes de Fine Licht, Torsten Hoefler
2020pLiner: isolating lines of floating-point code for compiler-induced variability.
Hui Guo, Ignacio Laguna, Cindy Rubio-González