| 2021 | 2PGraph: Accelerating GNN Training over Large Graphs on GPU Clusters. Lizhi Zhang, Zhiquan Lai, Shengwei Li, Yu Tang, Feng Liu, Dongsheng Li |
| 2021 | A Conceptual Framework for HPC Operational Data Analytics. Alessio Netti, Woong Shin, Michael Ott, Torsten Wilde, Natalie J. Bates |
| 2021 | A Deep Learning-Based Particle-in-Cell Method for Plasma Simulations. Xavier Aguilar, Stefano Markidis |
| 2021 | A Dynamic Power Capping Library for HPC Applications. Sahil Sharma, Zhiling Lan, Xingfu Wu, Valerie Taylor |
| 2021 | A Generative Approach to Visualizing Satellite Data. Saptashwa Mitra, Daniel Rammer, Shrideep Pallickara, Sangmi Lee Pallickara |
| 2021 | A Roadmap to Robust Science for High-throughput Applications: The Developers' Perspective. Michela Taufer, Ewa Deelman, Rafael Ferreira da Silva, Trilce Estrada, Mary W. Hall, Miron Livny |
| 2021 | A Scalability Study of Data Exchange in HPC Multi-component Workflows. Jie Yin, Atsushi Hori, Balazs Gerofi, Yutaka Ishikawa |
| 2021 | A Transfer Learning Scheme for Time Series Forecasting Using Facebook Prophet. Menuka Warushavithana, Saptashwa Mitra, Mazdak Arabi, F. Jay Breidt, Sangmi Lee Pallickara, Shrideep Pallickara |
| 2021 | A memory bandwidth improvement with memory space partitioning for single-precision floating-point FFT on Stratix 10 FPGA. Takaaki Miyajima, Kentaro Sano |
| 2021 | A64FX - Your Compiler You Must Decide! Jens Domke |
| 2021 | A64FX performance: experience on Ookami. Md Abdullah Shahneous Bari, Barbara M. Chapman, Anthony Curtis, Robert J. Harrison, Eva Siegmann, Nikolay A. Simakov, Matthew D. Jones |
| 2021 | AMR-Net: Convolutional Neural Networks for Multi-resolution Steady Flow Prediction. Yuuichi Asahi, Sora Hatayama, Takashi Shimokawabe, Naoyuki Onodera, Yuta Hasegawa, Yasuhiro Idomura |
| 2021 | Accelerating DNN Architecture Search at Scale Using Selective Weight Transfer. Hongyuan Liu, Bogdan Nicolae, Sheng Di, Franck Cappello, Adwait Jog |
| 2021 | Accelerating GPU Message Communication for Autonomous Navigation Systems. Hao Wu, Jiangming Jin, Jidong Zhai, Yifan Gong, Wei Liu |
| 2021 | Accelerating advection for atmospheric modelling on Xilinx and Intel FPGAs. Nick Brown |
| 2021 | An Execution Fingerprint Dictionary for HPC Application Recognition. Thomas Jakobsche, Nicolas Lachiche, Aurélien Cavelan, Florina M. Ciorba |
| 2021 | An FPGA-based storage control with load balancing. Naoya Umezu, Yoshiki Yamaguchi, Taisuke Boku |
| 2021 | An Integrated Job Monitor, Analyzer and Predictor. Ashish Pal, Preeti Malakar |
| 2021 | Automatic Parallelisation of Sturctured Mesh Computations with SYCL. Gábor Dániel Balogh, István Z. Reguly |
| 2021 | Backfilling HPC Jobs with a Multimodal-Aware Predictor. Kenneth Lamar, Alexander V. Goponenko, Christina L. Peterson, Benjamin A. Allan, Jim M. Brandt, Damian Dechev |
| 2021 | Bellamy: Reusing Performance Models for Distributed Dataflow Jobs Across Contexts. Dominik Scheinert, Lauritz Thamsen, Houkun Zhu, Jonathan Will, Alexander Acker, Thorsten Wittkopp, Odej Kao |
| 2021 | Building A Fast and Efficient LSM-tree Store by Integrating Local Storage with Cloud Storage. Peng Xu, Nannan Zhao, Jiguang Wan, Wei Liu, Shuning Chen, Yuanhui Zhou, Hadeel Albahar, Hanyang Liu, Liu Tang, Changsheng Xie |
| 2021 | CASQ: Accelerate Distributed Deep Learning with Sketch-Based Gradient Quantization. Keshi Ge, Yiming Zhang, Yongquan Fu, Zhiquan Lai, Xiaoge Deng, Dongsheng Li |
| 2021 | CSWAP: A Self-Tuning Compression Framework for Accelerating Tensor Swapping in GPUs. Ping Chen, Shuibing He, Xuechen Zhang, Shuaiben Chen, Peiyi Hong, Yanlong Yin, Xian-He Sun, Gang Chen |
| 2021 | CVFCC: CV-Based Framework for Container Consolidation in Cloud Data Centers. Yuting Li, Yun Xu, Xuehai Zhou |
| 2021 | Characterizing Impacts of Storage Faults on HPC Applications: A Methodology and Insights. Bo Fang, Daoce Wang, Sian Jin, Quincey Koziol, Zhao Zhang, Qiang Guan, Suren Byna, Sriram Krishnamoorthy, Dingwen Tao |
| 2021 | Cluster of emerging technology: evaluation of a production HPC system based on A64FX. Fabio Banchelli, Kilian Peiro, Guillem Ramirez-Gargallo, Joan Vinyals, David Vicente, Marta Garcia-Gasulla, Filippo Mantovani |
| 2021 | Combining One-Sided Communications with Task-Based Programming Models. Kevin Sala, Sandra Macià, Vicenç Beltran |
| 2021 | Computational Storage to Increase the Analysis Capability of Tier-2 HEP Data Sites. Chen Zou, Andrew A. Chien, Robert W. Gardner, Ilija Vukotic |
| 2021 | Cooling the Data Center: Design of a Mechanical Controls Owner Project Requirements (OPR) Template. Stefan A. Robila, David Grant, Chris DePrater, Vali Sorell, Terry L. Rodgers, David Martinez, Shlomo Novotny |
| 2021 | DPZ: Improving Lossy Compression Ratio with Information Retrieval on Scientific Data. Jialing Zhang, Jiaxi Chen, Xiaoyan Zhuo, Aekyeung Moon, Seung Woo Son |
| 2021 | Daps: A Dynamic Asynchronous Progress Stealing Model for MPI Communication. Kaiming Ouyang, Min Si, Atsushi Hori, Zizhong Chen, Pavan Balaji |
| 2021 | Distributed Computation of Persistent Homology from Partitioned Big Data. Nicholas O. Malott, Rishi R. Verma, Rohit P. Singh, Philip A. Wilsey |
| 2021 | Distributed Work Stealing at Scale via Matchmaking. Hrushit Parikh, Vinit Deodhar, Ada Gavrilovska, Santosh Pande |
| 2021 | Dynamic and Adaptive Monitoring and Analysis for Many-task Ensemble Computing. Shantenu Jha, Allen D. Malony |
| 2021 | Early Evaluation of Fugaku A64FX Architecture Using Climate Workloads. Sarat Sreepathi, Mark Taylor |
| 2021 | Energy Efficiency Aspects of the AMD Zen 2 Architecture. Robert Schöne, Thomas Ilsche, Mario Bielert, Markus Velten, Markus Schmidl, Daniel Hackenberg |
| 2021 | Evaluation of SPEC CPU and SPEC OMP on the A64FX. Yuetsu Kodama, Masaaki Kondo, Mitsuhisa Sato |
| 2021 | Explicit uncore frequency scaling for energy optimisation policies with EAR in Intel architectures. Julita Corbalán, Oriol Vidal, Lluis Alonso, Jordi Aneas |
| 2021 | Exploring Autoencoder-based Error-bounded Compression for Scientific Data. Jinyang Liu, Sheng Di, Kai Zhao, Sian Jin, Dingwen Tao, Xin Liang, Zizhong Chen, Franck Cappello |
| 2021 | Exploring Node Connection Modes in Multi-Rail Fat-tree. Yuyang Wang, Fei Lei, Dezun Dong |
| 2021 | FIRESTARTER 2: Dynamic Code Generation for Processor Stress Tests. Robert Schöne, Markus Schmidl, Mario Bielert, Daniel Hackenberg |
| 2021 | FineQuery: Fine-Grained Query Processing on CPU-GPU Integrated Architectures. Dalin Wang, Feng Zhang, Weitao Wan, Hourun Li, Xiaoyong Du |
| 2021 | From Domain-Specific Languages to Memory-Optimized Accelerators for Fluid Dynamics. Karl F. A. Friebel, Stephanie Soldavini, Gerald Hempel, Christian Pilato, Jerónimo Castrillón |
| 2021 | HBM2 Memory System for HPC Applications on an FPGA. Norihisa Fujita, Ryohei Kobayashi, Yoshiki Yamaguchi, Taisuke Boku |
| 2021 | HFlow: A Dynamic and Elastic Multi-Layered I/O Forwarder. Jaime Cernuda Garcia, Hariharan Devarajan, Luke Logan, Keith Bateman, Neeraj Rajesh, Jie Ye, Anthony Kougkas, Xian-He Sun |
| 2021 | HNGraph: Parallel Graph Processing in Hybrid Memory Based NUMA Systems. Wei Liu, Haikun Liu, Xiaofei Liao, Hai Jin, Yu Zhang |
| 2021 | HPC AI500 V2.0: The Methodology, Tools, and Metrics for Benchmarking HPC AI Systems. Zihan Jiang, Wanling Gao, Fei Tang, Lei Wang, Xingwang Xiong, Chunjie Luo, Chuanxin Lan, Hongxiao Li, Jianfeng Zhan |
| 2021 | Halcyon: Unified HPC Center Operations. Kevin D. Colby, Shawn Rice |
| 2021 | Higgs Boson Classification: Brain-inspired BCPNN Learning with StreamBrain. Martin Svedin, Artur Podobas, Steven Wei Der Chien, Stefano Markidis |
| 2021 | Hybrid workflow of Simulation and Deep Learning on HPC: A Case Study for Material Behavior Determination. Li Zhong, Dennis Hoppe, Naweiluo Zhou, Oleksandr Shcherbakov |
| 2021 | IEEE International Conference on Cluster Computing, CLUSTER 2021, Portland, OR, USA, September 7-10, 2021 |
| 2021 | Incorporating Fault-Tolerance Awareness into System-Level Modeling and Simulation. Trokon Johnson, Herman Lam |
| 2021 | Lazy-WL: A Wear-aware Load Balanced Data Redistribution Method for Efficient SSD Array Scaling. Hanchen Guo, Zhehan Lin, Yunfei Gu, Chentao Wu, Li Jiang, Jie Li, Guangtao Xue, Minyi Guo |
| 2021 | Load Balancing Policies for Nested Fork-Join. Mia Reitz |
| 2021 | MONARCH: Hierarchical Storage Management for Deep Learning Frameworks. Marco Dantas, Diogo Leitão, Cláudia Correia, Ricardo Macedo, Weijia Xu, João Paulo |
| 2021 | Malleability Implementation in a MPI Iterative Method. Iker Martín-Álvarez, José Ignacio Aliaga, María Isabel Castillo, Rafael Mayo, Sergio Iserte |
| 2021 | MiniMod: A Modular Miniapplication Benchmarking Framework for HPC. W. Pepper Marts, Matthew G. F. Dosanjh, Scott Levy, Whit Schonbein, Ryan E. Grant, Patrick G. Bridges |
| 2021 | Modeling the Linux page cache for accurate simulation of data-intensive applications. Hoang-Dung Do, Valérie Hayot-Sasson, Rafael Ferreira da Silva, Christopher Steele, Henri Casanova, Tristan Glatard |
| 2021 | Monitoring Large Scale Supercomputers: A Case Study with the Lassen Supercomputer. Tapasya Patki, Adam Bertsch, Ian Karlin, Dong H. Ahn, Brian Van Essen, Barry Rountree, Bronis R. de Supinski, Nathan Besaw |
| 2021 | NUMA-aware I/O System Call Steering. Chan-Gyu Lee, Hyun-Wook Jin |
| 2021 | O(1) Communication for Distributed SGD through Two-Level Gradient Averaging. Subhadeep Bhattacharya, Weikuan Yu, Fahim Tahmid Chowdhury, Kathryn M. Mohror |
| 2021 | Octo-Tiger's New Hydro Module and Performance Using HPX+CUDA on ORNL's Summit. Patrick Diehl, Gregor Daiß, Dominic Marcello, Kevin A. Huck, Sagiv Shiber, Hartmut Kaiser, Juhan Frank, Geoffrey C. Clayton, Dirk Pflüger |
| 2021 | On-the-Fly, Robust Translation of MPI Libraries. Edgar A. León, Marc Joos, Nathan Hanford, Adrien Cotte, Tony Delforge, François Diakhaté, Vincent Ducrot, Ian Karlin, Marc Pérache |
| 2021 | Optimisation of an FPGA Credit Default Swap engine by embracing dataflow techniques. Nick Brown, Mark Klaisoongnoen, Oliver Thomson Brown |
| 2021 | Optimizing Barrier Synchronization on ARMv8 Many-Core Architectures. Wanrong Gao, Jianbin Fang, Chun Huang, Chuanfu Xu, Zheng Wang |
| 2021 | Optimizing Distributed Load Balancing for Workloads with Time-Varying Imbalance. Jonathan Lifflander, Nicole Lemaster Slattengren, Philippe P. Pébaÿ, Phil Miller, Francesco Rizzi, Matthew T. Bettencourt |
| 2021 | Optimizing Error-Bounded Lossy Compression for Scientific Data on GPUs. Jiannan Tian, Sheng Di, Xiaodong Yu, Cody Rivera, Kai Zhao, Sian Jin, Yunhe Feng, Xin Liang, Dingwen Tao, Franck Cappello |
| 2021 | Packet Forwarding Cache of Commodity Switches for Parallel Computers. Shoichi Hirasawa, Hayato Yamaki, Michihiro Koibuchi |
| 2021 | Parallel I/O Evaluation Techniques and Emerging HPC Workloads: A Perspective. Sarah Neuwirth, Arnab Kumar Paul |
| 2021 | Performance Evaluation and Analysis of A64FX many-core Processor for the Fiber Miniapp Suite. Miwako Tsuji, Mitsuhisa Sato |
| 2021 | Pipelined Preconditioned s-step Conjugate Gradient Methods for Distributed Memory Systems. Manasi Tiwari, Sathish Vadhiyar |
| 2021 | READYS: A Reinforcement Learning Based Strategy for Heterogeneous Dynamic Scheduling. Nathan Grinsztajn, Olivier Beaumont, Emmanuel Jeannot, Philippe Preux |
| 2021 | RELAR: A Reinforcement Learning Framework for Adaptive Routing in Network-on-Chips. Changhong Wang, Dezun Dong, Zicong Wang, Xiaoyun Zhang, Zhenyu Zhao |
| 2021 | RISE: Reducing I/O Contention in Staging-based Extreme-Scale In-situ Workflows. Pradeep Subedi, Philip E. Davis, Manish Parashar |
| 2021 | RPTCN: Resource Prediction for High-dynamic Workloads in Clouds based on Deep Learning. Wenyan Chen, Chengzhi Lu, Kejiang Ye, Yang Wang, Cheng-Zhong Xu |
| 2021 | Reproducible Performance Optimization of Complex Applications on the Edge-to-Cloud Continuum. Daniel Rosendo, Alexandru Costan, Gabriel Antoniu, Matthieu Simonin, Jean-Christophe Lombardo, Alexis Joly, Patrick Valduriez |
| 2021 | Reusability First: Toward FAIR Workflows. Matthew Wolf, Jeremy Logan, Kshitij Mehta, Daniel A. Jacobson, Mikaela Cashman, Angelica M. Walker, Greg Eisenhauer, Patrick M. Widener, Ashley Cliff |
| 2021 | Robustness Analysis of Loop-Free Floating-Point Programs via Symbolic Automatic Differentiation. Arnab Das, Tanmay Tirpankar, Ganesh Gopalakrishnan, Sriram Krishnamoorthy |
| 2021 | SAP-SGD: Accelerating Distributed Parallel Training with High Communication Efficiency on Heterogeneous Clusters. Jing Cao, Zongwei Zhu, Xuehai Zhou |
| 2021 | SDIS: A PB-level seismic data index system with ML methods. Shaoheng Luo, Lei Wang, Yufeng Liu, Changhai Zhao, Xudong Zhang |
| 2021 | Sequence-RTG: Efficient and Production-Ready Pattern Mining in System Log Messages. Louise Harding, Fabien Wernli, Frédéric Suter |
| 2021 | Sequences of Sparse Matrix-Vector Multiplication on Fugaku's A64FX processors. Jérôme Gurhem, Maxence Vandromme, Miwako Tsuji, Serge G. Petiton, Mitsuhisa Sato |
| 2021 | Special function neural network (SFNN) models. Yuzhen Liu, Oana Marin |
| 2021 | Streamlining distributed Deep Learning I/O with ad hoc file systems. Frederic Schimmelpfennig, Marc-André Vef, Reza Salkhordeh, Alberto Miranda, Ramon Nou, André Brinkmann |
| 2021 | Supporting Elastic Compaction of LSM-tree with a FaaS Cluster. Xiaoliang Wang, Jianchuan Li, Peiquan Jin, Kuankuan Guo, Yuanjin Lin, Ming Zhao |
| 2021 | TIGRA: A Tightly Integrated Generic RISC-V Accelerator Interface. Brad Green, Dillon Todd, Jon C. Calhoun, Melissa C. Smith |
| 2021 | Tackling Cold Start of Serverless Applications by Efficient and Adaptive Container Runtime Reusing. Kun Suo, Junggab Son, Dazhao Cheng, Wei Chen, Sabur Baidya |
| 2021 | The Case for Storage Optimization Decoupling in Deep Learning Frameworks. Ricardo Macedo, Cláudia Correia, Marco Dantas, Cláudia Brito, Weijia Xu, Yusuke Tanimura, Jason Haga, João Paulo |
| 2021 | The Challenge of Disproportionate Importance of Temporal Features in Predicting HPC Power Consumption. Chengcheng Li, Ahmad Maroof Karimi, Woong Shin, Hairong Qi, Feiyi Wang |
| 2021 | Thinking More about RDMA Memory Semantics. Teng Ma, Kang Chen, Shaonan Ma, Zhuo Song, Yongwei Wu |
| 2021 | Thrifty Label Propagation: Fast Connected Components for Skewed-Degree Graphs. Mohsen Koohi Esfahani, Peter Kilpatrick, Hans Vandierendonck |
| 2021 | Toward a Comprehensive Benchmark Suite for Evaluating GASPI in HPC Environments. Sarah Neuwirth |
| 2021 | Two-Chains: High Performance Framework for Function Injection and Execution. Megan Grodowitz, Luis E. Peña, Curtis Dunham, Dong Zhong, Pavel Shamis, Steve Poole |
| 2021 | Understanding Soft Error Sensitivity of Deep Learning Models and Frameworks through Checkpoint Alteration. Elvis Rojas, Diego Pérez, Jon C. Calhoun, Leonardo Bautista-Gomez, Terry R. Jones, Esteban Meneses |
| 2021 | Understanding the Effects of DRAM Correctable Error Logging at Scale. Kurt B. Ferreira, Scott Levy, Victor Kuhns, Nathan DeBardeleben, Sean Blanchard |
| 2021 | Virtual Log-Structured Storage for High-Performance Streaming. Ovidiu-Cristian Marcu, Alexandru Costan, Bogdan Nicolae, Gabriel Antoniu |
| 2021 | WIRE: Resource-efficient Scaling with Online Prediction for DAG-based Workflows. Bing Xie, Qiang Cao, Mayuresh Kunjir, Linli Wan, Jeffrey S. Chase, Anirban Mandal, Mats Rynge |
| 2021 | csTuner: Scalable Auto-tuning Framework for Complex Stencil Computation on GPUs. Qingxiao Sun, Yi Liu, Hailong Yang, Zhonghui Jiang, Xiaoyan Liu, Ming Dun, Zhongzhi Luan, Depei Qian |
| 2021 | cuZ-Checker: A GPU-Based Ultra-Fast Assessment System for Lossy Compressions. Xiaodong Yu, Sheng Di, Ali Murat Gok, Dingwen Tao, Franck Cappello |
| 2021 | pMEMCPY: a simple, lightweight, and portable I/O library for storing data in persistent memory. Luke Logan, Jay F. Lofstead, Scott Levy, Patrick M. Widener, Xian-He Sun, Anthony Kougkas |
| 2021 | tcFFT: A Fast Half-Precision FFT Library for NVIDIA Tensor Cores. Bin-Rui Li, Shenggan Cheng, James Lin |