| 2021 | A Fast, General System for Buffered Persistent Data Structures. Haosen Wen, Wentao Cai, Mingzhe Du, Louis Jenkins, Benjamin Valpey, Michael L. Scott |
| 2021 | A Graph-Assisted Out-of-Place Update Scheme for Erasure Coded Storage Systems. Haiwei Deng, Ranhao Jia, Chentao Wu |
| 2021 | A Novel Multi-CPU/GPU Collaborative Computing Framework for SGD-based Matrix Factorization. Yizhi Huang, Yanlong Yin, Yan Liu, Shuibing He, Yang Bai, Renfa Li |
| 2021 | A Universal Construction to implement Concurrent Data Structure for NUMA-muticore. Zhengming Yi, Yiping Yao, Kai Chen |
| 2021 | ADA: An Application-Conscious Data Acquirer for Visual Molecular Dynamics. Hanpei Wu, Tongliang Deng, Yanliang Zou, Shu Yin, Si Chen, Tao Xie |
| 2021 | AMPS-Inf: Automatic Model Partitioning for Serverless Inference with Cost Efficiency. Jananie Jarachanthan, Li Chen, Fei Xu, Bo Li |
| 2021 | ASLDP: An Active Semi-supervised Learning method for Disk Failure Prediction. Yang Zhou, Fang Wang, Dan Feng |
| 2021 | Accelerated Device Placement Optimization with Contrastive Learning. Hao Lan, Li Chen, Baochun Li |
| 2021 | Accelerating DBSCAN Algorithm with AI Chips for Large Datasets. Zhuoran Ji, Cho-Li Wang |
| 2021 | Accelerating Sequence-to-Graph Alignment on Heterogeneous Processors. Zonghao Feng, Qiong Luo |
| 2021 | Accurate Matrix Multiplication on Binary128 Format Accelerated by Ozaki Scheme. Daichi Mukunoki, Katsuhisa Ozaki, Takeshi Ogita, Toshiyuki Imamura |
| 2021 | An Edge-Fencing Strategy for Optimizing SSSP Computations on Large-Scale Graphs. Huashan Yu, Xiaolin Wang, Yingwei Luo |
| 2021 | An Evaluation of Task-Parallel Frameworks for Sparse Solvers on Multicore and Manycore CPU Architectures. Abdullah Alperen, Md. Afibuzzaman, Fazlay Rabbi, M. Yusuf Özkaya, Ümit V. Çatalyürek, Hasan Metin Aktulga |
| 2021 | Ascetic: Enhancing Cross-Iterations Data Efficiency in Out-of-Memory Graph Processing on GPUs. Ruiqi Tang, Ziyi Zhao, Kailun Wang, Xiaoli Gong, Jin Zhang, Wenwen Wang, Pen-Chung Yew |
| 2021 | Automatic Code Generation and Optimization of Large-scale Stencil Computation on Many-core Processors. Mingzhen Li, Yi Liu, Hailong Yang, Yongmin Hu, Qingxiao Sun, Bangduo Chen, Xin You, Xiaoyan Liu, Zhongzhi Luan, Depei Qian |
| 2021 | Automatic Generation of High-Performance Inference Kernels for Graph Neural Networks on Multi-Core Systems. Qiang Fu, H. Howie Huang |
| 2021 | BGPQ: A Heap-Based Priority Queue Design for GPUs. Yan-Hao Chen, Fei Hua, Yuwei Jin, Eddy Z. Zhang |
| 2021 | Best VM Selection for Big Data Applications across Multiple Frameworks by Transfer Learning. Yuewen Wu, Heng Wu, Yuanjia Xu, Yi Hu, Wenbo Zhang, Hua Zhong, Tao Huang |
| 2021 | BitX: Empower Versatile Inference with Hardware Runtime Pruning. Hongyan Li, Hang Lu, Jiawen Huang, Wenxu Wang, Mingzhe Zhang, Wei Chen, Liang Chang, Xiaowei Li |
| 2021 | CD-SGD: Distributed Stochastic Gradient Descent with Compression and Delay Compensation. Enda Yu, Dezun Dong, Yemao Xu, Shuo Ouyang, Xiangke Liao |
| 2021 | CERES: Container-Based Elastic Resource Management System for Mixed Workloads. Jinyu Yu, Dan Feng, Wei Tong, Pengze Lv, Yufei Xiong |
| 2021 | CNN+LSTM Accelerated Turbulent Flow Simulation with Link-Wise Artificial Compressibility Method. Sijiang Fan, Jiawei Fei, Xiaowei Guo, Canqun Yang, Alistair Revell |
| 2021 | Combining Dynamic Concurrency Throttling with Voltage and Frequency Scaling on Task-based Programming Models. Antoni Navarro Muñoz, Arthur Francisco Lorenzon, Eduard Ayguadé Parra, Vicenç Beltran Querol |
| 2021 | Communication Avoiding All-Pairs Shortest Paths Algorithm for Sparse Graphs. Lin Zhu, Qiang-Sheng Hua, Hai Jin |
| 2021 | ComputeCOVID19+: Accelerating COVID-19 Diagnosis and Monitoring via High-Performance Deep Learning on CT Images. Garvit Goel, Atharva Gondhalekar, Jingyuan Qi, Zhicheng Zhang, Guohua Cao, Wu-chun Feng |
| 2021 | Context-aware Data Operation Strategies in Edge Systems for High Application Performance. Tanmoy Sen, Haiying Shen |
| 2021 | Coupling Right-Provisioned Cold Storage Data Centers with Deduplication. Liangfeng Cheng, Yuchong Hu, Zhaokang Ke, Zhongjie Wu |
| 2021 | Crash-Consistency-Aware Encryption for Non-Volatile Memories. Mengya Lei, Fang Wang, Dan Feng, Fan Li, Xueliang Wei |
| 2021 | CuART - a CUDA-based, scalable Radix-Tree lookup and update engine. Martin Koppehel, Tobias Groth, Sven Groppe, Thilo Pionteck |
| 2021 | Distributed Game-Theoretical Route Navigation for Vehicular Crowdsensing. En Wang, Dongming Luan, Yongjian Yang, Zihe Wang, Pengmin Dong, Dawei Li, Wenbin Liu, Jie Wu |
| 2021 | Dubhe: Towards Data Unbiasedness with Homomorphic Encryption in Federated Learning Client Selection. Shulai Zhang, Zirui Li, Quan Chen, Wenli Zheng, Jingwen Leng, Minyi Guo |
| 2021 | Efficient Complete Event Trend Detection over High-Velocity Streams. Huiyao Mei, Hanhua Chen, Hai Jin, Qiang-Sheng Hua, Bing Bing Zhou |
| 2021 | Efficient GPU-Implementation for Integer Sorting Based on Histogram and Prefix-Sums. Seiya Kozakai, Noriyuki Fujimoto, Koichi Wada |
| 2021 | Efficient Modeling of Random Sampling-Based LRU. Junyao Yang, Yuchen Wang, Zhenlin Wang |
| 2021 | Efficient Parallel Algorithms for String Comparison. Nikita Mishin, Daniil Berezun, Alexander Tiskin |
| 2021 | Efficiently Parallelizable Strassen-Based Multiplication of a Matrix by its Transpose. Viviana Arrigoni, Filippo Maggioli, Annalisa Massini, Emanuele Rodolà |
| 2021 | Enabling Efficient SIMD Acceleration for Virtual Radio Access Network. Jianda Wang, Yang Hu |
| 2021 | Exploiting in-Hub Temporal Locality in SpMV-based Graph Processing. Mohsen Koohi Esfahani, Peter Kilpatrick, Hans Vandierendonck |
| 2021 | Exploiting system level heterogeneity to improve the performance of a GeoStatistics multi-phase task-based application. Lucas Leandro Nesi, Arnaud Legrand, Lucas Mello Schnorr |
| 2021 | Exploring HW/SW Co-Optimizations for Accelerating Large-scale Texture Identification on Distributed GPUs. Junsong Wang, Xiaofan Zhang, Yubo Li, Yonghua Lin |
| 2021 | FIFL: A Fair Incentive Mechanism for Federated Learning. Liang Gao, Li Li, Yingwen Chen, Wenli Zheng, Chengzhong Xu, Ming Xu |
| 2021 | Fast Reconstruction for Large Disk Enclosures Based on RAID2.0. Qiliang Li, Min Lyu, Liangliang Xu, Yinlong Xu, Wei Wang |
| 2021 | Fast and Consistent Remote Direct Access to Non-volatile Memory. Jingwen Du, Fang Wang, Dan Feng, Weiguang Li, Fan Li |
| 2021 | Fast and Scalable Sparse Triangular Solver for Multi-GPU Based HPC Architectures. Chenhao Xie, Jieyang Chen, Jesun Firoz, Jiajia Li, Shuaiwen Leon Song, Kevin J. Barker, Mark Raugas, Ang Li |
| 2021 | FastPSO: Towards Efficient Swarm Intelligence Algorithm on GPUs. Hanfeng Liu, Zeyi Wen, Wei Cai |
| 2021 | FedCav: Contribution-aware Model Aggregation on Distributed Heterogeneous Data in Federated Learning. Hui Zeng, Tongqing Zhou, Yeting Guo, Zhiping Cai, Fang Liu |
| 2021 | Fourth-Order Exhaustive Epistasis Detection for the xPU Era. Ricardo Nobre, Aleksandar Ilic, Sergio Santander-Jiménez, Leonel Sousa |
| 2021 | GVT-Guided Demand-Driven Scheduling in Parallel Discrete Event Simulation. Ali Eker, David Timmerman, Barry Williams, Kenneth Chiu, Dmitry Ponomarev |
| 2021 | Generalized Skyline Interval Coloring and Dynamic Geometric Bin Packing Problems. Runtian Ren, Xueyan Tang |
| 2021 | HDNH: a read-efficient and write-optimized hashing scheme for hybrid DRAM-NVM memory. Junhao Zhu, Kaixin Huang, Xiaomin Zou, Chenglong Huang, Nuo Xu, Liang Fang |
| 2021 | HiPa: Hierarchical Partitioning for Fast PageRank on NUMA Multicore Systems. Yuang Chen, Yeh-Ching Chung |
| 2021 | Hippie: A Data-Paralleled Pipeline Approach to Improve Memory-Efficiency and Scalability for Large DNN Training. Xiangyu Ye, Zhiquan Lai, Shengwei Li, Lei Cai, Ding Sun, Linbo Qiao, Dongsheng Li |
| 2021 | ICPP 2021: 50th International Conference on Parallel Processing, Lemont, IL, USA, August 9 - 12, 2021 Xian-He Sun, Sameer Shende, Laxmikant V. Kalé, Yong Chen |
| 2021 | IMPECCABLE: Integrated Modeling PipelinE for COVID Cure by Assessing Better LEads. Aymen Al Saadi, Dario Alfè, Yadu N. Babuji, Agastya Bhati, Ben Blaiszik, Alexander Brace, Thomas S. Brettin, Kyle Chard, Ryan Chard, Austin Clyde, Peter V. Coveney, Ian T. Foster, Tom Gibbs, Shantenu Jha, Kristopher Keipert, Dieter Kranzlmüller, Thorsten Kurth, Hyungro Lee, Zhuozhao Li, Heng Ma, Gerald Mathias, André Merzky, Alexander Partin, Arvind Ramanathan, Ashka Shah, Abraham C. Stern, Rick Stevens, Li Tan, Mikhail Titov, Anda Trifan, Aristeidis Tsaris, Matteo Turilli, Huub J. J. Van Dam, Shunzhou Wan, David Wifling, Junqi Yin |
| 2021 | Interferences between Communications and Computations in Distributed HPC Systems. Alexandre Denis, Emmanuel Jeannot, Philippe Swartvagher |
| 2021 | Intra-page Cache Update in SLC-mode with Partial Programming in High Density SSDs. Jun Li, Minjun Li, Zhigang Cai, François Trahay, Mohamed Wahib, Balazs Gerofi, Zhiming Liu, Min Huang, Jianwei Liao |
| 2021 | Joint Optimization of DNN Partition and Scheduling for Mobile Cloud Computing. Yubin Duan, Jie Wu |
| 2021 | LoWino: Towards Efficient Low-Precision Winograd Convolutions on Modern CPUs. Guangli Li, Zhen Jia, Xiaobing Feng, Yida Wang |
| 2021 | Matryoshka: A Coalesced Delta Sequence Prefetcher. Shizhi Jiang, Yiwei Ci, Qiusong Yang, Mingshu Li |
| 2021 | MetaCache-GPU: Ultra-Fast Metagenomic Classification. Robin Kobus, André Müller, Daniel Jünger, Christian Hundt, Bertil Schmidt |
| 2021 | Multi-Agent Reinforcement Learning based Distributed Renewable Energy Matching for Datacenters. Haoyu Wang, Haiying Shen, Jiechao Gao, Kevin Zheng, Xiaoying Li |
| 2021 | Multi-Resource List Scheduling of Moldable Parallel Jobs under Precedence Constraints. Lucas Perotin, Hongyang Sun, Padma Raghavan |
| 2021 | Multi-level Forwarding and Scheduling Repair Technique in Heterogeneous Network for Erasure-coded Clusters. Hai Zhou, Dan Feng, Yuchong Hu |
| 2021 | NoStop: A Novel Configuration Optimization Scheme for Spark Streaming. Qianwen Ye, Wuji Liu, Chase Q. Wu |
| 2021 | Optimizing Flow Completion Time via Adaptive Buffer Management in Data Center Networks. Sen Liu, Xiang Lin, Zehua Guo, Yi Wang, Mohamed Adel Serhani, Yang Xu |
| 2021 | Optimizing Massively Parallel Winograd Convolution on ARM Processor. Dongsheng Li, Dan Huang, Zhiguang Chen, Yutong Lu |
| 2021 | Optimizing Winograd-Based Convolution with Tensor Cores. Junhong Liu, Dongxu Yang, Junjie Lai |
| 2021 | Optimizing Work Stealing Communication with Structured Atomic Operations. Hannah Cartier, James Dinan, D. Brian Larkins |
| 2021 | PREP: Predicting Job Runtime with Job Running Path on Supercomputers. Longfang Zhou, Xiaorong Zhang, Wenxiang Yang, Yongguo Han, Fang Wang, Yadong Wu, Jie Yu |
| 2021 | Parallel Multi-split Extendible Hashing for Persistent Memory. Jing Hu, Jianxi Chen, Yifeng Zhu, Qing Yang, Zhouxuan Peng, Ya Yu |
| 2021 | Parallel Tucker Decomposition with Numerically Accurate SVD. Zitong Li, Qiming Fang, Grey Ballard |
| 2021 | Paratick: Reducing Timer Overhead in Virtual Machines. Stijn Schildermans, Kris Aerts, Jianchen Shan, Xiaoning Ding |
| 2021 | Processor-Aware Cache-Oblivious Algorithms✱. Yuan Tang, Weiguo Gao |
| 2021 | Progressive Memory Adjustment with Performance Guarantee in Virtualized Systems. Lulu Yao, Yongkun Li, Jiawei Li, Weijie Wu, Yinlong Xu |
| 2021 | Prophet: Speeding up Distributed DNN Training with Predictable Communication Scheduling. Zhenwei Zhang, Qiang Qi, Ruitao Shang, Li Chen, Fei Xu |
| 2021 | ROBOTune: High-Dimensional Configuration Tuning for Cluster-Based Data Analytics. Md. Muhib Khan, Weikuan Yu |
| 2021 | Receiver-Driven Congestion Control for InfiniBand. Yiran Zhang, Kun Qian, Fengyuan Ren |
| 2021 | Recursion Brings Speedup to Out-of-Core TensorCore-based Linear Algebra Algorithms: A Case Study of Classic Gram-Schmidt QR Factorization. Shaoshuai Zhang, Panruo Wu |
| 2021 | Regu2D: Accelerating Vectorization of SpMV on Intel Processors through 2D-partitioning and Regular Arrangement. Xiang Fei, Youhui Zhang |
| 2021 | SPMFS: A Scalable Persistent Memory File System on Optane Persistent Memory. Yang Yang, Qiang Cao, Jie Yao, Yuanyuan Dong, Weikang Kong |
| 2021 | Scaling Generalized N-Body Problems, A Case Study from Genomics. Marquita Ellis, Aydin Buluç, Katherine A. Yelick |
| 2021 | Sparker: Efficient Reduction for More Scalable Machine Learning with Spark. Bowen Yu, Huanqi Cao, Tianyi Shan, Haojie Wang, Xiongchao Tang, Wenguang Chen |
| 2021 | Teddy: An Efficient SIMD-based Literal Matching Engine for Scalable Deep Packet Inspection. Kun Qiu, Harry Chang, Yang Hong, Wenjun Zhu, Xiang Wang, Baoqian Li |
| 2021 | Tool-Supported Mini-App Extraction to Facilitate Program Analysis and Parallelization. Jan-Patrick Lehr, Christian H. Bischof, Florian Dewald, Heiko Mantel, Mohammad Norouzi, Felix Wolf |
| 2021 | Tridiagonal GPU Solver with Scaled Partial Pivoting at Maximum Bandwidth. Christoph Klein, Robert Strzodka |
| 2021 | Using Vectorized Execution to Improve SQL Query Performance on Spark. Yijie Shen, Jin Xiong, Dejun Jiang |
| 2021 | Wave-PIM: Accelerating Wave Simulation Using Processing-in-Memory. Bagus Hanindhito, Ruihao Li, Dimitrios Gourounas, Arash Fathi, Karan Govil, Dimitar Trenev, Andreas Gerstlauer, Lizy Kurian John |
| 2021 | gem5 + rtl: A Framework to Enable RTL Models Inside a Full-System Simulator. Guillem López-Paradís, Adrià Armejach, Miquel Moretó |
| 2021 | sRouting: Towards a Better Flow Size Estimation Performance through Routing and Sketch Configuration. Yang Shi, Mei Wen |