| 2023 | ADARNet: Deep Learning Predicts Adaptive Mesh Refinement. Octavi Obiols-Sales, Abhinav Vishnu, Nicholas Malaya, Aparna Chandramowlishwaran |
| 2023 | ASFL: Adaptive Semi-asynchronous Federated Learning for Balancing Model Accuracy and Total Latency in Mobile Edge Networks. Jieling Yu, Ruiting Zhou, Chen Chen, Bo Li, Fang Dong |
| 2023 | Accelerating Large-Scale CFD Simulations with Lattice Boltzmann Method on a 40-Million-Core Sunway Supercomputer. Zhao Liu, Xuesen Chu, Xiaojing Lv, Hanyue Liu, Haohuan Fu, Guangwen Yang |
| 2023 | An Improved Parallel Overset Grid Method for Fluid Simulation with Moving Boundary. Ran Zhao, Chao Li, Xiaowei Guo, Yi Liu, Sifan Long, Sen Zhang, Yanlong Qiu, Canqun Yang |
| 2023 | AsyncGBP: Unleashing the Potential of Heterogeneous Computing for SSL/TLS with GPU-based Provider. Yi Bian, Fangyu Zheng, Yuewu Wang, Lingguang Lei, Yuan Ma, Jiankuo Dong, Jiwu Jing |
| 2023 | BEEP: Balanced Efficient subgraph Enumeration in Parallel. Samiran Kawtikwar, Mohammad Almasri, Wen-Mei Hwu, Rakesh Nagi, Jinjun Xiong |
| 2023 | BIRP: Batch-aware Inference Workload Redistribution and Parallel Scheme for Edge Collaboration. Hesheng Sun, Xinyi Chen, Zhuzhong Qian, Zengji Li, Ning Chen, Tuo Cao, Suwei Xu, Yitong Zhou |
| 2023 | BitColor: Accelerating Large-Scale Graph Coloring on FPGA with Parallel Bit-Wise Engines. Haishuang Fan, Ming Li, Jingya Wu, Wenyan Lu, Xiaowei Li, Guihai Yan |
| 2023 | BlockPilot: A Proposer-Validator Parallel Execution Framework for Blockchain. Haowen Zhang, Jing Li, He Zhao, Tong Zhou, Nianzu Sheng, Hengyu Pan |
| 2023 | CoTrain: Efficient Scheduling for Large-Model Training upon GPU and CPU in Parallel. Zhenxing Li, Qiang Cao, Yajie Chen, Wenrui Yan |
| 2023 | CoTuner: A Hierarchical Learning Framework for Coordinately Optimizing Resource Partitioning and Parameter Tuning. Tiannuo Yang, Ruobing Chen, Yusen Li, Xiaoguang Liu, Gang Wang |
| 2023 | Colossal-AI: A Unified Deep Learning System For Large-Scale Parallel Training. Shenggui Li, Hongxin Liu, Zhengda Bian, Jiarui Fang, Haichen Huang, Yuliang Liu, Boxiang Wang, Yang You |
| 2023 | Communication Optimizations for State-vector Quantum Simulator on CPU+GPU Clusters. Chenyang Jiao, Weihua Zhang, Li Shen |
| 2023 | Communication-Avoiding Optimizations for Large-Scale Unstructured-Mesh Applications with OP2. Suneth Dasantha Ekanayake, István Zoltan Reguly, Fabio Luporini, Gihan Ravideva Mudalige |
| 2023 | Communication-Efficient Generalized Neuron Matching for Federated Learning. Sixu Hu, Qinbin Li, Bingsheng He |
| 2023 | Composable Workflow for Accelerating Neural Architecture Search Using In Situ Analytics for Protein Classification. Georgia Channing, Ria Patel, Paula Olaya, Ariel Keller Rorabaugh, Osamu Miyashita, Silvina Caíno-Lores, Catherine D. Schuman, Florence Tama, Michela Taufer |
| 2023 | Computing the k-th Eigenvalue of Symmetric H2-Matrices. M. Ridwan Apriansyah, Rio Yokota |
| 2023 | Conflux: Exploiting Persistent Memory and RDMA Bandwidth via Adaptive I/O Mode Selection. Zhenlin Qi, Shengan Zheng, Yifeng Hui, Bowen Zhang, Linpeng Huang |
| 2023 | Connectivity-Aware Link Analysis for Skewed Graphs. Yuang Chen, Yeh-Ching Chung |
| 2023 | Credit-based Differential Privacy Stochastic Model Aggregation Algorithm for Robust Federated Learning via Blockchain. Mengyao Du, Miao Zhang, Lin Liu, Kai Xu, Quanjun Yin |
| 2023 | DAG-Aware Optimization for Geo-Distributed Data Analytics. Qingyuan Wang, Bin Gao, Zhi Zhou, Fei Xu, Chenghao Ouyang |
| 2023 | DArray: A High Performance RDMA-Based Distributed Array. Baorong Ding, Mingcong Han, Rong Chen |
| 2023 | DComp: Efficient Offload of LSM-tree Compaction with Data Processing Units. Chen Ding, Jian Zhou, Jiguang Wan, Yiqin Xiong, Sicen Li, Shuning Chen, Hanyang Liu, Liu Tang, Ling Zhan, Kai Lu, Peng Xu |
| 2023 | DEFT: Exploiting Gradient Norm Difference between Model Layers for Scalable Gradient Sparsification. Daegun Yoon, Sangyoon Oh |
| 2023 | DeepPower: Deep Reinforcement Learning based Power Management for Latency Critical Applications in Multi-core Systems. Jingrun Zhang, Guangba Yu, Zilong He, Liang Ai, Pengfei Chen |
| 2023 | DiffLex: A High-Performance, Memory-Efficient and NUMA-Aware Learned Index using Differentiated Management. Lixiao Cui, Kedi Yang, Yusen Li, Gang Wang, Xiaoguang Liu |
| 2023 | Dystri: A Dynamic Inference based Distributed DNN Service Framework on Edge. Xueyu Hou, Yongjie Guan, Tao Han |
| 2023 | EC-SpMM: Efficient Compilation of SpMM Kernel on GPUs. Junqing Lin, Honghe Zhang, Xiaolong Shi, Jingwei Sun, Xianzhi Yu, Jun Yao, Guangzhong Sun |
| 2023 | Embracing Uncertainty for Equity in Resource Allocation in ML Training. Suraiya Tairin, Haiying Shen, Zeyu Zhang |
| 2023 | Exploiting Subgraph Similarities for Efficient Auto-tuning of Tensor Programs. Mingzhen Li, Hailong Yang, Shanjun Zhang, Fengwei Yu, Ruihao Gong, Yi Liu, Zhongzhi Luan, Depei Qian |
| 2023 | FaST-GShare: Enabling Efficient Spatio-Temporal GPU Sharing in Serverless Computing for Deep Learning Inference. Jianfeng Gu, Yichao Zhu, Puxuan Wang, Mohak Chadha, Michael Gerndt |
| 2023 | Fast Parallel Index Construction for Efficient K-truss-based Local Community Detection in Large Graphs. Md Abdul Motaleb Faysal, Maximilian H. Bremer, Cy P. Chan, John Shalf, Shaikh Arifuzzaman |
| 2023 | Fast tree-based algorithms for DBSCAN for low-dimensional data on GPUs. Andrey Prokopenko, Damien Lebrun-Grandié, Daniel Arndt |
| 2023 | FastDimeNet++: Training DimeNet++ in 22 minutes. Feiwen Zhu, Michal Futrega, Han Bao, Sukru Burc Eryilmaz, Fei Kong, Kefeng Duan, Xinnian Zheng, Nimrod Angel, Matthias Jouanneaux, Maximilian Stadler, Michal Marcinkiewicz, Fung Xie, June Yang, Michael Andersch |
| 2023 | GFFT: a Task Graph Based Fast Fourier Transform Optimization Framework. Qinglin Lu, Xinyu Wang, Wenjing Ma, Yuwen Zhao, Daokun Chen, Fangfang Liu |
| 2023 | GPU Performance Acceleration via Intra-Group Sharing TLB. Weiming Huang, Yajuan Du, Mingyang Liu |
| 2023 | General-purpose Asynchronous Periodic Checkpointing in Hybrid Memory. Masaki Nakata, Shigeyuki Sato, Tomoharu Ugawa |
| 2023 | Group-based Hierarchical Federated Learning: Convergence, Group Formation, and Sampling. Jiyao Liu, Xinliang Wei, Xuanzhang Liu, Hongchang Gao, Yu Wang |
| 2023 | HASpGEMM: Heterogeneity-Aware Sparse General Matrix-Matrix Multiplication on Modern Asymmetric Multicore Processors. Helin Cheng, Wenxuan Li, Yuechen Lu, Weifeng Liu |
| 2023 | Hector: A Framework to Design and Evaluate Scheduling Strategies in Persistent Key-Value Stores. Louis-Claude Canon, Anthony Dugois, Loris Marchal, Etienne Rivière |
| 2023 | HighRPM: Combining Integrated Measurement and Sofware Power Modeling for High-Resolution Power Monitoring. Xinxin Qi, Juan Chen, Yong Dong, Yuan Yuan, Tao Xu, Rongyu Deng, Zekai Li, Kexing Zhou, Zheng Wang |
| 2023 | ITIF: Integrated Transformers Inference Framework for Multiple Tenants on GPU. Yuning Zhang, Zao Zhang, Wei Bao, Dong Yuan |
| 2023 | Impact of Cache Coherence on the Performance of Shared-Memory based MPI Primitives: A Case Study for Broadcast on Intel Xeon Scalable Processors. George Katevenis, Manolis Ploumidis, Manolis Marazakis |
| 2023 | Implementing OpenMP's SIMD Directive in LLVM's GPU Runtime. Eric Wright, Johannes Doerfert, Shilei Tian, Barbara M. Chapman, Sunita Chandrasekaran |
| 2023 | Improving the Scaling of an Asynchronous Many-Task Runtime with a Lightweight Communication Engine. Omri Mor, George Bosilca, Marc Snir |
| 2023 | Investigating Dependency Graph Discovery Impact on Task-based MPI+OpenMP Applications Performances. Romain Pereira, Adrien Roussel, Patrick Carribault, Thierry Gautier |
| 2023 | JOSS: Joint Exploration of CPU-Memory DVFS and Task Scheduling for Energy Efficiency. Jing Chen, Madhavan Manivannan, Bhavishya Goel, Miquel Pericàs |
| 2023 | JSweep: A Patch-centric Data-driven Approach for Parallel Sweeps on Large-scale Meshes. Jie Yan, Zhang Yang, Aiqing Zhang, Zeyao Mo |
| 2023 | Learning From Your Neighbours: Mobility-Driven Device-Edge-Cloud Federated Learning. Songli Zhang, Zhenzhe Zheng, Fan Wu, Bingshuai Li, Yunfeng Shao, Guihai Chen |
| 2023 | MARS: Fault Localization in Programmable Networking Systems with Low-cost In-Band Network Telemetry. Benran Wang, Hongyang Chen, Pengfei Chen, Zilong He, Guangba Yu |
| 2023 | Marlin: A Concurrent and Write-Optimized B+-tree Index on Disaggregated Memory. Hang An, Fang Wang, Dan Feng, Xiaomin Zou, Zefeng Liu, Jianshun Zhang |
| 2023 | Mercury: Fast and Optimal Device Placement for Large Deep Learning Models. Hengwei Xu, Pengyuan Zhou, Haiyong Xie, Yong Liao |
| 2023 | Minimizing Network and Storage Costs for Consensus with Flexible Erasure Coding. Mi Zhang, Qihan Kang, Patrick P. C. Lee |
| 2023 | Modeling and Benchmarking the Potential Benefit of Early-Bird Transmission in Fine-Grained Communication. Whit Schonbein, Scott Levy, Matthew G. F. Dosanjh, W. Pepper Marts, Elizabeth Reid, Ryan E. Grant |
| 2023 | NeiLatS: Neighbor-Aware Latency-Sensitive Application Scheduling in Heterogeneous Cloud-Edge Environment. Huadong Li, Hui Liu, Changyuan Liu, Aoqi Chen, Zhaocheng Niu, Junzhao Du |
| 2023 | O(N) distributed direct factorization of structured dense matrices using runtime systems. Sameer Deshmukh, Rio Yokota, George Bosilca, Qianxiang Ma |
| 2023 | ORAQL - Optimistic Responses to Alias Queries in LLVM. Jan Hückelheim, Johannes Doerfert |
| 2023 | OSP: Boosting Distributed Model Training with 2-stage Synchronization. Zixuan Chen, Lei Shi, Xuandong Liu, Jiahui Li, Sen Liu, Yang Xu |
| 2023 | On Optimizing Traffic Scheduling for Multi-replica Containerized Microservices. Xianzhi Zhu, Yongkun Li, Lulu Yao, Zhihao Qi, Yinlong Xu, Pengcheng Wang, Weiguang Wang, Xia Zhu |
| 2023 | Output-Directed Dynamic Quantization for DNN Acceleration. Beilei Jiang, Xianwei Cheng, Yuan Li, Jocelyn Zhang, Song Fu, Qing Yang, Mingxiong Liu, Alejandro Olvera |
| 2023 | PFDRL: Personalized Federated Deep Reinforcement Learning for Residential Energy Management. Jiechao Gao, Wenpeng Wang, Fateme Nikseresht, Viswajith Govinda Rajan, Bradford Campbell |
| 2023 | PMLDS: An LSM-Tree Direct Managed Storage for Key-Value Stores on Byte-Addressable Devices. Ziyi Lu, Qiang Cao, Shucheng Wang, Jie Yao, Xiangrui Yang |
| 2023 | PSRA-HGADMM: A Communication Efficient Distributed ADMM Algorithm. Yongwen Qiu, Yongmei Lei, Guozheng Wang |
| 2023 | Parallel Order-Based Core Maintenance in Dynamic Graphs. Bin Guo, Emil Sekerinski |
| 2023 | Performance-Aware Energy-Efficient GPU Frequency Selection using DNN-based Models. Ghazanfar Ali, Mert Side, Sridutt Bhalachandra, Nicholas J. Wright, Yong Chen |
| 2023 | Proceedings of the 52nd International Conference on Parallel Processing, ICPP 2023, Salt Lake City, UT, USA, August 7-10, 2023 |
| 2023 | Quantifying the Performance Benefits of Partitioned Communication in MPI. Thomas Gillis, Ken Raffenetti, Hui Zhou, Yanfei Guo, Rajeev Thakur |
| 2023 | RBC: A bandwidth controller to reduce write-stalls and tail latency. Zepeng Wang, Shu Yin |
| 2023 | RLB: Reordering-Robust Load Balancing in Lossless Datacenter Networks. Jinbin Hu, Yi He, Jin Wang, Wangqing Luo, Jiawei Huang |
| 2023 | RadarSSD: A Computational Storage for Radar Signal Processing. Jiali Li, Xianzhang Chen, Duo Liu, Ao Ren, Zhaoyang Zeng, Yujuan Tan |
| 2023 | Re-aligning Across-page Requests for Flash-based Solid-state Drives. Zhigang Cai, Chengyong Tang, Minjun Li, François Trahay, Jun Li, Zhibing Sha, Jiaojiao Wu, Fan Yang, Jianwei Liao |
| 2023 | Recoil: Parallel rANS Decoding with Decoder-Adaptive Scalability. Fangzheng Lin, Kasidis Arunruangsirilert, Heming Sun, Jiro Katto |
| 2023 | SEECHIP: A Scalable and Energy-Efficient Chiplet-based GPU Architecture Using Photonic Links. Hao Zhang, Yawen Chen, Zhiyi Huang, Haibo Zhang, Fei Dai |
| 2023 | SNICIT: Accelerating Sparse Neural Network Inference via Compression at Inference Time on GPU. Shui Jiang, Tsung-Wei Huang, Bei Yu, Tsung-Yi Ho |
| 2023 | SPLIT: QoS-Aware DNN Inference on Shared GPU via Evenly-Sized Model Splitting. Diaohan Luo, Tian Yu, Yuewen Wu, Heng Wu, Tao Wang, Wenbo Zhang |
| 2023 | Scalable Incremental Checkpointing using GPU-Accelerated De-Duplication. Nigel Tan, Jakob Lüttgau, Jack Marquez, Keita Teranishi, Nicolas M. Morales, Sanjukta Bhowmick, Franck Cappello, Michela Taufer, Bogdan Nicolae |
| 2023 | Scheduling Dependent Batching Tasks. Hehuan Shi, Lin Chen, Ming Lin, Raphael C.-W. Phan |
| 2023 | Smart Cache Insertion and Promotion Policy for Content Delivery Networks. Peng Wang, Yu Liu, Zhelong Zhao, Ke Zhou, Zhihai Huang, Yanxiong Chen |
| 2023 | Tango: Harmonious Management and Scheduling for Mixed Services Co-located among Distributed Edge-Clouds. Yicheng Feng, Shihao Shen, Mengwei Xu, Yuanming Ren, Xiaofei Wang, Victor C. M. Leung, Wenyu Wang |
| 2023 | Toward Optimal Repair and Load Balance in Locally Repairable Codes. Hao Zhao, Si Wu, Haifeng Liu, Zhixiang Tang, Xiaochun He, Yinlong Xu |
| 2023 | WFAsic: A High-Performance ASIC Accelerator for DNA Sequence Alignment on a RISC-V SoC. Abbas Haghi, Lluc Alvarez, Jordi Fornt, Juan Miguel De Haro Ruiz, Roger Figueras, Max Doblas, Santiago Marco-Sola, Miquel Moretó |
| 2023 | Warped-MC: An Efficient Memory Controller Scheme for Massively Parallel Processors. Jong-Hyun Jeong, Myung Kuk Yoon, Yunho Oh, Gunjae Koo |
| 2023 | Wrht: Efficient All-reduce for Distributed DNN Training in Optical Interconnect Systems. Fei Dai, Yawen Chen, Zhiyi Huang, Haibo Zhang |