| 2025 | A Pattern-Aware Finite Element Matrix Assembly Method on GPUs. Shuai Yang, Jiaxin Ding, Changyou Zhang, Zhuo Tian, Guangzhao Li, Chen Ju |
| 2025 | A Versatile Simulated Data Transport Layer for in Situ Workflows Performance Evaluation. Frédéric Suter |
| 2025 | Accelerating Key-Value Data Structures Using AVX-512 SIMD Extensions. MohammadReza HoseinyFarahabady, Javid Taheri, Albert Y. Zomaya |
| 2025 | Are We There Yet? Predicting the Queue Wait Times for HPC Jobs. Christin Whitton, William M. Jones, Craig S. Walker, Vanessa Job, Steven T. Senator, Nathan DeBardeleben |
| 2025 | BMPipe: Bubble-Memory Co-Optimization Strategy Planner for Very-Large DNN Training. Ruiwen Wang, Chong Li, Thibaut Tachon, Raja Appuswamy, Teng Su |
| 2025 | Bridging Metadata Service and CXL: A Metadata-Grained and Directory-Aware Storage Engine for Distributed Storage Systems. Xinyu Xu, Xuchao Xie, Xinghan Qiao, Lei Tian, Qiulin Wu, Wenhao Gu, Liquan Xiao |
| 2025 | CFseq: A Framework for Constructing Compression-Friendly Field Sequences for Network Logs. Yunwei Dai, Tao Huang, Shuo Wang, Yong Wang |
| 2025 | Cache Less to Save More: A Cost-Based Distributed Caching Strategy for ICN. Lydia Ait-Oucheggou, Stéphane Rubini, Abdella Battou, Jalil Boukhobza |
| 2025 | Capricorn: Efficient In-Memory Checkpointing for MoE Model Training with Dynamicity Awareness. Wenqian Xie, Zhiquan Lai, Shengwei Li, Weijie Liu, Wei Wang, Yanqi Hao, Dongsheng Li |
| 2025 | Cascade: a Collaborative Algorithm for Scalable and Efficient Neighborhood Allgather. Hamed Sharifian, Amir Hossein Sojoodi, Ahmad Afsahi |
| 2025 | Closing the HPC-Cloud Convergence Gap: Multi-Tenant Slingshot RDMA for Kubernetes. Philipp A. Friese, Ahmed Eleliemy, Utz-Uwe Haus, Martin Schulz |
| 2025 | Communication Notification Through User-Level Interrupts for the BXI Network. Charles Goedefroit, Alexandre Denis, Mathieu Barbe, Brice Goglin, Grégoire Pichon |
| 2025 | DDRM: An SLO-aware Deep Dynamic Resource Management Framework for Microservices. Liangping Tang, Jin Wang, Wanyou Wang, Gaotao Shi, Zhijun Li |
| 2025 | DaCe AD: Unifying High-Performance Automatic Differentiation for Machine Learning and Scientific Computing. Afif Boudaoud, Alexandru Calotoiu, Marcin Copik, Torsten Hoefler |
| 2025 | Deadline-Aware Resource Allocation and Scheduling of Serverless Workloads on Heterogeneous Clusters. Matthias Fritz, Siegfried Benkner, Enes Bajrovic |
| 2025 | Detecting Silent Data Corruption from Hardware Counters. Minseop Choi, Taha Azzaoui, Kyle Chaisson, Orlando Arias, Seung Woo Son |
| 2025 | Efficient Multi-GPU Programming in Python: Reducing Synchronization and Access Overheads. Lena Oden, Klaus Nölp |
| 2025 | EquilibrIO: Taming the I/O Tides in High-Performance Computing. Taylan Özden, Ahmad Tarraf, Felix Wolf |
| 2025 | FIFO-MEP: An Efficient Multi-Eviction-Point FIFO Cache with Stable Demotion for Burst-Oriented Access Mitigation. Ranhao Jia, Yunfei Gu, Chentao Wu, Jie Li, Minyi Guo, Liqiang Zhang, Zaigui Zhang, Haijun Zhang |
| 2025 | Fine-Grain Energy Consumption Modeling of HPC Task-Based Programs. Jules Risse, Amina Guermouche, François Trahay |
| 2025 | GreenK8s: Green-aware Scheduling for Sustainable Kubernetes Cluster Management. Yifan Sun, Minxian Xu, Adel Nadjaran Toosi |
| 2025 | IEEE International Conference on Cluster Computing, CLUSTER 2025, Edinburgh, United Kingdom, September 2-5, 2025 |
| 2025 | Lessons from Profiling and Optimizing Placement in AMR Codes. Ankush Jain, Charles D. Cranor, Qing Zheng, Dominic Manno, George Amvrosiadis, Gary A. Grider |
| 2025 | Multi-agent Independent PPO-based Automatic ECN Tuning for High-Speed Data Center Networks. Ting Wang, Kai Cheng, Xiao Du |
| 2025 | NSYS2PRV: Detailed and Quantitative Analysis of Large-Scale GPU Execution Traces with Paraver. Marc Clascà, Jesús Labarta, Marta Garcia-Gasulla |
| 2025 | PIAR: Path-Improved Adaptive Routing for Dragonfly Networks. Zhenghao Wang, Qiang Wang, Mingche Lai, Jiaqing Xu, Jinbo Xu, Min Xie, Guo Chen |
| 2025 | PRT: An Efficient Pipeline Reuse Technology for Large Models Training. Zeyu Ji, Banghao Zhai, Zhonghao Zhang, Qi Chu, Bin Liu |
| 2025 | Parallel Selected Inversion of Block-Tridiagonal with Arrowhead Matrices. Vincent Maillou, Lisa Gaedke-Merzhäuser, Alexandros Nikolaos Ziogas, Olaf Schenk, Mathieu Luisier |
| 2025 | Parallel Tall-and-Skinny QR Factorization Based on LU-CholeskyQR Algorithm. Yuki Uchino, Toshiyuki Imamura |
| 2025 | Proactive SSD Failure Prediction with A Gradient-Guided LSTM-xLSTM Hybrid Model. Xiaofei Wang, Yang Zhang, Junyan Chen, Xin Wu, Daiwei Du, Feng Wang, Gang Wang, Xiaozhou Liu, Xiaoguang Liu, Yu Zhang |
| 2025 | RAN: Accelerating Data Repair with Available Nodes in Erasure-Coded Storage. Canghai Yang, Kan Zhong, Yujuan Tan, Ao Ren, Duo Liu |
| 2025 | Revisiting Fragmentation for Deduplication in Clustered Primary Storage Systems. Lin Wang, Yuchong Hu, Shilong Mao, Mingqi Li, Ziling Duan, Yue Huang, Leihua Qin, Dan Feng, Zehui Chen, Ruliang Dong |
| 2025 | Rock: Serving Multimodal Models in Cloud with Heterogeneous-Aware Resource Orchestration for Thousands of LoRA Adapters. Shuaipeng Wu, Yanying Lin, Shijie Peng, Wenyan Chen, Chong Ma, Min Shen, Le Chen, Chengzhong Xu, Kejiang Ye |
| 2025 | Scalable and Fast Inference Serving via Hybrid Communication Scheduling on Heterogeneous Networks. Gonglong Chen, Jiamei Lv, Kejiang Ye, Tao Gu, Cheng-Zhong Xu |
| 2025 | Scaling Deep Learning Molecular Dynamics to 500M Atoms on 4096-Node ARMv8 Clusters. Qi Du, Feng Wang, Chengkun Wu, Han Wang, Yongpeng Liu, Zhaoyin Zhou, Kenli Li |
| 2025 | SoCL: Scalable and Latency-Optimized Microservices in Serverless Edge Computing. Shuaibing Lu, Bojin Xiang, Jie Wu, Ziyu You, Wentong Cai |
| 2025 | SplitQuant: Resource-Efficient LLM Offline Serving on Heterogeneous GPUs via Phase-Aware Model Partition and Adaptive Quantization. Juntao Zhao, Borui Wan, Yanghua Peng, Haibin Lin, Chuan Wu |
| 2025 | TRACE: A Targeted Recommender for VM Assignment in Cloud Environment. Hongji Dong, Yunlong Cheng, Tin Ping Chan, Xiaofeng Gao, Guihai Chen |
| 2025 | Towards Dynamic Message Passing Protocols for Stencil-Based Communication Patterns. Kaushik Kandadi Suresh, Bharath Ramesh, Goutham Kalikrishna Reddy Kuncham, Hari Subramoni, Dhabaleswar K. Panda |
| 2025 | Towards High-Performance and Portable Molecular Docking on CPUs Through Vectorization. Gianmarco Accordi, Jens Domke, Theresa Pollinger, Davide Gadioli, Gianluca Palermo |
| 2025 | Uniconn: A Uniform High-Level Communication Library for Portable Multi-GPU Programming. Dogan Sagbili, Sinan Ekmekçibasi, Khaled Z. Ibrahim, Tan Nguyen, Didem Unat |