| 2024 | A Distributed Framework for Subgraph Isomorphism Leveraging CPU and GPU Heterogeneous Computing. Chen Chen, Li Shen, Yingwen Chen |
| 2024 | A Hybrid Machine Learning Method for Cross-Platform Performance Prediction of Parallel Applications. Kaveh Mahdavi |
| 2024 | A Motion Trace Decomposition-based overset grid method for parallel CFD simulations with moving boundaries. Ran Zhao, Chao Li, Xiaowei Guo, Sen Zhang, Xi Yang, Tao Tang, Canqun Yang |
| 2024 | AUTOHET: An Automated Heterogeneous ReRAM-Based Accelerator for DNN Inference. Tong Wu, Shuibing He, Jianxin Zhu, Weijian Chen, Siling Yang, Ping Chen, Yanlong Yin, Xuechen Zhang, Xian-He Sun, Gang Chen |
| 2024 | Accelerated Constrained Sparse Tensor Factorization on Massively Parallel Architectures. Yongseok Soh, Ramakrishnan Kannan, Piyush Sao, Jee W. Choi |
| 2024 | Achieving Efficient Scheduling based on Accurate Measurement of Small Flows in Data Center. Jiawei Huang, Qile Wang, Zhaoyi Li, Yijun Li, Zihao Chen, Sitan Li, Jing Shao, Jingling Liu, Min Zhan, Jianxin Wang |
| 2024 | Achieving High Efficiency for Datacenter Multicast using Skewed Bloom Filter. Jiawei Huang, Zihao Chen, Yiting Wang, Hui Li, Zhaoyi Li, Qile Wang, Sitan Li, Zhidong He, Wanchun Jiang |
| 2024 | AdCoalescer: An Adaptive Coalescer to Reduce the Inter-Module Traffic in MCM-GPUs. Xu Zhang, Guangda Zhang, Lu Wang, Shiqing Zhang, Xia Zhao |
| 2024 | Arlo: Serving Transformer-based Language Models with Dynamic Input Lengths. Xin Tan, Jiamin Li, Yitao Yang, Jingzong Li, Hong Xu |
| 2024 | AutoPipe: Automatic Configuration of Pipeline Parallelism in Shared GPU Cluster. Jinbin Hu, Ying Liu, Hao Wang, Jin Wang |
| 2024 | BandSlim: A Novel Bandwidth and Space-Efficient KV-SSD with an Escape-from-Block Approach. Junhyeok Park, Chang-Gyu Lee, Soon Hwang, Soonyeal Yang, Jungki Noh, Woosuk Chung, Junghee Lee, Youngjae Kim |
| 2024 | Bandwidth-Aware and Overlap-Weighted Compression for Communication-Efficient Federated Learning. Zichen Tang, Junlin Huang, Rudan Yan, Yuxin Wang, Zhenheng Tang, Shaohuai Shi, Amelie Chi Zhou, Xiaowen Chu |
| 2024 | Bitmap-Based Sparse Matrix-Vector Multiplication with Tensor Cores. Yuang Chen, Jeffrey Xu Yu |
| 2024 | BoostN: Optimizing Imbalanced Neighborhood Communication on Homogeneous Many-Core System. Haopeng Huang, Yuyang Jin, Wei Xue |
| 2024 | BrickDL: Graph-Level Optimizations for DNNs with Fine-Grained Data Blocking on GPUs. Mahesh Lakshminarasimhan, Mary W. Hall, Samuel Williams, Oscar Antepara |
| 2024 | CAMLB-SpMV: An Efficient Cache-Aware Memory Load-Balancing SpMV on CPU. Jihu Guo, Rui Xia, Jie Liu, Xiaoxiong Zhu, Xiang Zhang |
| 2024 | CIM-KF: Efficient Computing-in-memory Circuits for Full-Process Execution of Kalman Filter Algorithm. Pingdan Xiao, Qinghui Hong, Sichun Du, Jiliang Zhang |
| 2024 | CR2: Community-aware Compressed Regular Representation for Graph Processing on a GPU. Shinnung Jeong, Sungjun Cho, Yongwoo Lee, Hyunjun Park, Seonyeong Heo, Gwangsun Kim, Youngsok Kim, Hanjun Kim |
| 2024 | Cache Line Pinning for Mitigating Row Hammer Attack. Praseetha M, Madhu Mutyam, Venkata Kalyan Tavva |
| 2024 | ChronusFed: Reinforcement-Based Adaptive Partial Training for Heterogeneous Federated Learning. Fuyuan Xia, Chenhao Ying, David S. L. Wei, Wei Chen, Weiting Zhang, Haiming Jin, Yuan Luo |
| 2024 | Co-Design of Convolutional Algorithms and Long Vector RISC-V Processors for Efficient CNN Model Serving. Sonia Rani Gupta, Nikela Papadopoulou, Jing Chen, Miquel Pericàs |
| 2024 | Coupling Congestion Control and Flow Pausing in Data Center Network. Jiawei Huang, Shengwen Zhou, Zhaoyi Li, Yijun Li, Zihao Chen, Xiaojun Zhu, Jing Shao, Sitan Li, Wanchun Jiang, Jianxin Wang, Ping Zhong |
| 2024 | DB-SpGEMM: A Massively Distributed Block-Sparse Matrix-Matrix Multiplication for Linear-Scaling DFT Calculations. Zhong Zheng, Junshi Chen, Yang Zhao, Longsheng Song, Xinming Qin, Hong An |
| 2024 | DPC: DPU-accelerated High-Performance File System Client. Kan Zhong, Zhiwang Yu, Qiao Li, Xianqiang Luo, Linbo Long, Yujuan Tan, Ao Ren, Duo Liu |
| 2024 | DeInfer: A GPU resource allocation algorithm with spatial sharing for near-deterministic inferring tasks. Yingwen Chen, Wenxin Li, Huan Zhou, Xiangrui Yang, Yanfei Yin |
| 2024 | Designing Non-uniform Locally Repairable Codes for Wide Stripes under Skewed File Accesses. Guantian Lin, Si Wu, Cheng Li, Yinlong Xu |
| 2024 | Detailed Analysis and Optimization of Irregular-Shaped Matrix Multiplication on Multi-Core DSPs. Haotian Mo, Qinglin Wang, Linyu Liao, Biao Li, Lihua Chi, Jie Liu |
| 2024 | DiStore: A Fully Memory Disaggregation Friendly Key-Value Store with Improved Tail Latency and Space Efficiency. Ziwei Xiong, Dejun Jiang, Jin Xiong |
| 2024 | Diminishing cold starts in serverless computing with approximation algorithms. Tomasz Kanas, Krzysztof Rzadca |
| 2024 | Dissecting Convolutional Neural Networks for Runtime and Scalability Prediction. Tim Beringer, Jakob Stock, Arya Mazaheri, Felix Wolf |
| 2024 | Distributed Minimax Fair Optimization over Hierarchical Networks. Wen Xu, Juncheng Wang, Ben Liang, Gary Boudreau, Hamza Umit Sokun |
| 2024 | Enabling Performance Observability for Heterogeneous HPC Workflows with SOMA. Dewi Yokelson, Mikhail Titov, Srinivasan Ramesh, Ozgur O. Kilic, Matteo Turilli, Shantenu Jha, Allen D. Malony |
| 2024 | Enhancing Heterogeneous Computing Through OpenMP and GPU Graph. Chenle Yu, Sara Royuela, Eduardo Quiñones |
| 2024 | Evaluating and optimising compiler code generation for NVIDIA Grace. Ricardo Jesus, Michèle Weiland |
| 2024 | Exploring Scalability in C++ Parallel STL Implementations. Ruben Laso, Diego Krupitza, Sascha Hunold |
| 2024 | Extending Segment Tree for Polygon Clipping and Parallelizing using OpenMP and OpenACC Directives. Buddhi Ashan Mallika Kankanamalage, Satish Puri, Sushil K. Prasad |
| 2024 | FNCC: Fast Notification Congestion Control in Data Center Networks. Jing Xu, Zhan Wang, Fan Yang, Ning Kang, Zhenlong Ma, Guojun Yuan, Guangming Tan, Ninghui Sun |
| 2024 | FP16 Acceleration in Structured Multigrid Preconditioner for Real-World Applications. Yi Zong, Peinan Yu, Haopeng Huang, Wei Xue |
| 2024 | Fast Leiden Algorithm for Community Detection in Shared Memory Setting. Subhajit Sahu, Kishore Kothapalli, Dip Sankar Banerjee |
| 2024 | FedCA: Efficient Federated Learning with Client Autonomy. Na Lv, Zhi Shen, Chen Chen, Zhifeng Jiang, Jiayi Zhang, Quan Chen, Minyi Guo |
| 2024 | FedClust: Tackling Data Heterogeneity in Federated Learning through Weight-Driven Client Clustering. Md Sirajul Islam, Simin Javaherian, Fei Xu, Xu Yuan, Li Chen, Nian-Feng Tzeng |
| 2024 | Federated Edge Learning with Blurred or Pseudo Data Sharing. Yinlong Li, Hao Zhang, Siyao Cheng, Jie Liu |
| 2024 | FlatDD: A High-Performance Quantum Circuit Simulator using Decision Diagram and Flat Array. Shui Jiang, Rongliang Fu, Lukas Burgholzer, Robert Wille, Tsung-Yi Ho, Tsung-Wei Huang |
| 2024 | FlexSP: (1 + β)-Choice based Flexible Stream Partitioning for Stateful Operators. Siyuan Chen, Decheng Zuo, Zhan Zhang |
| 2024 | FreeStencil: A Fine-Grained Solver Compiler with Graph and Kernel Optimizations on Structured Meshes for Modern GPUs. Qianchao Zhu |
| 2024 | GMM: An Efficient GPU Memory Management-based Model Serving System for Multiple DNN Inference Models. XinYu Piao, Jong-Kook Kim |
| 2024 | GNNDrive: Reducing Memory Contention and I/O Congestion for Disk-based GNN Training. Qisheng Jiang, Lei Jia, Chundong Wang |
| 2024 | GPU Algorithms for Fastest Path Problem in Temporal Graphs. Mithinti Srikanth, Prashant Singh, G. Ramakrishna |
| 2024 | GSAP: A GPU-Accelerated Stochastic Graph Partitioner. Chih-Chun Chang, Boyang Zhang, Tsung-Wei Huang |
| 2024 | Gradient Free Personalized Federated Learning. Haoyu Chen, Yuxin Zhang, Jin Zhao, Xin Wang, Yuedong Xu |
| 2024 | HASFL: Harnessing Heterogeneous Models Across Diverse Devices for Enhanced Federated Learning. Jiangshan Hao, Fang Dong, Bingheng Cen, Shucun Fu, Ruiting Zhou, Ding Ding |
| 2024 | HMT: A Hybrid Mitigating and Transferring Approach on I/O Throughput Degradation for Erasure Coded Storage Systems. Piao Hu, Huangzhen Xue, Chentao Wu, Jie Li, Minyi Guo |
| 2024 | HStream: A hierarchical data streaming engine for high-throughput scientific applications. Jaime Cernuda, Jie Ye, Anthony Kougkas, Xian-He Sun |
| 2024 | Hardware Acceleration of Minimap2 Genomic Sequence Alignment Algorithm. Jie Cheng, Lifu Hu, Wei Xu, Hanhua Chen, Tian Xia |
| 2024 | Harnessing Integrated CPU-GPU System Memory for HPC: a first look into Grace Hopper. Gabin Schieffer, Jacob Wahlgren, Jie Ren, Jennifer Faj, Ivy Peng |
| 2024 | Hi-ZNS: High Space Efficiency and Zero-Copy LSM-Tree Based Stores on ZNS SSDs. Renping Liu, Junhua Chen, Peng Chen, Linbo Long, Anping Xiong, Duo Liu |
| 2024 | High-Performance 3D convolution on the Latest Generation Sunway Processor. Jialin Li, Zhichen Feng, Yaqian Gao, Shaobo Tian, Haoyuan Zhang, Huang Ye, Jian Zhang |
| 2024 | High-Performance Sorting-Based K-mer Counting in Distributed Memory with Flexible Hybrid Parallelism. Yifan Li, Giulia Guidi |
| 2024 | High-Performance, Accurate Large-Scale Quantum Chemistry Calculations on GPU Supercomputers using Coulomb-Perturbed Fragmentation. Fazeleh S. Kazemian, Jorge L. Galvez Vallejo, Giuseppe M. J. Barca |
| 2024 | Holmes: Towards Distributed Training Across Clusters with Heterogeneous NIC Environment. Fei Yang, Shuang Peng, Ning Sun, Fangyu Wang, Yuanyuan Wang, Fu Wu, Jiezhong Qiu, Aimin Pan |
| 2024 | HyperDB: a Novel Key Value Store for Reducing Background Traffic in Heterogeneous SSD Storage. Ruisong Zhou, Yuzhan Zhang, Chunhua Li, Ke Zhou, Peng Wang, Gong Zhang, Ji Zhang, Guangyu Zhang |
| 2024 | IMI: In-memory Multi-job Inference Acceleration for Large Language Models. Bin Gao, Zhehui Wang, Zhuomin He, Tao Luo, Weng-Fai Wong, Zhi Zhou |
| 2024 | Im2col-Winograd: An Efficient and Flexible Fused-Winograd Convolution for NHWC Format on GPUs. Zhiyi Zhang, Pengfei Zhang, Zhuopin Xu, Bingjie Yan, Qi Wang |
| 2024 | Improving Performance on Replica-Exchange Molecular Dynamics Simulations by Optimizing GPU Core Utilization. Taisuke Boku, Masatake Sugita, Ryohei Kobayashi, Shinnosuke Furuya, Takuya Fujie, Masahito Ohue, Yutaka Akiyama |
| 2024 | Improving efficiency of Monte Carlo method via code intrinsic framework. Qifeng Pan, Ralf Schneider |
| 2024 | In-Situ Binary Segmentation of 3D time-dependent Flows into Laminar and Turbulent Regions. Jiahui Liu, Tobias Edwards, Kristina Durovic, Philipp Schlatter, Tino Weinkauf |
| 2024 | Jigsaw: Accelerating SpMM with Vector Sparsity on Sparse Tensor Core. Kaige Zhang, Xiaoyan Liu, Hailong Yang, Tianyu Feng, Xinyu Yang, Yi Liu, Zhongzhi Luan, Depei Qian |
| 2024 | Kanva: A Lock-free Learned Search Data Structure. Gaurav Bhardwaj, Bapi Chatterjee, Abhinav Sharma, Sathya Peri, Siddharth Nayak |
| 2024 | Large-scale Phase-Field Simulations for Solid-Solid Phase Transformations involving Elastic Energy. Yaqian Gao, Jian Zhang, Huang Ye, Xuebin Chi |
| 2024 | LpaqHP: A High-Performance FPGA Accelerator for LPAQ Compression. Weilin Zhu, Wei Tong, Hujun Ge, Zuoxian Zhang, Mengran Zhang, Wen Zhou |
| 2024 | MIGER: Integrating Multi-Instance GPU and Multi-Process Service for Deep Learning Clusters. Bowen Zhang, Shuxin Li, Zhuozhao Li |
| 2024 | Mapping Large Memory-constrained Workflows onto Heterogeneous Platforms✱. Svetlana Kulagina, Henning Meyerhenke, Anne Benoit |
| 2024 | Massively Parallel Inverse Block-sorting Transforms for bzip2 Decompression on GPUs. André Weißenberger, Bertil Schmidt |
| 2024 | Multi-level Load Balancing Strategies for Massively Parallel Smoothed Particle Hydrodynamics Simulation. Yi Zhang, Ziyu Zhang, Yang Zhao, Junshi Chen, Hong An, Zhanming Wang, Longkui Chen |
| 2024 | Murmuration: On-the-fly DNN Adaptation for SLO-Aware Distributed Inference in Dynamic Edge Environments. Jieyu Lin, Minghao Li, Sai Qian Zhang, Alberto Leon-Garcia |
| 2024 | Nebula: An Edge-Cloud Collaborative Learning Framework for Dynamic Edge Environments. Yan Zhuang, Zhenzhe Zheng, Yunfeng Shao, Bingshuai Li, Fan Wu, Guihai Chen |
| 2024 | NetSmith: An Optimization Framework for Machine-Discovered Network Topologies. Conor James Green, Mithuna Thottethodi |
| 2024 | OP-PIC - an Unstructured-Mesh Particle-in-Cell DSL for Developing Nuclear Fusion Simulations. Zaman Lantra, Steven A. Wright, Gihan R. Mudalige |
| 2024 | Online Non-preemptive Multi-Resource Scheduling for Weighted Completion Time on Multiple Machines. Donney Fan, Ben Liang |
| 2024 | Online Scheduling and Pricing for Multi-LoRA Fine-Tuning Tasks. Ying Zheng, Lei Jiao, Han Yang, Lulu Chen, Ying Liu, Yuxiao Wang, Yuedong Xu, Xin Wang, Zongpeng Li |
| 2024 | Optimizing SpMV on Heterogeneous Multi-Core DSPs through Improved Locality and Vectorization. Deshun Bi, Shengguo Li, Dezun Dong, Peng Zhang, Jianbin Fang |
| 2024 | Optimizing Stencil Computation on Multi-core DSPs. Fugeng Zhu, Xinxin Qi, Peng Zhang, Jianbin Fang, Tao Tang, Yonggang Che, Kainan Yu, Jing Xie, Chun Huang, Jie Ren |
| 2024 | Optimizing a Super-Fast Eigensolver for Hierarchically Semiseparable Matrices. Abhishek V. N. Taraka Josyula, Pritesh Verma, Amar Gaonkar, Amlan Barua, Nikhil Hegde |
| 2024 | PANDORA: A Parallel Dendrogram Construction Algorithm for Single Linkage Clustering on GPU. Piyush Sao, Andrey Prokopenko, Damien Lebrun-Grandié |
| 2024 | PASCI : A Scalable Framework for Heterogeneous Parallel Calculation of Dynamical Electron Correlation. Runfeng Jin, Wenhao Liang, Haoyuan Zhang, Yinxuan Song, Zhen Luo, Haibo Ma, Yingjin Ma, Zhong Jin |
| 2024 | PREACT: Predictive Resource Allocation for Bursty Workloads in a Co-located Data Center. Dingyu Yang, Ziyang Xiao, Dongxiang Zhang, Shuhao Zhang, Jian Cao, Gang Chen |
| 2024 | PRoof: A Comprehensive Hierarchical Profiling Framework for Deep Neural Networks with Roofline Analysis. Siyu Wu, Hailong Yang, Xin You, Ruihao Gong, Yi Liu, Zhongzhi Luan, Depei Qian |
| 2024 | Parallel Iterative Mistake Minimization (IMM) clustering algorithm for shared-memory systems. Wojciech Kwedlo |
| 2024 | Parallel Optimization for Accelerating the Generation of Correctly Rounded Elementary Functions. Xianglin Wang, Xin Yi, Hengbiao Yu, Chun Huang, Lin Peng |
| 2024 | Parallel Task Scheduling in Autonomous Robotic Systems: An Event-Driven Multimodal Prediction Approach. Wen Gao, Zhiwen Yu, Hui Xiong, Bin Guo, Liang Wang, Yuan Yao |
| 2024 | Parallelization of the Banded Needleman & Wunsch Algorithm on UPMEM PiM Architecture for Long DNA Sequence Alignment. Meven Mognol, Dominique Lavenier, Julien Legriel |
| 2024 | PheCon: Fine-Grained VM Consolidation with Nimble Resource Defragmentation in Public Cloud Platforms. Jiazhen Zhu, Wenda Tang, Xianglong Meng, Nan Gong, Tianxiang Ai, Guanghui Li, Bin Yu, Xin Yang |
| 2024 | Pluto and Charon: A Time and Memory Efficient Collaborative Edge AI Framework for Personal LLMs Fine-tuning. Bei Ouyang, Shengyuan Ye, Liekang Zeng, Tianyi Qian, Jingyi Li, Xu Chen |
| 2024 | Proceedings of the 53rd International Conference on Parallel Processing, ICPP 2024, Gotland, Sweden, August 12-15, 2024 |
| 2024 | RIA: Return on Investment Auto-scaler for Serverless Edge Functions. Huadong Li, Hui Liu, Aoqi Chen, Xirui Ma, Qiaoqiao Liu, Junzhao Du |
| 2024 | RMASanitizer: Generalized Runtime Detection of Data Races in Remote Memory Access Applications. Simon Schwitanski, Yussur Mustafa Oraji, Cornelius Pätzold, Joachim Jenke, Felix Tomski, Matthias S. Müller |
| 2024 | ReDy: A Novel ReRAM-centric Dynamic Quantization Approach for Energy-efficient CNNs. Mohammad Sabri Abrebekoh, Marc Riera Villanueva, Antonio González |
| 2024 | Rethinking Low-Carbon Edge Computing System Design with Renewable Energy Sharing. Hanlong Liao, Guoming Tang, Deke Guo, Yi Wang, Ruide Cao |
| 2024 | Rethinking Personalized Federated Learning from Knowledge Perspective. Dezhong Yao, Ziquan Zhu, Tongtong Liu, Zhiqiang Xu, Hai Jin |
| 2024 | Revisiting Learned Index with Byte-addressable Persistent Storage. Rui Zhang, Yukai Huang, Sicheng Liang, Shangyi Sun, Shaonan Ma, Chengying Huan, Lulu Chen, Zhihui Lu, Yang Xu, Ming Yan, Jie Wu |
| 2024 | RoDMap: A Reserve-on-Demand Mapper for Spatially-Configured Coarse-Grained Reconfigurable Arrays. Kyle Zhao Bin Chen, Tarek S. Abdelrahman, Reza Azimi, Tomasz S. Czajkowski, Maziar Goudarzi |
| 2024 | SIndex: An SSD-based Large-scale Indexing with Deterministic Latency for Cloud Block Storage. Shucheng Wang, Kaiye Zhou, Zhandong Guo, Qiang Cao, Jun Xu, Jie Yao |
| 2024 | SPHINX: Search Space-Pruning Heterogeneous Task Scheduling for Deep Neural Networks. Bowen Yuchi, Heng Shi, Guoqing Bao |
| 2024 | SaSpGEMM: Sorting-Avoiding Sparse General Matrix-Matrix Multiplication on Multi-Core Processors. Chuhe Hong, Qinglin Wang, Runzhang Mao, Yuechao Liang, Rui Xia, Jie Liu |
| 2024 | Scheduling Machine Learning Compressible Inference Tasks with Limited Energy Budget. Tiago Da Silva Barros, Davide Ferré, Frédéric Giroire, Ramon Aparicio-Pardo, Stephane Perennes |
| 2024 | Scratchpad Memory Management for Deep Learning Accelerators. Stavroula Zouzoula, Mohammad Ali Maleki, Muhammad Waqar Azhar, Pedro Trancoso |
| 2024 | Selective Memory Compression for GPU Memory Oversubscription Management. Abdun Nihaal, Madhu Mutyam |
| 2024 | Significantly Improving Fixed-Ratio Compression Framework for Resource-limited Applications. Tri Nguyen, Md Hasanur Rahman, Sheng Di, Michela Becchi |
| 2024 | Sparse Gradient Communication with AlltoAll for Accelerating Distributed Deep Learning. Jing Peng, Zihan Li, Shaohuai Shi, Bo Li |
| 2024 | Sparsity-Aware Communication for Distributed Graph Neural Network Training. Ujjaini Mukhopadhyay, Alok Tripathy, Oguz Selvitopi, Katherine A. Yelick, Aydin Buluç |
| 2024 | SpeedCore: Space-efficient and Dependency-aware GPU Parallel Framework for Core Decomposition. Chen Zhao, Ting Yu, Zhigao Zheng, Yuanyuan Zhu, Song Jin, Bo Du, Dacheng Tao |
| 2024 | SuperCSR: A Space-Time-Efficient CSR Representation for Large-scale Graph Applications on Supercomputers. Xinbiao Gan, Tiejun Li, Qiang Zhang, Bo Yang, Xinhai Chen, Jie Liu |
| 2024 | SyncMalloc: A Synchronized Host-Device Co-Management System for GPU Dynamic Memory Allocation across All Scales. Jiajian Zhang, Fangyu Wu, Hai Jiang, Guangliang Cheng, Genlang Chen, Qiufeng Wang |
| 2024 | TESLA: Thermally Safe, Load-Aware, and Energy-Efficient Cooling Control System for Data Centers. Hanfei Geng, Yi Sun, Yuanzhe Li, Jichao Leng, Xiangyu Zhu, Xianyuan Zhan, Yuanchun Li, Feng Zhao, Yunxin Liu |
| 2024 | TeMCO: Tensor Memory Compiler Optimization across Tensor Decompositions in Deep Learning Inference. Seungbin Song, Ju Min Lee, Haeeun Jeong, Hyunho Kwon, Shinnung Jeong, Jaeho Lee, Hanjun Kim |
| 2024 | Thawbringer: An Orchestrator to Mitigate Cascading Cold Starts of Serverless Function Chains. Huadong Li, Hui Liu, Aoqi Chen, Xirui Ma, Junzhao Du |
| 2024 | The Blind and the Elephant: A Preference-aware Edge Video Analytics Scheduler for Maximizing System Benefit. Liang Zhang, Hongzi Zhu, Yunzhe Li, Jiangang Shen, Minyi Guo |
| 2024 | The Case for Co-Designing Model Architectures with Hardware. Quentin Anthony, Jacob Hatef, Deepak Narayanan, Stella Biderman, Stas Bekman, Junqi Yin, Aamir Shafi, Hari Subramoni, Dhabaleswar K. Panda |
| 2024 | Viper: A High-Performance I/O Framework for Transparently Updating, Storing, and Transferring Deep Neural Network Models. Jie Ye, Jaime Cernuda, Neeraj Rajesh, Keith Bateman, Orcun Yildiz, Tom Peterka, Arnur Nigmetov, Dmitriy Morozov, Xian-He Sun, Anthony Kougkas, Bogdan Nicolae |
| 2024 | VitBit: Enhancing Embedded GPU Performance for AI Workloads through Register Operand Packing. Jaebeom Jeon, Minseong Gil, Junsu Kim, Jaeyong Park, Gunjae Koo, Myung Kuk Yoon, Yunho Oh |
| 2024 | Yggdrasil: Reducing Network I/O Tax with (CXL-Based) Distributed Shared Memory. Wenda Tang, Ying Han, Tianxiang Ai, Guanghui Li, Bin Yu, Xin Yang |
| 2024 | zQoS: Unleashing full performance capabilities of NVMe SSDs while enforcing SLOs in distributed storage systems. Liuying Ma, Zhenqing Liu, Jin Xiong, Yue Wu, Renhai Chen, Xi Peng, Ying Zhang, Gong Zhang, Dejun Jiang |