| 2025 | A4: Microarchitecture-Aware LLC Management for Datacenter Servers with Emerging I/O Devices. Haneul Park, Jiaqi Lou, Sangjin Lee, Yifan Yuan, KyoungSoo Park, Yongseok Son, Ipoom Jeong, Nam Sung Kim |
| 2025 | AIM: Software and Hardware Co-design for Architecture-level IR-drop Mitigation in High-performance PIM. Yuanpeng Zhang, Xing Hu, Xi Chen, Zhihang Yuan, Cong Li, Jingchen Zhu, Zhao Wang, Chenguang Zhang, Xin Si, Wei Gao, Qiang Wu, Runsheng Wang, Guangyu Sun |
| 2025 | AMALI: An Analytical Model for Accurately Modeling LLM Inference on Modern GPUs. Shiheng Cao, Junmin Wu, Junshi Chen, Hong An, Zhibin Yu |
| 2025 | ANSMET: Approximate Nearest Neighbor Search with Near-Memory Processing and Hybrid Early Termination. Yiwei Li, Yuxin Jin, Boyu Tian, Huanchen Zhang, Mingyu Gao |
| 2025 | ANVIL: An In-Storage Accelerator for Name-Value Data Stores. Ryan Wong, Nikita Kim, Aniket Das, Kevin Higgs, Engin Ipek, Sapan Agarwal, Saugata Ghose, Ben Feinberg |
| 2025 | AQB8: Energy-Efficient Ray Tracing Accelerator through Multi-Level Quantization. Yen-Chieh Huang, Chen-Pin Yang, Tsung Tai Yeh |
| 2025 | ARTERY: Fast Quantum Feedback using Branch Prediction. Wuwei Tian, Liqiang Lu, Siwei Tan, Yun Liang, Tingting Li, Kaiwen Zhou, Xinghui Jia, Jianwei Yin |
| 2025 | ATiM: Autotuning Tensor Programs for Processing-in-DRAM. Yongwon Shin, Dookyung Kang, Hyojin Sung |
| 2025 | Accelerating Simulation of Quantum Circuits under Noise via Computational Reuse. Meng Wang, Swamit Tannu, Prashant J. Nair |
| 2025 | Adaptive CHERI Compartmentalization for Heterogeneous Accelerators. Jianyi Cheng, A. Theodore Markettos, Alexandre Joannou, Paul Metzger, Matthew Naylor, Peter Rugg, Timothy M. Jones |
| 2025 | AiF: Accelerating On-Device LLM Inference Using In-Flash Processing. Jaeyong Lee, Hyeunjoo Kim, Sanghun Oh, Myoungjun Chun, Myungsuk Kim, Jihong Kim |
| 2025 | ArtMem: Adaptive Migration in Reinforcement Learning-Enabled Tiered Memory. Xinyue Yi, Hongchao Du, Yu Wang, Jie Zhang, Qiao Li, Chun Jason Xue |
| 2025 | Assassyn: A Unified Abstraction for Architectural Simulation and Implementation. Jian Weng, Boyang Han, Derui Gao, Ruijie Gao, Wanning Zhang, An Zhong, Ceyu Xu, Jihao Xin, Yangzhixin Luo, Lisa Wu Wills, Marco Canini |
| 2025 | Avalanche: Optimizing Cache Utilization via Matrix Reordering for Sparse Matrix Multiplication Accelerator. Gwangeun Byeon, Seongwook Kim, Hyungjin Kim, Sukhyun Han, Jinkwon Kim, Prashant J. Nair, Taewook Kang, Seokin Hong |
| 2025 | Avant-Garde: Empowering GPUs with Scaled Numeric Formats. Minseong Gil, Dongho Ha, Simla Burcu Harma, Myung Kuk Yoon, Babak Falsafi, Won Woo Ro, Yunho Oh |
| 2025 | BingoGCN: Towards Scalable and Efficient GNN Acceleration with Fine-Grained Partitioning and SLT. Jiale Yan, Hiroaki Ito, Yuta Nagahara, Kazushi Kawamura, Masato Motomura, Thiem Van Chu, Daichi Fujiki |
| 2025 | Bishop: Sparsified Bundling Spiking Transformers on Heterogeneous Cores with Error-constrained Pruning. Boxun Xu, Yuxuan Yin, Vikram Iyer, Peng Li |
| 2025 | CORD: Low-Latency, Bandwidth-Efficient and Scalable Release Consistency via Directory Ordering. Yanpeng Yu, Nicolai Oswald, Anurag Khandelwal |
| 2025 | CaliQEC: In-situ Qubit Calibration for Surface Code Quantum Error Correction. Xiang Fang, Keyi Yin, Yuchen Zhu, Jixuan Ruan, Dean Tullsen, Zhiding Liang, Andrew Sornborger, Ang Li, Travis S. Humble, Yufei Ding, Yunong Shi |
| 2025 | Cambricon-SR: An Accelerator for Neural Scene Representation with Sparse Encoding Table. Tianbo Liu, Xinkai Song, Zhifei Yue, Rui Wen, Xing Hu, Zhuoran Song, Yuanbo Wen, Yifan Hao, Wei Li, Zidong Du, Rui Zhang, Jiaming Guo, Di Huang, Shaohui Peng, Guangzhong Sun, Qi Guo, Tianshi Chen |
| 2025 | Caravan: A Hardware/Software Co-Design for Efficient SIMD Neighbor Search on Point Clouds. Pedro Henrique Exenberger Becker, Franyell Silfa, José-María Arnau, Antonio González |
| 2025 | Cassandra: Efficient Enforcement of Sequential Execution for Cryptographic Programs. Ali Hajiabadi, Trevor E. Carlson |
| 2025 | Chimera: Communication Fusion for Hybrid Parallelism in Large Language Models. Le Qin, Junwei Cui, Weilin Cai, Jiayi Huang |
| 2025 | Chip Architectures Under Advanced Computing Sanctions✱. August Ning, David Wentzlaff |
| 2025 | Concorde: Fast and Accurate CPU Performance Modeling with Compositional Analytical-ML Fusion. Arash Nasr-Esfahany, Mohammad Alizadeh, Victor Lee, Hanna Alam, Brett W. Coon, David E. Culler, Vidushi Dadu, Martin Dixon, Henry M. Levy, Santosh Pandey, Parthasarathy Ranganathan, Amir Yazdanbakhsh |
| 2025 | Constant-Rate Entanglement Distillation for Fast Quantum Interconnects. Christopher A. Pattison, Gefen Baranes, Juan Pablo Bonilla Ataides, Mikhail D. Lukin, Hengyun Zhou |
| 2025 | CoopRT: Accelerating BVH Traversal for Ray Tracing via Cooperative Threads. Yavuz Selim Tozlu, Huiyang Zhou |
| 2025 | Cramming a Data Center into One Cabinet, a Co-Exploration of Computing and Hardware Architecture of Waferscale Chip. Xingmao Yu, Dingcheng Jiang, Jinyi Deng, Jingyao Liu, Chao Li, Shouyi Yin, Yang Hu |
| 2025 | DCPerf: An Open-Source, Battle-Tested Performance Benchmark Suite for Datacenter Workloads. Wei Su, Abhishek Dhanotia, Carlos Torres, Jayneel Gandhi, Neha Gholkar, Shobhit O. Kanaujia, Maxim Naumov, Kalyan Subramanian, Valentin Andrei, Yifan Yuan, Chunqiang Tang |
| 2025 | DREAM: Enabling Low-Overhead Rowhammer Mitigation via Directed Refresh Management. Hritvik Taneja, Moinuddin K. Qureshi |
| 2025 | DReX: Accurate and Scalable Dense Retrieval Acceleration via Algorithmic-Hardware Codesign. Derrick Quinn, E. Ezgi Yücel, Martin Prammer, Zhenxing Fan, Kevin Skadron, Jignesh M. Patel, José F. Martínez, Mohammad Alian |
| 2025 | DS-TPU: Dynamical System for on-Device Lifelong Graph Learning with Nonlinear Node Interaction. Chunshu Wu, Ruibing Song, Chuan Liu, Pouya Haghi, Ang Li, Michael Huang, Tony Tong Geng |
| 2025 | DX100: Programmable Data Access Accelerator for Indirection. Alireza Khadem, Kamalavasan Kamalakkannan, Zhenyan Zhu, Akash Poptani, Yufeng Gu, Jered Benjamin Dominguez-Trujillo, Nishil Talati, Daichi Fujiki, Scott A. Mahlke, Galen M. Shipman, Reetuparna Das |
| 2025 | Dadu-Corki: Algorithm-Architecture Co-Design for Embodied AI-powered Robotic Manipulation. Yiyang Huang, Yuhui Hao, Bo Yu, Feng Yan, Yuxin Yang, Feng Min, Yinhe Han, Lin Ma, Shaoshan Liu, Qiang Liu, Yiming Gan |
| 2025 | Debunking the CUDA Myth Towards GPU-based AI Systems: Evaluation of the Performance and Programmability of Intel's Gaudi NPU for AI Model Serving. Yunjae Lee, Juntaek Lim, Jehyeon Bang, Eunyeong Cho, Huijong Jeong, Taesu Kim, Hyungjun Kim, Joonhyung Lee, Jinseop Im, Ranggi Hwang, Se Jung Kwon, Dongsoo Lee, Minsoo Rhu |
| 2025 | DiTile-DGNN: An Efficient Accelerator for Distributed Dynamic Graph Neural Network Inference. Jiaqi Yang, Hao Zheng, Ahmed Louri |
| 2025 | Dynamic Load Balancer in Intel Xeon Scalable Processor: Performance Analyses, Enhancements, and Guidelines. Jiaqi Lou, Srikar Vanavasam, Yifan Yuan, Ren Wang, Nam Sung Kim |
| 2025 | EOD: Enabling Low Latency GNN Inference via Near-Memory Concatenate Aggregation. Taehwan Kim, Yunki Han, Seohye Ha, Jiwan Kim, Lee-Sup Kim |
| 2025 | Ecco: Improving Memory Bandwidth and Capacity for LLMs via Entropy-Aware Cache Compression. Feng Cheng, Cong Guo, Chiyue Wei, Junyao Zhang, Changchun Zhou, Edward Hanson, Jiaqi Zhang, Xiaoxiao Liu, Hai Li, Yiran Chen |
| 2025 | Enabling Ahead Prediction with Practical Energy Constraints. Lingzhe Chester Cai, Aniket Deshmukh, Yale N. Patt |
| 2025 | Evaluating Ruche Networks: Physically Scalable, Cost-Effective, Bandwidth-Flexible NoCs. Dai Cheol Jung, Michael B. Taylor |
| 2025 | FAST: An FHE Accelerator for Scalable-parallelism with Tunable-bit. Shengyu Fan, Xianglong Deng, Liang Kong, Guiming Shi, Guang Fan, Dan Meng, Rui Hou, Mingzhe Zhang |
| 2025 | FATE: Boosting the Performance of Hyper-Dimensional Computing Intelligence with Flexible Numerical DAta TypE. Haomin Li, Fangxin Liu, Yichi Chen, Zongwu Wang, Shiyuan Huang, Ning Yang, Dongxu Lyu, Li Jiang |
| 2025 | FRED: A Wafer-scale Fabric for 3D Parallel DNN Training. Saeed Rashidi, William Won, Sudarshan Srinivasan, Puneet Gupta, Tushar Krishna |
| 2025 | Fair-CO2: Fair Attribution for Cloud Carbon Emissions. Leo Han, Jash Kakadia, Benjamin C. Lee, Udit Gupta |
| 2025 | Finesse: An Agile Design Framework for Pairing-based Cryptography via Software/Hardware Co-Design. Tianwei Pan, Tianao Dai, Jianlei Yang, Hongbin Jing, Yang Su, Zeyu Hao, Xiaotao Jia, Chunming Hu, Weisheng Zhao |
| 2025 | FlexNeRFer: A Multi-Dataflow, Adaptive Sparsity-Aware Accelerator for On-Device NeRF Rendering. Seock-Hwan Noh, Banseok Shin, Jeik Choi, Seungpyo Lee, Jaeha Kung, Yeseong Kim |
| 2025 | Folded Banks: 3D-Stacked HBM Design for Fine-Grained Random-Access Bandwidth. Vignesh Adhinarayanan, Bradford M. Beckmann, Wantong Li, Mohammad Seyedzadeh, Sergey Blagodurov, Derrick Aguren, Hayden Hyungdong Lee |
| 2025 | Forest: Access-aware GPU UVM Management. Mao Lin, Yuan Feng, Guilherme Cox, Hyeran Jeon |
| 2025 | GCStack+GCScaler: Fast and Accurate GPU Performance Analyses Using Fine-Grained Stall Cycle Accounting and Interval Analysis. Hanna Cha, Sungchul Lee, Jounghoo Lee, Yeonan Ha, Joonsung Kim, Youngsok Kim |
| 2025 | GPUs All Grown-Up: Fully Device-Driven SpMV Using GPU Work Graphs. Fabian Wildgrube, Pete Ehrett, Paul Trojahn, Richard Membarth, Bradford M. Beckmann, Dominik Baumeister, Matthäus G. Chajdas |
| 2025 | Garibaldi: A Pairwise Instruction-Data Management for Enhancing Shared Last-Level Cache Performance in Server Workloads. Jaewon Kwon, Yongju Lee, Jiwan Kim, Enhyeok Jang, Hongju Kal, Won Woo Ro |
| 2025 | Genesis: A Compiler for Hamiltonian Simulation on Hybrid CV-DV Quantum Computers. Zihan Chen, Jiakang Li, Minghao Guo, Henry Chen, Zirui Li, Joel Bierman, Yipeng Huang, Huiyang Zhou, Yuan Liu, Eddy Z. Zhang |
| 2025 | H Cong Li, Yihan Yin, Xintong Wu, Jingchen Zhu, Zhutianya Gao, Dimin Niu, Qiang Wu, Xin Si, Yuan Xie, Chen Zhang, Guangyu Sun |
| 2025 | HPVM-HDC: A Heterogeneous Programming System for Accelerating Hyperdimensional Computing. Russel Arbore, Xavier Routh, Abdul Rafae Noor, Akash Kothari, Haichao Yang, Weihong Xu, Sumukh Pinge, Minxuan Zhou, Tajana Rosing, Vikram S. Adve |
| 2025 | HYTE: Flexible Tiling for Sparse Accelerators via Hybrid Static-Dynamic Approaches. Xintong Li, Zhiyao Li, Mingyu Gao |
| 2025 | HardHarvest: Hardware-Supported Core Harvesting for Microservices. Jovan Stojkovic, Chunao Liu, Muhammad Shahbaz, Josep Torrellas |
| 2025 | Hardware-aware Calibration Protocol for Quantum Computers. Yuchen Zhu, Jinglei Cheng, Boxi Li, Kecheng Liu, Yidong Zhou, Hanrui Wang, Yufei Ding, Zhiding Liang |
| 2025 | Heliostat: Harnessing Ray Tracing Accelerators for Page Table Walks. Yuan Feng, Yuke Li, Jiwon Lee, Won Woo Ro, Hyeran Jeon |
| 2025 | Hermes: Algorithm-System Co-design for Efficient Retrieval-Augmented Generation At-Scale. Michael Shen, Muhammad Umar, Kiwan Maeng, G. Edward Suh, Udit Gupta |
| 2025 | HeterRAG: Heterogeneous Processing-in-Memory Acceleration for Retrieval-augmented Generation. Chaoqiang Liu, Haifeng Liu, Dan Chen, Yu Huang, Yi Zhang, Wenjing Xiao, Xiaofei Liao, Hai Jin |
| 2025 | HiPER: Hierarchically-Composed Processing for Efficient Robot Learning-Based Control. Justin Ting, Minsik Kim, Junkang Zhu, Haotian Sheng, Zhengya Zhang |
| 2025 | Hybe: GPU-NPU Hybrid System for Efficient LLM Inference with Million-Token Context Window. Seungjae Moon, Junseo Cha, Hyunjun Park, Joo-Young Kim |
| 2025 | Hybrid SLC-MLC RRAM Mixed-Signal Processing-in-Memory Architecture for Transformer Acceleration via Gradient Redistribution. Chang Eun Song, Priyansh Bhatnagar, Zihan Xia, Nam Sung Kim, Tajana Rosing, Mingu Kang |
| 2025 | IDEA-GP: Instruction-Driven Architecture with Efficient Online Workload Allocation for Geometric Perception. Suquan Zhang, Yu Hu, Yunfei Xiang, Dawei Zhao, Yuanfan Xu, Qingmin Liao, Jincheng Yu, Yu Wang |
| 2025 | In-Storage Acceleration of Retrieval Augmented Generation as a Service. Rohan Mahapatra, Harsha Santhanam, Christopher Priebe, Hanyang Xu, Hadi Esmaeilzadeh |
| 2025 | InfiniMind: A Learning-Optimized Large-Scale Brain-Computer Interface. Yeongwoo Jang, Daye Jung, Seunghyun Song, Hunjun Lee, Jangwoo Kim |
| 2025 | Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures. Chenggang Zhao, Chengqi Deng, Chong Ruan, Damai Dai, Huazuo Gao, Jiashi Li, Liyue Zhang, Panpan Huang, Shangyan Zhou, Shirong Ma, Wenfeng Liang, Ying He, Yuqing Wang, Yuxuan Liu, Y. X. Wei |
| 2025 | LIA: A Single-GPU LLM Inference Acceleration with Cooperative AMX-Enabled CPU-GPU Computation and CXL Offloading. Hyungyo Kim, Nachuan Wang, Qirong Xia, Jinghan Huang, Amir Yazdanbakhsh, Nam Sung Kim |
| 2025 | LUT Tensor Core: A Software-Hardware Co-Design for LUT-Based Low-Bit LLM Inference. Zhiwen Mo, Lei Wang, Jianyu Wei, Zhichen Zeng, Shijie Cao, Lingxiao Ma, Naifeng Jing, Ting Cao, Jilong Xue, Fan Yang, Mao Yang |
| 2025 | Leveraging control-flow similarity to reduce branch predictor cold effects in microservices. Haris Volos, Stylianos Vassiliou, Georgia Antoniou, Davide Basilio Bartolini, Yiannakis Sazeides |
| 2025 | Light-weight Cache Replacement for Instruction Heavy Workloads. Saba Mostofi, Setu Gupta, Ahmad Hassani, Krishnam Tibrewala, Elvira Teran, Paul V. Gratz, Daniel A. Jiménez |
| 2025 | LightML: A Photonic Accelerator for Efficient General Purpose Machine Learning. Liang Liu, Sadra Rahimi Kari, Xin Xin, Nathan Youngblood, Youtao Zhang, Jun Yang |
| 2025 | LightNobel: Improving Sequence Length Limitation in Protein Structure Prediction Model via Adaptive Activation Quantization. Seunghee Han, Soongyu Choi, Joo-Young Kim |
| 2025 | Lumina: Real-Time Neural Rendering by Exploiting Computational Redundancy. Yu Feng, Weikai Lin, Yuge Cheng, Zihan Liu, Jingwen Leng, Minyi Guo, Chen Chen, Shixuan Sun, Yuhao Zhu |
| 2025 | MD-pipe: A Strong Scaling Enhanced Pipeline Architecture for Ab Initio Accuracy Molecular Dynamics. Ning Kang, Guojun Yuan, Zihan Yan, Beining Zhang, Boyang Li, Zeyu Li, Shuo Wang, Guanglei Chen, Jiayi Rao, Zhan Wang, Weile Jia, Ninghui Sun, Guangming Tan |
| 2025 | Magellan: A High-Performance Loop-Guided Prefetcher for Indirect Memory Access. Gelin Fu, Tian Xia, Mingzhuo Yin, Prashant J. Nair, Mieszko Lis, Pengju Ren |
| 2025 | MagiCache: A Virtual In-Cache Computing Engine. Renhao Fan, Yikai Cui, Weike Li, Mingyu Wang, Zhaolin Li |
| 2025 | MeshSlice: Efficient 2D Tensor Parallelism for Distributed DNN Training. Hyoungwook Nam, Gerasimos Gerogiannis, Josep Torrellas |
| 2025 | Meta's Second Generation AI Chip: Model-Chip Co-Design and Productionization Experiences. Joel Coburn, Chunqiang Tang, Sameer Abu Asal, Neeraj Agrawal, Raviteja Chinta, Harish Dattatraya Dixit, Brian Dodds, Saritha Dwarakapuram, Amin Firoozshahian, Cao Gao, Kaustubh Gondkar, Tyler Graf, Junhan Hu, Jian Huang, Sterling Hughes, Adam Hutchin, Bhasker Jakka, Guoqiang Jerry Chen, Indu Kalyanaraman, Ashwin Kamath, Pankaj Kansal, Erum Kazi, Roman Levenstein, Mahesh Maddury, Alex Mastro, Siji Medaiyese, Pritesh Modi, Jack Montgomery, Nadathur Satish, Amit Nagpal, Ashwin Narasimha, Maxim Naumov, Eleanor Ozer, Jongsoo Park, Poorvaja Ramani, Harikrishna Reddy, David Reiss, Deboleena Roy, Sathish Sekar, Arushi Sharma, Pavan Shetty, Aravind Sukumaran-Rajam, Eran Tal, Mike Tsai, Shreya Varshini, Richard Wareing, Olívia Wu, Xiaolong Xie, Jinghan Yang, Hangchen Yu, Tanmay Zargar, Zitong Zeng, Feixiong Zhang, Ajit Mathews, Xun Jiao, Jiyuan Zhang, Emmanuel Menage, Truls Edvard Stokke, Mohammed Sourouri |
| 2025 | MicroScopiQ: Accelerating Foundational Models through Outlier-Aware Microscaling Quantization. Akshat Ramachandran, Souvik Kundu, Tushar Krishna |
| 2025 | MoPAC: Efficiently Mitigating Rowhammer with Probabilistic Activation Counting. Suhas Vittal, Salman Qazi, Poulami Das, Moinuddin Qureshi |
| 2025 | NMP-PaK: Near-Memory Processing Acceleration of Scalable De Novo Genome Assembly. Heewoo Kim, Sanjay Sri Vallabh Singapuram, Haojie Ye, Joseph Izraelevitz, Trevor N. Mudge, Ronald G. Dreslinski, Nishil Talati |
| 2025 | NUPEA: Optimizing Critical Loads on Spatial Dataflow Architectures via Non-Uniform Processing-Element Access. Souradip Ghosh, Graham Gobieski, Keyi Zhang, Brandon Lucia, Nathan Beckmann, Tony Nowatzki |
| 2025 | Need for zkSpeed: Accelerating HyperPlonk for Zero-Knowledge Proofs. Alhad Daftardar, Jianqiao Mo, Joey Ah-kiow, Benedikt Bünz, Ramesh Karri, Siddharth Garg, Brandon Reagen |
| 2025 | Neo: Towards Efficient Fully Homomorphic Encryption Acceleration using Tensor Core. Dian Jiao, Xianglong Deng, Zhiwei Wang, Shengyu Fan, Yi Chen, Dan Meng, Rui Hou, Mingzhe Zhang |
| 2025 | Neoscope: How Resilient Is My SoC to Workload Churn? Joseph Rogers, Lieven Eeckhout, Taha Soliman, Magnus Jahre |
| 2025 | NetCrafter: Tailoring Network Traffic for Non-Uniform Bandwidth Multi-GPU Systems. Amel Fatima, Yang Yang, Yifan Sun, Rachata Ausavarungnirun, Adwait Jog |
| 2025 | Nyx: Virtualizing dataflow execution on shared FPGA platforms. Panagiotis Miliadis, Dimitris Theodoropoulos, Nectarios Koziris, Dionisios N. Pnevmatikatos |
| 2025 | Oaken: Fast and Efficient LLM Serving with Online-Offline Hybrid KV Cache Quantization. Minsu Kim, Seongmin Hong, Ryeowook Ko, Soongyu Choi, Hunjong Lee, Junsoo Kim, Joo-Young Kim, Jongse Park |
| 2025 | OptiPIM: Optimizing Processing-in-Memory Acceleration Using Integer Linear Programming. Jiantao Liu, Minxuan Zhou, Yue Pan, Chien-Yi Yang, Lana Josipovic, Tajana Rosing |
| 2025 | PD Constraint-aware Physical/Logical Topology Co-Design for Network on Wafer. Qize Yang, Taiquan Wei, Sihan Guan, Chengran Li, Haoran Shang, Jinyi Deng, Huizheng Wang, Chao Li, Lei Wang, Yan Zhang, Shouyi Yin, Yang Hu |
| 2025 | Phi: Leveraging Pattern-based Hierarchical Sparsity for High-Efficiency Spiking Neural Networks. Chiyue Wei, Bowen Duan, Cong Guo, Jingyang Zhang, Qingyue Song, Hai Li, Yiran Chen |
| 2025 | Precise exceptions in relaxed architectures. Ben Simner, Alasdair Armstrong, Thomas Bauereiss, Brian Campbell, Ohad Kammar, Jean Pichon-Pharabod, Peter Sewell |
| 2025 | Proceedings of the 52nd Annual International Symposium on Computer Architecture, ISCA 2025, Tokyo, Japan, June 21-25, 2025 |
| 2025 | Process Only Where You Look: Hardware and Algorithm Co-optimization for Efficient Gaze-Tracked Foveated Rendering in Virtual Reality. Haiyu Wang, Wenxuan Liu, Kenneth Chen, Qi Sun, Sai Qian Zhang |
| 2025 | Profile-Guided Temporal Prefetching. Mengming Li, Qijun Zhang, Yichuan Gao, Wenji Fang, Yao Lu, Yongqing Ren, Zhiyao Xie |
| 2025 | PuDHammer: Experimental Analysis of Read Disturbance Effects of Processing-using-DRAM in Real DRAM Chips. Ismail Emir Yuksel, Akash Sood, Ataberk Olgun, Oguzhan Canpolat, Haocong Luo, Nisa Bostanci, Mohammad Sadrosadati, A. Giray Yaglikçi, Onur Mutlu |
| 2025 | QPlacer: Frequency-Aware Component Placement for Superconducting Quantum Computers. Junyao Zhang, Hanrui Wang, Qi Ding, Jiaqi Gu, Reouven Assouly, William D. Oliver, Song Han, Kenneth R. Brown, Hai Li, Yiran Chen |
| 2025 | QR-Map: A Map-Based Approach to Quantum Circuit Abstraction for Qubit Reuse Optimization. Hyungseok Kim, Enhyeok Jang, Seungwoo Choi, Youngmin Kim, Won Woo Ro |
| 2025 | Qtenon: Towards Low-Latency Architecture Integration for Accelerating Hybrid Quantum-Classical Computing. Chenning Tao, Liqiang Lu, Size Zheng, Li-Wen Chang, Minghua Shen, Hanyu Zhang, Fangxin Liu, Kaiwen Zhou, Jianwei Yin |
| 2025 | RAGO: Systematic Performance Optimization for Retrieval-Augmented Generation Serving. Wenqi Jiang, Suvinay Subramanian, Cat Graves, Gustavo Alonso, Amir Yazdanbakhsh, Vidushi Dadu |
| 2025 | RAP: Reconfigurable Automata Processor. Ziyuan Wen, Alexis Le Glaunec, Konstantinos Mamouras, Kaiyuan Yang |
| 2025 | REIS: A High-Performance and Energy-Efficient Retrieval System with In-Storage Processing. Kangqi Chen, Rakesh Nadig, Manos Frouzakis, Nika Mansouri-Ghiasi, Yu Liang, Haiyu Mao, Jisung Park, Mohammad Sadrosadati, Onur Mutlu |
| 2025 | RTSpMSpM: Harnessing Ray Tracing for Efficient Sparse Matrix Computations. Hongrui Zhang, Yunan Zhang, Hung-Wei Tseng |
| 2025 | Reconfigurable Stream Network Architecture. Chengyue Wang, Xiaofan Zhang, Jason Cong, James C. Hoe |
| 2025 | Reinforcement Learning-Guided Graph State Generation in Photonic Quantum Computers. Yingheng Li, Yue Dai, Aditya Pawar, Rongchao Dong, Jun Yang, Youtao Zhang, Xulong Tang |
| 2025 | Resource Analysis of Low-Overhead Transversal Architectures for Reconfigurable Atom Arrays. Hengyun Zhou, Casey Duckering, Chen Zhao, Dolev Bluvstein, Madelyn Cain, Aleksander Kubica, Sheng-Tao Wang, Mikhail D. Lukin |
| 2025 | Rethinking Prefetching for Intermittent Computing. Gan Fang, Jianping Zeng, Aditya Gupta, Changhee Jung |
| 2025 | S-SYNC: Shuttle and Swap Co-Optimization in Quantum Charge-Coupled Devices. Chenghong Zhu, Xian Wu, Jingbo Wang, Xin Wang |
| 2025 | SEAL: A Single-Event Architecture for In-Sensor Visual Localization. Ryan Hou, Thomas Twomey, Vasileios Milionis, Evangelos Dikopoulos, Tianrui Ma, Yuhao Zhu, Georgios Tzimpragos |
| 2025 | SWIPER: Minimizing Fault-Tolerant Quantum Program Latency via Speculative Window Decoding. Joshua Viszlai, Jason D. Chadwick, Sarang Joshi, Gokul Subramanian Ravi, Yanjing Li, Frederic T. Chong |
| 2025 | Scaling Llama 3 Training with Efficient Parallelism Strategies. Weiwei Chu, Xinfeng Xie, Jiecao Yu, Jie Wang, Amar Phanishayee, Chunqiang Tang, Yuchen Hao, Jianyu Huang, Mustafa Ozdal, Jun Wang, Vedanuj Goswami, Naman Goyal, Abhishek Kadian, Andrew Gu, Chris Cai, Feng Tian, Xiaodong Wang, Min Si, Pavan Balaji, Ching-Hsiang Chu, Jongsoo Park |
| 2025 | Single Spike Artificial Neural Networks. Rhys Gretsch, Michael Beyeler, Jeremy Lau, Timothy Sherwood |
| 2025 | Single-Address-Space FaaS with Jord. Yuanlong Li, Atri Bhattacharyya, Madhur Kumar, Abhishek Bhattacharjee, Yoav Etsion, Babak Falsafi, Sanidhya Kashyap, Mathias Payer |
| 2025 | SpecASan: Mitigating Transient Execution Attacks Using Speculative Address Sanitization. Saber Ganjisaffar, Esmaeil Mohmmadian Koruyeh, Jason Zellmer, Hodjat Asghari Esfeden, Chengyu Song, Nael B. Abu-Ghazaleh |
| 2025 | SpecEE: Accelerating Large Language Model Inference with Speculative Early Exiting. Jiaming Xu, Jiayi Pan, Yongkang Zhou, Siming Chen, Jinhao Li, Yaoxiu Lian, Junyi Wu, Guohao Dai |
| 2025 | SwitchQNet: Optimizing Distributed Quantum Computing for Quantum Data Centers with Switch Networks. Hezi Zhang, Yiran Xu, Haotian Hu, Keyi Yin, Hassan Shapourian, Jiapeng Zhao, Ramana Rao Kompella, Reza Nejabati, Yufei Ding |
| 2025 | Synchronization for Fault-Tolerant Quantum Computers. Satvik Maurya, Swamit Tannu |
| 2025 | TRACI: Network Acceleration of Input-Dynamic Communication for Large-Scale Deep Learning Recommendation Model. Guyue Huang, Hao Li, Le Qin, Jiayi Huang, Yangwook Kang, Yufei Ding, Yuan Xie |
| 2025 | Telos: A Dataflow Accelerator for Sparse Triangular Solver of Partial Differential Equations. Xiaochen Hao, Hao Luo, Chu Wang, Chao Yang, Yun Liang |
| 2025 | The Sparsity-Aware LazyGPU Architecture. Changxi Liu, Miao Yu, Yifan Sun, Trevor E. Carlson |
| 2025 | The XOR Cache: A Catalyst for Compression. Zhewen Pan, Joshua San Miguel |
| 2025 | Topology-Aware Virtualization over Inter-Core Connected Neural Processing Units. Dahu Feng, Erhu Feng, Dong Du, Pinjie Xu, Yubin Xia, Haibo Chen, Rong Zhao |
| 2025 | Transitive Array: An Efficient GEMM Accelerator with Result Reuse. Cong Guo, Chiyue Wei, Jiaming Tang, Bowen Duan, Song Han, Hai Li, Yiran Chen |
| 2025 | TrioSim: A Lightweight Simulator for Large-Scale DNN Workloads on Multi-GPU Systems. Ying Li, Yuhui Bao, Gongyu Wang, Xinxin Mei, Pranav Vaid, Anandaroop Ghosh, Adwait Jog, Darius Bunandar, Ajay Joshi, Yifan Sun |
| 2025 | UGPU: Dynamically Constructing Unbalanced GPUs for Enhanced Resource Efficiency. Xia Zhao, Guangda Zhang, Lu Wang, Huadong Dai |
| 2025 | UPP: Universal Predicate Pushdown to Smart Storage. Ipoom Jeong, Jinghan Huang, Chuxuan Hu, Dohyun Park, Jaeyoung Kang, Nam Sung Kim, Yongjoo Park |
| 2025 | Unified Memory Protection with Multi-granular MAC and Integrity Tree for Heterogeneous Processors. Sunho Lee, Seonjin Na, Jeongwon Choi, Jinwon Pyo, Jaehyuk Huh |
| 2025 | Variational Quantum Algorithms in the era of Early Fault Tolerance. Siddharth Dangwal, Suhas Vittal, Lennart Maximilian Seifert, Frederic T. Chong, Gokul Subramanian Ravi |
| 2025 | WSC-LLM: Efficient LLM Service and Architecture Co-exploration for Wafer-scale Chips. Zheng Xu, Dehao Kong, Jiaxin Liu, Jinxi Li, Jingxiang Hou, Xu Dai, Chao Li, Shaojun Wei, Yang Hu, Shouyi Yin |
| 2025 | WarmCache: Exploiting STT-RAM Cache for Low-Power Intermittent Systems. Noureldin Hassan, Byounguk Min, Changhee Jung, Yan Solihin, Jongouk Choi |
| 2025 | When Mitigations Backfire: Timing Channel Attacks and Defense for PRAC-Based RowHammer Mitigations. Jeonghyun Woo, Joyce Qu, Gururaj Saileshwar, Prashant Jayaprakash Nair |
| 2025 | WindServe: Efficient Phase-Disaggregated LLM Serving with Stream-based Dynamic Scheduling. Jingqi Feng, Yukai Huang, Rui Zhang, Sicheng Liang, Ming Yan, Jie Wu |
| 2025 | XHarvest: Rethinking High-Performance and Cost-Efficient SSD Architecture with CXL-Driven Harvesting. Li Peng, Wenbo Wu, Shushu Yi, Xianzhang Chen, Chenxi Wang, Shengwen Liang, Zhe Wang, Nong Xiao, Qiao Li, Mingzhe Zhang, Jie Zhang |
| 2025 | Zettafly: A Network Topology with Flexible Non-blocking Regions for Large-scale AI and HPC Systems. Dezun Dong, Ziyu Wang, Fei Lei |