| 2024 | (MC) Aditya K. Kamath, Simon Peter |
| 2024 | 51st ACM/IEEE Annual International Symposium on Computer Architecture, ISCA 2024, Buenos Aires, Argentina, June 29 - July 3, 2024 |
| 2024 | A New Formulation of Neural Data Prefetching. Quang Duong, Akanksha Jain, Calvin Lin |
| 2024 | A SAT Scalpel for Lattice Surgery: Representation and Synthesis of Subroutines for Surface-Code Fault-Tolerant Quantum Computing. Daniel Bochen Tan, Murphy Yuezhen Niu, Craig Gidney |
| 2024 | A Tale of Two Domains: Exploring Efficient Architecture Design for Truly Autonomous Things. Xiaofeng Hou, Tongqiao Xu, Chao Li, Cheng Xu, Jiacheng Liu, Yang Hu, Jieru Zhao, Jingwen Leng, Kwang-Ting Cheng, Minyi Guo |
| 2024 | AIO: An Abstraction for Performance Analysis Across Diverse Accelerator Architectures. Joseph Rogers, Taha Soliman, Magnus Jahre |
| 2024 | ALISA: Accelerating Large Language Model Inference via Sparsity-Aware KV Caching. Youpeng Zhao, Di Wu, Jun Wang |
| 2024 | AVM-BTB: Adaptive and Virtualized Multi-level Branch Target Buffer. Yunzhe Liu, Xinyu Li, Tingting Zhang, Tianyi Liu, Qi Guo, Fuxin Zhang, Jian Wang |
| 2024 | Alternate Path Fetch. Aniket Deshmukh, Lingzhe Chester Cai, Yale N. Patt |
| 2024 | Alternate Path μ-op Cache Prefetching. Sawan Singh, Arthur Perais, Alexandra Jimborean, Alberto Ros |
| 2024 | Atomique: A Quantum Compiler for Reconfigurable Neutral Atom Arrays. Hanrui Wang, Pengyu Liu, Daniel Bochen Tan, Yilian Liu, Jiaqi Gu, David Z. Pan, Jason Cong, Umut A. Acar, Song Han |
| 2024 | BLESS: Bandwidth and Locality Enhanced SMEM Seeding Acceleration for DNA Sequencing. Seunghee Han, Seungjae Moon, Teokkyu Suh, Jaehoon Heo, Joo-Young Kim |
| 2024 | Barre Chord: Efficient Virtual Memory Translation for Multi-Chip-Module GPUs. Yuan Feng, Seonjin Na, Hyesoon Kim, Hyeran Jeon |
| 2024 | BitNN: A Bit-Serial Accelerator for K-Nearest Neighbor Search in Point Clouds. Meng Han, Liang Wang, Limin Xiao, Hao Zhang, Tianhao Cai, Jiale Xu, Yibo Wu, Chenhao Zhang, Xiangrong Xu |
| 2024 | BlissCam: Boosting Eye Tracking Efficiency with Learned In-Sensor Sparse Sampling. Yu Feng, Tianrui Ma, Yuhao Zhu, Xuan Zhang |
| 2024 | BlitzCoin: Fully Decentralized Hardware Power Management for Accelerator-Rich SoCs. Martin Cochet, Karthik Swaminathan, Erik Jens Loscalzo, Joseph Zuckerman, Maico Cassel dos Santos, Davide Giri, Alper Buyuktosunoglu, Tianyu Jia, David Brooks, Gu-Yeon Wei, Kenneth L. Shepard, Luca P. Carloni, Pradip Bose |
| 2024 | Bosehedral: Compiler Optimization for Bosonic Quantum Computing. Junyu Zhou, Yuhao Liu, Yunong Shi, Ali Javadi-Abhari, Gushu Li |
| 2024 | Cambricon-D: Full-Network Differential Acceleration for Diffusion Models. Weihao Kong, Yifan Hao, Qi Guo, Yongwei Zhao, Xinkai Song, Xiaqing Li, Mo Zou, Zidong Du, Rui Zhang, Chang Liu, Yuanbo Wen, Pengwei Jin, Xing Hu, Wei Li, Zhiwei Xu, Tianshi Chen |
| 2024 | Cicero: Addressing Algorithmic and Architectural Bottlenecks in Neural Rendering by Radiance Warping and Memory Optimizations. Yu Feng, Zihan Liu, Jingwen Leng, Minyi Guo, Yuhao Zhu |
| 2024 | Circular Reconfigurable Parallel Processor for Edge Computing : Industrial Product ✶. Yuan Li, Jianbin Zhu, Yao Fu, Yu Lei, Toshio Nagata, Ryan Braidwood, Haohuan Fu, Juepeng Zheng, Wayne Luk, Hongxiang Fan |
| 2024 | Collision Prediction for Robotics Accelerators. Deval Shah, Tor M. Aamodt |
| 2024 | Compiler-Directed Whole-System Persistence. Jianping Zeng, Tong Zhang, Changhee Jung |
| 2024 | Constable: Improving Performance and Power Efficiency by Safely Eliminating Load Instruction Execution. Rahul Bera, Adithya Ranganathan, Joydeep Rakshit, Sujit Mahto, Anant V. Nori, Jayesh Gaur, Ataberk Olgun, Konstantinos Kanellopoulos, Mohammad Sadrosadati, Sreenivas Subramoney, Onur Mutlu |
| 2024 | Counter-light Memory Encryption. Xin Wang, Jagadish Kotra, Alex Jones, Wenjie Xiong, Xun Jian |
| 2024 | DACAPO: Accelerating Continuous Learning in Autonomous Systems for Video Analytics. Yoonsung Kim, Changhun Oh, Jinwoo Hwang, Wonung Kim, Seongryong Oh, Yubin Lee, Hardik Sharma, Amir Yazdanbakhsh, Jongse Park |
| 2024 | DRAMScope: Uncovering DRAM Microarchitecture and Characteristics by Issuing Memory Commands. Hwayong Nam, Seungmin Baek, Minbok Wi, Michael Jaemin Kim, Jaehyun Park, Chihun Song, Nam Sung Kim, Jung Ho Ahn |
| 2024 | DS-GL: Advancing Graph Learning via Harnessing Nature's Power within Scalable Dynamical Systems. Ruibing Song, Chunshu Wu, Chuan Liu, Ang Li, Michael C. Huang, Tong Geng |
| 2024 | Derm: SLA-aware Resource Management for Highly Dynamic Microservices. Liao Chen, Shutian Luo, Chenyu Lin, Zizhao Mo, Huanle Xu, Kejiang Ye, Chengzhong Xu |
| 2024 | Designing Cloud Servers for Lower Carbon. Jaylen Wang, Daniel S. Berger, Fiodar Kazhamiaka, Celine Irvene, Chaojie Zhang, Esha Choukse, Kali Frost, Rodrigo Fonseca, Brijesh Warrier, Chetan Bansal, Jonathan Stern, Ricardo Bianchini, Akshitha Sriraman |
| 2024 | Determining the Minimum Number of Virtual Networks for Different Coherence Protocols. Weihang Li, Andrés Goens, Nicolai Oswald, Vijay Nagarajan, Daniel J. Sorin |
| 2024 | DyLeCT: Achieving Huge-page-like Translation Performance for Hardware-compressed Memory. Gagandeep Panwar, Muhammad Laghari, Esha Choukse, Xun Jian |
| 2024 | EcoFaaS: Rethinking the Design of Serverless Environments for Energy Efficiency. Jovan Stojkovic, Nikoleta Iliakopoulou, Tianyin Xu, Hubertus Franke, Josep Torrellas |
| 2024 | ElasticRec: A Microservice-based Model Serving Architecture Enabling Elastic Resource Scaling for Recommendation Models. Yujeong Choi, Jiin Kim, Minsoo Rhu |
| 2024 | Enabling Efficient Large Recommendation Model Training with Near CXL Memory Processing. Haifeng Liu, Long Zheng, Yu Huang, Jingyi Zhou, Chaoqiang Liu, Runze Wang, Xiaofei Liao, Hai Jin, Jingling Xue |
| 2024 | Exploiting Similarity Opportunities of Emerging Vision AI Models on Hybrid Bonding Architecture. Zhiheng Yue, Huizheng Wang, Jiahao Fang, Jinyi Deng, Guangyang Lu, Fengbin Tu, Ruiqi Guo, Yuxuan Li, Yubin Qin, Yang Wang, Chao Li, Huiming Han, Shaojun Wei, Yang Hu, Shouyi Yin |
| 2024 | FEATHER: A Reconfigurable Accelerator with Data Reordering Support for Low-Cost On-Chip Dataflow Switching. Jianming Tong, Anirudh Itagi, Prasanth Chatarasi, Tushar Krishna |
| 2024 | FireAxe: Partitioned FPGA-Accelerated Simulation of Large-Scale RTL Designs. Joonho Whangbo, Edwin Lim, Chengyi Lux Zhang, Kevin Anderson, Abraham Gonzalez, Raghav Gupta, Nivedha Krishnakumar, Sagar Karandikar, Borivoje Nikolic, Yakun Sophia Shao, Krste Asanovic |
| 2024 | Flagger: Cooperative Acceleration for Large-Scale Cross-Silo Federated Learning Aggregation. Xiurui Pan, Yuda An, Shengwen Liang, Bo Mao, Mingzhe Zhang, Qiao Li, Myoungsoo Jung, Jie Zhang |
| 2024 | GameStreamSR: Enabling Neural-Augmented Game Streaming on Commodity Mobile Platforms. Sandeepa Bhuyan, Ziyu Ying, Mahmut T. Kandemir, Mahanth Gowda, Chita R. Das |
| 2024 | GhOST: a GPU Out-of-Order Scheduling Technique for Stall Reduction. Ishita Chaturvedi, Bhargav Reddy Godala, Yucan Wu, Ziyang Xu, Konstantinos Iliakis, Panagiotis-Eleftherios Eleftherakis, Sotirios Xydis, Dimitrios Soudris, Tyler Sorensen, Simone Campanoni, Tor M. Aamodt, David I. August |
| 2024 | HADES: Hardware-Assisted Distributed Transactions in the Age of Fast Networks and SmartNICs. Apostolos Kokolis, Antonis Psistakis, Benjamin Reidys, Jian Huang, Josep Torrellas |
| 2024 | HAL: Hardware-assisted Load Balancing for Energy-efficient SNIC-Host Cooperative Computing. Jinghan Huang, Jiaqi Lou, Srikar Vanavasam, Xinhao Kong, Houxiang Ji, Ipoom Jeong, Danyang Zhuo, Eun Kyung Lee, Nam Sung Kim |
| 2024 | HEAP: A Fully Homomorphic Encryption Accelerator with Parallelized Bootstrapping. Rashmi S. Agrawal, Anantha P. Chandrakasan, Ajay Joshi |
| 2024 | Harpocrates: Breaking the Silence of CPU Faults through Hardware-in-the-Loop Program Generation. Nikos Karystinos, Odysseas Chatzopoulos, George-Marios Fragkoulis, George Papadimitriou, Dimitris Gizopoulos, Sudhanva Gurumurthi |
| 2024 | Heterogeneous Acceleration Pipeline for Recommendation System Training. Muhammad Adnan, Yassaman Ebrahimzadeh Maboud, Divya Mahajan, Prashant J. Nair |
| 2024 | HiFi-DRAM: Enabling High-fidelity DRAM Research by Uncovering Sense Amplifiers with IC Imaging. Michele Marazzi, Tristan Sachsenweger, Flavien Solt, Peng Zeng, Kubo Takashi, Maksym Yarema, Kaveh Razavi |
| 2024 | Intel Accelerators Ecosystem: An SoC-Oriented Perspective : Industry Product. Yifan Yuan, Ren Wang, Narayan Ranganathan, Nikhil Rao, Sanjay Kumar, Philip Lantz, Vivekananthan Sanjeepan, Jorge Cabrera, Atul Kwatra, Rajesh Sankaran, Ipoom Jeong, Nam Sung Kim |
| 2024 | LLMCompass: Enabling Efficient Hardware Design for Large Language Model Inference. Hengrui Zhang, August Ning, Rohan Baskar Prabhakar, David Wentzlaff |
| 2024 | MAD-Max Beyond Single-Node: Enabling Large Machine Learning Model Acceleration on Distributed Systems. Samuel Hsia, Alicia Golden, Bilge Acun, Newsha Ardalani, Zachary DeVito, Gu-Yeon Wei, David Brooks, Carole-Jean Wu |
| 2024 | MECLA: Memory-Compute-Efficient LLM Accelerator with Scaling Sub-matrix Partition. Yubin Qin, Yang Wang, Zhiren Zhao, Xiaolong Yang, Yang Zhou, Shaojun Wei, Yang Hu, Shouyi Yin |
| 2024 | MegIS: High-Performance, Energy-Efficient, and Low-Cost Metagenomic Analysis with In-Storage Processing. Nika Mansouri-Ghiasi, Mohammad Sadrosadati, Harun Mustafa, Arvid Gollwitzer, Can Firtina, Julien Eudine, Haiyu Mao, Joël Lindegger, Meryem Banu Cavlak, Mohammed Alser, Jisung Park, Onur Mutlu |
| 2024 | Memento: An Adaptive, Compiler-Assisted Register File Cache for GPUs. Mojtaba Abaie Shoushtary, José-María Arnau, Jordi Tubella Murgadas, Antonio González |
| 2024 | MetaLeak: Uncovering Side Channels in Secure Processor Architectures Exploiting Metadata. Md Hafizul Islam Chowdhuryy, Hao Zheng, Fan Yao |
| 2024 | Mind the Gap: Attainable Data Movement and Operational Intensity Bounds for Tensor Algorithms. Qijing Huang, Po-An Tsai, Joel S. Emer, Angshuman Parashar |
| 2024 | Mirage: An RNS-Based Photonic Accelerator for DNN Training. Cansu Demirkiran, Guowei Yang, Darius Bunandar, Ajay Joshi |
| 2024 | NDPBridge: Enabling Cross-Bank Coordination in Near-DRAM-Bank Processing Architectures. Boyu Tian, Yiwei Li, Li Jiang, Shuangyu Cai, Mingyu Gao |
| 2024 | NDSEARCH: Accelerating Graph-Traversal-Based Approximate Nearest Neighbor Search through Near Data Processing. Yitu Wang, Shiyu Li, Qilin Zheng, Linghao Song, Zongwang Li, Andrew Chang, Hai Li, Yiran Chen |
| 2024 | Native DRAM Cache: Re-architecting DRAM as a Large-Scale Cache for Data Centers. Yesin Ryu, Yoojin Kim, Giyong Jung, Jung Ho Ahn, Jungrae Kim |
| 2024 | NeuraChip: Accelerating GNN Computations with a Hash-based Decoupled Spatial Accelerator. Kaustubh Shivdikar, Nicolas Bohm Agostini, Malith Jayaweera, Gilbert Jonatan, José L. Abellán, Ajay Joshi, John Kim, David R. Kaeli |
| 2024 | On Error Correction for Nonvolatile Processing-In-Memory. Hüsrev Cilasun, Salonik Resch, Zamshed I. Chowdhury, Masoud Zabihi, Yang Lv, Brandon Zink, Jianping Wang, Sachin S. Sapatnekar, Ulya R. Karpuzcu |
| 2024 | PID-Comm: A Fast and Flexible Collective Communication Framework for Commodity Processing-in-DIMM Devices. Si Ung Noh, Junguk Hong, Chaemin Lim, Seongyeon Park, Jeehyun Kim, Hanjun Kim, Youngsok Kim, Jinho Lee |
| 2024 | Perspective: A Principled Framework for Pliable and Secure Speculation in Operating Systems. Tae Hoon Kim, David Rudo, Kaiyang Zhao, Zirui Neil Zhao, Dimitrios Skarlatos |
| 2024 | PrIDE: Achieving Secure Rowhammer Mitigation with Low-Cost In-DRAM Trackers. Aamer Jaleel, Gururaj Saileshwar, Stephen W. Keckler, Moinuddin K. Qureshi |
| 2024 | Pre-gated MoE: An Algorithm-System Co-Design for Fast and Scalable Mixture-of-Expert Inference. Ranggi Hwang, Jianyu Wei, Shijie Cao, Changho Hwang, Xiaohu Tang, Ting Cao, Mao Yang |
| 2024 | PreSto: An In-Storage Data Preprocessing System for Training Recommendation Models. Yunjae Lee, Hyeseong Kim, Minsoo Rhu |
| 2024 | QUETZAL: Vector Acceleration Framework for Modern Genome Sequence Analysis Algorithms. Julian Pavon, Iván Vargas Valdivieso, Carlos Rojas, César Hernández, Mehmet Aslan, Roger Figueras, Yichao Yuan, Joël Lindegger, Mohammed Alser, Francesc Moll, Santiago Marco-Sola, Oguz Ergin, Nishil Talati, Onur Mutlu, Osman S. Unsal, Mateo Valero, Adrián Cristal |
| 2024 | QuTracer: Mitigating Quantum Gate and Measurement Errors by Tracing Subsets of Qubits. Peiyi Li, Ji Liu, Alvin Gonzales, Zain H. Saleem, Huiyang Zhou, Paul D. Hovland |
| 2024 | ReAIM: A ReRAM-based Adaptive Ising Machine for Solving Combinatorial Optimization Problems. Hao-Wei Chiang, Chin-Fu Nien, Hsiang-Yun Cheng, Kuei-Po Huang |
| 2024 | Realizing the AMD Exascale Heterogeneous Processor Vision : Industry Product. Alan Smith, Gabriel H. Loh, Michael J. Schulte, Mike Ignatowski, Samuel Naffziger, Mike Mantor, Nathan Kalyanasundharam, Vamsi Alla, Nicholas Malaya, Joseph L. Greathouse, Eric Chapman, Raja Swaminathan |
| 2024 | Scalable, Programmable and Dense: The HammerBlade Open-Source RISC-V Manycore. Dai Cheol Jung, Max Ruttenberg, Paul Gao, Scott Davidson, Daniel Petrisko, Kangli Li, Aditya K. Kamath, Lin Cheng, Shaolin Xie, Peitian Pan, Zhongyuan Zhao, Zichao Yue, Bandhav Veluri, Sripathi Muralitharan, Adrian Sampson, Andrew Lumsdaine, Zhiru Zhang, Christopher Batten, Mark Oskin, Dustin Richmond, Michael Bedford Taylor |
| 2024 | SmartOClock: Workload- and Risk-Aware Overclocking in the Cloud. Jovan Stojkovic, Pulkit A. Misra, Íñigo Goiri, Sam Whitlock, Esha Choukse, Mayukh Das, Chetan Bansal, Jason Lee, Zoey Sun, Haoran Qiu, Reed Zimmermann, Savyasachi Samal, Brijesh Warrier, Ashish Raniwala, Ricardo Bianchini |
| 2024 | Soter: Analytical Tensor-Architecture Modeling and Automatic Tensor Program Tuning for Spatial Accelerators. Fuyu Wang, Minghua Shen, Yufei Ding, Nong Xiao |
| 2024 | Splitwise: Efficient Generative LLM Inference Using Phase Splitting. Pratyush Patel, Esha Choukse, Chaojie Zhang, Aashaka Shah, Íñigo Goiri, Saeed Maleki, Ricardo Bianchini |
| 2024 | Suppressing Correlated Noise in Quantum Computers via Context-Aware Compiling. Alireza Seif, Haoran Liao, Vinay Tripathi, Kevin Krsulich, Moein Malekakhlagh, Mirko Amico, Petar Jurcevic, Ali Javadi-Abhari |
| 2024 | TCP: A Tensor Contraction Processor for AI Workloads Industrial Product. Hanjoon Kim, Younggeun Choi, Junyoung Park, Byeongwook Bae, Hyunmin Jeong, Sang Min Lee, Jeseung Yeon, Minho Kim, Changjae Park, Boncheol Gu, Changman Lee, Jaeick Bae, SungGyeong Bae, Yojung Cha, Wooyoung Choe, Jonguk Choi, Juho Ha, Hyuck Han, Namoh Hwang, Seokha Hwang, Kiseok Jang, Haechan Je, Hojin Jeon, Jaewoo Jeon, Hyunjun Jeong, Yeonsu Jung, Dongok Kang, Hyewon Kim, Minjae Kim, Muhwan Kim, Sewon Kim, Suhyung Kim, Won Kim, Yong Kim, Youngsik Kim, Younki Ku, Jeong Ki Lee, Juyun Lee, Kyungjae Lee, Seokho Lee, Minwoo Noh, Hyuntaek Oh, Gyunghee Park, Sanguk Park, Jimin Seo, Jungyoung Seong, June Paik, Nuno P. Lopes, Sungjoo Yoo |
| 2024 | Tartan: Microarchitecting a Robotic Processor. Mohammad Bakhshalipour, Phillip B. Gibbons |
| 2024 | Tender: Accelerating Large Language Models via Tensor Decomposition and Runtime Requantization. Jungi Lee, Wonbeom Lee, Jaewoong Sim |
| 2024 | Tetris: A Compilation Framework for VQA Applications in Quantum Computing. Yuwei Jin, Zirui Li, Fei Hua, Tianyi Hao, Huiyang Zhou, Yipeng Huang, Eddy Z. Zhang |
| 2024 | The Case For Data Centre Hyperloops. Guillem López-Paradís, Isaac M. Hair, Sid Kannan, Roman Rabbat, Parker Murray, Alex Lopes, Rory Zahedi, Winston Zuo, Jonathan Balkind |
| 2024 | The Dataflow Abstract Machine Simulator Framework. Nathan Zhang, Rubens Lacouture, Gina Sohn, Paul Mure, Qizheng Zhang, Fredrik Kjolstad, Kunle Olukotun |
| 2024 | The Maya Cache: A Storage-efficient and Secure Fully-associative Last-level Cache. Anubhav Bhatla, Navneet, Biswabandan Panda |
| 2024 | Trapezoid: A Versatile Accelerator for Dense and Sparse Matrix Multiplications. Yifan Yang, Joel S. Emer, Daniel Sánchez |
| 2024 | Triangel: A High-Performance, Accurate, Timely On-Chip Temporal Prefetcher. Sam Ainsworth, Lev Mukhanov |
| 2024 | UDP: Utility-Driven Fetch Directed Instruction Prefetching. Surim Oh, Mingsheng Xu, Tanvir Ahmed Khan, Baris Kasikci, Heiner Litz |
| 2024 | UM-PIM: DRAM-based PIM with Uniform & Shared Memory Space. Yilong Zhao, Mingyu Gao, Fangxin Liu, Yiwei Hu, Zongwu Wang, Han Lin, Jin Li, He Xian, Hanlin Dong, Tao Yang, Naifeng Jing, Xiaoyao Liang, Li Jiang |
| 2024 | Waferscale Network Switches. Shuangliang Chen, Saptadeep Pal, Rakesh Kumar |
| 2024 | pSyncPIM: Partially Synchronous Execution of Sparse Matrix Operations for All-Bank PIM Architectures. Daehyeon Baek, Soojin Hwang, Jaehyuk Huh |
| 2024 | sNPU: Trusted Execution Environments on Integrated NPUs. Erhu Feng, Dahu Feng, Dong Du, Yubin Xia, Haibo Chen |