ISCA A*

136 papers

YearTitle / Authors
2025A4: Microarchitecture-Aware LLC Management for Datacenter Servers with Emerging I/O Devices.
Haneul Park, Jiaqi Lou, Sangjin Lee, Yifan Yuan, KyoungSoo Park, Yongseok Son, Ipoom Jeong, Nam Sung Kim
2025AIM: Software and Hardware Co-design for Architecture-level IR-drop Mitigation in High-performance PIM.
Yuanpeng Zhang, Xing Hu, Xi Chen, Zhihang Yuan, Cong Li, Jingchen Zhu, Zhao Wang, Chenguang Zhang, Xin Si, Wei Gao, Qiang Wu, Runsheng Wang, Guangyu Sun
2025AMALI: An Analytical Model for Accurately Modeling LLM Inference on Modern GPUs.
Shiheng Cao, Junmin Wu, Junshi Chen, Hong An, Zhibin Yu
2025ANSMET: Approximate Nearest Neighbor Search with Near-Memory Processing and Hybrid Early Termination.
Yiwei Li, Yuxin Jin, Boyu Tian, Huanchen Zhang, Mingyu Gao
2025ANVIL: An In-Storage Accelerator for Name-Value Data Stores.
Ryan Wong, Nikita Kim, Aniket Das, Kevin Higgs, Engin Ipek, Sapan Agarwal, Saugata Ghose, Ben Feinberg
2025AQB8: Energy-Efficient Ray Tracing Accelerator through Multi-Level Quantization.
Yen-Chieh Huang, Chen-Pin Yang, Tsung Tai Yeh
2025ARTERY: Fast Quantum Feedback using Branch Prediction.
Wuwei Tian, Liqiang Lu, Siwei Tan, Yun Liang, Tingting Li, Kaiwen Zhou, Xinghui Jia, Jianwei Yin
2025ATiM: Autotuning Tensor Programs for Processing-in-DRAM.
Yongwon Shin, Dookyung Kang, Hyojin Sung
2025Accelerating Simulation of Quantum Circuits under Noise via Computational Reuse.
Meng Wang, Swamit Tannu, Prashant J. Nair
2025Adaptive CHERI Compartmentalization for Heterogeneous Accelerators.
Jianyi Cheng, A. Theodore Markettos, Alexandre Joannou, Paul Metzger, Matthew Naylor, Peter Rugg, Timothy M. Jones
2025AiF: Accelerating On-Device LLM Inference Using In-Flash Processing.
Jaeyong Lee, Hyeunjoo Kim, Sanghun Oh, Myoungjun Chun, Myungsuk Kim, Jihong Kim
2025ArtMem: Adaptive Migration in Reinforcement Learning-Enabled Tiered Memory.
Xinyue Yi, Hongchao Du, Yu Wang, Jie Zhang, Qiao Li, Chun Jason Xue
2025Assassyn: A Unified Abstraction for Architectural Simulation and Implementation.
Jian Weng, Boyang Han, Derui Gao, Ruijie Gao, Wanning Zhang, An Zhong, Ceyu Xu, Jihao Xin, Yangzhixin Luo, Lisa Wu Wills, Marco Canini
2025Avalanche: Optimizing Cache Utilization via Matrix Reordering for Sparse Matrix Multiplication Accelerator.
Gwangeun Byeon, Seongwook Kim, Hyungjin Kim, Sukhyun Han, Jinkwon Kim, Prashant J. Nair, Taewook Kang, Seokin Hong
2025Avant-Garde: Empowering GPUs with Scaled Numeric Formats.
Minseong Gil, Dongho Ha, Simla Burcu Harma, Myung Kuk Yoon, Babak Falsafi, Won Woo Ro, Yunho Oh
2025BingoGCN: Towards Scalable and Efficient GNN Acceleration with Fine-Grained Partitioning and SLT.
Jiale Yan, Hiroaki Ito, Yuta Nagahara, Kazushi Kawamura, Masato Motomura, Thiem Van Chu, Daichi Fujiki
2025Bishop: Sparsified Bundling Spiking Transformers on Heterogeneous Cores with Error-constrained Pruning.
Boxun Xu, Yuxuan Yin, Vikram Iyer, Peng Li
2025CORD: Low-Latency, Bandwidth-Efficient and Scalable Release Consistency via Directory Ordering.
Yanpeng Yu, Nicolai Oswald, Anurag Khandelwal
2025CaliQEC: In-situ Qubit Calibration for Surface Code Quantum Error Correction.
Xiang Fang, Keyi Yin, Yuchen Zhu, Jixuan Ruan, Dean Tullsen, Zhiding Liang, Andrew Sornborger, Ang Li, Travis S. Humble, Yufei Ding, Yunong Shi
2025Cambricon-SR: An Accelerator for Neural Scene Representation with Sparse Encoding Table.
Tianbo Liu, Xinkai Song, Zhifei Yue, Rui Wen, Xing Hu, Zhuoran Song, Yuanbo Wen, Yifan Hao, Wei Li, Zidong Du, Rui Zhang, Jiaming Guo, Di Huang, Shaohui Peng, Guangzhong Sun, Qi Guo, Tianshi Chen
2025Caravan: A Hardware/Software Co-Design for Efficient SIMD Neighbor Search on Point Clouds.
Pedro Henrique Exenberger Becker, Franyell Silfa, José-María Arnau, Antonio González
2025Cassandra: Efficient Enforcement of Sequential Execution for Cryptographic Programs.
Ali Hajiabadi, Trevor E. Carlson
2025Chimera: Communication Fusion for Hybrid Parallelism in Large Language Models.
Le Qin, Junwei Cui, Weilin Cai, Jiayi Huang
2025Chip Architectures Under Advanced Computing Sanctions✱.
August Ning, David Wentzlaff
2025Concorde: Fast and Accurate CPU Performance Modeling with Compositional Analytical-ML Fusion.
Arash Nasr-Esfahany, Mohammad Alizadeh, Victor Lee, Hanna Alam, Brett W. Coon, David E. Culler, Vidushi Dadu, Martin Dixon, Henry M. Levy, Santosh Pandey, Parthasarathy Ranganathan, Amir Yazdanbakhsh
2025Constant-Rate Entanglement Distillation for Fast Quantum Interconnects.
Christopher A. Pattison, Gefen Baranes, Juan Pablo Bonilla Ataides, Mikhail D. Lukin, Hengyun Zhou
2025CoopRT: Accelerating BVH Traversal for Ray Tracing via Cooperative Threads.
Yavuz Selim Tozlu, Huiyang Zhou
2025Cramming a Data Center into One Cabinet, a Co-Exploration of Computing and Hardware Architecture of Waferscale Chip.
Xingmao Yu, Dingcheng Jiang, Jinyi Deng, Jingyao Liu, Chao Li, Shouyi Yin, Yang Hu
2025DCPerf: An Open-Source, Battle-Tested Performance Benchmark Suite for Datacenter Workloads.
Wei Su, Abhishek Dhanotia, Carlos Torres, Jayneel Gandhi, Neha Gholkar, Shobhit O. Kanaujia, Maxim Naumov, Kalyan Subramanian, Valentin Andrei, Yifan Yuan, Chunqiang Tang
2025DREAM: Enabling Low-Overhead Rowhammer Mitigation via Directed Refresh Management.
Hritvik Taneja, Moinuddin K. Qureshi
2025DReX: Accurate and Scalable Dense Retrieval Acceleration via Algorithmic-Hardware Codesign.
Derrick Quinn, E. Ezgi Yücel, Martin Prammer, Zhenxing Fan, Kevin Skadron, Jignesh M. Patel, José F. Martínez, Mohammad Alian
2025DS-TPU: Dynamical System for on-Device Lifelong Graph Learning with Nonlinear Node Interaction.
Chunshu Wu, Ruibing Song, Chuan Liu, Pouya Haghi, Ang Li, Michael Huang, Tony Tong Geng
2025DX100: Programmable Data Access Accelerator for Indirection.
Alireza Khadem, Kamalavasan Kamalakkannan, Zhenyan Zhu, Akash Poptani, Yufeng Gu, Jered Benjamin Dominguez-Trujillo, Nishil Talati, Daichi Fujiki, Scott A. Mahlke, Galen M. Shipman, Reetuparna Das
2025Dadu-Corki: Algorithm-Architecture Co-Design for Embodied AI-powered Robotic Manipulation.
Yiyang Huang, Yuhui Hao, Bo Yu, Feng Yan, Yuxin Yang, Feng Min, Yinhe Han, Lin Ma, Shaoshan Liu, Qiang Liu, Yiming Gan
2025Debunking the CUDA Myth Towards GPU-based AI Systems: Evaluation of the Performance and Programmability of Intel's Gaudi NPU for AI Model Serving.
Yunjae Lee, Juntaek Lim, Jehyeon Bang, Eunyeong Cho, Huijong Jeong, Taesu Kim, Hyungjun Kim, Joonhyung Lee, Jinseop Im, Ranggi Hwang, Se Jung Kwon, Dongsoo Lee, Minsoo Rhu
2025DiTile-DGNN: An Efficient Accelerator for Distributed Dynamic Graph Neural Network Inference.
Jiaqi Yang, Hao Zheng, Ahmed Louri
2025Dynamic Load Balancer in Intel Xeon Scalable Processor: Performance Analyses, Enhancements, and Guidelines.
Jiaqi Lou, Srikar Vanavasam, Yifan Yuan, Ren Wang, Nam Sung Kim
2025EOD: Enabling Low Latency GNN Inference via Near-Memory Concatenate Aggregation.
Taehwan Kim, Yunki Han, Seohye Ha, Jiwan Kim, Lee-Sup Kim
2025Ecco: Improving Memory Bandwidth and Capacity for LLMs via Entropy-Aware Cache Compression.
Feng Cheng, Cong Guo, Chiyue Wei, Junyao Zhang, Changchun Zhou, Edward Hanson, Jiaqi Zhang, Xiaoxiao Liu, Hai Li, Yiran Chen
2025Enabling Ahead Prediction with Practical Energy Constraints.
Lingzhe Chester Cai, Aniket Deshmukh, Yale N. Patt
2025Evaluating Ruche Networks: Physically Scalable, Cost-Effective, Bandwidth-Flexible NoCs.
Dai Cheol Jung, Michael B. Taylor
2025FAST: An FHE Accelerator for Scalable-parallelism with Tunable-bit.
Shengyu Fan, Xianglong Deng, Liang Kong, Guiming Shi, Guang Fan, Dan Meng, Rui Hou, Mingzhe Zhang
2025FATE: Boosting the Performance of Hyper-Dimensional Computing Intelligence with Flexible Numerical DAta TypE.
Haomin Li, Fangxin Liu, Yichi Chen, Zongwu Wang, Shiyuan Huang, Ning Yang, Dongxu Lyu, Li Jiang
2025FRED: A Wafer-scale Fabric for 3D Parallel DNN Training.
Saeed Rashidi, William Won, Sudarshan Srinivasan, Puneet Gupta, Tushar Krishna
2025Fair-CO2: Fair Attribution for Cloud Carbon Emissions.
Leo Han, Jash Kakadia, Benjamin C. Lee, Udit Gupta
2025Finesse: An Agile Design Framework for Pairing-based Cryptography via Software/Hardware Co-Design.
Tianwei Pan, Tianao Dai, Jianlei Yang, Hongbin Jing, Yang Su, Zeyu Hao, Xiaotao Jia, Chunming Hu, Weisheng Zhao
2025FlexNeRFer: A Multi-Dataflow, Adaptive Sparsity-Aware Accelerator for On-Device NeRF Rendering.
Seock-Hwan Noh, Banseok Shin, Jeik Choi, Seungpyo Lee, Jaeha Kung, Yeseong Kim
2025Folded Banks: 3D-Stacked HBM Design for Fine-Grained Random-Access Bandwidth.
Vignesh Adhinarayanan, Bradford M. Beckmann, Wantong Li, Mohammad Seyedzadeh, Sergey Blagodurov, Derrick Aguren, Hayden Hyungdong Lee
2025Forest: Access-aware GPU UVM Management.
Mao Lin, Yuan Feng, Guilherme Cox, Hyeran Jeon
2025GCStack+GCScaler: Fast and Accurate GPU Performance Analyses Using Fine-Grained Stall Cycle Accounting and Interval Analysis.
Hanna Cha, Sungchul Lee, Jounghoo Lee, Yeonan Ha, Joonsung Kim, Youngsok Kim
2025GPUs All Grown-Up: Fully Device-Driven SpMV Using GPU Work Graphs.
Fabian Wildgrube, Pete Ehrett, Paul Trojahn, Richard Membarth, Bradford M. Beckmann, Dominik Baumeister, Matthäus G. Chajdas
2025Garibaldi: A Pairwise Instruction-Data Management for Enhancing Shared Last-Level Cache Performance in Server Workloads.
Jaewon Kwon, Yongju Lee, Jiwan Kim, Enhyeok Jang, Hongju Kal, Won Woo Ro
2025Genesis: A Compiler for Hamiltonian Simulation on Hybrid CV-DV Quantum Computers.
Zihan Chen, Jiakang Li, Minghao Guo, Henry Chen, Zirui Li, Joel Bierman, Yipeng Huang, Huiyang Zhou, Yuan Liu, Eddy Z. Zhang
2025H
Cong Li, Yihan Yin, Xintong Wu, Jingchen Zhu, Zhutianya Gao, Dimin Niu, Qiang Wu, Xin Si, Yuan Xie, Chen Zhang, Guangyu Sun
2025HPVM-HDC: A Heterogeneous Programming System for Accelerating Hyperdimensional Computing.
Russel Arbore, Xavier Routh, Abdul Rafae Noor, Akash Kothari, Haichao Yang, Weihong Xu, Sumukh Pinge, Minxuan Zhou, Tajana Rosing, Vikram S. Adve
2025HYTE: Flexible Tiling for Sparse Accelerators via Hybrid Static-Dynamic Approaches.
Xintong Li, Zhiyao Li, Mingyu Gao
2025HardHarvest: Hardware-Supported Core Harvesting for Microservices.
Jovan Stojkovic, Chunao Liu, Muhammad Shahbaz, Josep Torrellas
2025Hardware-aware Calibration Protocol for Quantum Computers.
Yuchen Zhu, Jinglei Cheng, Boxi Li, Kecheng Liu, Yidong Zhou, Hanrui Wang, Yufei Ding, Zhiding Liang
2025Heliostat: Harnessing Ray Tracing Accelerators for Page Table Walks.
Yuan Feng, Yuke Li, Jiwon Lee, Won Woo Ro, Hyeran Jeon
2025Hermes: Algorithm-System Co-design for Efficient Retrieval-Augmented Generation At-Scale.
Michael Shen, Muhammad Umar, Kiwan Maeng, G. Edward Suh, Udit Gupta
2025HeterRAG: Heterogeneous Processing-in-Memory Acceleration for Retrieval-augmented Generation.
Chaoqiang Liu, Haifeng Liu, Dan Chen, Yu Huang, Yi Zhang, Wenjing Xiao, Xiaofei Liao, Hai Jin
2025HiPER: Hierarchically-Composed Processing for Efficient Robot Learning-Based Control.
Justin Ting, Minsik Kim, Junkang Zhu, Haotian Sheng, Zhengya Zhang
2025Hybe: GPU-NPU Hybrid System for Efficient LLM Inference with Million-Token Context Window.
Seungjae Moon, Junseo Cha, Hyunjun Park, Joo-Young Kim
2025Hybrid SLC-MLC RRAM Mixed-Signal Processing-in-Memory Architecture for Transformer Acceleration via Gradient Redistribution.
Chang Eun Song, Priyansh Bhatnagar, Zihan Xia, Nam Sung Kim, Tajana Rosing, Mingu Kang
2025IDEA-GP: Instruction-Driven Architecture with Efficient Online Workload Allocation for Geometric Perception.
Suquan Zhang, Yu Hu, Yunfei Xiang, Dawei Zhao, Yuanfan Xu, Qingmin Liao, Jincheng Yu, Yu Wang
2025In-Storage Acceleration of Retrieval Augmented Generation as a Service.
Rohan Mahapatra, Harsha Santhanam, Christopher Priebe, Hanyang Xu, Hadi Esmaeilzadeh
2025InfiniMind: A Learning-Optimized Large-Scale Brain-Computer Interface.
Yeongwoo Jang, Daye Jung, Seunghyun Song, Hunjun Lee, Jangwoo Kim
2025Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures.
Chenggang Zhao, Chengqi Deng, Chong Ruan, Damai Dai, Huazuo Gao, Jiashi Li, Liyue Zhang, Panpan Huang, Shangyan Zhou, Shirong Ma, Wenfeng Liang, Ying He, Yuqing Wang, Yuxuan Liu, Y. X. Wei
2025LIA: A Single-GPU LLM Inference Acceleration with Cooperative AMX-Enabled CPU-GPU Computation and CXL Offloading.
Hyungyo Kim, Nachuan Wang, Qirong Xia, Jinghan Huang, Amir Yazdanbakhsh, Nam Sung Kim
2025LUT Tensor Core: A Software-Hardware Co-Design for LUT-Based Low-Bit LLM Inference.
Zhiwen Mo, Lei Wang, Jianyu Wei, Zhichen Zeng, Shijie Cao, Lingxiao Ma, Naifeng Jing, Ting Cao, Jilong Xue, Fan Yang, Mao Yang
2025Leveraging control-flow similarity to reduce branch predictor cold effects in microservices.
Haris Volos, Stylianos Vassiliou, Georgia Antoniou, Davide Basilio Bartolini, Yiannakis Sazeides
2025Light-weight Cache Replacement for Instruction Heavy Workloads.
Saba Mostofi, Setu Gupta, Ahmad Hassani, Krishnam Tibrewala, Elvira Teran, Paul V. Gratz, Daniel A. Jiménez
2025LightML: A Photonic Accelerator for Efficient General Purpose Machine Learning.
Liang Liu, Sadra Rahimi Kari, Xin Xin, Nathan Youngblood, Youtao Zhang, Jun Yang
2025LightNobel: Improving Sequence Length Limitation in Protein Structure Prediction Model via Adaptive Activation Quantization.
Seunghee Han, Soongyu Choi, Joo-Young Kim
2025Lumina: Real-Time Neural Rendering by Exploiting Computational Redundancy.
Yu Feng, Weikai Lin, Yuge Cheng, Zihan Liu, Jingwen Leng, Minyi Guo, Chen Chen, Shixuan Sun, Yuhao Zhu
2025MD-pipe: A Strong Scaling Enhanced Pipeline Architecture for Ab Initio Accuracy Molecular Dynamics.
Ning Kang, Guojun Yuan, Zihan Yan, Beining Zhang, Boyang Li, Zeyu Li, Shuo Wang, Guanglei Chen, Jiayi Rao, Zhan Wang, Weile Jia, Ninghui Sun, Guangming Tan
2025Magellan: A High-Performance Loop-Guided Prefetcher for Indirect Memory Access.
Gelin Fu, Tian Xia, Mingzhuo Yin, Prashant J. Nair, Mieszko Lis, Pengju Ren
2025MagiCache: A Virtual In-Cache Computing Engine.
Renhao Fan, Yikai Cui, Weike Li, Mingyu Wang, Zhaolin Li
2025MeshSlice: Efficient 2D Tensor Parallelism for Distributed DNN Training.
Hyoungwook Nam, Gerasimos Gerogiannis, Josep Torrellas
2025Meta's Second Generation AI Chip: Model-Chip Co-Design and Productionization Experiences.
Joel Coburn, Chunqiang Tang, Sameer Abu Asal, Neeraj Agrawal, Raviteja Chinta, Harish Dattatraya Dixit, Brian Dodds, Saritha Dwarakapuram, Amin Firoozshahian, Cao Gao, Kaustubh Gondkar, Tyler Graf, Junhan Hu, Jian Huang, Sterling Hughes, Adam Hutchin, Bhasker Jakka, Guoqiang Jerry Chen, Indu Kalyanaraman, Ashwin Kamath, Pankaj Kansal, Erum Kazi, Roman Levenstein, Mahesh Maddury, Alex Mastro, Siji Medaiyese, Pritesh Modi, Jack Montgomery, Nadathur Satish, Amit Nagpal, Ashwin Narasimha, Maxim Naumov, Eleanor Ozer, Jongsoo Park, Poorvaja Ramani, Harikrishna Reddy, David Reiss, Deboleena Roy, Sathish Sekar, Arushi Sharma, Pavan Shetty, Aravind Sukumaran-Rajam, Eran Tal, Mike Tsai, Shreya Varshini, Richard Wareing, Olívia Wu, Xiaolong Xie, Jinghan Yang, Hangchen Yu, Tanmay Zargar, Zitong Zeng, Feixiong Zhang, Ajit Mathews, Xun Jiao, Jiyuan Zhang, Emmanuel Menage, Truls Edvard Stokke, Mohammed Sourouri
2025MicroScopiQ: Accelerating Foundational Models through Outlier-Aware Microscaling Quantization.
Akshat Ramachandran, Souvik Kundu, Tushar Krishna
2025MoPAC: Efficiently Mitigating Rowhammer with Probabilistic Activation Counting.
Suhas Vittal, Salman Qazi, Poulami Das, Moinuddin Qureshi
2025NMP-PaK: Near-Memory Processing Acceleration of Scalable De Novo Genome Assembly.
Heewoo Kim, Sanjay Sri Vallabh Singapuram, Haojie Ye, Joseph Izraelevitz, Trevor N. Mudge, Ronald G. Dreslinski, Nishil Talati
2025NUPEA: Optimizing Critical Loads on Spatial Dataflow Architectures via Non-Uniform Processing-Element Access.
Souradip Ghosh, Graham Gobieski, Keyi Zhang, Brandon Lucia, Nathan Beckmann, Tony Nowatzki
2025Need for zkSpeed: Accelerating HyperPlonk for Zero-Knowledge Proofs.
Alhad Daftardar, Jianqiao Mo, Joey Ah-kiow, Benedikt Bünz, Ramesh Karri, Siddharth Garg, Brandon Reagen
2025Neo: Towards Efficient Fully Homomorphic Encryption Acceleration using Tensor Core.
Dian Jiao, Xianglong Deng, Zhiwei Wang, Shengyu Fan, Yi Chen, Dan Meng, Rui Hou, Mingzhe Zhang
2025Neoscope: How Resilient Is My SoC to Workload Churn?
Joseph Rogers, Lieven Eeckhout, Taha Soliman, Magnus Jahre
2025NetCrafter: Tailoring Network Traffic for Non-Uniform Bandwidth Multi-GPU Systems.
Amel Fatima, Yang Yang, Yifan Sun, Rachata Ausavarungnirun, Adwait Jog
2025Nyx: Virtualizing dataflow execution on shared FPGA platforms.
Panagiotis Miliadis, Dimitris Theodoropoulos, Nectarios Koziris, Dionisios N. Pnevmatikatos
2025Oaken: Fast and Efficient LLM Serving with Online-Offline Hybrid KV Cache Quantization.
Minsu Kim, Seongmin Hong, Ryeowook Ko, Soongyu Choi, Hunjong Lee, Junsoo Kim, Joo-Young Kim, Jongse Park
2025OptiPIM: Optimizing Processing-in-Memory Acceleration Using Integer Linear Programming.
Jiantao Liu, Minxuan Zhou, Yue Pan, Chien-Yi Yang, Lana Josipovic, Tajana Rosing
2025PD Constraint-aware Physical/Logical Topology Co-Design for Network on Wafer.
Qize Yang, Taiquan Wei, Sihan Guan, Chengran Li, Haoran Shang, Jinyi Deng, Huizheng Wang, Chao Li, Lei Wang, Yan Zhang, Shouyi Yin, Yang Hu
2025Phi: Leveraging Pattern-based Hierarchical Sparsity for High-Efficiency Spiking Neural Networks.
Chiyue Wei, Bowen Duan, Cong Guo, Jingyang Zhang, Qingyue Song, Hai Li, Yiran Chen
2025Precise exceptions in relaxed architectures.
Ben Simner, Alasdair Armstrong, Thomas Bauereiss, Brian Campbell, Ohad Kammar, Jean Pichon-Pharabod, Peter Sewell
2025Proceedings of the 52nd Annual International Symposium on Computer Architecture, ISCA 2025, Tokyo, Japan, June 21-25, 2025
2025Process Only Where You Look: Hardware and Algorithm Co-optimization for Efficient Gaze-Tracked Foveated Rendering in Virtual Reality.
Haiyu Wang, Wenxuan Liu, Kenneth Chen, Qi Sun, Sai Qian Zhang
2025Profile-Guided Temporal Prefetching.
Mengming Li, Qijun Zhang, Yichuan Gao, Wenji Fang, Yao Lu, Yongqing Ren, Zhiyao Xie
2025PuDHammer: Experimental Analysis of Read Disturbance Effects of Processing-using-DRAM in Real DRAM Chips.
Ismail Emir Yuksel, Akash Sood, Ataberk Olgun, Oguzhan Canpolat, Haocong Luo, Nisa Bostanci, Mohammad Sadrosadati, A. Giray Yaglikçi, Onur Mutlu
2025QPlacer: Frequency-Aware Component Placement for Superconducting Quantum Computers.
Junyao Zhang, Hanrui Wang, Qi Ding, Jiaqi Gu, Reouven Assouly, William D. Oliver, Song Han, Kenneth R. Brown, Hai Li, Yiran Chen
2025QR-Map: A Map-Based Approach to Quantum Circuit Abstraction for Qubit Reuse Optimization.
Hyungseok Kim, Enhyeok Jang, Seungwoo Choi, Youngmin Kim, Won Woo Ro
2025Qtenon: Towards Low-Latency Architecture Integration for Accelerating Hybrid Quantum-Classical Computing.
Chenning Tao, Liqiang Lu, Size Zheng, Li-Wen Chang, Minghua Shen, Hanyu Zhang, Fangxin Liu, Kaiwen Zhou, Jianwei Yin
2025RAGO: Systematic Performance Optimization for Retrieval-Augmented Generation Serving.
Wenqi Jiang, Suvinay Subramanian, Cat Graves, Gustavo Alonso, Amir Yazdanbakhsh, Vidushi Dadu
2025RAP: Reconfigurable Automata Processor.
Ziyuan Wen, Alexis Le Glaunec, Konstantinos Mamouras, Kaiyuan Yang
2025REIS: A High-Performance and Energy-Efficient Retrieval System with In-Storage Processing.
Kangqi Chen, Rakesh Nadig, Manos Frouzakis, Nika Mansouri-Ghiasi, Yu Liang, Haiyu Mao, Jisung Park, Mohammad Sadrosadati, Onur Mutlu
2025RTSpMSpM: Harnessing Ray Tracing for Efficient Sparse Matrix Computations.
Hongrui Zhang, Yunan Zhang, Hung-Wei Tseng
2025Reconfigurable Stream Network Architecture.
Chengyue Wang, Xiaofan Zhang, Jason Cong, James C. Hoe
2025Reinforcement Learning-Guided Graph State Generation in Photonic Quantum Computers.
Yingheng Li, Yue Dai, Aditya Pawar, Rongchao Dong, Jun Yang, Youtao Zhang, Xulong Tang
2025Resource Analysis of Low-Overhead Transversal Architectures for Reconfigurable Atom Arrays.
Hengyun Zhou, Casey Duckering, Chen Zhao, Dolev Bluvstein, Madelyn Cain, Aleksander Kubica, Sheng-Tao Wang, Mikhail D. Lukin
2025Rethinking Prefetching for Intermittent Computing.
Gan Fang, Jianping Zeng, Aditya Gupta, Changhee Jung
2025S-SYNC: Shuttle and Swap Co-Optimization in Quantum Charge-Coupled Devices.
Chenghong Zhu, Xian Wu, Jingbo Wang, Xin Wang
2025SEAL: A Single-Event Architecture for In-Sensor Visual Localization.
Ryan Hou, Thomas Twomey, Vasileios Milionis, Evangelos Dikopoulos, Tianrui Ma, Yuhao Zhu, Georgios Tzimpragos
2025SWIPER: Minimizing Fault-Tolerant Quantum Program Latency via Speculative Window Decoding.
Joshua Viszlai, Jason D. Chadwick, Sarang Joshi, Gokul Subramanian Ravi, Yanjing Li, Frederic T. Chong
2025Scaling Llama 3 Training with Efficient Parallelism Strategies.
Weiwei Chu, Xinfeng Xie, Jiecao Yu, Jie Wang, Amar Phanishayee, Chunqiang Tang, Yuchen Hao, Jianyu Huang, Mustafa Ozdal, Jun Wang, Vedanuj Goswami, Naman Goyal, Abhishek Kadian, Andrew Gu, Chris Cai, Feng Tian, Xiaodong Wang, Min Si, Pavan Balaji, Ching-Hsiang Chu, Jongsoo Park
2025Single Spike Artificial Neural Networks.
Rhys Gretsch, Michael Beyeler, Jeremy Lau, Timothy Sherwood
2025Single-Address-Space FaaS with Jord.
Yuanlong Li, Atri Bhattacharyya, Madhur Kumar, Abhishek Bhattacharjee, Yoav Etsion, Babak Falsafi, Sanidhya Kashyap, Mathias Payer
2025SpecASan: Mitigating Transient Execution Attacks Using Speculative Address Sanitization.
Saber Ganjisaffar, Esmaeil Mohmmadian Koruyeh, Jason Zellmer, Hodjat Asghari Esfeden, Chengyu Song, Nael B. Abu-Ghazaleh
2025SpecEE: Accelerating Large Language Model Inference with Speculative Early Exiting.
Jiaming Xu, Jiayi Pan, Yongkang Zhou, Siming Chen, Jinhao Li, Yaoxiu Lian, Junyi Wu, Guohao Dai
2025SwitchQNet: Optimizing Distributed Quantum Computing for Quantum Data Centers with Switch Networks.
Hezi Zhang, Yiran Xu, Haotian Hu, Keyi Yin, Hassan Shapourian, Jiapeng Zhao, Ramana Rao Kompella, Reza Nejabati, Yufei Ding
2025Synchronization for Fault-Tolerant Quantum Computers.
Satvik Maurya, Swamit Tannu
2025TRACI: Network Acceleration of Input-Dynamic Communication for Large-Scale Deep Learning Recommendation Model.
Guyue Huang, Hao Li, Le Qin, Jiayi Huang, Yangwook Kang, Yufei Ding, Yuan Xie
2025Telos: A Dataflow Accelerator for Sparse Triangular Solver of Partial Differential Equations.
Xiaochen Hao, Hao Luo, Chu Wang, Chao Yang, Yun Liang
2025The Sparsity-Aware LazyGPU Architecture.
Changxi Liu, Miao Yu, Yifan Sun, Trevor E. Carlson
2025The XOR Cache: A Catalyst for Compression.
Zhewen Pan, Joshua San Miguel
2025Topology-Aware Virtualization over Inter-Core Connected Neural Processing Units.
Dahu Feng, Erhu Feng, Dong Du, Pinjie Xu, Yubin Xia, Haibo Chen, Rong Zhao
2025Transitive Array: An Efficient GEMM Accelerator with Result Reuse.
Cong Guo, Chiyue Wei, Jiaming Tang, Bowen Duan, Song Han, Hai Li, Yiran Chen
2025TrioSim: A Lightweight Simulator for Large-Scale DNN Workloads on Multi-GPU Systems.
Ying Li, Yuhui Bao, Gongyu Wang, Xinxin Mei, Pranav Vaid, Anandaroop Ghosh, Adwait Jog, Darius Bunandar, Ajay Joshi, Yifan Sun
2025UGPU: Dynamically Constructing Unbalanced GPUs for Enhanced Resource Efficiency.
Xia Zhao, Guangda Zhang, Lu Wang, Huadong Dai
2025UPP: Universal Predicate Pushdown to Smart Storage.
Ipoom Jeong, Jinghan Huang, Chuxuan Hu, Dohyun Park, Jaeyoung Kang, Nam Sung Kim, Yongjoo Park
2025Unified Memory Protection with Multi-granular MAC and Integrity Tree for Heterogeneous Processors.
Sunho Lee, Seonjin Na, Jeongwon Choi, Jinwon Pyo, Jaehyuk Huh
2025Variational Quantum Algorithms in the era of Early Fault Tolerance.
Siddharth Dangwal, Suhas Vittal, Lennart Maximilian Seifert, Frederic T. Chong, Gokul Subramanian Ravi
2025WSC-LLM: Efficient LLM Service and Architecture Co-exploration for Wafer-scale Chips.
Zheng Xu, Dehao Kong, Jiaxin Liu, Jinxi Li, Jingxiang Hou, Xu Dai, Chao Li, Shaojun Wei, Yang Hu, Shouyi Yin
2025WarmCache: Exploiting STT-RAM Cache for Low-Power Intermittent Systems.
Noureldin Hassan, Byounguk Min, Changhee Jung, Yan Solihin, Jongouk Choi
2025When Mitigations Backfire: Timing Channel Attacks and Defense for PRAC-Based RowHammer Mitigations.
Jeonghyun Woo, Joyce Qu, Gururaj Saileshwar, Prashant Jayaprakash Nair
2025WindServe: Efficient Phase-Disaggregated LLM Serving with Stream-based Dynamic Scheduling.
Jingqi Feng, Yukai Huang, Rui Zhang, Sicheng Liang, Ming Yan, Jie Wu
2025XHarvest: Rethinking High-Performance and Cost-Efficient SSD Architecture with CXL-Driven Harvesting.
Li Peng, Wenbo Wu, Shushu Yi, Xianzhang Chen, Chenxi Wang, Shengwen Liang, Zhe Wang, Nong Xiao, Qiao Li, Mingzhe Zhang, Jie Zhang
2025Zettafly: A Network Topology with Flexible Non-blocking Regions for Large-scale AI and HPC Systems.
Dezun Dong, Ziyu Wang, Fei Lei