| 2024 | 57th IEEE/ACM International Symposium on Microarchitecture, MICRO 2024, Austin, TX, USA, November 2-6, 2024 |
| 2024 | A Case for Speculative Address Translation with Rapid Validation for GPUs. Junhyeok Park, Osang Kwon, Yongho Lee, Seongwook Kim, Gwangeun Byeon, Jihun Yoon, Prashant J. Nair, Seokin Hong |
| 2024 | A Compiler-Like Framework for Optimizing Cryptographic Big Integer Multiplication on GPUs. Zhuoran Ji, Jianyu Zhao, Zhaorui Zhang, Jiming Xu, Shoumeng Yan, Lei Ju |
| 2024 | A Framework for Fine-Grained Program Versioning. Yishen Chen, Saman P. Amarasinghe |
| 2024 | A Mess of Memory System Benchmarking, Simulation and Application Profiling. Pouya Esmaili-Dokht, Francesco Sgherzi, Valéria Soldera Girelli, Isaac Boixaderas, Mariana Carmin, Alireza Monemi, Adrià Armejach, Estanislao Mercadal, Germán Llort, Petar Radojkovic, Miquel Moretó, Judit Giménez, Xavier Martorell, Eduard Ayguadé, Jesús Labarta, Emanuele Confalonieri, Rishabh Dubey, Jason Adlard |
| 2024 | A Scalable, Efficient, and Robust Dynamic Memory Management Library for HLS-based FPGAs. Qinggang Wang, Long Zheng, Zhaozeng An, Shuyi Xiong, Runze Wang, Yu Huang, Pengcheng Yao, Xiaofei Liao, Hai Jin, Jingling Xue |
| 2024 | Acamar: A Dynamically Reconfigurable Scientific Computing Accelerator for Robust Convergence and Minimal Resource Underutilization. Ubaid Bakhtiar, Helya Hosseini, Bahar Asgari |
| 2024 | Accelerating Zero-Knowledge Proofs Through Hardware-Algorithm Co-Design. Nikola Samardzic, Simon Langowski, Srinivas Devadas, Daniel Sánchez |
| 2024 | ActiveN: A Scalable and Flexibly-Programmable Event-Driven Neuromorphic Processor. Xiaoyi Liu, Zhongzhu Pu, Peng Qu, Weimin Zheng, Youhui Zhang |
| 2024 | AdapTiV: Sign-Similarity Based Image-Adaptive Token Merging for Vision Transformer Acceleration. Seungjae Yoo, Hangyeol Kim, Joo-Young Kim |
| 2024 | Ares-Flash: Efficient Parallel Integer Arithmetic Operations Using NAND Flash Memory. Jian Chen, Congming Gao, Youyou Lu, Yuhao Zhang, Jiwu Shu |
| 2024 | Atomic Cache: Enabling Efficient Fine-Grained Synchronization with Relaxed Memory Consistency on GPGPUs Through In-Cache Atomic Operations. Yicong Zhang, Mingyu Wang, Wangguang Wang, Yangzhan Mai, Haiqiu Huang, Zhiyi Yu |
| 2024 | Azul: An Accelerator for Sparse Iterative Solvers Leveraging Distributed On-Chip Memory. Axel Feldmann, Courtney Golden, Yifan Yang, Joel S. Emer, Daniel Sánchez |
| 2024 | BABOL: A Software-Defined NAND Flash Controller. Kibin Park, Alberto Lerner, Sangjin Lee, Philippe Bonnet, Yong Ho Song, Philippe Cudré-Mauroux, Jungwook Choi |
| 2024 | BBS: Bi-Directional Bit-Level Sparsity for Deep Learning Acceleration. Yuzong Chen, Jian Meng, Jae-sun Seo, Mohamed S. Abdelfattah |
| 2024 | Beehive: A Flexible Network Stack for Direct-Attached Accelerators. Katie Lim, Matthew Giordano, Theano Stavrinos, Irene Zhang, Jacob Nelson, Baris Kasikci, Thomas E. Anderson |
| 2024 | Blenda: Dynamically-Reconfigurable Stacked DRAM. Mohammad Bakhshalipour, HamidReza Zare, Farid Samandi, Fatemeh Golshan, Pejman Lotfi-Kamran, Hamid Sarbazi-Azad |
| 2024 | BreakHammer: Enhancing RowHammer Mitigations by Carefully Throttling Suspect Threads. Oguzhan Canpolat, A. Giray Yaglikçi, Ataberk Olgun, Ismail Emir Yuksel, Yahya Can Tugrul, Konstantinos Kanellopoulos, Oguz Ergin, Onur Mutlu |
| 2024 | Bridging the Gap Between LLMs and LNS with Dynamic Data Format and Architecture Codesign. Pouya Haghi, Chunshu Wu, Zahra Azad, Yanfei Li, Andrew Gui, Yuchen Hao, Ang Li, Tony Tong Geng |
| 2024 | COMPASS: SRAM-Based Computing-in-Memory SNN Accelerator with Adaptive Spike Speculation. Zongwu Wang, Fangxin Liu, Ning Yang, Shiyuan Huang, Haomin Li, Li Jiang |
| 2024 | CPElide: Efficient Multi-Chiplet GPU Implicit Synchronization. Preyesh Dalmia, Rajesh Shashi Kumar, Matthew D. Sinclair |
| 2024 | CacheCraft: Enhancing GPU Performance under Memory Protection through Reconstructed Caching. Soyoung Park, Hojung Namkoong, Boyeol Choi, Michael B. Sullivan, Jungrae Kim |
| 2024 | CamPU: A Multi-Camera Processing Unit for Deep Learning-based 3D Spatial Computing Systems. Dongseok Im, Hoi-Jun Yoo |
| 2024 | Cambricon-C: Efficient 4-Bit Matrix Unit via Primitivization. Yi Chen, Yongwei Zhao, Yifan Hao, Yuanbo Wen, Yuntao Dai, Xiaqing Li, Yang Liu, Rui Zhang, Mo Zou, Xinkai Song, Xing Hu, Zidong Du, Huaping Chen, Qi Guo, Tianshi Chen |
| 2024 | Cambricon-LLM: A Chiplet-Based Hybrid Architecture for On-Device Inference of 70B LLM. Zhongkai Yu, Shengwen Liang, TianYun Ma, Yunke Cai, Ziyuan Nan, Di Huang, Xinkai Song, Yifan Hao, Jie Zhang, Tian Zhi, Yongwei Zhao, Zidong Du, Xing Hu, Qi Guo, Tianshi Chen |
| 2024 | Cambricon-M: A Fibonacci-Coded Charge-Domain SRAM-Based CIM Accelerator for DNN Inference. Hongrui Guo, Mo Zou, Yifan Hao, Zidong Du, Erxiang Ren, Yang Liu, Yongwei Zhao, Tianrui Ma, Rui Zhang, Xing Hu, Fei Qiao, Zhiwei Xu, Qi Guo, Tianshi Chen |
| 2024 | Chaining Transactions for Effective Concurrency Management in Hardware Transactional Memory. Víctor Nicolás-Conesa, J. Rubén Titos Gil, Ricardo Fernández-Pascual, Manuel E. Acacio, Alberto Ros |
| 2024 | Concurrency-Aware Register Stacks for Efficient GPU Function Calls. Ni Kang, Ahmad Alawneh, Mengchi Zhang, Timothy G. Rogers |
| 2024 | Customizing Cache Indexing Through Entropy Estimation. Kevin Weston, Avery Johnson, Vahid Janfaza, Farabi Mahmud, Abdullah Muzahid |
| 2024 | DRCTL: A Disorder-Resistant Computation Translation Layer Enhancing the Lifetime and Performance of Memristive CIM Architecture. Heng Zhou, Bing Wu, Huan Cheng, Jinpeng Liu, Taoming Lei, Dan Feng, Wei Tong |
| 2024 | Defending Against EMI Attacks on Just-In-Time Checkpoint for Resilient Intermittent Systems. Jaeseok Choi, Hyunwoo Joe, Changhee Jung, Jongouk Choi |
| 2024 | DelayAVF: Calculating Architectural Vulnerability Factors for Delay Faults. Peter W. Deutsch, Vincent Quentin Ulitzsch, Sudhanva Gurumurthi, Vilas Sridharan, Joel S. Emer, Mengjia Yan |
| 2024 | Demystifying a CXL Type-2 Device: A Heterogeneous Cooperative Computing Perspective. Houxiang Ji, Srikar Vanavasam, Yang Zhou, Qirong Xia, Jinghan Huang, Yifan Yuan, Ren Wang, Pekon Gupta, Bhushan Chitlur, Ipoom Jeong, Nam Sung Kim |
| 2024 | Distributed Page Table: Harnessing Physical Memory as an Unbounded Hashed Page Table. Osang Kwon, Yongho Lee, Junhyeok Park, Sungbin Jang, Byungchul Tak, Seokin Hong |
| 2024 | Duplex: A Device for Large Language Models with Mixture of Experts, Grouped Query Attention, and Continuous Batching. Sungmin Yun, Kwanhee Kyung, Juhwan Cho, Jaewan Choi, Jongmin Kim, Byeongho Kim, Sukhan Lee, Kyomin Sohn, Jung Ho Ahn |
| 2024 | Elastic Translations: Fast Virtual Memory with Multiple Translation Sizes. Stratos Psomadakis, Chloe Alverti, Vasileios Karakostas, Christos Katsakioris, Dimitrios Siakavaras, Konstantinos Nikas, Georgios I. Goumas, Nectarios Koziris |
| 2024 | Extending GPU Ray-Tracing Units for Hierarchical Search Acceleration. Aaron Barnes, Fangjia Shen, Timothy G. Rogers |
| 2024 | Flag-Proxy Networks: Overcoming the Architectural, Scheduling and Decoding Obstacles of Quantum LDPC Codes. Suhas Vittal, Ali Javadi-Abhari, Andrew W. Cross, Lev S. Bishop, Moinuddin Qureshi |
| 2024 | FloatAP: Supporting High-Performance Floating-Point Arithmetic in Associative Processors. Kailin Yang, José F. Martínez |
| 2024 | FuseMax: Leveraging Extended Einsums to Optimize Attention Accelerator Design. Nandeeka Nayak, Xinrui Wu, Toluwanimi O. Odemuyiwa, Michael Pellauer, Joel S. Emer, Christopher W. Fletcher |
| 2024 | Fusion-3D: Integrated Acceleration for Instant 3D Reconstruction and Real-Time Rendering. Sixu Li, Yang Zhao, Chaojian Li, Bowei Guo, Jingqun Zhang, Wenbo Zhu, Zhifan Ye, Cheng Wan, Yingyan Celine Lin |
| 2024 | GauSPU: 3D Gaussian Splatting Processor for Real-Time SLAM Systems. Lizhou Wu, Haozhe Zhu, Siqi He, Jiapei Zheng, Chixiao Chen, Xiaoyang Zeng |
| 2024 | Generalizing Ray Tracing Accelerators for Tree Traversals on GPUs. Dongho Ha, Lufei Liu, Yuan-Hsi Chou, Seokjin Go, Won Woo Ro, Hung-Wei Tseng, Tor M. Aamodt |
| 2024 | Genie Cache: Non-Blocking Miss Handling and Replacement in Page-Table-Based DRAM Cache. Youngin Kim, William J. Song |
| 2024 | Ghost Arbitration: Mitigating Interconnect Side-Channel Timing Attacks in GPU. Zhixian Jin, Jaeguk Ahn, Jiho Kim, Hans Kasan, Jina Song, WonJun Song, John Kim |
| 2024 | Hardware-Assisted Virtualization of Neural Processing Units for Cloud Platforms. Yuqi Xue, Yiqi Liu, Lifeng Nai, Jian Huang |
| 2024 | Hestia: An Efficient Cross-Level Debugger for High-Level Synthesis. Ruifan Xu, Jin Luo, Yawen Zhang, Yibo Lin, Runsheng Wang, Ru Huang, Yun Liang |
| 2024 | HgPCN: A Heterogeneous Architecture for E2E Embedded Point Cloud Inference. Yiming Gao, Chao Jiang, Wesley Piard, Xiangru Chen, Bhavesh Patel, Herman Lam |
| 2024 | HyFiSS: A Hybrid Fidelity Stall-Aware Simulator for GPGPUs. Jianchao Yang, Mei Wen, Dong Chen, Zhaoyun Chen, Zeyu Xue, Yuhang Li, Junzhong Shen, Yang Shi |
| 2024 | HyperTEE: A Decoupled TEE Architecture with Secure Enclave Management. Yunkai Bai, Peinan Li, Yubiao Huang, Michael C. Huang, Shijun Zhao, Lutan Zhao, Fengwei Zhang, Dan Meng, Rui Hou |
| 2024 | ICED: An Integrated CGRA Framework Enabling DVFS-Aware Acceleration. Cheng Tan, Miaomiao Jiang, Deepak Patil, Yanghui Ou, Zhaoying Li, Lei Ju, Tulika Mitra, Hyunchul Park, Antonino Tumeo, Jeff Zhang |
| 2024 | ImPress: Securing DRAM Against Data-Disturbance Errors via Implicit Row-Press Mitigation. Anish Saxena, Aamer Jaleel, Moinuddin Qureshi |
| 2024 | IvLeague: Side Channel-Resistant Secure Architectures Using Isolated Domains of Dynamic Integrity Trees. Md Hafizul Islam Chowdhuryy, Fan Yao |
| 2024 | LIBRA: Memory Bandwidth- and Locality-Aware Parallel Tile Rendering. Aurora Tomás, Juan L. Aragón, Joan-Manuel Parcerisa, Antonio González |
| 2024 | LUCIE: A Universal Chiplet-Interposer Design Framework for Plug-and-Play Integration. Zixi Li, David Wentzlaff |
| 2024 | Leveraging Cache Coherence to Detect and Repair False Sharing On-the-fly. Vipin Patel, Swarnendu Biswas, Mainak Chaudhuri |
| 2024 | Leviathan: A Unified System for General-Purpose Near-Data Computing. Brian C. Schwedock, Nathan Beckmann |
| 2024 | LightWSP: Whole-System Persistence on the Cheap. Yuchen Zhou, Jianping Zeng, Changhee Jung |
| 2024 | LoAS: Fully Temporal-Parallel Dataflow for Dual-Sparse Spiking Neural Networks. Ruokai Yin, Youngeun Kim, Di Wu, Priyadarshini Panda |
| 2024 | Localizing the Tag Comparisons in the Wakeup Logic to Reduce Energy Consumption of the Issue Queue. Kenichiro Mori, Sota Kosugi, Hiroto Yoshida, Hajime Shimada, Hideki Ando |
| 2024 | Looking into the Black Box: Monitoring Computer Architecture Simulations in Real-Time with AkitaRTM. Ali Mosallaei, Katherine E. Isaacs, Yifan Sun |
| 2024 | Low-Overhead General-Purpose Near-Data Processing in CXL Memory Expanders. Hyungkyu Ham, Jeongmin Hong, Geonwoo Park, Yunseon Shin, Okkyun Woo, Wonhyuk Yang, Jinhoon Bae, Eunhyeok Park, Hyojin Sung, Euicheol Lim, Gwangsun Kim |
| 2024 | MINT: Securely Mitigating Rowhammer with a Minimalist in-DRAM Tracker. Moinuddin Qureshi, Salman Qazi, Aamer Jaleel |
| 2024 | MeMCISA: Memristor-Enabled Memory-Centric Instruction-Set Architecture for Database Workloads. Yihang Zhu, Lei Cai, Lianfeng Yu, Anjunyi Fan, Longhao Yan, Zhaokun Jing, Bonan Yan, Pek Jun Tiw, Yuqi Li, Yaoyu Tao, Yuchao Yang |
| 2024 | Memory Allocation Under Hardware Compression. Muhammad Laghari, Yuqing Liu, Gagandeep Panwar, David Bears, Chandler Jearls, Raghavendra Srinivas, Esha Choukse, Kirk W. Cameron, Ali Raza Butt, Xun Jian |
| 2024 | Message from the MICRO 2024 General Chairs: "Hi, How Are you?" - "Jeremiah The Innocent" Mural. Daniel Johnson |
| 2024 | Message from the MICRO 2024 Program Chairs. Daniel A. Jiménez, Alaa R. Alameldeen |
| 2024 | Mosaic: Harnessing the Micro-Architectural Resources of Servers in Serverless Environments. Jovan Stojkovic, Esha Choukse, Enrique Saurez, Íñigo Goiri, Josep Torrellas |
| 2024 | Multi-Issue Butterfly Architecture for Sparse Convex Quadratic Programming. Maolin Wang, Ian McInerney, Bartolomeo Stellato, Fengbin Tu, Stephen P. Boyd, Hayden Kwok-Hay So, Kwang-Ting Cheng |
| 2024 | NeoMem: Hardware/Software Co-Design for CXL-Native Memory Tiering. Zhe Zhou, Yiqi Chen, Tao Zhang, Yang Wang, Ran Shu, Shuotao Xu, Peng Cheng, Lei Qu, Yongqiang Xiong, Jie Zhang, Guangyu Sun |
| 2024 | Over-Synchronization in GPU Programs. Ajay Nayak, Arkaprava Basu |
| 2024 | PIFS-Rec: Process-In-Fabric-Switch for Large-Scale Recommendation System Inferences. Pingyi Huo, Anusha Devulapally, Hasan Al Maruf, Minseo Park, Krishnakumar Nair, Meena Arunachalam, Gulsum Gudukbay Akbulut, Mahmut Taylan Kandemir, Vijaykrishnan Narayanan |
| 2024 | PIM-MMU: A Memory Management Unit for Accelerating Data Transfers in Commercial PIM Systems. Dongjae Lee, Bongjoon Hyun, Taehun Kim, Minsoo Rhu |
| 2024 | PointCIM: A Computing-in-Memory Architecture for Accelerating Deep Point Cloud Analytics. Xuan-Jun Chen, Han-Ping Chen, Chia-Lin Yang |
| 2024 | Polymorphic Error Correction. Evgeny Manzhosov, Simha Sethumadhavan |
| 2024 | Pushing the Performance Envelope of DNN-based Recommendation Systems Inference on GPUs. Rishabh Jain, Vivek M. Bhasi, Adwait Jog, Anand Sivasubramaniam, Mahmut T. Kandemir, Chita R. Das |
| 2024 | PyPIM: Integrating Digital Processing-in-Memory from Microarchitectural Design to Python Tensors. Orian Leitersdorf, Ronny Ronen, Shahar Kvatinsky |
| 2024 | Qoncord: A Multi-Device Job Scheduling Framework for Variational Quantum Algorithms. Meng Wang, Poulami Das, Prashant J. Nair |
| 2024 | RAHP: A Redundancy-aware Accelerator for High-performance Hypergraph Neural Network. Hui Yu, Yu Zhang, Ligang He, Yingqi Zhao, Xintao Li, Ruida Xin, Jin Zhao, Xiaofei Liao, Haikun Liu, Bingsheng He, Hai Jin |
| 2024 | RTL2MμPATH: Multi-μPATH Synthesis with Applications to Hardware Security Verification. Yao Hsiao, Nikos Nikoleris, Artem Khyzha, Dominic P. Mulligan, Gustavo Petri, Christopher W. Fletcher, Caroline Trippel |
| 2024 | Rearchitecting a Neuromorphic Processor for Spike-Driven Brain-Computer Interfacing. Hunjun Lee, Yeongwoo Jang, Daye Jung, Seunghyun Song, Jangwoo Kim |
| 2024 | Ring Road: A Scalable Polar-Coordinate-based 2D Network-on-Chip Architecture. Yinxiao Feng, Wei Li, Kaisheng Ma |
| 2024 | SCALE: A Structure-Centric Accelerator for Message Passing Graph Neural Networks. Lingxiang Yin, Sanjay Gandham, Mingjie Lin, Hao Zheng |
| 2024 | SCAR: Scheduling Multi-Model AI Workloads on Heterogeneous Multi-Chiplet Module Accelerators. Mohanad Odema, Luke Chen, Hyoukjun Kwon, Mohammad Abdullah Al Faruque |
| 2024 | SOFA: A Compute-Memory Optimized Sparsity Accelerator via Cross-Stage Coordinated Tiling. Huizheng Wang, Jiahao Fang, Xinru Tang, Zhiheng Yue, Jinxi Li, Yubin Qin, Sihan Guan, Qinze Yang, Yang Wang, Chao Li, Yang Hu, Shouyi Yin |
| 2024 | SOPHGO BM1684X: A Commercial High Performance Terminal AI Processor with Large Model Support. Peng Gao, Yang Liu, Jun Wang, Wanlin Cai, Guangchong Shen, Zonghui Hong, Jiali Qu, Ning Wang |
| 2024 | SOPHIE: A Scalable Recurrent Ising Machine Using Optically Addressed Phase Change Memory. Guowei Yang, Sina Karimi, Carlos A. Ríos Ocampo, Ayse K. Coskun, Ajay Joshi |
| 2024 | SRender: Boosting Neural Radiance Field Efficiency via Sensitivity-Aware Dynamic Precision Rendering. Zhuoran Song, Houshu He, Fangxin Liu, Yifan Hao, Xinkai Song, Li Jiang, Xiaoyao Liang |
| 2024 | STAR: Sub-Entry Sharing-Aware TLB for Multi-Instance GPU. Bingyao Li, Yueqi Wang, Tianyu Wang, Lieven Eeckhout, Jun Yang, Aamer Jaleel, Xulong Tang |
| 2024 | SUV: Static Analysis Guided Unified Virtual Memory. Pratheek B, Guilherme Cox, Ján Veselý, Arkaprava Basu |
| 2024 | SambaNova SN40L: Scaling the AI Memory Wall with Dataflow and Composition of Experts. Raghu Prabhakar, Ram Sivaramakrishnan, Darshan Gandhi, Yun Du, Mingran Wang, Xiangyu Song, Kejie Zhang, Tianren Gao, Angela Wang, Xiaoyan Li, Yongning Sheng, Joshua Brot, Denis Sokolov, Apurv Vivek, Calvin Leung, Arjun Sabnis, Jiayu Bai, Tuowen Zhao, Mark Gottscho, David Jackson, Mark Luttrell, Manish K. Shah, Zhengyu Chen, Kaizhao Liang, Swayambhoo Jain, Urmish Thakker, Dawei Huang, Sumti Jairath, Kevin J. Brown, Kunle Olukotun |
| 2024 | Scalar Vector Runahead. Jaime Roelandts, Ajeya Naithani, Sam Ainsworth, Timothy M. Jones, Lieven Eeckhout |
| 2024 | Secure Prefetching for Secure Cache Systems. Sumon Nath, Agustín Navarro-Torres, Alberto Ros, Biswabandan Panda |
| 2024 | Self-Managing DRAM: A Low-Cost Framework for Enabling Autonomous and Efficient DRAM Maintenance Operations. Hasan Hassan, Ataberk Olgun, A. Giray Yaglikçi, Haocong Luo, Onur Mutlu |
| 2024 | Sparsepipe: Sparse Inter-operator Dataflow Architecture with Cross-Iteration Reuse. Yunan Zhang, Po-An Tsai, Hung-Wei Tseng |
| 2024 | StarNUMA: Mitigating NUMA Challenges with Memory Pooling. Albert Cho, Alexandros Daglis |
| 2024 | Stellar: An Automated Design Framework for Dense and Sparse Spatial Accelerators. Hasan Nazim Genc, Hansung Kim, Prashanth Ganesh, Yakun Sophia Shao |
| 2024 | Stream-Based Data Placement for Near-Data Processing with Extended Memory. Yiwei Li, Boyu Tian, Yi Ren, Mingyu Gao |
| 2024 | SuperCore: An Ultra-Fast Superconducting Processor for Cryogenic Applications. Junhyuk Choi, Ilkwon Byun, Juwon Hong, Dongmoon Min, Junpyo Kim, Jungmin Cho, Hyeonseong Jeong, Masamitsu Tanaka, Koji Inoue, Jangwoo Kim |
| 2024 | Surf-Deformer: Mitigating Dynamic Defects on Surface Code via Adaptive Deformation. Keyi Yin, Xiang Fang, Travis S. Humble, Ang Li, Yunong Shi, Yufei Ding |
| 2024 | TACOS: Topology-Aware Collective Algorithm Synthesizer for Distributed Machine Learning. William Won, Midhilesh Elavazhagan, Sudarshan Srinivasan, Swati Gupta, Tushar Krishna |
| 2024 | TMiner: A Vertex-Based Task Scheduling Architecture for Graph Pattern Mining. Zerun Li, Xiaoming Chen, Yinhe Han |
| 2024 | Temporarily Unauthorized Stores: Write First, Ask for Permission Later. Juan M. Cebrian, Magnus Jahre, Alberto Ros |
| 2024 | Terminus: A Programmable Accelerator for Read and Update Operations on Sparse Data Structures. Hyun Ryong Lee, Daniel Sánchez |
| 2024 | The Last-Level Branch Predictor. David Schall, Andreas Sandberg, Boris Grot |
| 2024 | The TYR Dataflow Architecture: Improving Locality by Taming Parallelism. Nikhil Agarwal, Mitchell Fream, Souradip Ghosh, Brian C. Schwedock, Nathan Beckmann |
| 2024 | ThreadFuser: A SIMT Analysis Framework for MIMD Programs. Ahmad Alawneh, Ni Kang, Mahmoud Khairy, Timothy G. Rogers |
| 2024 | Timely, Efficient, and Accurate Branch Precomputation. Aniket Deshmukh, Lingzhe Chester Cai, Yale N. Patt |
| 2024 | Trinity: A General Purpose FHE Accelerator. Xianglong Deng, Shengyu Fan, Zhicheng Hu, Zhuoyu Tian, Zihao Yang, Jiangrui Yu, Dingyuan Cao, Dan Meng, Rui Hou, Meng Li, Qian Lou, Mingzhe Zhang |
| 2024 | UFC: A Unified Accelerator for Fully Homomorphic Encryption. Minxuan Zhou, Yujin Nam, Xuan Wang, Youhak Lee, Chris Wilkerson, Raghavan Kumar, Sachin Taneja, Sanu Mathew, Rosario Cammarota, Tajana Rosing |
| 2024 | Uncovering Real GPU NoC Characteristics: Implications on Interconnect Architecture. Zhixian Jin, Christopher Rocca, Jiho Kim, Hans Kasan, Minsoo Rhu, Ali Bakhoda, Tor M. Aamodt, John Kim |
| 2024 | Unleashing CPU Potential for Executing GPU Programs Through Compiler/Runtime Optimizations. Ruobing Han, Jisheng Zhao, Hyesoon Kim |
| 2024 | VGA: Hardware Accelerator for Scalable Long Sequence Model Inference. Seung Yul Lee, Hyunseung Lee, Jihoon Hong, SangLyul Cho, Jae W. Lee |
| 2024 | Veiled Pathways: Investigating Covert and Side Channels Within GPU Uncore. Yuanqing Miao, Yingtian Zhang, Dinghao Wu, Danfeng Zhang, Gang Tan, Rui Zhang, Mahmut Taylan Kandemir |
| 2024 | Weeding out Front-End Stalls with Uneven Block Size Instruction Cache. Roman Brunner, Rakesh Kumar |
| 2024 | vTrain: A Simulation Framework for Evaluating Cost-Effective and Compute-Optimal Large Language Model Training. Jehyeon Bang, Yujeong Choi, Myeongwoo Kim, Yongdeok Kim, Minsoo Rhu |