HPCA A*

82 papers

YearTitle / Authors
2024A Quantum Computer Trusted Execution Environment.
Theodoros Trochatos, Chuanqi Xu, Sanjay Deshpande, Yao Lu, Yongshan Ding, Jakub Szefer
2024A Two Level Neural Approach Combining Off-Chip Prediction with Adaptive Prefetch Filtering.
Alexandre Valentin Jamet, Georgios Vavouliotis, Daniel A. Jiménez, Lluc Alvarez, Marc Casas
2024ASADI: Accelerating Sparse Attention Using Diagonal-based In-Situ Computing.
Huize Li, Zhaoying Li, Zhenyu Bai, Tulika Mitra
2024Agile-DRAM: Agile Trade-Offs in Memory Capacity, Latency, and Energy for Data Centers.
Jaeyoon Lee, Wonyeong Jung, Dongwhee Kim, Daero Kim, Junseung Lee, Jungrae Kim
2024An LPDDR-based CXL-PNM Platform for TCO-efficient Inference of Transformer-based Large Language Models.
Sangsoo Park, Kyungsoo Kim, Jinin So, Jin Jung, Jonggeon Lee, Kyoungwan Woo, Nayeon Kim, Younghyun Lee, Hyungyo Kim, Yongsuk Kwon, JinHyun Kim, Jieun Lee, Yeongon Cho, Yongmin Tai, Jeonghyeon Cho, Hoyoung Song, Jung Ho Ahn, Nam Sung Kim
2024An Optimizing Framework on MLIR for Efficient FPGA-based Accelerator Generation.
Weichuang Zhang, Jieru Zhao, Guan Shen, Quan Chen, Chen Chen, Minyi Guo
2024Are Superpages Super-fast? Distilling Flash Blocks to Unify Flash Pages of a Superpage in an SSD.
Shih-Hung Tseng, Tseng-Yi Chen, Ming-Chang Yang
2024Bandwidth-Effective DRAM Cache for GPU s with Storage-Class Memory.
Jeongmin Hong, Sungjun Cho, Geonwoo Park, Wonhyuk Yang, Young-Ho Gong, Gwangsun Kim
2024BeaconGNN: Large-Scale GNN Acceleration with Out-of-Order Streaming In-Storage Computing.
Yuyue Wang, Xiurui Pan, Yuda An, Jie Zhang, Glenn Reinman
2024BitWave: Exploiting Column-Based Bit-Level Sparsity for Deep Learning Acceleration.
Man Shi, Vikram Jain, Antony Joseph, Maurice Meijer, Marian Verhelst
2024CAMEL: Co-Designing AI Models and eDRAMs for Efficient On-Device Learning.
Sai Qian Zhang, Thierry Tambe, Nestor Cuevas, Gu-Yeon Wei, David Brooks
2024CHROME: Concurrency-Aware Holistic Cache Management Framework with Online Reinforcement Learning.
Xiaoyang Lu, Hamed Najafi, Jason Liu, Xian-He Sun
2024Celeritas: Out-of-Core Based Unsupervised Graph Neural Network via Cross-Layer Computing 2024.
Yi Li, Tsun-Yu Yang, Ming-Chang Yang, Zhaoyan Shen, Bingzhe Li
2024Cepheus: Accelerating Datacenter Applications with High-Performance RoCE-Capable Multicast.
Wenxue Li, Junyi Zhang, Yufei Liu, Gaoxiong Zeng, Zilong Wang, Chaoliang Zeng, Pengpeng Zhou, Qiaoling Wang, Kai Chen
2024CoMeT: Count-Min-Sketch-based Row Tracking to Mitigate RowHammer at Low Cost.
F. Nisa Bostanci, Ismail Emir Yüksel, Ataberk Olgun, Konstantinos Kanellopoulos, Yahya Can Tugrul, A. Giray Yaglikçi, Mohammad Sadrosadati, Onur Mutlu
2024Computational CXL-Memory Solution for Accelerating Memory-Intensive Applications.
Joonseop Sim, Soohong Ahn, Taeyoung Ahn, Seungyong Lee, Myunghyun Rhee, Jooyoung Kim, Kwangsik Shin, Donguk Moon, Euiseok Kim, Kyoung Park
2024Data Enclave: A Data-Centric Trusted Execution Environment.
Yuanchao Xu, James Pangia, Chencheng Ye, Yan Solihin, Xipeng Shen
2024Data Motion Acceleration: Chaining Cross-Domain Multi Accelerators.
Shu-Ting Wang, Hanyang Xu, Amin Mamandipoor, Rohan Mahapatra, Byung Hoon Ahn, Soroush Ghodrati, Krishnan Kailas, Mohammad Alian, Hadi Esmaeilzadeh
2024Differential-Matching Prefetcher for Indirect Memory Access.
Gelin Fu, Tian Xia, Zhongpei Luo, Ruiyang Chen, Wenzhe Zhao, Pengju Ren
2024DockerSSD: Containerized In-Storage Processing and Hardware Acceleration for Computational SSDs.
Donghyun Gouk, Miryeong Kwon, Hanyeoreum Bae, Myoungsoo Jung
2024E2EMap: End-to-End Reinforcement Learning for CGRA Compilation via Reverse Mapping.
Dajiang Liu, Yuxin Xia, Jiaxing Shang, Jiang Zhong, Peng Ouyang, Shouyi Yin
2024ECO-CHIP: Estimation of Carbon Footprint of Chiplet-based Architectures for Sustainable VLSI.
Chetan Choppali Sudarshan, Nikhil Matkar, Sarma B. K. Vrudhula, Sachin S. Sapatnekar, Vidya A. Chhabria
2024Effective Context-Sensitive Memory Dependence Prediction.
Sebastian S. Kim, Alberto Ros
2024Enabling Large Dynamic Neural Network Training with Learning-based Memory Management.
Jie Ren, Dong Xu, Shuangyan Yang, Jiacheng Zhao, Zhicheng Li, Christian Navasca, Chenxi Wang, Guoqing Harry Xu, Dong Li
2024Enhancing Collective Communication in MCM Accelerators for Deep Learning Training.
Sabuj Laskar, Pranati Majhi, Sungkeun Kim, Farabi Mahmud, Abdullah Muzahid, Eun Jung Kim
2024Enterprise-Class Cache Compression Design.
Alper Buyuktosunoglu, David Trilla, Bülent Abali, Deanna Postles Dunn Berger, Craig R. Walters, Jang-Soo Lee
2024Exploitation of Security Vulnerability on Retirement.
Ke Xu, Ming Tang, Quancheng Wang, Han Wang
2024FIGNA: Integer Unit-Based Accelerator Design for FP-INT GEMM Preserving Numerical Accuracy.
Jaeyong Jang, Yulhwa Kim, Juheun Lee, Jae-Joon Kim
2024FlashGNN: An In-SSD Accelerator for GNN Training.
Fuping Niu, Jianhui Yue, Jiangqiu Shen, Xiaofei Liao, Hai Jin
2024FlipBit: Approximate Flash Memory for IoT Devices.
Alexander Buck, Karthik Ganesan, Natalie Enright Jerger
2024Functionally-Complete Boolean Logic in Real DRAM Chips: Experimental Characterization and Analysis.
Ismail Emir Yüksel, Yahya Can Tugrul, Ataberk Olgun, F. Nisa Bostanci, Abdullah Giray Yaglikçi, Geraldo F. Oliveira, Haocong Luo, Juan Gómez-Luna, Mohammad Sadrosadati, Onur Mutlu
2024GADGETSPINNER: A New Transient Execution Primitive Using the Loop Stream Detector.
Yun Chen, Ali Hajiabadi, Trevor E. Carlson
2024GPU Scale-Model Simulation.
Hossein SeyyedAghaei, Mahmood Naderan-Tahan, Lieven Eeckhout
2024GRIT: Enhancing Multi-GPU Performance with Fine-Grained Dynamic Page Placement.
Yueqi Wang, Bingyao Li, Aamer Jaleel, Jun Yang, Xulong Tang
2024Gem5-MARVEL: Microarchitecture-Level Resilience Analysis of Heterogeneous SoC Architectures.
Odysseas Chatzopoulos, George Papadimitriou, Vasileios Karakostas, Dimitris Gizopoulos
2024Gemini: Mapping and Architecture Co-exploration for Large-scale DNN Chiplet Accelerators.
Jingwei Cai, Zuotong Wu, Sen Peng, Yuchen Wei, Zhanhong Tan, Guiming Shi, Mingyu Gao, Kaisheng Ma
2024Guser: A GPGPU Power Stressmark Generator.
Yalong Shan, Yongkui Yang, Xuehai Qian, Zhibin Yu
2024HotTiles: Accelerating SpMM with Heterogeneous Accelerator Architectures.
Gerasimos Gerogiannis, Sriram Aananthakrishnan, Josep Torrellas, Ibrahim Hur
2024IEEE International Symposium on High-Performance Computer Architecture, HPCA 2024, Edinburgh, United Kingdom, March 2-6, 2024
2024LUTein: Dense-Sparse Bit-Slice Architecture With Radix-4 LUT-Based Slice-Tensor Processing Units.
Dongseok Im, Hoi-Jun Yoo
2024LearnedFTL: A Learning-Based Page-Level FTL for Reducing Double Reads in Flash-Based SSDs.
Shengzhe Wang, Zihang Lin, Suzhen Wu, Hong Jiang, Jie Zhang, Bo Mao
2024LibPreemptible: Enabling Fast, Adaptive, and Hardware-Assisted User-Space Scheduling.
Yueying Li, Nikita Lazarev, David Koufaty, Tenny Yin, Andy Anderson, Zhiru Zhang, G. Edward Suh, Kostis Kaffes, Christina Delimitrou
2024LightPool: A NVMe-oF-based High-performance and Lightweight Storage Pool Architecture for Cloud-Native Distributed Database.
Jiexiong Xu, Yiquan Chen, Yijing Wang, Wenhui Shi, Guoju Fang, Yi Chen, Huasheng Liao, Yang Wang, Hai Lin, Zhen Jin, Qiang Liu, Wenzhi Chen
2024Lightening-Transformer: A Dynamically-Operated Optically-Interconnected Photonic Transformer Accelerator.
Hanqing Zhu, Jiaqi Gu, Hanrui Wang, Zixuan Jiang, Zhekai Zhang, Rongxing Tang, Chenghao Feng, Song Han, Ray T. Chen, David Z. Pan
2024MEGA: A Memory-Efficient GNN Accelerator Exploiting Degree-Aware Mixed-Precision Quantization.
Zeyu Zhu, Fanrong Li, Gang Li, Zejian Liu, Zitao Mo, Qinghao Hu, Xiaoyao Liang, Jian Cheng
2024MIMDRAM: An End-to-End Processing-Using-DRAM System for High-Throughput, Energy-Efficient and Programmer-Transparent Multiple-Instruction Multiple-Data Computing.
Geraldo F. Oliveira, Ataberk Olgun, Abdullah Giray Yaglikçi, F. Nisa Bostanci, Juan Gómez-Luna, Saugata Ghose, Onur Mutlu
2024MINOS: Distributed Consistency and Persistency Protocol Implementation & Offloading to SmartNICs.
Antonis Psistakis, Fabien Chaix, Josep Torrellas
2024MIRAGE: Quantum Circuit Decomposition and Routing Collaborative Design Using Mirror Gates.
Evan McKinney, Michael Hatridge, Alex K. Jones
2024MOPED: Efficient Motion Planning Engine with Flexible Dimension Support.
Lingyi Huang, Yu Gong, Yang Sui, Xiao Zang, Bo Yuan
2024Midas Touch: Invalid-Data Assisted Reliability and Performance Boost for 3d High-Density Flash.
Qiao Li, Hongyang Dang, Zheng Wan, Congming Gao, Min Ye, Jie Zhang, Tei-Wei Kuo, Chun Jason Xue
2024Mitigating Write Disturbance in Non-Volatile Memory via Coupling Machine Learning with Out-of-Place Updates.
Ronglong Wu, Zhirong Shen, Zhiwei Yang, Jiwu Shu
2024Modeling, Derivation, and Automated Analysis of Branch Predictor Security Vulnerabilities.
Quancheng Wang, Ming Tang, Ke Xu, Han Wang
2024Morphling: A Throughput-Maximized TFHE-based Accelerator using Transform-domain Reuse.
Prasetiyo, Adiwena Putra, Joo-Young Kim
2024PREFETCHX: Cross-Core Cache-Agnostic Prefetcher-based Side-Channel Attacks.
Yun Chen, Ali Hajiabadi, Lingfeng Pei, Trevor E. Carlson
2024Pathfinding Future PIM Architectures by Demystifying a Commercial PIM Technology.
Bongjoon Hyun, Taehun Kim, Dongjae Lee, Minsoo Rhu
2024Prosper: Program Stack Persistence in Hybrid Memory Systems.
K. P. Arun, Debadatta Mishra, Biswabandan Panda
2024PruneGNN: Algorithm-Architecture Pruning Framework for Graph Neural Network Acceleration.
Deniz Gurevin, Mohsin Shan, Shaoyi Huang, Md Amit Hasan, Caiwen Ding, Omer Khan
2024RELIEF: Relieving Memory Pressure In SoCs Via Data Movement-Aware Accelerator Scheduling.
Sudhanshu Gupta, Sandhya Dwarkadas
2024Rapper: A Parameter-Aware Repair-in-Memory Accelerator for Blockchain Storage Platform.
Chenlin Ma, Yingping Wang, Fuwen Chen, Jing Liao, Yi Wang, Rui Mao
2024Revet: A Language and Compiler for Dataflow Threads.
Alexander C. Rucker, Shiv Sundram, Coleman Smith, Matthew Vilim, Raghu Prabhakar, Fredrik Kjølstad, Kunle Olukotun
2024RiF: Improving Read Performance of Modern SSDs Using an On-Die Early-Retry Engine.
Myoungjun Chun, Jaeyong Lee, Myungsuk Kim, Jisung Park, Jihong Kim
2024SACHI: A Stationarity-Aware, All-Digital, Near-Memory, Ising Architecture.
Siddhartha Raman Sundara Raman, Lizy K. John, Jaydeep P. Kulkarni
2024SPADE: Sparse Pillar-based 3D Object Detection Accelerator for Autonomous Driving.
Minjae Lee, Seongmin Park, Hyungmin Kim, Minyong Yoon, Janghwan Lee, Jun Won Choi, Nam Sung Kim, Mingu Kang, Jungwook Choi
2024SPARK: Scalable and Precision-Aware Acceleration of Neural Networks via Efficient Encoding.
Fangxin Liu, Ning Yang, Haomin Li, Zongwu Wang, Zhuoran Song, Songwen Pei, Li Jiang
2024START: Scalable Tracking for any Rowhammer Threshold.
Anish Saxena, Moinuddin K. Qureshi
2024Salus: Efficient Security Support for CXL-Expanded GPU Memory.
Rahaf Abdullah, Hyokeun Lee, Huiyang Zhou, Amro Awad
2024SegScope: Probing Fine-grained Interrupts via Architectural Footprints.
Xin Zhang, Zhi Zhang, Qingni Shen, Wenhao Wang, Yansong Gao, Zhuoxi Yang, Jiliang Zhang
2024Smart-Infinity: Fast Large Language Model Training using Near-Storage Processing on a Real System.
Hongsun Jang, Jaeyong Song, Jaewon Jung, Jaeyoung Park, Youngsok Kim, Jinho Lee
2024SmartDIMM: In-Memory Acceleration of Upper Layer Protocols.
Neel Patel, Amin Mamandipoor, Mohammad Nouri, Mohammad Alian
2024Spatial Variation-Aware Read Disturbance Defenses: Experimental Analysis of Real DRAM Chips and Implications on Future Solutions.
Abdullah Giray Yaglikçi, Yahya Can Tugrul, Geraldo F. Oliveira, Ismail Emir Yüksel, Ataberk Olgun, Haocong Luo, Onur Mutlu
2024SpecFL: An Efficient Speculative Federated Learning System for Tree-based Model Training.
Yuhui Zhang, Lutan Zhao, Cheng Che, Xiaofeng Wang, Dan Meng, Rui Hou
2024Stellar: Energy-Efficient and Low-Latency SNN Algorithm and Hardware Co-Design with Spatiotemporal Computation.
Ruixin Mao, Lin Tang, Xingyu Yuan, Ye Liu, Jun Zhou
2024StreamPIM: Streaming Matrix Computation in Racetrack Memory.
Yuda An, Yunxiao Tang, Shushu Yi, Li Peng, Xiurui Pan, Guangyu Sun, Zhaochu Luo, Qiao Li, Jie Zhang
2024Supporting Secure Multi-GPU Computing with Dynamic and Batched Metadata Management.
Seonjin Na, Jungwoo Kim, Sunho Lee, Jaehyuk Huh
2024TALCO: Tiling Genome Sequence Alignment Using Convergence of Traceback Pointers.
Sumit Walia, Cheng Ye, Arkid Bera, Dhruvi Lodhavia, Yatish Turakhia
2024Tessel: Boosting Distributed Execution of Large DNN Models via Flexible Schedule Search.
Zhiqi Lin, Youshan Miao, Guanbin Xu, Cheng Li, Olli Saarikivi, Saeed Maleki, Fan Yang
2024TinyTS: Memory-Efficient TinyML Model Compiler Framework on Microcontrollers.
Yu-Yuan Liu, Hong-Sheng Zheng, Yu Fang Hu, Chen-Fong Hsu, Tsung Tai Yeh
2024Uncovering and Exploiting AMD Speculative Memory Access Predictors for Fun and Profit.
Chang Liu, Dongsheng Wang, Yongqiang Lyu, Pengfei Qiu, Yu Jin, Zhuoyuan Lu, Yinqian Zhang, Gang Qu
2024Unleashing the Potential of PIM: Accelerating Large Batched Inference of Transformer-Based Generative Models.
Jaewan Choi, Jaehyun Park, Kwanhee Kyung, Nam Sung Kim, Jung Ho Ahn
2024Ursa: Lightweight Resource Management for Cloud-Native Microservices.
Yanqi Zhang, Zhuangzhuang Zhou, Sameh Elnikety, Christina Delimitrou
2024Usas: A Sustainable Continuous-Learning' Framework for Edge Servers.
Cyan Subhra Mishra, Jack Sampson, Mahmut Taylan Kandemir, Vijaykrishnan Narayanan, Chita R. Das
2024WASP: Exploiting GPU Pipeline Parallelism with Hardware-Accelerated Automatic Warp Specialization.
Neal Clayton Crago, Sana Damani, Karthikeyan Sankaralingam, Stephen W. Keckler