| 2022 | A GPU Multiversion B-Tree. Muhammad A. Awad, Serban D. Porumbescu, John D. Owens |
| 2022 | A Specialized BTB Organization for Servers. Truls Asheim, Boris Grot, Rakesh Kumar |
| 2022 | A Thermal-Aware Data Replica Placement Strategy for Data-Intensive Data Centers. Jie Li, Yuhui Deng, Zhaorui Wu, Shujie Pang |
| 2022 | An Architecture for Resilient Federated Learning through Parallel Recognition. Jeongeun Kim, Young Woo Jeong, Su-Yeon Jang, Seung Eun Lee |
| 2022 | Analysing Dataflow Programs with Causation Traces. Michail Boulasikis, Flavius Gruian, Gareth Callanan, Jörn W. Janneck |
| 2022 | Athena: An Early-Fetch Architecture to Reduce on-Chip Page Walk Latencies. Seyed Armin Vakil-Ghahani, Soheil Khadirsharbiyani, Jagadish B. Kotra, Mahmut T. Kandemir |
| 2022 | Auto-Partitioning Heterogeneous Task-Parallel Programs with StreamBlocks. Mahyar Emami, Endri Bezati, Jörn W. Janneck, James R. Larus |
| 2022 | Batched Graph Community Detection on GPUs. Han-Yi Chou, Sayan Ghosh |
| 2022 | BenchPress: A Deep Active Benchmark Generator. Foivos Tsimpourlas, Pavlos Petoumenos, Min Xu, Chris Cummins, Kim M. Hazelwood, Ajitha Rajan, Hugh Leather |
| 2022 | Breaking the Vendor Lock: Performance Portable Programming through OpenMP as Target Independent Runtime Layer. Johannes Doerfert, Marc Jasper, Joseph Huber, Khaled Abdelaal, Giorgis Georgakoudis, Thomas Scogland, Konstantinos Parasyris |
| 2022 | Collage: Seamless Integration of Deep Learning Backends with Automatic Placement. Byungsoo Jeon, Sunghyun Park, Peiyuan Liao, Sheng Xu, Tianqi Chen, Zhihao Jia |
| 2022 | Com-CAS: Effective Cache Apportioning under Compiler Guidance. Bodhisatwa Chatterjee, Sharjeel Khan, Santosh Pande |
| 2022 | Combining Run-Time Checks and Compile-Time Analysis to Improve Control Flow Auto-Vectorization. Bangtian Liu, Avery Laird, Wai Hung Tsang, Bardia Mahjour, Maryam Mehri Dehnavi |
| 2022 | Custom High-Performance Vector Code Generation for Data-Specific Sparse Computations. Marcos Horro, Louis-Noël Pouchet, Gabriel Rodríguez, Juan Touriño |
| 2022 | DSDP: Dual Stream Data Prefetcher. Mingjian He, Hua Wang, Ke Zhou, Kaichao Cui, Huabing Yan, Chang Guo, Rongfeng He |
| 2022 | Decoupling Schedule, Topology Layout, and Algorithm to Easily Enlarge the Tuning Space of GPU Graph Processing. Shinnung Jeong, Yongwoo Lee, Jaeho Lee, Heelim Choi, Seungbin Song, Jinho Lee, Youngsok Kim, Hanjun Kim |
| 2022 | Effective Performance Modeling and Domain-Specific Compiler Optimization of CNNs for GPUs. Yufan Xu, Qiwei Yuan, Erik Curtis Barton, Rui Li, P. Sadayappan, Aravind Sukumaran-Rajam |
| 2022 | Efficient Atomic Durability on eADR-Enabled Persistent Memory. Taiyu Zhou, Yajuan Du, Fan Yang, Xiaojian Liao, Youyou Lu |
| 2022 | Efficient Task-Mapping of Parallel Applications Using a Space-Filling Curve. Oh-Kyoung Kwon, Ji Hoon Kang, Seungchul Lee, Wonjung Kim, Junehwa Song |
| 2022 | FlatPack: Flexible Compaction of Compressed Memory. Albin Eldstål-Ahrens, Angelos Arelakis, Ioannis Sourdis |
| 2022 | FlexPointer: Fast Address Translation Based on Range TLB and Tagged Pointers. Dongwei Chen, Dong Tong, Chun Yang, Jiangfang Yi, Xu Cheng |
| 2022 | GNNear: Accelerating Full-Batch Training of Graph Neural Networks with near-Memory Processing. Zhe Zhou, Cong Li, Xuechao Wei, Xiaoyang Wang, Guangyu Sun |
| 2022 | GPU Adaptive In-situ Parallel Analytics (GAP). Haoyuan Xing, Gagan Agrawal, Rajiv Ramnath |
| 2022 | GPUPool: A Holistic Approach to Fine-Grained GPU Sharing in the Cloud. Xiaodan Serina Tan, Pavel Golikov, Nandita Vijaykumar, Gennady Pekhimenko |
| 2022 | HBMax: Optimizing Memory Efficiency for Parallel Influence Maximization on Multicore Architectures. Xinyu Chen, Marco Minutoli, Jiannan Tian, Mahantesh Halappanavar, Ananth Kalyanaraman, Dingwen Tao |
| 2022 | High-Performance Architecture Aware Sparse Convolutional Neural Networks for GPUs. Lizhi Xiang, P. Sadayappan, Aravind Sukumaran-Rajam |
| 2022 | Improving Convolution via Cache Hierarchy Tiling and Reduced Packing. Victor Ferrari, Rafael C. F. Sousa, Márcio Machado Pereira, João P. L. de Carvalho, José Nelson Amaral, Guido Araujo |
| 2022 | Locality-Aware Optimizations for Improving Remote Memory Latency in Multi-GPU Systems. Leul Belayneh, Haojie Ye, Kuan-Yu Chen, David T. Blaauw, Trevor N. Mudge, Ronald G. Dreslinski, Nishil Talati |
| 2022 | MLIR Loop Optimizations for High-Level Synthesis: A Case Study. Serena Curzel, Sofija Jovic, Michele Fiorito, Antonino Tumeo, Fabrizio Ferrandi |
| 2022 | Massively Parallel Open Modification Spectral Library Searching with Hyperdimensional Computing. Jaeyoung Kang, Weihong Xu, Wout Bittremieux, Tajana Rosing |
| 2022 | NaviSim: A Highly Accurate GPU Simulator for AMD RDNA GPUs. Yuhui Bao, Yifan Sun, Zlatan Feric, Michael Tian Shen, Micah Weston, José L. Abellán, Trinayan Baruah, John Kim, Ajay Joshi, David R. Kaeli |
| 2022 | Optimizing Aggregate Computation of Graph Neural Networks with on-GPU Interpreter-Style Programming. Zhuoran Ji, Cho-Li Wang |
| 2022 | Optimizing Regular Expressions via Rewrite-Guided Synthesis. Jedidiah McClurg, Miles Claver, Jackson Garner, Jake Vossen, Jordan Schmerge, Mehmet E. Belviranli |
| 2022 | Parallelizing Neural Network Models Effectively on GPU by Implementing Reductions Atomically. Jie Zhao, Cédric Bastoul, Yanzhi Yi, Jiahui Hu, Wang Nie, Renwei Zhang, Zhen Geng, Chong Li, Thibaut Tachon, Zhiliang Gan |
| 2022 | Pavise: Integrating Fault Tolerance Support for Persistent Memory Applications. Han Jie Qiu, Sihang Liu, Xinyang Song, Samira Manabi Khan, Gennady Pekhimenko |
| 2022 | Probing the Efficacy of Hardware-Aware Weight Pruning to Optimize the SpMM Routine on Ampere GPUs. Roberto L. Castro, Diego Andrade, Basilio B. Fraguela |
| 2022 | Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, PACT 2022, Chicago, Illinois, October 8-12, 2022 Andreas Klöckner, José Moreira |
| 2022 | Q-gym: An Equality Saturation Framework for DNN Inference Exploiting Weight Repetition. Cheng Fu, Hanxian Huang, Bram Wasti, Chris Cummins, Riyadh Baghdadi, Kim M. Hazelwood, Yuandong Tian, Jishen Zhao, Hugh Leather |
| 2022 | ReACT: Redundancy-Aware Code Generation for Tensor Expressions. Tong Zhou, Ruiqin Tian, Rizwan A. Ashraf, Roberto Gioiosa, Gokcen Kestor, Vivek Sarkar |
| 2022 | SampleMine: A Framework for Applying Random Sampling to Subgraph Pattern Mining through Loop Perforation. Peng Jiang, Yihua Wei, Jiya Su, Rujia Wang, Bo Wu |
| 2022 | Slice-and-Forge: Making Better Use of Caches for Graph Convolutional Network Accelerators. Mingi Yoo, Jaeyong Song, Hyeyoon Lee, Jounghoo Lee, Namhyung Kim, Youngsok Kim, Jinho Lee |
| 2022 | Squaring the circle: Executing Sparse Matrix Computations on FlexTPU - A TPU-Like Processor. Xin He, Kuan-Yu Chen, Siying Feng, Hun-Seok Kim, David T. Blaauw, Ronald G. Dreslinski, Trevor N. Mudge |
| 2022 | T-GCN: A Sampling Based Streaming Graph Neural Network System with Hybrid Architecture. Chengying Huan, Shuaiwen Leon Song, Yongchao Liu, Heng Zhang, Hang Liu, Charles He, Kang Chen, Jinlei Jiang, Yongwei Wu |
| 2022 | Tiered Hashing: Revamping Hash Indexing under a Unified Memory-Storage Hierarchy. Jian Zhou, Jianfeng Wu, Weizhou Huang, You Zhou, Fei Wu, Liu Shi, Xiaoyi Zhang, Kun Wang, Feng Zhu, Shu Li |
| 2022 | Towards Supporting Semiring in MLIR-Based COMET Compiler. Luanzheng Guo, Rizwan A. Ashraf, Ryan D. Friese, Gokcen Kestor |
| 2022 | Transfer-Tuning: Reusing Auto-Schedules for Efficient Tensor Program Code Generation. Perry Gibson, José Cano |
| 2022 | UPIR: Toward the Design of Unified Parallel Intermediate Representation for Parallel Programming Models. Anjia Wang, Xinyao Yi, Yonghong Yan |
| 2022 | Understanding and Reaching the Performance Limit of Schedule Tuning on Stable Synchronization Determinism. Qi Zhao, Zhengyi Qiu, Shudi Shao, Xinning Hui, Hassan Ali Khan, Guoliang Jin |
| 2022 | VoxelCache: Accelerating Online Mapping in Robotics and 3D Reconstruction Tasks. Sankeerth Durvasula, Raymond Kiguru, Samarth Mathur, Jenny Xu, Jimmy Lin, Nandita Vijaykumar |
| 2022 | Weightless Neural Networks for Efficient Edge Inference. Zachary Susskind, Aman Arora, Igor D. S. Miranda, Luis Armando Quintanilla Villon, Rafael Fontella Katopodis, Leandro Santiago de Araújo, Diego Leonel Cadette Dutra, Priscila M. V. Lima, Felipe M. G. França, Maurício Breternitz, Lizy K. John |
| 2022 | mu-grind: A Framework for Dynamically Instrumenting HLS-Generated RTL. Parmida Vahdatniya, Amirali Sharifian, Reza Hojabr, Arrvindh Shriraman |