PACT B

26 papers

YearTitle / Authors
202130th International Conference on Parallel Architectures and Compilation Techniques, PACT 2021, Atlanta, GA, USA, September 26-29, 2021
Jaejin Lee, Albert Cohen
2021A Flexible Approach to Autotuning Multi-Pass Machine Learning Compilers.
Phitchaya Mangpo Phothilimthana, Amit Sabne, Nikhil Sarda, Karthik Srinivasa Murthy, Yanqi Zhou, Christof Angermueller, Mike Burrows, Sudip Roy, Ketan Mandke, Rezsa Farahani, Yu Emma Wang, Berkin Ilbeyi, Blake A. Hechtman, Bjarke Roune, Shen Wang, Yuanzhong Xu, Samuel J. Kaufman
2021AIBench Scenario: Scenario-Distilling AI Benchmarking.
Wanling Gao, Fei Tang, Jianfeng Zhan, Xu Wen, Lei Wang, Zheng Cao, Chuanxin Lan, Chunjie Luo, Xiaoli Liu, Zihan Jiang
2021Accelerating Fourier and Number Theoretic Transforms using Tensor Cores and Warp Shuffles.
Sultan Durrani, Muhammad Saad Chughtai, Mert Hidayetoglu, Rashid Tahir, Abdul Dakkak, Lawrence Rauchwerger, Fareed Zaffar, Wen-mei W. Hwu
2021CBP: Coordinated management of cache partitioning, bandwidth partitioning and prefetch throttling.
Nadja Ramhöj Holtryd, Madhavan Manivannan, Per Stenström, Miquel Pericàs
2021CoPlace: Effectively Mitigating Cache Conflicts in Modern Clouds.
Xiaowei Shang, Weiwei Jia, Jianchen Shan, Xiaoning Ding
2021Dryadic: Flexible and Fast Graph Pattern Matching at Scale.
Daniel Mawhirter, Sam Reinehr, Wei Han, Noah Fields, Miles Claver, Connor Holmes, Jedidiah McClurg, Tongping Liu, Bo Wu
2021Google Neural Network Models for Edge Devices: Analyzing and Mitigating Machine Learning Inference Bottlenecks.
Amirali Boroumand, Saugata Ghose, Berkin Akin, Ravi Narayanaswami, Geraldo F. Oliveira, Xiaoyu Ma, Eric Shiu, Onur Mutlu
2021HERTI: A Reinforcement Learning-Augmented System for Efficient Real-Time Inference on Heterogeneous Embedded Systems.
Myeonggyun Han, Woongki Baek
2021InnerSP: A Memory Efficient Sparse Matrix Multiplication Accelerator with Locality-Aware Inner Product Processing.
Daehyeon Baek, Soojin Hwang, Taekyung Heo, Daehoon Kim, Jaehyuk Huh
2021Invalidate or Update? Revisiting Coherence for Tomorrow's Cache Hierarchies.
Mingcan Zhu, Amna Shahab, Antonios Katsarakis, Boris Grot
2021NLP-Fast: A Fast, Scalable, and Flexible System to Accelerate Large-Scale Heterogeneous NLP Models.
Joonsung Kim, Suyeon Hur, Eunbok Lee, Seungho Lee, Jangwoo Kim
2021PIM-DL: Boosting DNN Inference on Digital Processing In-Memory Architectures via Data Layout Optimizations.
Minxuan Zhou, Guoyang Chen, Mohsen Imani, Saransh Gupta, Weifeng Zhang, Tajana Rosing
2021PolyGym: Polyhedral Optimizations as an Environment for Reinforcement Learning.
Alexander Brauckmann, Andrés Goens, Jerónimo Castrillón
2021Polygeist: Raising C to Polyhedral MLIR.
William S. Moses, Lorenzo Chelini, Ruizhe Zhao, Oleksandr Zinenko
2021Precision Batching: Bitserial Decomposition for Efficient Neural Network Inference on GPUs.
Maximilian Lam, Zachary Yedidia, Colby R. Banbury, Vijay Janapa Reddi
2021Program Lifting using Gray-Box Behavior.
Bruce Collie, Michael F. P. O'Boyle
2021SEER: A Time Prediction Model for CNNs from GPU Kernel's View.
Guodong Liu, Sa Wang, Yungang Bao
2021SURFNet: Super-Resolution of Turbulent Flows with Transfer Learning using Small Datasets.
Octavi Obiols-Sales, Abhinav Vishnu, Nicholas Malaya, Aparna Chandramowlishwaran
2021Skywalker: Efficient Alias-Method-Based Graph Sampling and Random Walk on GPUs.
Pengyu Wang, Chao Li, Jing Wang, Taolei Wang, Lu Zhang, Jingwen Leng, Quan Chen, Minyi Guo
2021SumPA: Efficient Pattern-Centric Graph Mining with Pattern Abstraction.
Chuangyi Gui, Xiaofei Liao, Long Zheng, Pengcheng Yao, Qinggang Wang, Hai Jin
2021Ultra Efficient Acceleration for De Novo Genome Assembly via Near-Memory Computing.
Minxuan Zhou, Lingxi Wu, Muzhou Li, Niema Moshiri, Kevin Skadron, Tajana Rosing
2021Union: A Unified HW-SW Co-Design Ecosystem in MLIR for Evaluating Tensor Operations on Spatial Accelerators.
Geonhwa Jeong, Gokcen Kestor, Prasanth Chatarasi, Angshuman Parashar, Po-An Tsai, Sivasankaran Rajamanickam, Roberto Gioiosa, Tushar Krishna
2021Write Prediction for Persistent Memory Systems.
Suyash Mahar, Sihang Liu, Korakit Seemakhupt, Vinson Young, Samira Manabi Khan
2021X-Layer: Building Composable Pipelined Dataflows for Low-Rank Convolutions.
Naveen Vedula, Reza Hojabr, Ahmad Khonsari, Arrvindh Shriraman
2021nuKSM: NUMA-aware Memory De-duplication on Multi-socket Servers.
Akash Panda, Ashish Panwar, Arkaprava Basu