| 2021 | 33rd IEEE International Symposium on Computer Architecture and High Performance Computing, SBAC-PAD 2021, Belo Horizonte, Brazil, October 26-29, 2021 |
| 2021 | A Low-Power Hardware Accelerator for ORB Feature Extraction in Self-Driving Cars. Raúl Taranco, José-María Arnau, Antonio González |
| 2021 | A Task-based Execution Engine for Distributed Operating Systems Tailored to Lightweight Manycores with Limited On-Chip Memory. João Vicente Souto, Márcio Castro, Pedro Henrique Penna |
| 2021 | DACHash: A Dynamic, Cache-Aware and Concurrent Hash Table on GPUs. Hao Zhou, David Troendle, Byunghyun Jang |
| 2021 | Design and evaluation of associative processing kernels. Jonathas Silveira, Lucas Wanner |
| 2021 | Efficient Online 4D Magnetic Resonance Imaging. Marco Barbone, Andreas Wetscherek, Thomas Yung, Uwe Oelfke, Wayne Luk, Georgi Gaydadjiev |
| 2021 | Efficient Tensor Slicing for Multicore NPUs using Memory Burst Modeling. Rafael C. F. Sousa, Byungmin Jung, Jaehwa Kwak, Michael Frank, Guido Araujo |
| 2021 | Employing Simulation to Facilitate the Design of Dynamic Binary Translators. Vanderson Martins do Rosario, Raphael Zinsly, Sandro Rigo, Edson Borin |
| 2021 | Enabling microservices management for Deep Learning applications across the Edge-Cloud Continuum. Zeina Houmani, Daniel Balouek-Thomert, Eddy Caron, Manish Parashar |
| 2021 | FAIR: Fully-Adaptive Framework for Improving Resource Provisioning in Collaborative CPU-FPGA Cloud Environments. Michael Guilherme Jordan, Guilherme Korol, Mateus Beck Rutzig, Antonio Carlos Schneider Beck |
| 2021 | FSCHOL: An OpenCL-based HPC Framework for Accelerating Sparse Cholesky Factorization on FPGAs. Erfan Bank Tavakoli, Michael Riera, Masudul Hassan Quraishi, Fengbo Ren |
| 2021 | Functional Approximation and Approximate Parallelization with the ACCEPT compiler. Lucas Reis, Lucas Wanner |
| 2021 | HPC Data Storage at a Glance: The Santos Dumont Experience. André Ramos Carneiro, Jean Luca Bez, Carla Osthoff, Lucas Mello Schnorr, Philippe O. A. Navaux |
| 2021 | Improving Phased Transactional Memory via Commit Throughput and Capacity Estimation. Catalina Munoz Morales, Bruno C. Honorio, Alexandro Baldassin, Guido Araujo |
| 2021 | Opening the Black Box: Performance Estimation during Code Generation for GPUs. Dominik Ernst, Georg Hager, Matthias Knorr, Gerhard Wellein, Markus Holzer |
| 2021 | Register Flush-free Runahead Execution for Modern Vector Processors. Hikaru Takayashiki, Masayuki Sato, Kazuhiko Komatsu, Hiroaki Kobayashi |
| 2021 | SECDA: Efficient Hardware/Software Co-Design of FPGA-based DNN Accelerators for Edge Inference. Jude Haris, Perry Gibson, José Cano, Nicolas Bohm Agostini, David R. Kaeli |
| 2021 | Sampling-based Sparse Format Selection on GPUs. Gangyi Zhu, Gagan Agrawal |
| 2021 | Shelf schedules for independent moldable tasks to minimize the energy consumption. Anne Benoit, Louis-Claude Canon, Redouane Elghazi, Pierre-Cyrille Héam |
| 2021 | Sparbit: a new logarithmic-cost and data locality-aware MPI Allgather algorithm. Wilton Jaciel Loch, Guilherme Piêgas Koslovski |
| 2021 | Sparsity-aware Power Gating for Tensor Cores. Ehsan Atoofian |
| 2021 | Synchronization Strategies on Many-Core SMT Systems. Agustín Navarro-Torres, Jesús Alastruey-Benedé, Pablo Ibáñez-Marín, Maria Carpen-Amarie |