CGO A

34 papers

YearTitle / Authors
2019A Code Generator for High-Performance Tensor Contractions on GPUs.
Jinsung Kim, Aravind Sukumaran-Rajam, Vineeth Thumma, Sriram Krishnamoorthy, Ajay Panyala, Louis-Noël Pouchet, Atanas Rountev, P. Sadayappan
2019A Shared BTB Design for Multicore Systems.
Moumita Das, Ansuman Banerjee, Bhaskar Sardar
2019A Tool for Performance Analysis of GPU-Accelerated Applications.
Keren Zhou, John M. Mellor-Crummey
2019Accelerating GPU Computing at Runtime with Binary Optimization.
Guangli Li, Lei Liu, Xiaobing Feng
2019An Optimization-Driven Incremental Inline Substitution Algorithm for Just-in-Time Compilers.
Aleksandar Prokopec, Gilles Duboscq, David Leopoldseder, Thomas Würthinger
2019Automatic Equivalence Checking for Assembly Implementations of Cryptography Libraries.
Jay P. Lim, Santosh Nagarakatte
2019Automatic Generation of Warp-Level Primitives and Atomic Instructions for Fast and Portable Parallel Reduction on GPUs.
Simon Garcia De Gonzalo, Sitao Huang, Juan Gómez-Luna, Simon D. Hammond, Onur Mutlu, Wen-Mei Hwu
2019Automatic Parallelization of Irregular x86-64 Loops.
Brandon Neth, Michelle Mills Strout
2019BOLT: A Practical Binary Optimizer for Data Centers and Beyond.
Maksim Panchenko, Rafael Auler, Bill Nell, Guilherme Ottoni
2019CSOD: Context-Sensitive Overflow Detection.
Hongyu Liu, Sam Silvestro, Xiaoyin Wang, Lide Duan, Tongping Liu
2019Code Generation from Formal Models for Automatic RTOS Portability.
Renata Martins Gomes, Marcel Baunach
2019Decoding CUDA Binary.
Ari B. Hayes, Fei Hua, Jin Huang, Yan-Hao Chen, Eddy Z. Zhang
2019Extending LLVM for Lightweight SPMD Vectorization: Using SIMD and Vector Instructions Easily from Any Language.
Robin Kruppe, Julian Oppermann, Lukas Sommer, Andreas Koch
2019From Loop Fusion to Kernel Fusion: A Domain-Specific Approach to Locality Optimization.
Bo Qiao, Oliver Reiche, Frank Hannig, Jürgen Teich
2019Function Merging by Sequence Alignment.
Rodrigo C. O. Rocha, Pavlos Petoumenos, Zheng Wang, Murray Cole, Hugh Leather
2019Generation of In-Bounds Inputs for Arrays in Memory-Unsafe Languages.
Marcus Rodrigues, Breno Guimarães, Fernando Magno Quintão Pereira
2019IEEE/ACM International Symposium on Code Generation and Optimization, CGO 2019, Washington, DC, USA, February 16-20, 2019
Mahmut Taylan Kandemir, Alexandra Jimborean, Tipp Moseley
2019IGC: The Open Source Intel Graphics Compiler.
Anupama Chandrasekhar, Gang Chen, Po-Yu Chen, Wei-Yu Chen, Junjie Gu, Peng Guo, Shruthi Hebbur Prasanna Kumar, Guei-Yuan Lueh, Pankaj Mistry, Wei Pan, Thomas Raoux, Konrad Trifunovic
2019Janus: Statically-Driven and Profile-Guided Automatic Dynamic Binary Parallelisation.
Ruoyu Zhou, Timothy M. Jones
2019Kernel Fusion/Decomposition for Automatic GPU-Offloading.
Alok Mishra, Martin Kong, Barbara M. Chapman
2019Locus: A System and a Language for Program Optimization.
Thiago S. F. X. Teixeira, Corinne Ancourt, David A. Padua, William Gropp
2019Multi-target Compiler for the Deployment of Machine Learning Models.
Oscar Castro-López, Inés Fernando Vega López
2019Optimizing RNA-RNA Interaction Computations.
Swetha Varadarajan
2019Quantifying and Reducing Execution Variance in STM via Model Driven Commit Optimization.
Girish Mururu, Ada Gavrilovska, Santosh Pande
2019Reasoning about the Node.js Event Loop using Async Graphs.
Haiyang Sun, Daniele Bonetta, Filippo Schiavio, Walter Binder
2019Smokestack: Thwarting DOP Attacks with Runtime Stack Layout Randomization.
Misiker Tadesse Aga, Todd M. Austin
2019Super-Node SLP: Optimized Vectorization for Code Sequences Containing Operators and Their Inverse Elements.
Vasileios Porpodas, Rodrigo C. O. Rocha, Evgueni Brevnov, Luís F. W. Góes, Timothy G. Mattson
2019Tensor Algebra Compilation with Workspaces.
Fredrik Kjolstad, Willow Ahrens, Shoaib Kamil, Saman P. Amarasinghe
2019Tiramisu: A Polyhedral Compiler for Expressing Fast and Portable Code.
Riyadh Baghdadi, Jessica Ray, Malek Ben Romdhane, Emanuele Del Sozzo, Abdurrahman Akkas, Yunming Zhang, Patricia Suriana, Shoaib Kamil, Saman P. Amarasinghe
2019Transforming Query Sequences for High-Throughput B+ Tree Processing on Many-Core Processors.
Ruiqin Tian, Junqiao Qiu, Zhijia Zhao, Xu Liu, Bin Ren
2019Translating CUDA to OpenCL for Hardware Generation using Neural Machine Translation.
Yonghae Kim, Hyesoon Kim
2019Translating Traditional SIMD Instructions to Vector Length Agnostic Architectures.
Sheng-Yu Fu, Wei-Chung Hsu
2019Understanding RDMA Behavior in NUMA Systems.
Jacob Nelson, Roberto Palmieri
2019White-Box Program Tuning.
Wen-Chuan Lee, Yingqi Liu, Peng Liu, Shiqing Ma, Hongjun Choi, Xiangyu Zhang, Rajiv Gupta