| 2018 | 3D-Xpath: high-density managed DRAM architecture with cost-effective alternative paths for memory transactions. Sukhan Lee, Kiwon Lee, Min Chul Sung, Mohammad Alian, Chankyung Kim, Wooyeong Cho, Reum Oh, Seongil O, Jung Ho Ahn, Nam Sung Kim |
| 2018 | A portable, automatic data qantizer for deep neural networks. Young H. Oh, Quan Quan, Daeyeon Kim, Seonghak Kim, Jun Heo, Sungjun Jung, Jaeyoung Jang, Jae W. Lee |
| 2018 | An efficient graph accelerator with parallel data conflict management. Pengcheng Yao, Long Zheng, Xiaofei Liao, Hai Jin, Bingsheng He |
| 2018 | Architectural support for convolutional neural networks on modern CPUs. Animesh Jain, Michael A. Laurenzano, Gilles A. Pokam, Jason Mars, Lingjia Tang |
| 2018 | Atributed consistent hashing for heterogeneous storage systems. Jiang Zhou, Yong Chen, Weiping Wang |
| 2018 | Automatic annotation of tasks in structured code. Pedro Ramos, Gleison Souza Diniz Mendonca, Divino Soares, Guido Araújo, Fernando Magno Quintão Pereira |
| 2018 | Biased reference counting: minimizing atomic operations in garbage collection. Jiho Choi, Thomas Shull, Josep Torrellas |
| 2018 | Cimple: instruction and memory level parallelism: a DSL for uncovering ILP and MLP. Vladimir Kiriansky, Haoran Xu, Martin C. Rinard, Saman P. Amarasinghe |
| 2018 | ComP-net: command processor networking for efficient intra-kernel communications on GPUs. Michael LeBeane, Khaled Hamidouche, Brad Benton, Maurício Breternitz, Steven K. Reinhardt, Lizy K. John |
| 2018 | Compiler assisted coalescing. Sooraj Puthoor, Mikko H. Lipasti |
| 2018 | Cost effective speculation with the omnipredictor. Arthur Perais, André Seznec |
| 2018 | Cost-driven thread coarsening for GPU kernels. Prithayan Barua, Jun Shirako, Vivek Sarkar |
| 2018 | DART: distributed adaptive radix tree for efficient affix-based keyword search on HPC systems. Wei Zhang, Houjun Tang, Suren Byna, Yong Chen |
| 2018 | Data motifs: a lens towards fully understanding big data and AI workloads. Wanling Gao, Jianfeng Zhan, Lei Wang, Chunjie Luo, Daoyi Zheng, Fei Tang, Biwei Xie, Chen Zheng, Xu Wen, Xiwen He, Hainan Ye, Rui Ren |
| 2018 | E-PUR: an energy-efficient processing unit for recurrent neural networks. Franyell Silfa, Gem Dot, José-María Arnau, Antonio González |
| 2018 | EAR: ECC-aided refresh reduction through 2-D zero compression. Jeongkyu Hong, Hyeonggyu Kim, Soontae Kim |
| 2018 | GMOD: a dynamic GPU memory overflow detector. Bang Di, Jianhua Sun, Dong Li, Hao Chen, Zhe Quan |
| 2018 | Graphphi: efficient parallel graph processing on emerging throughput-oriented architectures. Zhen Peng, Alexander Powell, Bo Wu, Tekin Bicer, Bin Ren |
| 2018 | Hybrid optimization/heuristic instruction scheduling for programmable accelerator codesign. Tony Nowatzki, Newsha Ardalani, Karthikeyan Sankaralingam, Jian Weng |
| 2018 | Hypart: a hybrid technique for practical memory bandwidth partitioning on commodity servers. Jinsu Park, Seongbeom Park, Myeonggyun Han, Jihoon Hyun, Woongki Baek |
| 2018 | In-DRAM near-data approximate acceleration for GPUs. Amir Yazdanbakhsh, Choungki Song, Jacob Sacks, Pejman Lotfi-Kamran, Hadi Esmaeilzadeh, Nam Sung Kim |
| 2018 | Log(graph): a near-optimal high-performance graph representation. Maciej Besta, Dimitri Stanojevic, Tijana Zivic, Jagpreet Singh, Maurice Hoerold, Torsten Hoefler |
| 2018 | Mage: online and interference-aware scheduling for multi-scale heterogeneous systems. Francisco Romero, Christina Delimitrou |
| 2018 | Massively parallel skyline computation for processing-in-memory architectures. Vasileios Zois, Divya Gupta, Vassilis J. Tsotras, Walid A. Najjar, Jean-François Roy |
| 2018 | Maximizing system utilization via parallelism management for co-located parallel applications. Younghyun Cho, Camilo A. Celis Guzman, Bernhard Egger |
| 2018 | MemoDyn: exploiting weakly consistent data structures for dynamic parallel memoization. Prakash Prabhu, Stephen R. Beard, Sotiris Apostolakis, Ayal Zaks, David I. August |
| 2018 | Near-side prefetch throttling: adaptive prefetching for high-performance many-core processors. Wim Heirman, Kristof Du Bois, Yves Vandriessche, Stijn Eyerman, Ibrahim Hur |
| 2018 | On-the-fly workload partitioning for integrated CPU/GPU architectures. Younghyun Cho, Florian Negele, Seohong Park, Bernhard Egger, Thomas R. Gross |
| 2018 | Optimizing remote data transfers in X10. Arun Thangamani, V. Krishna Nandivada |
| 2018 | Performance extraction and suitability analysis of multi- and many-core architectures for next generation sequencing secondary analysis. Sanchit Misra, Tony C. Pan, Kanak Mahadik, George Powley, Priya N. Vaidya, Md. Vasimuddin, Srinivas Aluru |
| 2018 | Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques, PACT 2018, Limassol, Cyprus, November 01-04, 2018 Skevos Evripidou, Per Stenström, Michael F. P. O'Boyle |
| 2018 | Revealing parallel scans and reductions in recurrences through function reconstruction. Peng Jiang, Linchuan Chen, Gagan Agrawal |
| 2018 | Stencil codes on a vector length agnostic architecture. Adrià Armejach, Helena Caminal, Juan M. Cebrian, Rekai González-Alberquilla, Chris Adeniyi-Jones, Mateo Valero, Marc Casas, Miquel Moretó |
| 2018 | Synergistic cache layout for reuse and compression. Biswabandan Panda, André Seznec |
| 2018 | Towards concurrency race debugging: an integrated approach for constraint solving and dynamic slicing. Long Zheng, Xiaofei Liao, Hai Jin, Bingsheng He, Jingling Xue, Haikun Liu |
| 2018 | Transactional pre-abort handlers in hardware transactional memory. Sunjae Park, Christopher J. Hughes, Milos Prvulovic |
| 2018 | VW-SLP: auto-vectorization with adaptive vector width. Vasileios Porpodas, Rodrigo C. O. Rocha, Luís F. W. Góes |