PACT B

56 papers

YearTitle / Authors
2016A DSL Compiler for Accelerating Image Processing Pipelines on FPGAs.
Nitin Chugh, Vinay Vasista, Suresh Purini, Uday Bondhugula
2016A Static Cut-off for Task Parallel Programs.
Shintaro Iwasaki, Kenjiro Taura
2016Accelerating Linked-list Traversal Through Near-Data Processing.
Byungchul Hong, Gwangsun Kim, Jung Ho Ahn, Yongkee Kwon, Hongsik Kim, John Kim
2016Auto-tuning Spark Big Data Workloads on POWER8: Prediction-Based Dynamic SMT Threading.
Zhen Jia, Chao Xue, Guancheng Chen, Jianfeng Zhan, Lixin Zhang, Yonghua Lin, H. Peter Hofstee
2016Automatically Exploiting Implicit Pipeline Parallelism from Multiple Dependent Kernels for GPUs.
Gwangsun Kim, Jiyun Jeong, John Kim, Mark Stephenson
2016Big Data Analytics on Flash Storage with Accelerators.
Arvind
2016Bridging the Semantic Gaps of GPU Acceleration for Scale-out CNN-based Big Data Processing: Think Big, See Small.
Mingcong Song, Yang Hu, Yunlong Xu, Chao Li, Huixiang Chen, Jingling Yuan, Tao Li
2016CAF: Core to Core Communication Acceleration Framework.
Yipeng Wang, Ren Wang, Andrew Herdrich, James Tsai, Yan Solihin
2016Characterizing and Optimizing the Performance of Multithreaded Programs Under Interference.
Yong Zhao, Jia Rao, Qing Yi
2016Combating the Reliability Challenge of GPU Register File at Low Supply Voltage.
Jingweijia Tan, Shuaiwen Leon Song, Kaige Yan, Xin Fu, Andrés Márquez, Darren J. Kerbyson
2016EXCITE-VM: Extending the Virtual Memory System to Support Snapshot Isolation Transactions.
Heiner Litz, Benjamin Braun, David R. Cheriton
2016Energy Aware Persistence: Reducing Energy Overheads of Memory-based Persistence in NVMs.
Sudarsun Kannan, Moinuddin K. Qureshi, Ada Gavrilovska, Karsten Schwan
2016Fusion of Parallel Array Operations.
Mads Ruben Burgdorff Kristensen, Simon Andreas Frimann Lund, Troels Blum, James Avery
2016Greater Performance and Better Efficiency: Predicated Execution has shown us the way.
Yale N. Patt
2016Hash Map Inlining.
Dibakar Gope, Mikko H. Lipasti
2016Integrating Algorithmic Parameters into Benchmarking and Design Space Exploration in 3D Scene Understanding.
Bruno Bodin, Luigi Nardi, M. Zeeshan Zia, Harry Wagstaff, Govind Sreekar Shenoy, Murali Krishna Emani, John Mawer, Christos Kotselidis, Andy Nisbet, Mikel Luján, Björn Franke, Paul H. J. Kelly, Michael F. P. O'Boyle
2016MicroSpec: Speculation-Centric Fine-Grained Parallelization for FSM Computations.
Junqiao Qiu, Zhijia Zhao, Bin Ren
2016OAWS: Memory Occlusion Aware Warp Scheduling.
Bin Wang, Yue Zhu, Weikuan Yu
2016Online Scalability Characterization of Data-Parallel Programs on Many Cores.
Younghyun Cho, Surim Oh, Bernhard Egger
2016Optimizing Indirect Memory References with milk.
Vladimir Kiriansky, Yunming Zhang, Saman P. Amarasinghe
2016POSTER: An Integrated Vector-Scalar Design on an In-order ARM Core.
Milan Stanic, Oscar Palomar, Timothy Hayes, Ivan Ratkovic, Osman S. Unsal, Adrián Cristal, Mateo Valero
2016POSTER: An Optimization of Dataflow Architectures for Scientific Applications.
Xiaowei Shen, Xiaochun Ye, Xu Tan, Da Wang, Zhimin Zhang, Dongrui Fan, Zhimin Tang
2016POSTER: Collective Dynamic Parallelism for Directive Based GPU Programming Languages and Compilers.
Guray Ozen, Eduard Ayguadé, Jesús Labarta
2016POSTER: Easy PRAM-based High-Performance Parallel Programming with ICE.
Fady Ghanim, Rajeev Barua, Uzi Vishkin
2016POSTER: Efficient Self-Invalidation/Self-Downgrade for Critical Sections with Relaxed Semantics.
Alberto Ros, Carl Leonardsson, Christos Sakalis, Stefanos Kaxiras
2016POSTER: Exploiting Asymmetric Multi-Core Processors with Flexible System Sofware.
Kallia Chronaki, Miquel Moretó, Marc Casas, Alejandro Rico, Rosa M. Badia, Eduard Ayguadé, Jesús Labarta, Mateo Valero
2016POSTER: Fault-tolerant Execution on COTS Multi-core Processors with Hardware Transactional Memory Support.
Florian Haas, Sebastian Weis, Theo Ungerer, Gilles Pokam, Youfeng Wu
2016POSTER: Firestorm: Operating Systems for Power-Constrained Architectures.
Sankaralingam Panneerselvam, Michael M. Swift
2016POSTER: Fly-Over: A Light-Weight Distributed Power-Gating Mechanism For Energy-Efficient Networks-on-Chip.
Rahul Boyapati, Jiayi Huang, Ningyuan Wang, Kyung Hoon Kim, Ki Hwan Yum, Eun Jung Kim
2016POSTER: Hybrid Data Dependence Analysis for Loop Transformations.
Diogo Nunes Sampaio, Alain Ketterlin, Louis-Noël Pouchet, Fabrice Rastello
2016POSTER: Pagoda: A Runtime System to Maximize GPU Utilization in Data Parallel Tasks with Limited Parallelism.
Tsung Tai Yeh, Amit Sabne, Putt Sakdhnagool, Rudolf Eigenmann, Timothy G. Rogers
2016POSTER: SILC-FM: Subblocked InterLeaved Cache-Like Flat Memory Organization.
Jee Ho Ryoo, Mitesh R. Meswani, Reena Panda, Lizy K. John
2016POSTER: hVISC: A Portable Abstraction for Heterogeneous Parallel Systems.
Prakalp Srivastava, Maria Kotsifakou, Matthew D. Sinclair, Rakesh Komuravelli, Vikram S. Adve, Sarita V. Adve
2016POSTER: ξ-TAO: A Cache-centric Execution Model and Runtime for Deep Parallel Multicore Topologies.
Miquel Pericàs
2016Power Tuning HPC Jobs on Power-Constrained Systems.
Neha Gholkar, Frank Mueller, Barry Rountree
2016Proceedings of the 2016 International Conference on Parallel Architectures and Compilation, PACT 2016, Haifa, Israel, September 11-15, 2016
Ayal Zaks, Bilha Mendelson, Lawrence Rauchwerger, Wen-mei W. Hwu
2016Reducing Cache Coherence Traffic with Hierarchical Directory Cache and NUMA-Aware Runtime Scheduling.
Paul Caheny, Marc Casas, Miquel Moretó, Hervé Gloaguen, Maxime Saintes, Eduard Ayguadé, Jesús Labarta, Mateo Valero
2016Reduction Drawing: Language Constructs and Polyhedral Compilation for Reductions on GPU.
Chandan Reddy, Michael Kruse, Albert Cohen
2016Resource Conscious Reuse-Driven Tiling for GPUs.
Prashant Singh Rawat, Changwan Hong, Mahesh Ravishankar, Vinod Grover, Louis-Noël Pouchet, Atanas Rountev, P. Sadayappan
2016Rinnegan: Efficient Resource Use in Heterogeneous Architectures.
Sankaralingam Panneerselvam, Michael M. Swift
2016Scalable Task Parallelism for NUMA: A Uniform Abstraction for Coordinated Scheduling and Memory Management.
Andi Drebes, Antoniu Pop, Karine Heydemann, Albert Cohen, Nathalie Drach
2016Scaling Data Analytics with Moore's Law.
Kunle Olukotun
2016Scheduling Techniques for GPU Architectures with Processing-In-Memory Capabilities.
Ashutosh Pattnaik, Xulong Tang, Adwait Jog, Onur Kayiran, Asit K. Mishra, Mahmut T. Kandemir, Onur Mutlu, Chita R. Das
2016Sparso: Context-driven Optimizations of Sparse Linear Algebra.
Hongbo Rong, Jongsoo Park, Lingxiang Xiang, Todd A. Anderson, Mikhail Smelyanskiy
2016Speculatively Exploiting Cross-Invocation Parallelism.
Jialu Huang, Prakash Prabhu, Thomas B. Jablin, Soumyadeep Ghosh, Sotiris Apostolakis, Jae W. Lee, David I. August
2016Student Research Poster: A Low Complexity Cache Sharing Mechanism to Address System Fairness.
Vicent Selfa, Julio Sahuquillo, Salvador Petit, María Engracia Gómez
2016Student Research Poster: A Scalable General Purpose System for Large-Scale Graph Processing.
Jiawen Sun
2016Student Research Poster: Compiling Boolean Circuits to Non-deterministic Branching Programs to be Implemented by Light Switching Circuits.
Vladislav Tartakovsky
2016Student Research Poster: From Processing-in-Memory to Processing-in-Storage.
Roman Kaplan
2016Student Research Poster: Network Controller Emulation on a Sidecore for Unmodified Virtual Machines.
Arthur Kiyanovski
2016Student Research Poster: Slack-Aware Shared Bandwidth Management in GPUs.
Saumay Dublish
2016Student Research Poster: Software Out-of-Order Execution for In-Order Architectures.
Kim-Anh Tran
2016Tardis 2.0: Optimized Time Traveling Coherence for Relaxed Consistency Models.
Xiangyao Yu, Hongzhe Liu, Ethan Zou, Srinivas Devadas
2016Vectorization of Multibyte Floating Point Data Formats.
Andrew Anderson, David Gregg
2016WearCore: A Core for Wearable Workloads.
Sanyam Mehta, Josep Torrellas
2016μC-States: Fine-grained GPU Datapath Power Management.
Onur Kayiran, Adwait Jog, Ashutosh Pattnaik, Rachata Ausavarungnirun, Xulong Tang, Mahmut T. Kandemir, Gabriel H. Loh, Onur Mutlu, Chita R. Das