| 2013 | A unified view of non-monotonic core selection and application steering in heterogeneous chip multiprocessors. Sandeep Navada, Niket K. Choudhary, Salil V. Wadhavkar, Eric Rotenberg |
| 2013 | APOGEE: Adaptive prefetching on GPUs for energy efficiency. Ankit Sethia, Ganesh S. Dasika, Mehrzad Samadi, Scott A. Mahlke |
| 2013 | An empirical model for predicting cross-core performance interference on multicore processors. Jiacheng Zhao, Xiaobing Feng, Huimin Cui, Youliang Yan, Jingling Xue, Wensen Yang |
| 2013 | An opportunistic prediction-based thread scheduling to maximize throughput/watt in AMPs. Arunachalam Annamalai, Rance Rodrigues, Israel Koren, Sandip Kundu |
| 2013 | Automatic OpenCL work-group size selection for multicore CPUs. Sangmin Seo, Jun Lee, Gangwon Jo, Jaejin Lee |
| 2013 | Automatic vectorization of tree traversals. Youngjoon Jo, Michael Goldfarb, Milind Kulkarni |
| 2013 | Breaking SIMD shackles with an exposed flexible microarchitecture and the access execute PDG. Venkatraman Govindaraju, Tony Nowatzki, Karthikeyan Sankaralingam |
| 2013 | Building expressive, area-efficient coherence directories. Lei Fang, Peng Liu, Qi Hu, Michael C. Huang, Guofan Jiang |
| 2013 | Can lock-free and combining techniques co-exist? A novel approach on concurrent queue. Changwoo Min, Young Ik Eom |
| 2013 | Concurrent predicates: A debugging technique for every parallel programmer. Justin Emile Gottschlich, Gilles Pokam, Cristiano Pereira, Youfeng Wu |
| 2013 | Coordinated power-performance optimization in manycores. Hiroshi Sasaki, Satoshi Imamura, Koji Inoue |
| 2013 | DANBI: Dynamic scheduling of irregular stream programs for many-core systems. Changwoo Min, Young Ik Eom |
| 2013 | Do inputs matter? using data-dependence profiling to evaluate thread level speculation in BG/Q. Arnamoy Bhattacharyya |
| 2013 | Dynamic memory access monitoring based on tagged memory. Mikhail A. Gorelov, Lev Mukhanov |
| 2013 | Exploring hybrid memory for GPU energy efficiency through software-hardware co-design. Bin Wang, Bo Wu, Dong Li, Xipeng Shen, Weikuan Yu, Yizheng Jiao, Jeffrey S. Vetter |
| 2013 | Exposing ILP in custom hardware with a dataflow compiler IR. Ali Mustafa Zaidi |
| 2013 | Fairness-aware scheduling on single-ISA heterogeneous multi-cores. Kenzo Van Craeynest, Shoaib Akram, Wim Heirman, Aamer Jaleel, Lieven Eeckhout |
| 2013 | General chairs' welcome message. Michael F. P. O'Boyle, Christian Fensch |
| 2013 | Generating efficient data movement code for heterogeneous architectures with distributed-memory. Roshan Dathathri, Chandan Reddy, Thejas Ramashekar, Uday Bondhugula |
| 2013 | INSPIRE: The insieme parallel intermediate representation. Herbert Jordan, Simone Pellegrini, Peter Thoman, Klaus Kofler, Thomas Fahringer |
| 2013 | Interprocedural strength reduction of critical sections in explicitly-parallel programs. Rajkishore Barik, Jisheng Zhao, Vivek Sarkar |
| 2013 | Jigsaw: Scalable software-defined caches. Nathan Beckmann, Daniel Sánchez |
| 2013 | Keynote talk: A comprehensive approach to HW/SW codesign. David J. Kuck |
| 2013 | Keynote talk: Parallel programming for mobile computing. Calin Cascaval |
| 2013 | Keynote talk: Towards automatic resource management in parallel architectures. Per Stenström |
| 2013 | L1-bandwidth aware thread allocation in multicore SMT processors. Josué Feliu, Julio Sahuquillo, Salvador Petit, José Duato |
| 2013 | Managing shared last-level cache in a heterogeneous multicore processor. Vineeth Mekkat, Anup Holey, Pen-Chung Yew, Antonia Zhai |
| 2013 | McRouter: Multicast within a router for high performance network-on-chips. Yuan He, Hiroshi Sasaki, Shinobu Miwa, Hiroshi Nakamura |
| 2013 | Meeting midway: Improving CMP performance with memory-side prefetching. Praveen Yedlapalli, Jagadish Kotra, Emre Kultursay, Mahmut T. Kandemir, Chita R. Das, Anand Sivasubramaniam |
| 2013 | Memory-centric system interconnect design with Hybrid Memory Cubes. Gwangsun Kim, John Kim, Jung Ho Ahn, Jaeha Kim |
| 2013 | Message from the program chairs. André Seznec, François Bodin |
| 2013 | Neither more nor less: Optimizing thread-level parallelism for GPGPUs. Onur Kayiran, Adwait Jog, Mahmut T. Kandemir, Chita R. Das |
| 2013 | PS-cache: An energy-efficient cache design for chip multiprocessors. Joan J. Valls, Alberto Ros, Julio Sahuquillo, María Engracia Gómez |
| 2013 | Parallel flow-sensitive pointer analysis by graph-rewriting. Vaivaswatha Nagaraj, R. Govindarajan |
| 2013 | Parallel frame rendering: Trading responsiveness for energy on a mobile GPU. José-María Arnau, Joan-Manuel Parcerisa, Polychronis Xekalakis |
| 2013 | Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, Edinburgh, United Kingdom, September 7-11, 2013 Christian Fensch, Michael F. P. O'Boyle, André Seznec, François Bodin |
| 2013 | RSVM: A Region-based Software Virtual Memory for GPU. Feng Ji, Heshan Lin, Xiaosong Ma |
| 2013 | Reshaping cache misses to improve row-buffer locality in multicore systems. Wei Ding, Jun Liu, Mahmut T. Kandemir, Mary Jane Irwin |
| 2013 | S-CAVE: Effective SSD caching to improve virtual machine storage performance. Tian Luo, Siyuan Ma, Rubao Lee, Xiaodong Zhang, Deng Liu, Li Zhou |
| 2013 | SMT-centric power-aware thread placement in chip multiprocessors. Augusto Vega, Alper Buyuktosunoglu, Pradip Bose |
| 2013 | Starchart: Hardware and software optimization using recursive partitioning regression trees. Wenhao Jia, Kelly A. Shaw, Margaret Martonosi |
| 2013 | TCPT - Thread criticality-driven prefetcher throttling. Biswabandan Panda, Shankar Balachandran |
| 2013 | Task sampling: Computer architecture simulation in the many-core era. Thomas Grass |
| 2013 | The case for a scalable coherence protocol for complex on-chip cache hierarchies in many-core systems. Lucia G. Menezo, Valentin Puente, José-Ángel Gregorio |
| 2013 | ThermOS: System support for dynamic thermal management of chip multi-processors. Filippo Sironi, Martina Maggio, Riccardo Cattaneo, Giovanni F. Del Nero, Donatella Sciuto, Marco D. Santambrogio |
| 2013 | Traffic steering between a low-latency unswitched TL ring and a high-throughput switched on-chip interconnect. Jungju Oh, Alenka G. Zajic, Milos Prvulovic |
| 2013 | Transparent CPU-GPU collaboration for data-parallel kernels on heterogeneous systems. Janghaeng Lee, Mehrzad Samadi, Yongjun Park, Scott A. Mahlke |
| 2013 | Vectorization past dependent branches through speculation. Majedul Haque Sujon, R. Clint Whaley, Qing Yi |
| 2013 | Writeback-aware bandwidth partitioning for multi-core systems with PCM. Miao Zhou, Yu Du, Bruce R. Childers, Rami G. Melhem, Daniel Mossé |