| 2010 | 19th International Conference on Parallel Architectures and Compilation Techniques, PACT 2010, Vienna, Austria, September 11-15, 2010 Valentina Salapura, Michael Gschwind, Jens Knoop |
| 2010 | A case for NUMA-aware contention management on multicore systems. Sergey Blagodurov, Sergey Zhuravlev, Alexandra Fedorova, Ali Kamali |
| 2010 | A model for fusion and code motion in an automatic parallelizing compiler. Uday Bondhugula, Oktay Günlük, Sanjeeb Dash, Lakshminarayanan Renganarayanan |
| 2010 | A programmable parallel accelerator for learning and classification. Srihari Cadambi, Abhinandan Majumdar, Michela Becchi, Srimat T. Chakradhar, Hans Peter Graf |
| 2010 | A software-SVM-based transactional memory for multicore accelerator architectures with local memory. Jun Lee, Sangmin Seo, Jaejin Lee |
| 2010 | AKULA: a toolset for experimenting and developing thread placement algorithms on multicore systems. Sergey Zhuravlev, Sergey Blagodurov, Alexandra Fedorova |
| 2010 | AM++: a generalized active message framework. Jeremiah Willcock, Torsten Hoefler, Nicholas Gerard Edmonds, Andrew Lumsdaine |
| 2010 | ATAC: a 1000-core cache-coherent processor with on-chip optical network. George Kurian, Jason E. Miller, James Psota, Jonathan Eastep, Jifeng Liu, Jürgen Michel, Lionel C. Kimerling, Anant Agarwal |
| 2010 | Accelerating multicore reuse distance analysis with sampling and parallelization. Derek L. Schuff, Milind Kulkarni, Vijay S. Pai |
| 2010 | Adaptive spatiotemporal node selection in dynamic networks. Pradip Hari, John B. P. McCabe, Jonathan Banafato, Marcus Henry, Kevin Ko, Emmanouil Koukoumidis, Ulrich Kremer, Margaret Martonosi, Li-Shiuan Peh |
| 2010 | An OpenCL framework for heterogeneous multicores with local memory. Jaejin Lee, Jungwon Kim, Sangmin Seo, Seungkyun Kim, Jung-Ho Park, Honggyu Kim, Thanh Tuan Dao, Yongjin Cho, Sung Jong Seo, Seung Hak Lee, Seung Mo Cho, Hyo Jung Song, Sang-Bum Suh, Jong-Deok Choi |
| 2010 | An empirical characterization of stream programs and its implications for language and compiler design. William Thies, Saman P. Amarasinghe |
| 2010 | An integer programming framework for optimizing shared memory use on GPUs. Wenjing Ma, Gagan Agrawal |
| 2010 | An intra-tile cache set balancing scheme. Mohammad Hammoud, Sangyeun Cho, Rami G. Melhem |
| 2010 | Analyzing cache performance bottlenecks of STM applications and addressing them with compiler's help. Sandya S. Mannarswamy, Ramaswamy Govindarajan |
| 2010 | Approximating age-based arbitration in on-chip networks. Michael Mihn-Jong Lee, John Kim, Dennis Abts, Michael R. Marty, Jae W. Lee |
| 2010 | Automatic vector instruction selection for dynamic compilation. Rajkishore Barik, Jisheng Zhao, Vivek Sarkar |
| 2010 | Avoiding deadlock avoidance. Hari K. Pyla, Srinidhi Varadarajan |
| 2010 | Believe it or not!: mult-core CPUs can match GPU performance for a FLOP-intensive application! Rajesh Bordawekar, Uday Bondhugula, Ravi Rao |
| 2010 | Build Watson: an overview of DeepQA for the Jeopardy! challenge. David A. Ferrucci |
| 2010 | Compiler-assisted data distribution for chip multiprocessors. Yong Li, Ahmed Abousamra, Rami G. Melhem, Alex K. Jones |
| 2010 | CoreGenesis: erasing core boundaries for robust and configurable performance. Shantanu Gupta, Shuguang Feng, Amin Ansari, Ganesh S. Dasika, Scott A. Mahlke |
| 2010 | Criticality-driven superscalar design space exploration. Sandeep Navada, Niket Kumar Choudhary, Eric Rotenberg |
| 2010 | DAFT: decoupled acyclic fault tolerance. Yun Zhang, Jae W. Lee, Nick P. Johnson, David I. August |
| 2010 | DMATiler: revisiting loop tiling for direct memory access. Haibo Lin, Tao Liu, Huoding Li, Tong Chen, Lakshminarayanan Renganarayanan, Kevin O'Brien, Ling Shao |
| 2010 | Data layout transformation exploiting memory-level parallelism in structured grid many-core applications. I-Jui Sung, John A. Stratton, Wen-mei W. Hwu |
| 2010 | Design and implementation of the PLUG architecture for programmable and efficient network lookups. Amit Kumar, Lorenzo De Carli, Sung Jin Kim, Marc de Kruijf, Karthikeyan Sankaralingam, Cristian Estan, Somesh Jha |
| 2010 | Discovering and understanding performance bottlenecks in transactional applications. Ferad Zyulkyarov, Srdjan Stipic, Tim Harris, Osman S. Unsal, Adrián Cristal, Ibrahim Hur, Mateo Valero |
| 2010 | Dynamically managed multithreaded reconfigurable architectures for chip multiprocessors. Matthew A. Watkins, David H. Albonesi |
| 2010 | Efficient runahead threads. Tanausú Ramírez, Alex Pajuelo, Oliverio J. Santana, Onur Mutlu, Mateo Valero |
| 2010 | Efficient sequential consistency using conditional fences. Changhui Lin, Vijay Nagarajan, Rajiv Gupta |
| 2010 | Energy efficient speculative threads: dynamic thread allocation in Same-ISA heterogeneous multicore systems. Yangchun Luo, Venkatesan Packirisamy, Wei-Chung Hsu, Antonia Zhai |
| 2010 | Exploiting subtrace-level parallelism in clustered processors. Rafael Ubal, Julio Sahuquillo, Salvador Petit, Pedro López, José Duato |
| 2010 | Feedback-directed pipeline parallelism. M. Aater Suleman, Moinuddin K. Qureshi, Khubaib, Yale N. Patt |
| 2010 | Handling the problems and opportunities posed by multiple on-chip memory controllers. Manu Awasthi, David W. Nellans, Kshitij Sudan, Rajeev Balasubramonian, Al Davis |
| 2010 | Improving speculative loop parallelization via selective squash and speculation reuse. Santhosh Sharma Ananthramu, Deepak Majeti, Sanjeev Kumar Aggarwal, Mainak Chaudhuri |
| 2010 | MEDICS: ultra-portable processing for medical image reconstruction. Ganesh S. Dasika, Ankit Sethia, Vincentius Robby, Trevor N. Mudge, Scott A. Mahlke |
| 2010 | MapCG: writing parallel program portable between CPU and GPU. Chuntao Hong, Dehao Chen, Wenguang Chen, Weimin Zheng, Haibo Lin |
| 2010 | Moths: mobile threads for on-chip networks. Matthew Misler, Natalie D. Enright Jerger |
| 2010 | NUcache: a multicore cache organization based on next-use distance. R. Manikantan, Kaushik Rajan, R. Govindarajan |
| 2010 | NoC-aware cache design for chip multiprocessors. Ahmed Abousamra, Rami G. Melhem, Alex K. Jones |
| 2010 | Ocelot: a dynamic optimization framework for bulk-synchronous applications in heterogeneous systems. Gregory Frederick Diamos, Andrew Kerr, Sudhakar Yalamanchili, Nathan Clark |
| 2010 | On mitigating memory bandwidth contention through bandwidth-aware scheduling. Di Xu, Chenggang Wu, Pen-Chung Yew |
| 2010 | On-chip network design considerations for compute accelerators. Ali Bakhoda, John Kim, Tor M. Aamodt |
| 2010 | Online cache modeling for commodity multicore processors. Richard West, Puneet Zaroo, Carl A. Waldspurger, Xiao Zhang |
| 2010 | Ordered and unordered algorithms for parallel breadth first search. Muhammad Amber Hassaan, Martin Burtscher, Keshav Pingali |
| 2010 | Partitioning streaming parallelism for multi-cores: a machine learning based approach. Zheng Wang, Michael F. P. O'Boyle |
| 2010 | Power and thermal characterization of POWER6 system. Víctor Jiménez, Francisco J. Cazorla, Roberto Gioiosa, Mateo Valero, Carlos Boneti, Eren Kursun, Chen-Yong Cher, Canturk Isci, Alper Buyuktosunoglu, Pradip Bose |
| 2010 | Proximity coherence for chip multiprocessors. Nick Barrow-Williams, Christian Fensch, Simon W. Moore |
| 2010 | Raising the level of many-core programming with compiler technology: meeting a grand challenge. Wen-mei W. Hwu |
| 2010 | Reducing task creation and termination overhead in explicitly parallel programs. Jisheng Zhao, Jun Shirako, V. Krishna Nandivada, Vivek Sarkar |
| 2010 | Revisiting sorting for GPGPU stream architectures. Duane Merrill, Andrew S. Grimshaw |
| 2010 | SPACE: sharing pattern-based directory coherence for multicore scalability. Hongzhou Zhao, Arrvindh Shriraman, Sandhya Dwarkadas |
| 2010 | SWEL: hardware cache coherence protocols to map shared data onto shared caches. Seth H. Pugsley, Josef B. Spjut, David W. Nellans, Rajeev Balasubramonian |
| 2010 | Scalable hardware support for conditional parallelization. Zheng Li, Olivier Certner, José Duato, Olivier Temam |
| 2010 | Scalable thread scheduling and global power management for heterogeneous many-core architectures. Jonathan A. Winter, David H. Albonesi, Christine A. Shoemaker |
| 2010 | Scaling of the PARSEC benchmark inputs. Christian Bienia, Kai Li |
| 2010 | Semi-automatic extraction and exploitation of hierarchical pipeline parallelism using profiling information. Georgios Tournavitis, Björn Franke |
| 2010 | Simple and fast biased locks. Nalini Vasudevan, Kedar S. Namjoshi, Stephen A. Edwards |
| 2010 | Speculative-aware execution: a simple and efficient technique for utilizing multi-cores to improve single-thread performance. Rania H. Mameesh, Manoj Franklin |
| 2010 | StatCC: a statistical cache contention model. David Eklov, David Black-Schaffer, Erik Hagersten |
| 2010 | Subspace snooping: filtering snoops with operating system support. Daehoon Kim, Jeongseob Ahn, Jaehong Kim, Jaehyuk Huh |
| 2010 | System-level max power (SYMPO): a systematic approach for escalating system-level power consumption using synthetic benchmarks. Karthik Ganesan, Jungho Jo, William Lloyd Bircher, Dimitris Kaseridis, Zhibin Yu, Lizy K. John |
| 2010 | The Paralax infrastructure: automatic parallelization with a helping hand. Hans Vandierendonck, Sean Rul, Koen De Bosschere |
| 2010 | The potential of using dynamic information flow analysis in data value prediction. Walid J. Ghandour, Haitham Akkary, Wes Masri |
| 2010 | Tiled-MapReduce: optimizing resource usages of data-parallel applications on multicore with tiling. Rong Chen, Haibo Chen, Binyu Zang |
| 2010 | Towards a science of parallel programming. Keshav Pingali |
| 2010 | Twin peaks: a software platform for heterogeneous computing on general-purpose and graphics processors. Jayanth Gummaraju, Laurent Morichetti, Michael Houston, Ben Sander, Benedict R. Gaster, Bixia Zheng |
| 2010 | Using dead blocks as a virtual victim cache. Samira Manabi Khan, Daniel A. Jiménez, Doug Burger, Babak Falsafi |
| 2010 | Using memory mapping to support cactus stacks in work-stealing runtime systems. I-Ting Angelina Lee, Silas Boyd-Wickizer, Zhiyi Huang, Charles E. Leiserson |
| 2010 | WAYPOINT: scaling coherence to thousand-core architectures. John H. Kelm, Matthew R. Johnson, Steven S. Lumetta, Sanjay J. Patel |