PACT B

71 papers

YearTitle / Authors
201019th International Conference on Parallel Architectures and Compilation Techniques, PACT 2010, Vienna, Austria, September 11-15, 2010
Valentina Salapura, Michael Gschwind, Jens Knoop
2010A case for NUMA-aware contention management on multicore systems.
Sergey Blagodurov, Sergey Zhuravlev, Alexandra Fedorova, Ali Kamali
2010A model for fusion and code motion in an automatic parallelizing compiler.
Uday Bondhugula, Oktay Günlük, Sanjeeb Dash, Lakshminarayanan Renganarayanan
2010A programmable parallel accelerator for learning and classification.
Srihari Cadambi, Abhinandan Majumdar, Michela Becchi, Srimat T. Chakradhar, Hans Peter Graf
2010A software-SVM-based transactional memory for multicore accelerator architectures with local memory.
Jun Lee, Sangmin Seo, Jaejin Lee
2010AKULA: a toolset for experimenting and developing thread placement algorithms on multicore systems.
Sergey Zhuravlev, Sergey Blagodurov, Alexandra Fedorova
2010AM++: a generalized active message framework.
Jeremiah Willcock, Torsten Hoefler, Nicholas Gerard Edmonds, Andrew Lumsdaine
2010ATAC: a 1000-core cache-coherent processor with on-chip optical network.
George Kurian, Jason E. Miller, James Psota, Jonathan Eastep, Jifeng Liu, Jürgen Michel, Lionel C. Kimerling, Anant Agarwal
2010Accelerating multicore reuse distance analysis with sampling and parallelization.
Derek L. Schuff, Milind Kulkarni, Vijay S. Pai
2010Adaptive spatiotemporal node selection in dynamic networks.
Pradip Hari, John B. P. McCabe, Jonathan Banafato, Marcus Henry, Kevin Ko, Emmanouil Koukoumidis, Ulrich Kremer, Margaret Martonosi, Li-Shiuan Peh
2010An OpenCL framework for heterogeneous multicores with local memory.
Jaejin Lee, Jungwon Kim, Sangmin Seo, Seungkyun Kim, Jung-Ho Park, Honggyu Kim, Thanh Tuan Dao, Yongjin Cho, Sung Jong Seo, Seung Hak Lee, Seung Mo Cho, Hyo Jung Song, Sang-Bum Suh, Jong-Deok Choi
2010An empirical characterization of stream programs and its implications for language and compiler design.
William Thies, Saman P. Amarasinghe
2010An integer programming framework for optimizing shared memory use on GPUs.
Wenjing Ma, Gagan Agrawal
2010An intra-tile cache set balancing scheme.
Mohammad Hammoud, Sangyeun Cho, Rami G. Melhem
2010Analyzing cache performance bottlenecks of STM applications and addressing them with compiler's help.
Sandya S. Mannarswamy, Ramaswamy Govindarajan
2010Approximating age-based arbitration in on-chip networks.
Michael Mihn-Jong Lee, John Kim, Dennis Abts, Michael R. Marty, Jae W. Lee
2010Automatic vector instruction selection for dynamic compilation.
Rajkishore Barik, Jisheng Zhao, Vivek Sarkar
2010Avoiding deadlock avoidance.
Hari K. Pyla, Srinidhi Varadarajan
2010Believe it or not!: mult-core CPUs can match GPU performance for a FLOP-intensive application!
Rajesh Bordawekar, Uday Bondhugula, Ravi Rao
2010Build Watson: an overview of DeepQA for the Jeopardy! challenge.
David A. Ferrucci
2010Compiler-assisted data distribution for chip multiprocessors.
Yong Li, Ahmed Abousamra, Rami G. Melhem, Alex K. Jones
2010CoreGenesis: erasing core boundaries for robust and configurable performance.
Shantanu Gupta, Shuguang Feng, Amin Ansari, Ganesh S. Dasika, Scott A. Mahlke
2010Criticality-driven superscalar design space exploration.
Sandeep Navada, Niket Kumar Choudhary, Eric Rotenberg
2010DAFT: decoupled acyclic fault tolerance.
Yun Zhang, Jae W. Lee, Nick P. Johnson, David I. August
2010DMATiler: revisiting loop tiling for direct memory access.
Haibo Lin, Tao Liu, Huoding Li, Tong Chen, Lakshminarayanan Renganarayanan, Kevin O'Brien, Ling Shao
2010Data layout transformation exploiting memory-level parallelism in structured grid many-core applications.
I-Jui Sung, John A. Stratton, Wen-mei W. Hwu
2010Design and implementation of the PLUG architecture for programmable and efficient network lookups.
Amit Kumar, Lorenzo De Carli, Sung Jin Kim, Marc de Kruijf, Karthikeyan Sankaralingam, Cristian Estan, Somesh Jha
2010Discovering and understanding performance bottlenecks in transactional applications.
Ferad Zyulkyarov, Srdjan Stipic, Tim Harris, Osman S. Unsal, Adrián Cristal, Ibrahim Hur, Mateo Valero
2010Dynamically managed multithreaded reconfigurable architectures for chip multiprocessors.
Matthew A. Watkins, David H. Albonesi
2010Efficient runahead threads.
Tanausú Ramírez, Alex Pajuelo, Oliverio J. Santana, Onur Mutlu, Mateo Valero
2010Efficient sequential consistency using conditional fences.
Changhui Lin, Vijay Nagarajan, Rajiv Gupta
2010Energy efficient speculative threads: dynamic thread allocation in Same-ISA heterogeneous multicore systems.
Yangchun Luo, Venkatesan Packirisamy, Wei-Chung Hsu, Antonia Zhai
2010Exploiting subtrace-level parallelism in clustered processors.
Rafael Ubal, Julio Sahuquillo, Salvador Petit, Pedro López, José Duato
2010Feedback-directed pipeline parallelism.
M. Aater Suleman, Moinuddin K. Qureshi, Khubaib, Yale N. Patt
2010Handling the problems and opportunities posed by multiple on-chip memory controllers.
Manu Awasthi, David W. Nellans, Kshitij Sudan, Rajeev Balasubramonian, Al Davis
2010Improving speculative loop parallelization via selective squash and speculation reuse.
Santhosh Sharma Ananthramu, Deepak Majeti, Sanjeev Kumar Aggarwal, Mainak Chaudhuri
2010MEDICS: ultra-portable processing for medical image reconstruction.
Ganesh S. Dasika, Ankit Sethia, Vincentius Robby, Trevor N. Mudge, Scott A. Mahlke
2010MapCG: writing parallel program portable between CPU and GPU.
Chuntao Hong, Dehao Chen, Wenguang Chen, Weimin Zheng, Haibo Lin
2010Moths: mobile threads for on-chip networks.
Matthew Misler, Natalie D. Enright Jerger
2010NUcache: a multicore cache organization based on next-use distance.
R. Manikantan, Kaushik Rajan, R. Govindarajan
2010NoC-aware cache design for chip multiprocessors.
Ahmed Abousamra, Rami G. Melhem, Alex K. Jones
2010Ocelot: a dynamic optimization framework for bulk-synchronous applications in heterogeneous systems.
Gregory Frederick Diamos, Andrew Kerr, Sudhakar Yalamanchili, Nathan Clark
2010On mitigating memory bandwidth contention through bandwidth-aware scheduling.
Di Xu, Chenggang Wu, Pen-Chung Yew
2010On-chip network design considerations for compute accelerators.
Ali Bakhoda, John Kim, Tor M. Aamodt
2010Online cache modeling for commodity multicore processors.
Richard West, Puneet Zaroo, Carl A. Waldspurger, Xiao Zhang
2010Ordered and unordered algorithms for parallel breadth first search.
Muhammad Amber Hassaan, Martin Burtscher, Keshav Pingali
2010Partitioning streaming parallelism for multi-cores: a machine learning based approach.
Zheng Wang, Michael F. P. O'Boyle
2010Power and thermal characterization of POWER6 system.
Víctor Jiménez, Francisco J. Cazorla, Roberto Gioiosa, Mateo Valero, Carlos Boneti, Eren Kursun, Chen-Yong Cher, Canturk Isci, Alper Buyuktosunoglu, Pradip Bose
2010Proximity coherence for chip multiprocessors.
Nick Barrow-Williams, Christian Fensch, Simon W. Moore
2010Raising the level of many-core programming with compiler technology: meeting a grand challenge.
Wen-mei W. Hwu
2010Reducing task creation and termination overhead in explicitly parallel programs.
Jisheng Zhao, Jun Shirako, V. Krishna Nandivada, Vivek Sarkar
2010Revisiting sorting for GPGPU stream architectures.
Duane Merrill, Andrew S. Grimshaw
2010SPACE: sharing pattern-based directory coherence for multicore scalability.
Hongzhou Zhao, Arrvindh Shriraman, Sandhya Dwarkadas
2010SWEL: hardware cache coherence protocols to map shared data onto shared caches.
Seth H. Pugsley, Josef B. Spjut, David W. Nellans, Rajeev Balasubramonian
2010Scalable hardware support for conditional parallelization.
Zheng Li, Olivier Certner, José Duato, Olivier Temam
2010Scalable thread scheduling and global power management for heterogeneous many-core architectures.
Jonathan A. Winter, David H. Albonesi, Christine A. Shoemaker
2010Scaling of the PARSEC benchmark inputs.
Christian Bienia, Kai Li
2010Semi-automatic extraction and exploitation of hierarchical pipeline parallelism using profiling information.
Georgios Tournavitis, Björn Franke
2010Simple and fast biased locks.
Nalini Vasudevan, Kedar S. Namjoshi, Stephen A. Edwards
2010Speculative-aware execution: a simple and efficient technique for utilizing multi-cores to improve single-thread performance.
Rania H. Mameesh, Manoj Franklin
2010StatCC: a statistical cache contention model.
David Eklov, David Black-Schaffer, Erik Hagersten
2010Subspace snooping: filtering snoops with operating system support.
Daehoon Kim, Jeongseob Ahn, Jaehong Kim, Jaehyuk Huh
2010System-level max power (SYMPO): a systematic approach for escalating system-level power consumption using synthetic benchmarks.
Karthik Ganesan, Jungho Jo, William Lloyd Bircher, Dimitris Kaseridis, Zhibin Yu, Lizy K. John
2010The Paralax infrastructure: automatic parallelization with a helping hand.
Hans Vandierendonck, Sean Rul, Koen De Bosschere
2010The potential of using dynamic information flow analysis in data value prediction.
Walid J. Ghandour, Haitham Akkary, Wes Masri
2010Tiled-MapReduce: optimizing resource usages of data-parallel applications on multicore with tiling.
Rong Chen, Haibo Chen, Binyu Zang
2010Towards a science of parallel programming.
Keshav Pingali
2010Twin peaks: a software platform for heterogeneous computing on general-purpose and graphics processors.
Jayanth Gummaraju, Laurent Morichetti, Michael Houston, Ben Sander, Benedict R. Gaster, Bixia Zheng
2010Using dead blocks as a virtual victim cache.
Samira Manabi Khan, Daniel A. Jiménez, Doug Burger, Babak Falsafi
2010Using memory mapping to support cactus stacks in work-stealing runtime systems.
I-Ting Angelina Lee, Silas Boyd-Wickizer, Zhiyi Huang, Charles E. Leiserson
2010WAYPOINT: scaling coherence to thousand-core architectures.
John H. Kelm, Matthew R. Johnson, Steven S. Lumetta, Sanjay J. Patel