ICS A

41 papers

YearTitle / Authors
2008A compiler framework for optimization of affine loop nests for gpgpus.
Muthu Manikandan Baskaran, Uday Bondhugula, Sriram Krishnamoorthy, J. Ramanujam, Atanas Rountev, P. Sadayappan
2008A freespace crossbar for multi-core processors.
Michel N. Victor, Aris K. Silzars, Edward S. Davidson
2008A projection-based optimization framework for abstractions with application to the unstructured mesh domain.
Brian S. White, Sally A. McKee, Daniel J. Quinlan
2008A regression-based approach to scalability prediction.
Bradley J. Barnes, Barry Rountree, David K. Lowenthal, Jaxk Reeves, Bronis R. de Supinski, Martin Schulz
2008Accurate memory signatures and synthetic address traces for HPC applications.
Jonathan Weinberg, Allan Snavely
2008Adaptive runtime tuning of parallel sparse matrix-vector multiplication on distributed memory systems.
Seyong Lee, Rudolf Eigenmann
2008Advanced collective communication in aspen.
Qasim Ali, Vijay S. Pai, Samuel P. Midkiff
2008An approach for adaptive DRAM temperature and power management.
Song Liu, Seda Ogrenci Memik, Yu Zhang, Gokhan Memik
2008Analysis of dynamic power management on multi-core processors.
William Lloyd Bircher, Lizy K. John
2008Analyzing memory access intensity in parallel programs on multicore.
Lixia Liu, Zhiyuan Li, Ahmed H. Sameh
2008Automatic SIMD vectorization of chains of recurrences.
Yixin Shou, Robert A. van Engelen
2008Automatic analysis of speedup of MPI applications.
Marc Casas, Rosa M. Badia, Jesús Labarta
2008Autonomous learning for efficient resource utilization of dynamic VM migration.
Hyung Won Choi, Hukeun Kwak, Andrew Sohn, Kyusik Chung
2008Biomedical image analysis on a cooperative cluster of GPUs and multicores.
Timothy D. R. Hartley, Ümit V. Çatalyürek, Antonio Ruiz, Francisco D. Igual, Rafael Mayo, Manuel Ujaldon
2008CUBA: an architecture for efficient CPU/co-processor data communication.
Isaac Gelado, John H. Kelm, Shane Ryoo, Steven S. Lumetta, Nacho Navarro, Wen-mei W. Hwu
2008Can software reliability outperform hardware reliability on high performance interconnects?: a case study with MPI over infiniband.
Matthew J. Koop, Rahul Kumar, Dhabaleswar K. Panda
2008Challenges on the road to exascale computing.
Tilak Agerwala
2008CprFS: a user-level file system to support consistent file states for checkpoint and restart.
Ruini Xue, Wenguang Chen, Weimin Zheng
2008Data mining on the cell broadband engine.
Gregory Buehrer, Srinivasan Parthasarathy, Matthew Goyder
2008Efficient computation of sum-products on GPUs through software-managed cache.
Mark Silberstein, Assaf Schuster, Dan Geiger, Anjul Patney, John D. Owens
2008Evaluating the effect of replacing CNK with linux on the compute-nodes of blue gene/l.
Edi Shmueli, George Almási, José R. Brunheroto, José G. Castaños, Gábor Dózsa, Sameer Kumar, Derek Lieber
2008Exploiting idle register classes for fast spill destination.
Fang Lu, Lei Wang, Xiaobing Feng, Zhiyuan Li, Zhaoqing Zhang
2008Fast scan algorithms on graphics processors.
Yuri Dotsenko, Naga K. Govindaraju, Peter-Pike J. Sloan, Charles Boyd, John Manferdelli
2008Focused prefetching: performance oriented prefetching based on commit stalls.
R. Manikantan, R. Govindarajan
2008Implementing Wilson-Dirac operator on the cell broadband engine.
Khaled Z. Ibrahim, François Bodin
2008Many-core GPU computing with NVIDIA CUDA.
Mark J. Harris
2008Optimizing irregular shared-memory applications for clusters.
Seung-Jai Min, Rudolf Eigenmann
2008Orchestrating data transfer for the cell/B.E. processor.
Tong Chen, Haibo Lin, Tao Zhang
2008Performance portable optimizations for loops containing communication operations.
Costin Iancu, Wei Chen, Katherine A. Yelick
2008Petaflop/s, seriously.
David E. Keyes
2008Phasers: a unified deadlock-free construct for collective and point-to-point synchronization.
Jun Shirako, David M. Peixotto, Vivek Sarkar, William N. Scherer III
2008Power-aware dynamic placement of HPC applications.
Akshat Verma, Puneet Ahuja, Anindya Neogi
2008Preserving time in large-scale communication traces.
Prasun Ratn, Frank Mueller, Bronis R. de Supinski, Martin Schulz
2008Proceedings of the 22nd Annual International Conference on Supercomputing, ICS 2008, Island of Kos, Greece, June 7-12, 2008
Pin Zhou
2008Rotating register allocation with multiple rotating branches.
Suhyun Kim, Soo-Mook Moon
2008Shifted declustering: a placement-ideal layout scheme for multi-way replication storage architecture.
Huijun Zhu, Peng Gu, Jun Wang
2008Soft error vulnerability of iterative linear algebra methods.
Greg Bronevetsky, Bronis R. de Supinski
2008The deep computing messaging framework: generalized scalable message passing on the blue gene/P supercomputer.
Sameer Kumar, Gábor Dózsa, Gheorghe Almási, Philip Heidelberger, Dong Chen, Mark Giampapa, Michael Blocksome, Ahmad Faraj, Jeff Parker, Joe Ratterman, Brian E. Smith, Charles Archer
2008The shared-thread multiprocessor.
Jeffery A. Brown, Dean M. Tullsen
2008Three-dimensional delaunay refinement for multi-core processors.
Andrey N. Chernikov, Nikos Chrisochoides
2008Timely offloading of result-data in HPC centers.
Henry M. Monti, Ali Raza Butt, Sudharshan S. Vazhkudai