ICS A

46 papers

YearTitle / Authors
20142014 International Conference on Supercomputing, ICS'14, Muenchen, Germany, June 10-13, 2014
Arndt Bode, Michael Gerndt, Per Stenström, Lawrence Rauchwerger, Barton P. Miller, Martin Schulz
201421st century computer architecture keynote at 2014 international conference on supercomputing (ICS).
Mark D. Hill
2014A performance perspective on energy efficient HPC links.
Karthikeyan P. Saravanan, Paul M. Carpenter, Alex Ramírez
2014A programming system for xeon phis with runtime SIMD parallelization.
Xin Huo, Bin Ren, Gagan Agrawal
2014Accelerating cache coherence mechanism with speculation.
Jun Ohno, Kei Hiraki
2014Acceleration of derivative calculations with application to radial basis function: finite-differences on the intel mic architecture.
Gordon Erlebacher, Erik Saule, Natasha Flyer, Evan F. Bollig
2014Addressing bandwidth contention in SMT multicores through scheduling.
Josué Feliu, Julio Sahuquillo, Salvador Petit, José Duato
2014An adaptive cross-architecture combination method for graph traversal.
Yang You, Shuaiwen Leon Song, Darren J. Kerbyson
2014An efficient two-dimensional blocking strategy for sparse matrix-vector multiplication on GPUs.
Arash Ashari, Naser Sedaghati, John Eisenlohr, P. Sadayappan
2014An end-to-end analysis of file system features on sparse virtual disks.
Ruijin Zhou, Sankaran Sivathanu, Jinpyo Kim, Bing Tsai, Tao Li
2014An optimal distributed load balancing algorithm for homogeneous work units.
Akhil Langer
2014Automating and optimizing data transfers for many-core coprocessors.
Bin Ren, Nishkam Ravi, Yi Yang, Min Feng, Gagan Agrawal, Srimat T. Chakradhar
2014Block value based insertion policy for high performance last-level caches.
Lingda Li, Junlin Lu, Xu Cheng
2014Collective memory transfers for multi-core chips.
George Michelogiannakis, Alexander Williams, Samuel Williams, John Shalf
2014DTail: a flexible approach to DRAM refresh management.
Zehan Cui, Sally A. McKee, Zhongbin Zha, Yungang Bao, Mingyu Chen
2014DWC: dynamic write consolidation for phase change memory systems.
Fei Xia, Dejun Jiang, Jin Xiong, Mingyu Chen, Lixin Zhang, Ninghui Sun
2014Effective automatic computation placement and dataallocation for parallelization of regular programs.
Chandan Reddy, Uday Bondhugula
2014Evaluation of methods to integrate analysis into a large-scale shock shock physics code.
Ron A. Oldfield, Kenneth Moreland, Nathan Fabian, David H. Rogers
2014Galaxy: a high-performance energy-efficient multi-chip architecture using photonic interconnects.
Yigit Demir, Yan Pan, Seokwoo Song, Nikos Hardavellas, John Kim, Gokhan Memik
2014HOMR: a hybrid approach to exploit maximum overlapping in MapReduce over high performance interconnects.
Md. Wasi-ur-Rahman, Xiaoyi Lu, Nusrat Sharmin Islam, Dhabaleswar K. Panda
2014HPC for the human brain project.
Thomas Lippert
2014Hardware-assisted scalable flow control of shared receive queue.
Teruo Tanimoto, Takatsugu Ono, Kohta Nakashima, Takashi Miyoshi
2014Implementing a classic: zero-copy all-to-all communication with mpi datatypes.
Jesper Larsson Träff, Antoine Rougier, Sascha Hunold
2014Improving performance by matching imbalanced workloads with heterogeneous platforms.
Jie Shen, Ana Lucia Varbanescu, Peng Zou, Yutong Lu, Henk J. Sips
2014Input-adaptive parallel sparse fast fourier transform for stream processing.
Shuo Chen, Xiaoming Li
2014LAWS: locality-aware work-stealing for multi-socket multi-core architectures.
Quan Chen, Minyi Guo, Haibing Guan
2014Last-level cache deduplication.
Yingying Tian, Samira Manabi Khan, Daniel A. Jiménez, Gabriel H. Loh
2014Load balancing n-body simulations with highly non-uniform density.
Olga Pearce, Todd Gamblin, Bronis R. de Supinski, Tom Arsenlis, Nancy M. Amato
2014Long-term resource fairness: towards economic fairness on pay-as-you-use computing systems.
Shanjiang Tang, Bu-Sung Lee, Bingsheng He, Haikun Liu
2014MT-MPI: multithreaded MPI for many-core environments.
Min Si, Antonio J. Peña, Pavan Balaji, Masamichi Takagi, Yutaka Ishikawa
2014Multi-stage coordinated prefetching for present-day processors.
Sanyam Mehta, Zhenman Fang, Antonia Zhai, Pen-Chung Yew
2014On the conditions for efficient interoperability with threads: an experience with PGAS languages using cray communication domains.
Khaled Z. Ibrahim, Katherine A. Yelick
2014Palm: easing the burden of analytical performance modeling.
Nathan R. Tallent, Adolfy Hoisie
2014Parallelizing and optimizing sparse tensor computations.
Muthu Manikandan Baskaran, Benoît Meister, Richard Lethin
2014Reducing energy consumption of NoC by router bypassing.
Takahiro Naruko
2014Revealing applications' access pattern in collective I/O for cache management.
Yin Lu, Yong Chen, Robert Latham, Yu Zhuang
2014Scalable analysis of multicore data reuse and sharing.
Miquel Pericàs, Kenjiro Taura, Satoshi Matsuoka
2014Scalable performance analysis of exascale MPI programs through signature-based clustering algorithms.
Amir Bahmani, Frank Mueller
2014Scaling up matrix computations on shared-memory manycore systems with 1000 CPU cores.
Fengguang Song, Jack J. Dongarra
2014Supporting storage configuration for I/O intensive workflows.
Lauro Beltrão Costa, Samer Al-Kiswany, Hao Yang, Matei Ripeanu
2014The future of supercomputing.
Marc Snir
2014Thread-cooperative, bit-parallel computation of levenshtein distance on GPU.
Alejandro Chacón, Santiago Marco-Sola, Antonio Espinosa, Paolo Ribeca, Juan Carlos Moure
2014Understanding the impact of threshold voltage on MLC flash memory performance and reliability.
Wei Wang, Tao Xie, Deng Zhou
2014Unified on-chip memory allocation for SIMT architecture.
Ari B. Hayes, Eddy Z. Zhang
2014Value influence analysis for message passing applications.
Philip C. Roth, Jeremy S. Meredith
2014Verifying micro-architecture simulators using event traces.
Hui Meen Nyew, Nilufer Onder, Soner Önder, Zhenlin Wang