ICS A

46 papers

Year	Title / Authors
2014	2014 International Conference on Supercomputing, ICS'14, Muenchen, Germany, June 10-13, 2014 Arndt Bode, Michael Gerndt, Per Stenström, Lawrence Rauchwerger, Barton P. Miller, Martin Schulz
2014	21st century computer architecture keynote at 2014 international conference on supercomputing (ICS). Mark D. Hill
2014	A performance perspective on energy efficient HPC links. Karthikeyan P. Saravanan, Paul M. Carpenter, Alex Ramírez
2014	A programming system for xeon phis with runtime SIMD parallelization. Xin Huo, Bin Ren, Gagan Agrawal
2014	Accelerating cache coherence mechanism with speculation. Jun Ohno, Kei Hiraki
2014	Acceleration of derivative calculations with application to radial basis function: finite-differences on the intel mic architecture. Gordon Erlebacher, Erik Saule, Natasha Flyer, Evan F. Bollig
2014	Addressing bandwidth contention in SMT multicores through scheduling. Josué Feliu, Julio Sahuquillo, Salvador Petit, José Duato
2014	An adaptive cross-architecture combination method for graph traversal. Yang You, Shuaiwen Leon Song, Darren J. Kerbyson
2014	An efficient two-dimensional blocking strategy for sparse matrix-vector multiplication on GPUs. Arash Ashari, Naser Sedaghati, John Eisenlohr, P. Sadayappan
2014	An end-to-end analysis of file system features on sparse virtual disks. Ruijin Zhou, Sankaran Sivathanu, Jinpyo Kim, Bing Tsai, Tao Li
2014	An optimal distributed load balancing algorithm for homogeneous work units. Akhil Langer
2014	Automating and optimizing data transfers for many-core coprocessors. Bin Ren, Nishkam Ravi, Yi Yang, Min Feng, Gagan Agrawal, Srimat T. Chakradhar
2014	Block value based insertion policy for high performance last-level caches. Lingda Li, Junlin Lu, Xu Cheng
2014	Collective memory transfers for multi-core chips. George Michelogiannakis, Alexander Williams, Samuel Williams, John Shalf
2014	DTail: a flexible approach to DRAM refresh management. Zehan Cui, Sally A. McKee, Zhongbin Zha, Yungang Bao, Mingyu Chen
2014	DWC: dynamic write consolidation for phase change memory systems. Fei Xia, Dejun Jiang, Jin Xiong, Mingyu Chen, Lixin Zhang, Ninghui Sun
2014	Effective automatic computation placement and dataallocation for parallelization of regular programs. Chandan Reddy, Uday Bondhugula
2014	Evaluation of methods to integrate analysis into a large-scale shock shock physics code. Ron A. Oldfield, Kenneth Moreland, Nathan Fabian, David H. Rogers
2014	Galaxy: a high-performance energy-efficient multi-chip architecture using photonic interconnects. Yigit Demir, Yan Pan, Seokwoo Song, Nikos Hardavellas, John Kim, Gokhan Memik
2014	HOMR: a hybrid approach to exploit maximum overlapping in MapReduce over high performance interconnects. Md. Wasi-ur-Rahman, Xiaoyi Lu, Nusrat Sharmin Islam, Dhabaleswar K. Panda
2014	HPC for the human brain project. Thomas Lippert
2014	Hardware-assisted scalable flow control of shared receive queue. Teruo Tanimoto, Takatsugu Ono, Kohta Nakashima, Takashi Miyoshi
2014	Implementing a classic: zero-copy all-to-all communication with mpi datatypes. Jesper Larsson Träff, Antoine Rougier, Sascha Hunold
2014	Improving performance by matching imbalanced workloads with heterogeneous platforms. Jie Shen, Ana Lucia Varbanescu, Peng Zou, Yutong Lu, Henk J. Sips
2014	Input-adaptive parallel sparse fast fourier transform for stream processing. Shuo Chen, Xiaoming Li
2014	LAWS: locality-aware work-stealing for multi-socket multi-core architectures. Quan Chen, Minyi Guo, Haibing Guan
2014	Last-level cache deduplication. Yingying Tian, Samira Manabi Khan, Daniel A. Jiménez, Gabriel H. Loh
2014	Load balancing n-body simulations with highly non-uniform density. Olga Pearce, Todd Gamblin, Bronis R. de Supinski, Tom Arsenlis, Nancy M. Amato
2014	Long-term resource fairness: towards economic fairness on pay-as-you-use computing systems. Shanjiang Tang, Bu-Sung Lee, Bingsheng He, Haikun Liu
2014	MT-MPI: multithreaded MPI for many-core environments. Min Si, Antonio J. Peña, Pavan Balaji, Masamichi Takagi, Yutaka Ishikawa
2014	Multi-stage coordinated prefetching for present-day processors. Sanyam Mehta, Zhenman Fang, Antonia Zhai, Pen-Chung Yew
2014	On the conditions for efficient interoperability with threads: an experience with PGAS languages using cray communication domains. Khaled Z. Ibrahim, Katherine A. Yelick
2014	Palm: easing the burden of analytical performance modeling. Nathan R. Tallent, Adolfy Hoisie
2014	Parallelizing and optimizing sparse tensor computations. Muthu Manikandan Baskaran, Benoît Meister, Richard Lethin
2014	Reducing energy consumption of NoC by router bypassing. Takahiro Naruko
2014	Revealing applications' access pattern in collective I/O for cache management. Yin Lu, Yong Chen, Robert Latham, Yu Zhuang
2014	Scalable analysis of multicore data reuse and sharing. Miquel Pericàs, Kenjiro Taura, Satoshi Matsuoka
2014	Scalable performance analysis of exascale MPI programs through signature-based clustering algorithms. Amir Bahmani, Frank Mueller
2014	Scaling up matrix computations on shared-memory manycore systems with 1000 CPU cores. Fengguang Song, Jack J. Dongarra
2014	Supporting storage configuration for I/O intensive workflows. Lauro Beltrão Costa, Samer Al-Kiswany, Hao Yang, Matei Ripeanu
2014	The future of supercomputing. Marc Snir
2014	Thread-cooperative, bit-parallel computation of levenshtein distance on GPU. Alejandro Chacón, Santiago Marco-Sola, Antonio Espinosa, Paolo Ribeca, Juan Carlos Moure
2014	Understanding the impact of threshold voltage on MLC flash memory performance and reliability. Wei Wang, Tao Xie, Deng Zhou
2014	Unified on-chip memory allocation for SIMT architecture. Ari B. Hayes, Eddy Z. Zhang
2014	Value influence analysis for message passing applications. Philip C. Roth, Jeremy S. Meredith
2014	Verifying micro-architecture simulators using event traces. Hui Meen Nyew, Nilufer Onder, Soner Önder, Zhenlin Wang