IPDPS - RankMe

118 papers

Year	Title / Authors
2014	2014 IEEE 28th International Parallel and Distributed Processing Symposium, Phoenix, AZ, USA, May 19-23, 2014
2014	A Case for a Flexible Scalar Unit in SIMT Architecture. Yi Yang, Ping Xiang, Michael Mantor, Norman Rubin, Lisa R. Hsu, Qunfeng Dong, Huiyang Zhou
2014	A Coprocessor Sharing-Aware Scheduler for Xeon Phi-Based Compute Clusters. Giuseppe Coviello, Srihari Cadambi, Srimat T. Chakradhar
2014	A Framework for Lattice QCD Calculations on GPUs. Frank Tobias Winter, Mike A. Clark, Robert G. Edwards, Bálint Joó
2014	A Medium-Grain Method for Fast 2D Bipartitioning of Sparse Matrices. Daniël Maria Pelt, Rob H. Bisseling
2014	A Multi-core Parallel Branch-and-Bound Algorithm Using Factorial Number System. Mohand Mezmaz, Rudi Leroy, Nouredine Melab, Daniel Tuyttens
2014	A New Scalable Parallel Algorithm for Fock Matrix Construction. Xing Liu, Aftab Patel, Edmond Chow
2014	A Spatio-temporal Coupling Method to Reduce the Time-to-Solution of Cardiovascular Simulations. Amanda Peters Randles, Efthimios Kaxiras
2014	A Step towards Energy Efficient Computing: Redesigning a Hydrodynamic Application on CPU-GPU. Tingxing Dong, Veselin Dobrev, Tzanio V. Kolev, Robert N. Rieben, Stanimire Tomov, Jack J. Dongarra
2014	Accelerating MPI Collective Communications through Hierarchical Algorithms Without Sacrificing Inter-Node Communication Flexibility. Benjamin S. Parsons, Vijay S. Pai
2014	Active Measurement of Memory Resource Consumption. Marc Casas, Greg Bronevetsky
2014	Active Measurement of the Impact of Network Switch Utilization on Application Performance. Marc Casas, Greg Bronevetsky
2014	Algorithmic Time, Energy, and Power on Candidate HPC Compute Building Blocks. Jee W. Choi, Marat Dukhan, Xing Liu, Richard W. Vuduc
2014	An Accelerated Recursive Doubling Algorithm for Block Tridiagonal Systems. Sudip K. Seal
2014	An Efficient GPU General Sparse Matrix-Matrix Multiplication for Irregular Data. Weifeng Liu, Brian Vinter
2014	An Efficient Method for Stream Semantics over RDMA. Patrick MacArthur, Robert D. Russell
2014	An Evaluation of One-Sided and Two-Sided Communication Paradigms on Relaxed-Ordering Interconnect. Khaled Z. Ibrahim, Paul Hargrove, Costin Iancu, Katherine A. Yelick
2014	An Improved Router Design for Reliable On-Chip Networks. Pavan Poluri, Ahmed Louri
2014	Analytically Modeling Application Execution for Software-Hardware Co-design. Jichi Guo, Jiayuan Meng, Qing Yi, Vitali A. Morozov, Kalyan Kumaran
2014	Anatomy of High-Performance Many-Threaded Matrix Multiplication. Tyler M. Smith, Robert A. van de Geijn, Mikhail Smelyanskiy, Jeff R. Hammond, Field G. Van Zee
2014	Astrophysical Applications of Machine Learning at Scale and under Duress. Joshua S. Bloom
2014	Auto-Tuning Dedispersion for Many-Core Accelerators. Alessio Sclocco, Henri E. Bal, Jason W. T. Hessels, Joeri van Leeuwen, Rob van Nieuwpoort
2014	BFS and Coloring-Based Parallel Algorithms for Strongly Connected Components and Related Problems. George M. Slota, Sivasankaran Rajamanickam, Kamesh Madduri
2014	Balancing CPU-GPU Collaborative High-Order CFD Simulations on the Tianhe-1A Supercomputer. Chuanfu Xu, Lilun Zhang, Xiaogang Deng, Jianbin Fang, Guangxue Wang, Wei Cao, Yonggang Che, Yongxian Wang, Wei Liu
2014	Balancing On-Chip Network Latency in Multi-application Mapping for Chip-Multiprocessors. Di Zhu, Lizhong Chen, Siyu Yue, Timothy Mark Pinkston, Massoud Pedram
2014	BigKernel - High Performance CPU-GPU Communication Pipelining for Big Data-Style Applications. Reza Mokhtari, Michael Stumm
2014	Bipartite Matching Heuristics with Quality Guarantees on Shared Memory Parallel Computers. Fanny Dufossé, Kamer Kaya, Bora Uçar
2014	Bursting the Cloud Data Bubble: Towards Transparent Storage Elasticity in IaaS Clouds. Bogdan Nicolae, Pierre Riteau, Kate Keahey
2014	CALCioM: Mitigating I/O Interference in HPC Systems through Cross-Application Coordination. Matthieu Dorier, Gabriel Antoniu, Robert B. Ross, Dries Kimpe, Shadi Ibrahim
2014	Characterization and Optimization of Memory-Resident MapReduce on HPC Systems. Yandong Wang, Robin Goldstone, Weikuan Yu, Teng Wang
2014	Characterization of Impact of Transient Faults and Detection of Data Corruption Errors in Large-Scale N-Body Programs Using Graphics Processing Units. Keun Soo Yim
2014	Collaborative Network Configuration in Hybrid Electrical/Optical Data Center Networks. Zhiyang Guo, Yuanyuan Yang
2014	Communication-Efficient Distributed Variance Monitoring and Outlier Detection for Multivariate Time Series. Moshe Gabel, Assaf Schuster, Daniel Keren
2014	Comparative Performance Analysis of Intel (R) Xeon Phi (TM), GPU, and CPU: A Case Study from Microscopy Image Analysis. George Teodoro, Tahsin M. Kurç, Jun Kong, Lee Cooper, Joel H. Saltz
2014	Complex Network Analysis Using Parallel Approximate Motif Counting. George M. Slota, Kamesh Madduri
2014	Computational Co-design of a Multiscale Plasma Application: A Process and Initial Results. Joshua Payne, Dana A. Knoll, Allen McPherson, William T. Taitano, Luis Chacón, Guangye Chen, Scott Pakin
2014	Cost-Efficient and Resilient Job Life-Cycle Management on Hybrid Clouds. Hsuan-Yi Chu, Yogesh Simmhan
2014	Cost-Optimal Execution of Boolean Query Trees with Shared Streams. Henri Casanova, Lipyeow Lim, Yves Robert, Frédéric Vivien, Dounia Zaidouni
2014	DEX: Self-Healing Expanders. Gopal Pandurangan, Peter Robinson, Amitabh Trehan
2014	DataMPI: Extending MPI to Hadoop-Like Big Data Computing. Xiaoyi Lu, Fan Liang, Bing Wang, Li Zha, Zhiwei Xu
2014	Designing Bit-Reproducible Portable High-Performance Applications. Andrea Arteaga, Oliver Fuhrer, Torsten Hoefler
2014	Designing LU-QR Hybrid Solvers for Performance and Stability. Mathieu Faverge, Julien Herrmann, Julien Langou, Bradley R. Lowery, Yves Robert, Jack J. Dongarra
2014	EDM: An Endurance-Aware Data Migration Scheme for Load Balancing in SSD Storage Clusters. Jiaxin Ou, Jiwu Shu, Youyou Lu, Letian Yi, Wei Wang
2014	Effectively Exploiting Parallel Scale for All Problem Sizes in LU Factorization. Md Rakib Hasan, R. Clint Whaley
2014	Efficient Data Race Detection for C/C++ Programs Using Dynamic Granularity. Young Wn Song, Yann-Hang Lee
2014	Efficient Multi-GPU Computation of All-Pairs Shortest Paths. Hristo N. Djidjev, Sunil Thulasidasan, Guillaume Chapuis, Rumen Andonov, Dominique Lavenier
2014	Enabling In-Situ Data Analysis for Large Protein-Folding Trajectory Datasets. Boyu Zhang, Trilce Estrada, Pietro Cicotti, Michela Taufer
2014	Enabling and Scaling a Global Shallow-Water Atmospheric Model on Tianhe-2. Wei Xue, Chao Yang, Haohuan Fu, Xinliang Wang, Yangtong Xu, Lin Gan, Yutong Lu, Xiaoqian Zhu
2014	Energy Efficient HPC on Embedded SoCs: Optimization Techniques for Mali GPU. Ivan Grasso, Petar Radojkovic, Nikola Rajovic, Isaac Gelado, Alex Ramírez
2014	Energy-Efficient Time-Division Multiplexed Hybrid-Switched NoC for Heterogeneous Multicore Systems. Jieming Yin, Pingqiang Zhou, Sachin S. Sapatnekar, Antonia Zhai
2014	Evaluating the Impact of SDC on the GMRES Iterative Solver. James Elliott, Mark Hoemmen, Frank Mueller
2014	Exploiting Geometric Partitioning in Task Mapping for Parallel Computers. Mehmet Deveci, Sivasankaran Rajamanickam, Vitus J. Leung, Kevin T. Pedretti, Stephen L. Olivier, David P. Bunde, Ümit V. Çatalyürek, Karen D. Devine
2014	F-SEFI: A Fine-Grained Soft Error Fault Injection Tool for Profiling Application Vulnerability. Qiang Guan, Nathan DeBardeleben, Sean Blanchard, Song Fu
2014	F2C2-STM: Flux-Based Feedback-Driven Concurrency Control for STMs. Kaushik Ravichandran, Santosh Pande
2014	FMI: Fault Tolerant Messaging Interface for Fast and Transparent Recovery. Kento Sato, Adam Moody, Kathryn M. Mohror, Todd Gamblin, Bronis R. de Supinski, Naoya Maruyama, Satoshi Matsuoka
2014	Fair Maximal Independent Sets. Jeremy T. Fineman, Calvin C. Newport, Micah Sherr, Tonghe Wang
2014	Finding Motifs in Biological Sequences Using the Micron Automata Processor. Indranil Roy, Srinivas Aluru
2014	Generalizing Run-Time Tiling with the Loop Chain Abstraction. Michelle Mills Strout, Fabio Luporini, Christopher D. Krieger, Carlo Bertolli, Gheorghe-Teodor Bercea, Catherine Olschanowsky, J. Ramanujam, Paul H. J. Kelly
2014	HPMMAP: Lightweight Memory Management for Commodity Operating Systems. Brian Kocoloski, John R. Lange
2014	Heterogeneity-Aware Workload Placement and Migration in Distributed Sustainable Datacenters. Dazhao Cheng, Changjun Jiang, Xiaobo Zhou
2014	High Performance Alltoall and Allgather Designs for InfiniBand MIC Clusters. Akshay Venkatesh, Sreeram Potluri, Raghunath Rajachandrasekar, Miao Luo, Khaled Hamidouche, Dhabaleswar K. Panda
2014	How Well Do Graph-Processing Platforms Perform? An Empirical Performance Evaluation and Analysis. Yong Guo, Marcin Biczak, Ana Lucia Varbanescu, Alexandru Iosup, Claudio Martella, Theodore L. Willke
2014	Identifying Code Phases Using Piece-Wise Linear Regressions. Harald Servat, Germán Llort, Juan Gonzalez, Judit Giménez, Jesús Labarta
2014	Improved Time Bounds for Linearizable Implementations of Abstract Data Types. Jiaqi Wang, Edward Talmage, Hyunyoung Lee, Jennifer L. Welch
2014	Improving Communication Performance and Scalability of Native Applications on Intel Xeon Phi Coprocessor Clusters. Karthikeyan Vaidyanathan, Kiran Pamnany, Dhiraj D. Kalamkar, Alexander Heinecke, Mikhail Smelyanskiy, Jongsoo Park, Daehyun Kim, Aniruddha G. Shet, Bharat Kaul, Bálint Joó, Pradeep Dubey
2014	Improving the Performance of CA-GMRES on Multicores with Multiple GPUs. Ichitaro Yamazaki, Hartwig Anzt, Stanimire Tomov, Mark Hoemmen, Jack J. Dongarra
2014	Interactive Program Debugging and Optimization for Directive-Based, Efficient GPU Computing. Seyong Lee, Dong Li, Jeffrey S. Vetter
2014	It's About Time: On Optimal Virtual Network Embeddings under Temporal Flexibilities. Matthias Rost, Stefan Schmid, Anja Feldmann
2014	LFTI: A New Performance Metric for Assessing Interconnect Designs for Extreme-Scale HPC Systems. Xin Yuan, Santosh Mahapatra, Michael Lang, Scott Pakin
2014	Large-Scale Hydrodynamic Brownian Simulations on Multicore and Manycore Architectures. Xing Liu, Edmond Chow
2014	Locating Parallelization Potential in Object-Oriented Data Structures. Korbinian Molitorisz, Thomas Karcher, Alexander Biele, Walter F. Tichy
2014	MIC-SVM: Designing a Highly Efficient Support Vector Machine for Advanced Modern Multi-core and Many-Core Architectures. Yang You, Shuaiwen Leon Song, Haohuan Fu, Andres Marquez, Maryam Mehri Dehnavi, Kevin J. Barker, Kirk W. Cameron, Amanda Peters Randles, Guangwen Yang
2014	MapReuse: Reusing Computation in an In-Memory MapReduce System. Devesh Tiwari, Yan Solihin
2014	Mitigating the Mismatch between the Coherence Protocol and Conflict Detection in Hardware Transactional Memory. Lihang Zhao, Lizhong Chen, Jeffrey T. Draper
2014	MobiStreams: A Reliable Distributed Stream Processing System for Mobile Devices. Huayong Wang, Li-Shiuan Peh
2014	Multi-resource Real-Time Reader/Writer Locks for Multiprocessors. Bryan C. Ward, James H. Anderson
2014	New Effective Multithreaded Matching Algorithms. Fredrik Manne, Mahantesh Halappanavar
2014	Nitro: A Framework for Adaptive Code Variant Tuning. Saurav Muralidharan, Manu Shantharam, Mary W. Hall, Michael Garland, Bryan Catanzaro
2014	Online Server and Workload Management for Joint Optimization of Electricity Cost and Carbon Footprint Across Data Centers. Zahra Abbasi, Madhurima Pore, Sandeep K. S. Gupta
2014	Optimization of Multi-level Checkpoint Model for Large Scale HPC Applications. Sheng Di, Mohamed-Slim Bouguerra, Leonardo Arturo Bautista-Gomez, Franck Cappello
2014	Optimizing Bandwidth Allocation in Flex-Grid Optical Networks with Application to Scheduling. Hadas Shachnai, Ariella Voloshin, Shmuel Zaks
2014	Optimizing Sparse Matrix-Multiple Vectors Multiplication for Nuclear Configuration Interaction Calculations. Hasan Metin Aktulga, Aydin Buluç, Samuel Williams, Chao Yang
2014	Overcoming the Limitations Posed by TCR-beta Repertoire Modeling through a GPU-Based In-Silico DNA Recombination Algorithm. Gregory M. Striemer, Harsha Krovi, Ali Akoglu, Benjamin Vincent, Ben Hopson, Jeffrey Frelinger, Adam Buntzman
2014	Overcoming the Scalability Challenges of Epidemic Simulations on Blue Waters. Jae-Seung Yeom, Abhinav Bhatele, Keith R. Bisset, Eric J. Bohm, Abhishek Gupta, Laxmikant V. Kalé, Madhav V. Marathe, Dimitrios S. Nikolopoulos, Martin Schulz, Lukasz Wesolowski
2014	PAGE: A Framework for Easy PArallelization of GEnomic Applications. Mücahid Kutlu, Gagan Agrawal
2014	POD: Performance Oriented I/O Deduplication for Primary Storage Systems in the Cloud. Bo Mao, Hong Jiang, Suzhen Wu, Lei Tian
2014	Parallel Mutual Information Based Construction of Whole-Genome Networks on the Intel (R) Xeon Phi (TM) Coprocessor. Sanchit Misra, Kiran Pamnany, Srinivas Aluru
2014	Performance and Energy Analysis of the Restricted Transactional Memory Implementation on Haswell. Bhavishya Goel, J. Rubén Titos Gil, Anurag Negi, Sally A. McKee, Per Stenström
2014	Petascale Application of a Coupled CPU-GPU Algorithm for Simulation and Analysis of Multiphase Flow Solutions in Porous Medium Systems. James E. McClure, Hao Wang, Jan F. Prins, Cass T. Miller, Wu-chun Feng
2014	Petascale General Solver for Semidefinite Programming Problems with Over Two Million Constraints. Katsuki Fujisawa, Toshio Endo, Yuichiro Yasui, Hitoshi Sato, Naoki Matsuzawa, Satoshi Matsuoka, Hayato Waki
2014	Pipelined Compaction for the LSM-Tree. Zigang Zhang, Yinliang Yue, Bingsheng He, Jin Xiong, Mingyu Chen, Lixin Zhang, Ninghui Sun
2014	Power and Performance Characterization and Modeling of GPU-Accelerated Systems. Yuki Abe, Hiroshi Sasaki, Shinpei Kato, Koji Inoue, Masato Edahiro, Martin Peres
2014	Power-Efficient Multiple Producer-Consumer. Ramy Medhat, Borzoo Bonakdarpour, Sebastian Fischmeister
2014	Pythia: Faster Big Data in Motion through Predictive Software-Defined Network Optimization at Runtime. Marcelo Veiga Neves, César A. F. De Rose, Kostas Katrinis, Hubertus Franke
2014	RCMP: Enabling Efficient Recomputation Based Failure Resilience for Big Data Analytics. Florin Dinu, T. S. Eugene Ng
2014	ReDHiP: Recalibrating Deep Hierarchy Prediction for Energy Efficiency. Xun Li, Diana Franklin, Ricardo Bianchini, Frederic T. Chong
2014	Reading the Tea-Leaves: How Architecture Has Evolved at the High End. Peter M. Kogge
2014	Reconstructing Householder Vectors from Tall-Skinny QR. Grey Ballard, James Demmel, Laura Grigori, Mathias Jacquelin, Hong Diep Nguyen, Edgar Solomonik
2014	Remote Invalidation: Optimizing the Critical Path of Memory Transactions. Ahmed Hassan, Roberto Palmieri, Binoy Ravindran
2014	Revisiting Asynchronous Linear Solvers: Provable Convergence Rate through Randomization. Haim Avron, Alex Druinsky, Anshul Gupta
2014	Runtime-Guided Cache Coherence Optimizations in Multi-core Architectures. Madhavan Manivannan, Per Stenström
2014	Scalability-Centric HPC System Design. Yutong Lu
2014	Scalable Single Source Shortest Path Algorithms for Massively Parallel Systems. Venkatesan T. Chakaravarthy, Fabio Checconi, Fabrizio Petrini, Yogish Sabharwal
2014	Scalar Waving: Improving the Efficiency of SIMD Execution on GPUs. Ayse Yilmazer, Zhongliang Chen, David R. Kaeli
2014	Scaling Irregular Applications through Data Aggregation and Software Multithreading. Alessandro Morari, Antonino Tumeo, Daniel G. Chavarría-Miranda, Oreste Villa, Mateo Valero
2014	Scibox: Online Sharing of Scientific Data via the Cloud. Jian Huang, Xuechen Zhang, Greg Eisenhauer, Karsten Schwan, Matthew Wolf, Stéphane Ethier, Scott Klasky
2014	Shedding Light on Lithium/Air Batteries Using Millions of Threads on the BG/Q Supercomputer. Valéry Weber, Costas Bekas, Teodoro Laino, Alessandro Curioni, Adam Bertsch, Scott Futral
2014	Skywalk: A Topology for HPC Networks with Low-Delay Switches. Ikki Fujiwara, Michihiro Koibuchi, Hiroki Matsutani, Henri Casanova
2014	TBPoint: Reducing Simulation Time for Large-Scale GPGPU Kernels. Jen-Cheng Huang, Lifeng Nai, Hyesoon Kim, Hsien-Hsin S. Lee
2014	Traversing Trillions of Edges in Real Time: Graph Exploration on Large-Scale Parallel Machines. Fabio Checconi, Fabrizio Petrini
2014	UPC++: A PGAS Extension for C++. Yili Zheng, Amir Kamil, Michael B. Driscoll, Hongzhang Shan, Katherine A. Yelick
2014	Unified Development for Mixed Multi-GPU and Multi-coprocessor Environments Using a Lightweight Runtime Environment. Azzam Haidar, Chongxiao Cao, Asim YarKhan, Piotr Luszczek, Stanimire Tomov, Khairul Kabir, Jack J. Dongarra
2014	Using Load Balancing to Scalably Parallelize Sampling-Based Motion Planning Algorithms. Adam Fidel, Sam Ade Jacobs, Shishir Sharma, Nancy M. Amato, Lawrence Rauchwerger
2014	Using Multiple Threads to Accelerate Single Thread Performance. Zehra Sura, Kevin O'Brien, José R. Brunheroto
2014	Victim Selection and Distributed Work Stealing Performance: A Case Study. Swann Perarnau, Mitsuhisa Sato
2014	Work-Efficient Parallel GPU Methods for Single-Source Shortest Paths. Andrew A. Davidson, Sean Baxter, Michael Garland, John D. Owens
2014	cuBLASTP: Fine-Grained Parallelization of Protein Sequence Search on a GPU. Jing Zhang, Hao Wang, Heshan Lin, Wu-chun Feng
2014	s-Step Krylov Subspace Methods as Bottom Solvers for Geometric Multigrid. Samuel Williams, Mike Lijewski, Ann S. Almgren, Brian van Straalen, Erin Carson, Nicholas Knight, James Demmel