IPDPS A

118 papers

YearTitle / Authors
20142014 IEEE 28th International Parallel and Distributed Processing Symposium, Phoenix, AZ, USA, May 19-23, 2014
2014A Case for a Flexible Scalar Unit in SIMT Architecture.
Yi Yang, Ping Xiang, Michael Mantor, Norman Rubin, Lisa R. Hsu, Qunfeng Dong, Huiyang Zhou
2014A Coprocessor Sharing-Aware Scheduler for Xeon Phi-Based Compute Clusters.
Giuseppe Coviello, Srihari Cadambi, Srimat T. Chakradhar
2014A Framework for Lattice QCD Calculations on GPUs.
Frank Tobias Winter, Mike A. Clark, Robert G. Edwards, Bálint Joó
2014A Medium-Grain Method for Fast 2D Bipartitioning of Sparse Matrices.
Daniël Maria Pelt, Rob H. Bisseling
2014A Multi-core Parallel Branch-and-Bound Algorithm Using Factorial Number System.
Mohand Mezmaz, Rudi Leroy, Nouredine Melab, Daniel Tuyttens
2014A New Scalable Parallel Algorithm for Fock Matrix Construction.
Xing Liu, Aftab Patel, Edmond Chow
2014A Spatio-temporal Coupling Method to Reduce the Time-to-Solution of Cardiovascular Simulations.
Amanda Peters Randles, Efthimios Kaxiras
2014A Step towards Energy Efficient Computing: Redesigning a Hydrodynamic Application on CPU-GPU.
Tingxing Dong, Veselin Dobrev, Tzanio V. Kolev, Robert N. Rieben, Stanimire Tomov, Jack J. Dongarra
2014Accelerating MPI Collective Communications through Hierarchical Algorithms Without Sacrificing Inter-Node Communication Flexibility.
Benjamin S. Parsons, Vijay S. Pai
2014Active Measurement of Memory Resource Consumption.
Marc Casas, Greg Bronevetsky
2014Active Measurement of the Impact of Network Switch Utilization on Application Performance.
Marc Casas, Greg Bronevetsky
2014Algorithmic Time, Energy, and Power on Candidate HPC Compute Building Blocks.
Jee W. Choi, Marat Dukhan, Xing Liu, Richard W. Vuduc
2014An Accelerated Recursive Doubling Algorithm for Block Tridiagonal Systems.
Sudip K. Seal
2014An Efficient GPU General Sparse Matrix-Matrix Multiplication for Irregular Data.
Weifeng Liu, Brian Vinter
2014An Efficient Method for Stream Semantics over RDMA.
Patrick MacArthur, Robert D. Russell
2014An Evaluation of One-Sided and Two-Sided Communication Paradigms on Relaxed-Ordering Interconnect.
Khaled Z. Ibrahim, Paul Hargrove, Costin Iancu, Katherine A. Yelick
2014An Improved Router Design for Reliable On-Chip Networks.
Pavan Poluri, Ahmed Louri
2014Analytically Modeling Application Execution for Software-Hardware Co-design.
Jichi Guo, Jiayuan Meng, Qing Yi, Vitali A. Morozov, Kalyan Kumaran
2014Anatomy of High-Performance Many-Threaded Matrix Multiplication.
Tyler M. Smith, Robert A. van de Geijn, Mikhail Smelyanskiy, Jeff R. Hammond, Field G. Van Zee
2014Astrophysical Applications of Machine Learning at Scale and under Duress.
Joshua S. Bloom
2014Auto-Tuning Dedispersion for Many-Core Accelerators.
Alessio Sclocco, Henri E. Bal, Jason W. T. Hessels, Joeri van Leeuwen, Rob van Nieuwpoort
2014BFS and Coloring-Based Parallel Algorithms for Strongly Connected Components and Related Problems.
George M. Slota, Sivasankaran Rajamanickam, Kamesh Madduri
2014Balancing CPU-GPU Collaborative High-Order CFD Simulations on the Tianhe-1A Supercomputer.
Chuanfu Xu, Lilun Zhang, Xiaogang Deng, Jianbin Fang, Guangxue Wang, Wei Cao, Yonggang Che, Yongxian Wang, Wei Liu
2014Balancing On-Chip Network Latency in Multi-application Mapping for Chip-Multiprocessors.
Di Zhu, Lizhong Chen, Siyu Yue, Timothy Mark Pinkston, Massoud Pedram
2014BigKernel - High Performance CPU-GPU Communication Pipelining for Big Data-Style Applications.
Reza Mokhtari, Michael Stumm
2014Bipartite Matching Heuristics with Quality Guarantees on Shared Memory Parallel Computers.
Fanny Dufossé, Kamer Kaya, Bora Uçar
2014Bursting the Cloud Data Bubble: Towards Transparent Storage Elasticity in IaaS Clouds.
Bogdan Nicolae, Pierre Riteau, Kate Keahey
2014CALCioM: Mitigating I/O Interference in HPC Systems through Cross-Application Coordination.
Matthieu Dorier, Gabriel Antoniu, Robert B. Ross, Dries Kimpe, Shadi Ibrahim
2014Characterization and Optimization of Memory-Resident MapReduce on HPC Systems.
Yandong Wang, Robin Goldstone, Weikuan Yu, Teng Wang
2014Characterization of Impact of Transient Faults and Detection of Data Corruption Errors in Large-Scale N-Body Programs Using Graphics Processing Units.
Keun Soo Yim
2014Collaborative Network Configuration in Hybrid Electrical/Optical Data Center Networks.
Zhiyang Guo, Yuanyuan Yang
2014Communication-Efficient Distributed Variance Monitoring and Outlier Detection for Multivariate Time Series.
Moshe Gabel, Assaf Schuster, Daniel Keren
2014Comparative Performance Analysis of Intel (R) Xeon Phi (TM), GPU, and CPU: A Case Study from Microscopy Image Analysis.
George Teodoro, Tahsin M. Kurç, Jun Kong, Lee Cooper, Joel H. Saltz
2014Complex Network Analysis Using Parallel Approximate Motif Counting.
George M. Slota, Kamesh Madduri
2014Computational Co-design of a Multiscale Plasma Application: A Process and Initial Results.
Joshua Payne, Dana A. Knoll, Allen McPherson, William T. Taitano, Luis Chacón, Guangye Chen, Scott Pakin
2014Cost-Efficient and Resilient Job Life-Cycle Management on Hybrid Clouds.
Hsuan-Yi Chu, Yogesh Simmhan
2014Cost-Optimal Execution of Boolean Query Trees with Shared Streams.
Henri Casanova, Lipyeow Lim, Yves Robert, Frédéric Vivien, Dounia Zaidouni
2014DEX: Self-Healing Expanders.
Gopal Pandurangan, Peter Robinson, Amitabh Trehan
2014DataMPI: Extending MPI to Hadoop-Like Big Data Computing.
Xiaoyi Lu, Fan Liang, Bing Wang, Li Zha, Zhiwei Xu
2014Designing Bit-Reproducible Portable High-Performance Applications.
Andrea Arteaga, Oliver Fuhrer, Torsten Hoefler
2014Designing LU-QR Hybrid Solvers for Performance and Stability.
Mathieu Faverge, Julien Herrmann, Julien Langou, Bradley R. Lowery, Yves Robert, Jack J. Dongarra
2014EDM: An Endurance-Aware Data Migration Scheme for Load Balancing in SSD Storage Clusters.
Jiaxin Ou, Jiwu Shu, Youyou Lu, Letian Yi, Wei Wang
2014Effectively Exploiting Parallel Scale for All Problem Sizes in LU Factorization.
Md Rakib Hasan, R. Clint Whaley
2014Efficient Data Race Detection for C/C++ Programs Using Dynamic Granularity.
Young Wn Song, Yann-Hang Lee
2014Efficient Multi-GPU Computation of All-Pairs Shortest Paths.
Hristo N. Djidjev, Sunil Thulasidasan, Guillaume Chapuis, Rumen Andonov, Dominique Lavenier
2014Enabling In-Situ Data Analysis for Large Protein-Folding Trajectory Datasets.
Boyu Zhang, Trilce Estrada, Pietro Cicotti, Michela Taufer
2014Enabling and Scaling a Global Shallow-Water Atmospheric Model on Tianhe-2.
Wei Xue, Chao Yang, Haohuan Fu, Xinliang Wang, Yangtong Xu, Lin Gan, Yutong Lu, Xiaoqian Zhu
2014Energy Efficient HPC on Embedded SoCs: Optimization Techniques for Mali GPU.
Ivan Grasso, Petar Radojkovic, Nikola Rajovic, Isaac Gelado, Alex Ramírez
2014Energy-Efficient Time-Division Multiplexed Hybrid-Switched NoC for Heterogeneous Multicore Systems.
Jieming Yin, Pingqiang Zhou, Sachin S. Sapatnekar, Antonia Zhai
2014Evaluating the Impact of SDC on the GMRES Iterative Solver.
James Elliott, Mark Hoemmen, Frank Mueller
2014Exploiting Geometric Partitioning in Task Mapping for Parallel Computers.
Mehmet Deveci, Sivasankaran Rajamanickam, Vitus J. Leung, Kevin T. Pedretti, Stephen L. Olivier, David P. Bunde, Ümit V. Çatalyürek, Karen D. Devine
2014F-SEFI: A Fine-Grained Soft Error Fault Injection Tool for Profiling Application Vulnerability.
Qiang Guan, Nathan DeBardeleben, Sean Blanchard, Song Fu
2014F2C2-STM: Flux-Based Feedback-Driven Concurrency Control for STMs.
Kaushik Ravichandran, Santosh Pande
2014FMI: Fault Tolerant Messaging Interface for Fast and Transparent Recovery.
Kento Sato, Adam Moody, Kathryn M. Mohror, Todd Gamblin, Bronis R. de Supinski, Naoya Maruyama, Satoshi Matsuoka
2014Fair Maximal Independent Sets.
Jeremy T. Fineman, Calvin C. Newport, Micah Sherr, Tonghe Wang
2014Finding Motifs in Biological Sequences Using the Micron Automata Processor.
Indranil Roy, Srinivas Aluru
2014Generalizing Run-Time Tiling with the Loop Chain Abstraction.
Michelle Mills Strout, Fabio Luporini, Christopher D. Krieger, Carlo Bertolli, Gheorghe-Teodor Bercea, Catherine Olschanowsky, J. Ramanujam, Paul H. J. Kelly
2014HPMMAP: Lightweight Memory Management for Commodity Operating Systems.
Brian Kocoloski, John R. Lange
2014Heterogeneity-Aware Workload Placement and Migration in Distributed Sustainable Datacenters.
Dazhao Cheng, Changjun Jiang, Xiaobo Zhou
2014High Performance Alltoall and Allgather Designs for InfiniBand MIC Clusters.
Akshay Venkatesh, Sreeram Potluri, Raghunath Rajachandrasekar, Miao Luo, Khaled Hamidouche, Dhabaleswar K. Panda
2014How Well Do Graph-Processing Platforms Perform? An Empirical Performance Evaluation and Analysis.
Yong Guo, Marcin Biczak, Ana Lucia Varbanescu, Alexandru Iosup, Claudio Martella, Theodore L. Willke
2014Identifying Code Phases Using Piece-Wise Linear Regressions.
Harald Servat, Germán Llort, Juan Gonzalez, Judit Giménez, Jesús Labarta
2014Improved Time Bounds for Linearizable Implementations of Abstract Data Types.
Jiaqi Wang, Edward Talmage, Hyunyoung Lee, Jennifer L. Welch
2014Improving Communication Performance and Scalability of Native Applications on Intel Xeon Phi Coprocessor Clusters.
Karthikeyan Vaidyanathan, Kiran Pamnany, Dhiraj D. Kalamkar, Alexander Heinecke, Mikhail Smelyanskiy, Jongsoo Park, Daehyun Kim, Aniruddha G. Shet, Bharat Kaul, Bálint Joó, Pradeep Dubey
2014Improving the Performance of CA-GMRES on Multicores with Multiple GPUs.
Ichitaro Yamazaki, Hartwig Anzt, Stanimire Tomov, Mark Hoemmen, Jack J. Dongarra
2014Interactive Program Debugging and Optimization for Directive-Based, Efficient GPU Computing.
Seyong Lee, Dong Li, Jeffrey S. Vetter
2014It's About Time: On Optimal Virtual Network Embeddings under Temporal Flexibilities.
Matthias Rost, Stefan Schmid, Anja Feldmann
2014LFTI: A New Performance Metric for Assessing Interconnect Designs for Extreme-Scale HPC Systems.
Xin Yuan, Santosh Mahapatra, Michael Lang, Scott Pakin
2014Large-Scale Hydrodynamic Brownian Simulations on Multicore and Manycore Architectures.
Xing Liu, Edmond Chow
2014Locating Parallelization Potential in Object-Oriented Data Structures.
Korbinian Molitorisz, Thomas Karcher, Alexander Biele, Walter F. Tichy
2014MIC-SVM: Designing a Highly Efficient Support Vector Machine for Advanced Modern Multi-core and Many-Core Architectures.
Yang You, Shuaiwen Leon Song, Haohuan Fu, Andres Marquez, Maryam Mehri Dehnavi, Kevin J. Barker, Kirk W. Cameron, Amanda Peters Randles, Guangwen Yang
2014MapReuse: Reusing Computation in an In-Memory MapReduce System.
Devesh Tiwari, Yan Solihin
2014Mitigating the Mismatch between the Coherence Protocol and Conflict Detection in Hardware Transactional Memory.
Lihang Zhao, Lizhong Chen, Jeffrey T. Draper
2014MobiStreams: A Reliable Distributed Stream Processing System for Mobile Devices.
Huayong Wang, Li-Shiuan Peh
2014Multi-resource Real-Time Reader/Writer Locks for Multiprocessors.
Bryan C. Ward, James H. Anderson
2014New Effective Multithreaded Matching Algorithms.
Fredrik Manne, Mahantesh Halappanavar
2014Nitro: A Framework for Adaptive Code Variant Tuning.
Saurav Muralidharan, Manu Shantharam, Mary W. Hall, Michael Garland, Bryan Catanzaro
2014Online Server and Workload Management for Joint Optimization of Electricity Cost and Carbon Footprint Across Data Centers.
Zahra Abbasi, Madhurima Pore, Sandeep K. S. Gupta
2014Optimization of Multi-level Checkpoint Model for Large Scale HPC Applications.
Sheng Di, Mohamed-Slim Bouguerra, Leonardo Arturo Bautista-Gomez, Franck Cappello
2014Optimizing Bandwidth Allocation in Flex-Grid Optical Networks with Application to Scheduling.
Hadas Shachnai, Ariella Voloshin, Shmuel Zaks
2014Optimizing Sparse Matrix-Multiple Vectors Multiplication for Nuclear Configuration Interaction Calculations.
Hasan Metin Aktulga, Aydin Buluç, Samuel Williams, Chao Yang
2014Overcoming the Limitations Posed by TCR-beta Repertoire Modeling through a GPU-Based In-Silico DNA Recombination Algorithm.
Gregory M. Striemer, Harsha Krovi, Ali Akoglu, Benjamin Vincent, Ben Hopson, Jeffrey Frelinger, Adam Buntzman
2014Overcoming the Scalability Challenges of Epidemic Simulations on Blue Waters.
Jae-Seung Yeom, Abhinav Bhatele, Keith R. Bisset, Eric J. Bohm, Abhishek Gupta, Laxmikant V. Kalé, Madhav V. Marathe, Dimitrios S. Nikolopoulos, Martin Schulz, Lukasz Wesolowski
2014PAGE: A Framework for Easy PArallelization of GEnomic Applications.
Mücahid Kutlu, Gagan Agrawal
2014POD: Performance Oriented I/O Deduplication for Primary Storage Systems in the Cloud.
Bo Mao, Hong Jiang, Suzhen Wu, Lei Tian
2014Parallel Mutual Information Based Construction of Whole-Genome Networks on the Intel (R) Xeon Phi (TM) Coprocessor.
Sanchit Misra, Kiran Pamnany, Srinivas Aluru
2014Performance and Energy Analysis of the Restricted Transactional Memory Implementation on Haswell.
Bhavishya Goel, J. Rubén Titos Gil, Anurag Negi, Sally A. McKee, Per Stenström
2014Petascale Application of a Coupled CPU-GPU Algorithm for Simulation and Analysis of Multiphase Flow Solutions in Porous Medium Systems.
James E. McClure, Hao Wang, Jan F. Prins, Cass T. Miller, Wu-chun Feng
2014Petascale General Solver for Semidefinite Programming Problems with Over Two Million Constraints.
Katsuki Fujisawa, Toshio Endo, Yuichiro Yasui, Hitoshi Sato, Naoki Matsuzawa, Satoshi Matsuoka, Hayato Waki
2014Pipelined Compaction for the LSM-Tree.
Zigang Zhang, Yinliang Yue, Bingsheng He, Jin Xiong, Mingyu Chen, Lixin Zhang, Ninghui Sun
2014Power and Performance Characterization and Modeling of GPU-Accelerated Systems.
Yuki Abe, Hiroshi Sasaki, Shinpei Kato, Koji Inoue, Masato Edahiro, Martin Peres
2014Power-Efficient Multiple Producer-Consumer.
Ramy Medhat, Borzoo Bonakdarpour, Sebastian Fischmeister
2014Pythia: Faster Big Data in Motion through Predictive Software-Defined Network Optimization at Runtime.
Marcelo Veiga Neves, César A. F. De Rose, Kostas Katrinis, Hubertus Franke
2014RCMP: Enabling Efficient Recomputation Based Failure Resilience for Big Data Analytics.
Florin Dinu, T. S. Eugene Ng
2014ReDHiP: Recalibrating Deep Hierarchy Prediction for Energy Efficiency.
Xun Li, Diana Franklin, Ricardo Bianchini, Frederic T. Chong
2014Reading the Tea-Leaves: How Architecture Has Evolved at the High End.
Peter M. Kogge
2014Reconstructing Householder Vectors from Tall-Skinny QR.
Grey Ballard, James Demmel, Laura Grigori, Mathias Jacquelin, Hong Diep Nguyen, Edgar Solomonik
2014Remote Invalidation: Optimizing the Critical Path of Memory Transactions.
Ahmed Hassan, Roberto Palmieri, Binoy Ravindran
2014Revisiting Asynchronous Linear Solvers: Provable Convergence Rate through Randomization.
Haim Avron, Alex Druinsky, Anshul Gupta
2014Runtime-Guided Cache Coherence Optimizations in Multi-core Architectures.
Madhavan Manivannan, Per Stenström
2014Scalability­-Centric HPC System Design.
Yutong Lu
2014Scalable Single Source Shortest Path Algorithms for Massively Parallel Systems.
Venkatesan T. Chakaravarthy, Fabio Checconi, Fabrizio Petrini, Yogish Sabharwal
2014Scalar Waving: Improving the Efficiency of SIMD Execution on GPUs.
Ayse Yilmazer, Zhongliang Chen, David R. Kaeli
2014Scaling Irregular Applications through Data Aggregation and Software Multithreading.
Alessandro Morari, Antonino Tumeo, Daniel G. Chavarría-Miranda, Oreste Villa, Mateo Valero
2014Scibox: Online Sharing of Scientific Data via the Cloud.
Jian Huang, Xuechen Zhang, Greg Eisenhauer, Karsten Schwan, Matthew Wolf, Stéphane Ethier, Scott Klasky
2014Shedding Light on Lithium/Air Batteries Using Millions of Threads on the BG/Q Supercomputer.
Valéry Weber, Costas Bekas, Teodoro Laino, Alessandro Curioni, Adam Bertsch, Scott Futral
2014Skywalk: A Topology for HPC Networks with Low-Delay Switches.
Ikki Fujiwara, Michihiro Koibuchi, Hiroki Matsutani, Henri Casanova
2014TBPoint: Reducing Simulation Time for Large-Scale GPGPU Kernels.
Jen-Cheng Huang, Lifeng Nai, Hyesoon Kim, Hsien-Hsin S. Lee
2014Traversing Trillions of Edges in Real Time: Graph Exploration on Large-Scale Parallel Machines.
Fabio Checconi, Fabrizio Petrini
2014UPC++: A PGAS Extension for C++.
Yili Zheng, Amir Kamil, Michael B. Driscoll, Hongzhang Shan, Katherine A. Yelick
2014Unified Development for Mixed Multi-GPU and Multi-coprocessor Environments Using a Lightweight Runtime Environment.
Azzam Haidar, Chongxiao Cao, Asim YarKhan, Piotr Luszczek, Stanimire Tomov, Khairul Kabir, Jack J. Dongarra
2014Using Load Balancing to Scalably Parallelize Sampling-Based Motion Planning Algorithms.
Adam Fidel, Sam Ade Jacobs, Shishir Sharma, Nancy M. Amato, Lawrence Rauchwerger
2014Using Multiple Threads to Accelerate Single Thread Performance.
Zehra Sura, Kevin O'Brien, José R. Brunheroto
2014Victim Selection and Distributed Work Stealing Performance: A Case Study.
Swann Perarnau, Mitsuhisa Sato
2014Work-Efficient Parallel GPU Methods for Single-Source Shortest Paths.
Andrew A. Davidson, Sean Baxter, Michael Garland, John D. Owens
2014cuBLASTP: Fine-Grained Parallelization of Protein Sequence Search on a GPU.
Jing Zhang, Hao Wang, Heshan Lin, Wu-chun Feng
2014s-Step Krylov Subspace Methods as Bottom Solvers for Geometric Multigrid.
Samuel Williams, Mike Lijewski, Ann S. Almgren, Brian van Straalen, Erin Carson, Nicholas Knight, James Demmel