| 2016 | 2016 IEEE International Parallel and Distributed Processing Symposium, IPDPS 2016, Chicago, IL, USA, May 23-27, 2016 |
| 2016 | A Case Study of Complex Graph Analysis in Distributed Memory: Implementation and Optimization. George M. Slota, Sivasankaran Rajamanickam, Kamesh Madduri |
| 2016 | A Fast Selected Inversion Algorithm for Green's Function Calculation in Many-Body Quantum Monte Carlo Simulations. Chengming Jiang, Zhaojun Bai, Richard Scalettar |
| 2016 | A Fast Tridiagonal Solver for Intel MIC Architecture. Xinliang Wang, Wei Xue, Jidong Zhai, Yangtong Xu, Weimin Zheng, Hai-Xiang Lin |
| 2016 | A Hartree-Fock Application Using UPC++ and the New DArray Library. David Ozog, Amir Kamil, Yili Zheng, Paul Hargrove, Jeff R. Hammond, Allen D. Malony, Wibe de Jong, Kathy Yelick |
| 2016 | A Hybrid Decomposition Parallel Algorithm for Multi-scale Simulation of Viscoelastic Fluids. Xiaowei Guo, Xinhai Xu, Qian Wang, Hao Li, Xiaoguang Ren, Liyang Xu, Xuejun Yang |
| 2016 | A Medium-Grained Algorithm for Sparse Tensor Factorization. Shaden Smith, George Karypis |
| 2016 | A Methodology for Modeling Dynamic and Static Power Consumption for Multicore Processors. Bhavishya Goel, Sally A. McKee |
| 2016 | A New Approximation Algorithm for Matrix Partitioning in Presence of Strongly Heterogeneous Processors. Olivier Beaumont, Lionel Eyraud-Dubois, Thomas Lambert |
| 2016 | A Practical Parallel Algorithm for Diameter Approximation of Massive Weighted Graphs. Matteo Ceccarello, Andrea Pietracaprina, Geppino Pucci, Eli Upfal |
| 2016 | A Relaxed Synchronization Approach for Solving Parallel Quadratic Programming Problems with Guaranteed Convergence. Kooktae Lee, Raktim Bhattacharya, Jyotikrishna Dass, V. N. S. Prithvi Sakuru, Rabi N. Mahapatra |
| 2016 | AAlign: A SIMD Framework for Pairwise Sequence Alignment on x86-Based Multi-and Many-Core Processors. Kaixi Hou, Hao Wang, Wu-chun Feng |
| 2016 | ARCHER: Effectively Spotting Data Races in Large OpenMP Applications. Simone Atzeni, Ganesh Gopalakrishnan, Zvonimir Rakamaric, Dong H. Ahn, Ignacio Laguna, Martin Schulz, Gregory L. Lee, Joachim Protze, Matthias S. Müller |
| 2016 | Agile Live Migration of Virtual Machines. Umesh Deshpande, Danny Chan, Ten-Young Guh, James Edouard, Kartik Gopalan, Nilton Bila |
| 2016 | Algorithm and Architecture Independent Benchmarking with SEAK. Nathan R. Tallent, Joseph B. Manzano, Nitin A. Gawande, Seunghwa Kang, Darren J. Kerbyson, Adolfy Hoisie, Joseph K. Cross |
| 2016 | Algorithmic Techniques for Solving Graph Problems on the Automata Processor. Indranil Roy, Nagakishore Jammula, Srinivas Aluru |
| 2016 | An Early Performance Study of Large-Scale POWER8 SMP Systems. Xing Liu, Daniele Buono, Fabio Checconi, Jee W. Choi, Xinyu Que, Fabrizio Petrini, John A. Gunnels, Jeff Stuecheli |
| 2016 | Analyzing Network Health and Congestion in Dragonfly-Based Supercomputers. Abhinav Bhatele, Nikhil Jain, Yarden Livnat, Valerio Pascucci, Peer-Timo Bremer |
| 2016 | Architecting and Programming a Hardware-Incoherent Multiprocessor Cache Hierarchy. Wooil Kim, Sanket Tavarageri, P. Sadayappan, Josep Torrellas |
| 2016 | Are Static Schedules so Bad? A Case Study on Cholesky Factorization. Emmanuel Agullo, Olivier Beaumont, Lionel Eyraud-Dubois, Suraj Kumar |
| 2016 | Asymptotic Optimality of Parallel Short Division. Niall Emmart, Charles C. Weems |
| 2016 | Automatic Parallel Pattern Detection in the Algorithm Structure Design Space. Zia Ul Huda, Rohit Atre, Ali Jannesari, Felix Wolf |
| 2016 | Balancing Scalar and Vector Execution on GPU Architectures. Zhongliang Chen, David R. Kaeli |
| 2016 | CATA: Criticality Aware Task Acceleration for Multicore Processors. Emilio Castillo, Miquel Moretó, Marc Casas, Lluc Alvarez, Enrique Vallejo, Kallia Chronaki, Rosa M. Badia, José Luis Bosque, Ramón Beivide, Eduard Ayguadé, Jesús Labarta, Mateo Valero |
| 2016 | CRC-Based Memory Reliability for Task-Parallel HPC Applications. Omer Subasi, Osman S. Ünsal, Jesús Labarta, Gulay Yalcin, Adrián Cristal |
| 2016 | Communication Efficient Algorithms for Top-k Selection Problems. Lorenz Hübschle-Schneider, Peter Sanders |
| 2016 | Communication-Avoiding Parallel Sparse-Dense Matrix-Matrix Multiplication. Penporn Koanantakool, Ariful Azad, Aydin Buluç, Dmitriy Morozov, Sang-Yun Oh, Leonid Oliker, Katherine A. Yelick |
| 2016 | Compiler-Assisted Workload Consolidation for Efficient Dynamic Parallelism on GPU. Hancheng Wu, Da Li, Michela Becchi |
| 2016 | DataNet: A Data Distribution-Aware Method for Sub-Dataset Analysis on Distributed File Systems. Jun Wang, Jiangling Yin, Jian Zhou, Xuhong Zhang, Ruijun Wang |
| 2016 | Deflection Containment for Bufferless Network-on-Chips. Xi-Yue Xiang, Nian-Feng Tzeng |
| 2016 | Design and Implementation of a Parallel Research Kernel for Assessing Dynamic Load-Balancing Capabilities. Evangelos Georganas, Rob F. Van der Wijngaart, Timothy G. Mattson |
| 2016 | Differentiated Scheduling of Response-Critical and Best-Effort Wide-Area Data Transfers. Rajkumar Kettimuthu, Gagan Agrawal, P. Sadayappan, Ian T. Foster |
| 2016 | Discrete Cache Insertion Policies for Shared Last Level Cache Management on Large Multicores. Aswinkumar Sridharan, André Seznec |
| 2016 | Disruptive Research and Innovation. Kai Li |
| 2016 | Distributed-Memory Algorithms for Maximum Cardinality Matching in Bipartite Graphs. Ariful Azad, Aydin Buluç |
| 2016 | Dynamic Acceleration of Parallel Applications in Cloud Platforms by Adaptive Time-Slice Control. Song Wu, Zhenjiang Xie, Haibao Chen, Sheng Di, Xinyu Zhao, Hai Jin |
| 2016 | Efficient Checkpointing of Multi-threaded Applications as a Tool for Debugging, Performance Tuning, and Resiliency. Max Grossman, Vivek Sarkar |
| 2016 | Eliminating Intra-Warp Load Imbalance in Irregular Nested Patterns via Collaborative Task Engagement. Farzad Khorasani, Bryan Rowe, Rajiv Gupta, Laxmi N. Bhuyan |
| 2016 | Enhancing Scalability and Load Balancing of Parallel Selected Inversion via Tree-Based Asynchronous Communication. Mathias Jacquelin, Lin Lin, Nathan Wichmann, Chao Yang |
| 2016 | Evaluating and Improving Thread-Level Speculation in Hardware Transactional Memories. Juan Salamanca, José Nelson Amaral, Guido Araujo |
| 2016 | Exploiting Maximal Overlap for Non-Contiguous Data Movement Processing on Modern GPU-Enabled Systems. Ching-Hsiang Chu, Khaled Hamidouche, Akshay Venkatesh, Dip Sankar Banerjee, Hari Subramoni, Dhabaleswar K. Panda |
| 2016 | Exploiting Variant-Based Parallelism for Data Mining of Space Weather Phenomena. Michael G. Gowanlock, David M. Blair, Victor Pankratius |
| 2016 | Fast Classification of MPI Applications Using Lamport's Logical Clocks. Zhou Tong, Scott Pakin, Michael Lang, Xin Yuan |
| 2016 | Fast Error-Bounded Lossy HPC Data Compression with SZ. Sheng Di, Franck Cappello |
| 2016 | FastBFS: Fast Breadth-First Graph Search on a Single Server. Shu-han Cheng, Guangyan Zhang, Jiwu Shu, Qingda Hu, Weimin Zheng |
| 2016 | Fault Modeling of Extreme Scale Applications Using Machine Learning. Abhinav Vishnu, Hubertus Van Dam, Nathan R. Tallent, Darren J. Kerbyson, Adolfy Hoisie |
| 2016 | GPU-Accelerated Outlier Detection for Continuous Data Streams. Chandima Hewa Nadungodage, Yuni Xia, John Jaehwan Lee |
| 2016 | Gathering a Closed Chain of Robots on a Grid. Sebastian Abshoff, Andreas Cord-Landwehr, Matthias Fischer, Daniel Jung, Friedhelm Meyer auf der Heide |
| 2016 | GinFlow: A Decentralised Adaptive Workflow Execution Manager. Javier Rojas Balderrama, Matthieu Simonin, Cédric Tedeschi |
| 2016 | GraphPad: Optimized Graph Primitives for Parallel and Distributed Platforms. Michael J. Anderson, Narayanan Sundaram, Nadathur Satish, Md. Mostofa Ali Patwary, Theodore L. Willke, Pradeep Dubey |
| 2016 | GreenMatch: Renewable-Aware Workload Scheduling for Massive Storage Systems. Xiaoyang Qu, Jiguang Wan, Jun Wang, Liqiong Liu, Dan Luo, Changsheng Xie |
| 2016 | Hierarchical Parallel Dynamic Dependence Analysis for Recursively Task-Parallel Programs. Nikolaos Papakonstantinou, Foivos S. Zakkak, Polyvios Pratikakis |
| 2016 | High Performance Parallel Stochastic Gradient Descent in Shared Memory. Scott Sallinen, Nadathur Satish, Mikhail Smelyanskiy, Samantika S. Sury, Christopher Ré |
| 2016 | High Performance Pattern Matching Using the Automata Processor. Indranil Roy, Ankit Srivastava, Marziyeh Nourian, Michela Becchi, Srinivas Aluru |
| 2016 | High-Performance Hybrid Key-Value Store on Modern Clusters with RDMA Interconnects and SSDs: Non-blocking Extensions, Designs, and Benefits. Dipti Shankar, Xiaoyi Lu, Nusrat S. Islam, Md. Wasi-ur-Rahman, Dhabaleswar K. Panda |
| 2016 | Hybrid Dynamic Trees for Extreme-Resolution 3D Sparse Data Modeling. Mohammad M. Hossain, Thomas M. Tucker, Thomas R. Kurfess, Richard W. Vuduc |
| 2016 | I/O Aware Power Shifting. Lee Savoie, David K. Lowenthal, Bronis R. de Supinski, Tanzima Z. Islam, Kathryn M. Mohror, Barry Rountree, Martin Schulz |
| 2016 | INV-ASKIT: A Parallel Fast Direct Solver for Kernel Matrices. Chenhan D. Yu, William B. March, Bo Xiao, George Biros |
| 2016 | Integrating Abstractions to Enhance the Execution of Distributed Applications. Matteo Turilli, Feng Liu, Zhao Zhang, André Merzky, Michael Wilde, Jon B. Weissman, Daniel S. Katz, Shantenu Jha |
| 2016 | Key/Value-Enabled Flash Memory for Complex Scientific Workflows with On-Line Analysis and Visualization. Stefan Eilemann, Fabien Delalondre, Jon Bernard, Judit Planas, Felix Schürmann, John Biddiscombe, Costas Bekas, Alessandro Curioni, Bernard Metzler, Peter Kaltstein, Peter Morjan, Joachim Fenkes, Ralph Bellofatto, Lars Schneidenbach, T. J. Christopher Ward, Blake G. Fitch |
| 2016 | Lazy Repair for Addition of Fault-Tolerance to Distributed Programs. Mohammad Roohitavaf, Yiyan Lin, Sandeep S. Kulkarni |
| 2016 | MEMTUNE: Dynamic Memory Management for In-Memory Data Analytic Platforms. Luna Xu, Min Li, Li Zhang, Ali Raza Butt, Yandong Wang, Zane Zhenhua Hu |
| 2016 | MPMD Framework for Offloading Load Balance Computation. Olga Pearce, Todd Gamblin, Bronis R. de Supinski, Martin Schulz, Nancy M. Amato |
| 2016 | Markov Chain-Based Adaptive Scheduling in Software Transactional Memory. Pierangelo di Sanzo, Marco Sannicandro, Bruno Ciciani, Francesco Quaglia |
| 2016 | Massively Parallel First-Principles Simulation of Electron Dynamics in Materials. Erik W. Draeger, Xavier Andrade, John A. Gunnels, Abhinav Bhatele, Andre Schleife, Alfredo A. Correa |
| 2016 | Memory, Storage and Processing in Future Parallel and Distributed Processing Systems. J. Thomas Pawlowski |
| 2016 | Mendel: A Distributed Storage Framework for Similarity Searching over Sequencing Data. Cameron Tolooee, Sangmi Lee Pallickara, Asa Ben-Hur |
| 2016 | Minimal Aggregated Shared Memory Messaging on Distributed Memory Supercomputers. Benjamin F. Jamroz, John M. Dennis |
| 2016 | Mitigation of Denial of Service Attack with Hardware Trojans in NoC Architectures. Travis Boraten, Avinash Karanth Kodi |
| 2016 | Mystic: Predictive Scheduling for GPU Based Cloud Servers Using Machine Learning. Yash Ukidave, Xiangyu Li, David R. Kaeli |
| 2016 | NEPTUNE: Real Time Stream Processing for Internet of Things and Sensing Environments. Thilina Buddhika, Shrideep Pallickara |
| 2016 | Never Say Never - Probabilistic and Temporal Failure Detectors. Dacfey Dzung, Rachid Guerraoui, David Kozhaya, Yvonne-Anne Pignolet |
| 2016 | NiMC: Characterizing and Eliminating Network-Induced Memory Contention. Taylor L. Groves, Ryan E. Grant, Dorian C. Arnold |
| 2016 | On Competitive Algorithms for Approximations of Top-k-Position Monitoring of Distributed Streams. Alexander Mäcker, Manuel Malatyali, Friedhelm Meyer auf der Heide |
| 2016 | On First Fit Bin Packing for Online Cloud Server Allocation. Xueyan Tang, Yusen Li, Runtian Ren, Wentong Cai |
| 2016 | On the Root Causes of Cross-Application I/O Interference in HPC Storage Systems. Orcun Yildiz, Matthieu Dorier, Shadi Ibrahim, Robert B. Ross, Gabriel Antoniu |
| 2016 | On the Scalability, Performance Isolation and Device Driver Transparency of the IHK/McKernel Hybrid Lightweight Kernel. Balazs Gerofi, Masamichi Takagi, Atsushi Hori, Gou Nakamura, Tomoki Shirasawa, Yutaka Ishikawa |
| 2016 | Online Algorithm-Based Fault Tolerance for Cholesky Decomposition on Heterogeneous Systems with GPUs. Jieyang Chen, Xin Liang, Zizhong Chen |
| 2016 | Online-Autotuning of Parallel SAH kD-Trees. Martin Peter Tillmann, Philip Pfaffe, Christopher Kaag, Walter F. Tichy |
| 2016 | OpenACC to FPGA: A Framework for Directive-Based High-Performance Reconfigurable Computing. Seyong Lee, Jungwon Kim, Jeffrey S. Vetter |
| 2016 | Optimal Algorithms for Graphs and Images on a Shared Memory Mesh. Yujie An, Quentin F. Stout |
| 2016 | Optimal Resilience Patterns to Cope with Fail-Stop and Silent Errors. Anne Benoit, Aurélien Cavelan, Yves Robert, Hongyang Sun |
| 2016 | Optimization and Analysis of MPI Collective Communication on Fat-Tree Networks. Sameer Kumar, Sameh Sharkawi, K. A. Nysal Jan |
| 2016 | Optimization of an Electromagnetics Code with Multicore Wavefront Diamond Blocking and Multi-dimensional Intra-Tile Parallelization. Tareq M. Malas, Julian Hornich, Georg Hager, Hatem Ltaief, Christoph Pflaum, David E. Keyes |
| 2016 | Order-Invariant Real Number Summation: Circumventing Accuracy Loss for Multimillion Summands on Multiple Parallel Architectures. Patrick E. Small, Rajiv K. Kalia, Aiichiro Nakano, Priya Vashishta |
| 2016 | PANDA: Extreme Scale Parallel K-Nearest Neighbor on Distributed Architectures. Md. Mostofa Ali Patwary, Nadathur Rajagopalan Satish, Narayanan Sundaram, Jialin Liu, Peter J. Sadowski, Evan Racah, Surendra Byna, Craig Tull, Wahid Bhimji, Prabhat, Pradeep Dubey |
| 2016 | Parallel Graph Coloring for Manycore Architectures. Mehmet Deveci, Erik G. Boman, Karen D. Devine, Sivasankaran Rajamanickam |
| 2016 | Parallel Tensor Compression for Large-Scale Scientific Data. Woody Austin, Grey Ballard, Tamara G. Kolda |
| 2016 | Partitioned Feasibility Tests for Sporadic Tasks on Heterogeneous Machines. Shaurya Ahuja, Kefu Lu, Benjamin Moseley |
| 2016 | Petascale Local Time Stepping for the ADER-DG Finite Element Method. Alexander Breuer, Alexander Heinecke, Michael Bader |
| 2016 | Polynomial-Time Construction of Optimal MPI Derived Datatype Trees. Robert Ganian, Martin Kalany, Stefan Szeider, Jesper Larsson Träff |
| 2016 | RUPS: Fixing Relative Distances among Urban Vehicles with Context-Aware Trajectories. Hongzi Zhu, Shan Chang, Li Lu, Wei Zhang |
| 2016 | Rabbit Order: Just-in-Time Parallel Reordering for Fast Graph Analysis. Junya Arai, Hiroaki Shiokawa, Takeshi Yamamuro, Makoto Onizuka, Sotetsu Iwamura |
| 2016 | Random Regular Graph and Generalized De Bruijn Graph with k-Shortest Path Routing. Peyman Faizian, Md Atiqul Mollah, Xin Yuan, Scott Pakin, Michael Lang |
| 2016 | Re-NUCA: A Practical NUCA Architecture for ReRAM Based Last-Level Caches. Jagadish Kotra, Mohammad Arjomand, Diana R. Guttman, Mahmut T. Kandemir, Chita R. Das |
| 2016 | Reducing Waste in Extreme Scale Systems through Introspective Analysis. Leonardo Arturo Bautista-Gomez, Ana Gainaru, Swann Perarnau, Devesh Tiwari, Saurabh Gupta, Christian Engelmann, Franck Cappello, Marc Snir |
| 2016 | Refree: A Refresh-Free Hybrid DRAM/PCM Main Memory System. Bahareh Pourshirazi, Zhichun Zhu |
| 2016 | Reusable Resource Scheduling via Colored Interval Covering. Venkatesan T. Chakaravarthy, Sreyash Kenkre, Sakib A. Mondal, Vinayaka Pandit, Yogish Sabharwal |
| 2016 | Security RBSG: Protecting Phase Change Memory with Security-Level Adjustable Dynamic Mapping. Fangting Huang, Dan Feng, Wen Xia, Wen Zhou, Yucheng Zhang, Min Fu, Chuntao Jiang, Yukun Zhou |
| 2016 | Smoothed Online Resource Allocation in Multi-tier Distributed Cloud Networks. Lei Jiao, Antonia M. Tulino, Jaime Llorca, Yue Jin, Alessandra Sala |
| 2016 | Solving Open MIP Instances with ParaSCIP on Supercomputers Using up to 80, 000 Cores. Yuji Shinano, Tobias Achterberg, Timo Berthold, Stefan Heinz, Thorsten Koch, Michael Winkler |
| 2016 | Stochastic Matrix-Function Estimators: Scalable Big-Data Kernels with High Performance. Peter W. J. Staar, Panagiotis Kl. Barkoutsos, Roxana Istrate, A. Cristiano I. Malossi, Ivano Tavernelli, Nikolaj Moll, Heiner Giefers, Christoph Hagleitner, Costas Bekas, Alessandro Curioni |
| 2016 | Storage-Optimized Data-Atomic Algorithms for Handling Erasures and Errors in Distributed Storage Systems. Kishori M. Konwar, N. Prakash, Erez Kantor, Nancy A. Lynch, Muriel Médard, Alexander A. Schwarzmann |
| 2016 | Structural Clustering: A New Approach to Support Performance Analysis at Scale. Matthias Weber, Ronny Brendel, Tobias Hilbrich, Kathryn M. Mohror, Martin Schulz, Holger Brunst |
| 2016 | Subgraph Counting: Color Coding Beyond Trees. Venkatesan T. Chakaravarthy, Michael Kapralov, Prakash Murali, Fabrizio Petrini, Xinyu Que, Yogish Sabharwal, Baruch Schieber |
| 2016 | Synchronization Trade-Offs in GPU Implementations of Graph Algorithms. Rashid Kaleem, Anand Venkat, Sreepathi Pai, Mary W. Hall, Keshav Pingali |
| 2016 | System Noise Revisited: Enabling Application Scalability and Reproducibility with SMT. Edgar A. León, Ian Karlin, Adam Moody |
| 2016 | TECfan: Coordinating Thermoelectric Cooler, Fan, and DVFS for CMP Energy Optimization. Wenli Zheng, Kai Ma, Xiaorui Wang |
| 2016 | TintMalloc: Reducing Memory Access Divergence via Controller-Aware Coloring. Xing Pan, Yasaswini Jyothi Gownivaripalli, Frank Mueller |
| 2016 | Towards a Restrained Use of Non-Equivocation for Achieving Iterative Approximate Byzantine Consensus. Chuanyou Li, Michel Hurfin, Yun Wang, Lei Yu |
| 2016 | Unlocking the Mysteries of the Universe with Supercomputers. Katrin Heitmann |
| 2016 | Utility Maximizing Thread Assignment and Resource Allocation. Pan Lai, Rui Fan, Wei Zhang, Fang Liu |
| 2016 | VNRE: Flexible and Efficient Acceleration for Network Redundancy Elimination. Xiongzi Ge, Yi Liu, Chengtao Lu, Jim Diehl, David H. C. Du, Liang Zhang, Jian Chen |
| 2016 | Write-Avoiding Algorithms. Erin Carson, James Demmel, Laura Grigori, Nicholas Knight, Penporn Koanantakool, Oded Schwartz, Harsha Vardhan Simhadri |
| 2016 | X: A Comprehensive Analytic Model for Parallel Machines. Ang Li, Shuaiwen Leon Song, Eric Brugel, Akash Kumar, Daniel G. Chavarría-Miranda, Henk Corporaal |
| 2016 | ZCCloud: Exploring Wasted Green Power for High-Performance Computing. Fan Yang, Andrew A. Chien |
| 2016 | ZNN - A Fast and Scalable Algorithm for Training 3D Convolutional Networks on Multi-core and Many-Core Shared Memory Machines. Aleksandar Zlateski, Kisuk Lee, H. Sebastian Seung |
| 2016 | cusFFT: A High-Performance Sparse Fast Fourier Transform Algorithm on GPUs. Cheng Wang, Sunita Chandrasekaran, Barbara M. Chapman |