| 2016 | 45th International Conference on Parallel Processing, ICPP 2016, Philadelphia, PA, USA, August 16-19, 2016 |
| 2016 | A Comparison of Accelerator Architectures for Radio-Astronomical Signal-Processing Algorithms. John W. Romein |
| 2016 | A Parallel Hill-Climbing Refinement Algorithm for Graph Partitioning. Dominique LaSalle, George Karypis |
| 2016 | A Scalability Comparison Study of Data Management Approaches for Smart Metering Systems. Houssem Chihoub, Christine Collet |
| 2016 | AccuracyTrader: Accuracy-Aware Approximate Processing for Low Tail Latency and High Result Accuracy in Cloud Online Services. Rui Han, Siguang Huang, Fei Tang, Fu-Gui Chang, Jianfeng Zhan |
| 2016 | An Efficient Wireless Power Transfer System to Balance the State of Charge of Electric Vehicles. Ankur Sarker, Chenxi Qiu, Haiying Shen, Andrea Gil, Joachim Taiber, Mashrur Chowdhury, Jim Martin, Mac Devine, Andrew J. Rindos |
| 2016 | An Unbounded Nonblocking Double-Ended Queue. Matthew Graichen, Joseph Izraelevitz, Michael L. Scott |
| 2016 | AppBag: Application-Aware Bandwidth Allocation for Virtual Machines in Cloud Environment. Dian Shen, Junzhou Luo, Fang Dong, Junxue Zhang |
| 2016 | CoARC: Co-operative, Aggressive Recovery and Caching for Failures in Erasure Coded Hadoop. Pradeep Subedi, Ping Huang, Tong Liu, Joseph Moore, Stan Skelton, Xubin He |
| 2016 | Criticality-Aware Partitioning for Multicore Mixed-Criticality Systems. Jian-Jun Han, Xin Tao, Dakai Zhu, Hakan Aydin |
| 2016 | DC-Top-k: A Novel Top-k Selecting Algorithm and Its Parallelization. Zhengyuan Xue, Ruixuan Li, Heng Zhang, Xiwu Gu, Zhiyong Xu |
| 2016 | Declarative Tuning for Locality in Parallel Programs. Sanjay Chatterjee, Nick Vrvilo, Zoran Budimlic, Kathleen Knobe, Vivek Sarkar |
| 2016 | EchoLoc: Accurate Device-Free Hand Localization Using COTS Devices. Huijie Chen, Fan Li, Yu Wang |
| 2016 | Efficient 2-Body Statistics Computation on GPUs: Parallelization & Beyond. Napath Pitaksirianan, Zhila Nouri, Yi-Cheng Tu |
| 2016 | Efficient Parallel Algorithms for k-Center Clustering. Jessica McClintock, Anthony Wirth |
| 2016 | Efficient Virtual Network Embedding for Variable Size Virtual Machines in Fat-Tree Data Centers. Jun Duan, Yuanyuan Yang |
| 2016 | Ensemble Toolkit: Scalable and Flexible Execution of Ensembles of Tasks. Vivekanandan Balasubramanian, Antons Treikalis, Ole Weidner, Shantenu Jha |
| 2016 | Exploiting Real-Time Traffic Light Scheduling with Taxi Traces. Zongjian He, Daqiang Zhang, Jiannong Cao, Xuefeng Liu, Xiaopeng Fan, Cheng-Zhong Xu |
| 2016 | Exploring Variation-Aware Fault-Tolerant Cache under Near-Threshold Computing. Jing Wang, Yanjun Liu, Weigong Zhang, Kezhong Lu, Keni Qiu, Xin Fu, Tao Li |
| 2016 | Fast RFID Polling Protocols. Jia Liu, Bin Xiao, Xuan Liu, Lijun Chen |
| 2016 | Fault Tolerant Support Vector Machines. Sameh Shohdy, Abhinav Vishnu, Gagan Agrawal |
| 2016 | GFlink: An In-Memory Computing Architecture on Heterogeneous CPU-GPU Clusters for Big Data. Cen Chen, Kenli Li, Aijia Ouyang, Zhuo Tang, Keqin Li |
| 2016 | Guaranteed Bang for the Buck: Modeling VDI Applications with Guaranteed Quality of Service. Hao Wen, David Hung-Chang Du, Milan Shetti, Doug Voigt, Shanshan Li |
| 2016 | Help-Optimal and Language-Portable Lock-Free Concurrent Data Structures. Bapi Chatterjee, Ivan Walulya, Philippas Tsigas |
| 2016 | High Performance MPI Library for Container-Based HPC Cloud on InfiniBand Clusters. Jie Zhang, Xiaoyi Lu, Dhabaleswar K. Panda |
| 2016 | High Performance Parallel Algorithms for the Tucker Decomposition of Sparse Tensors. Oguz Kaya, Bora Uçar |
| 2016 | HppCnn: A High-Performance, Portable Deep-Learning Library for GPGPUs. Yi Yang, Min Feng, Srimat T. Chakradhar |
| 2016 | Improving Data Transfer Throughput with Direct Search Optimization. Prasanna Balaprakash, Vitali A. Morozov, Rajkumar Kettimuthu, Kalyan Kumaran, Ian T. Foster |
| 2016 | Improving RAID Performance Using an Endurable SSD Cache. Chu Li, Dan Feng, Yu Hua, Fang Wang |
| 2016 | In Situ Storage Layout Optimization for AMR Spatio-temporal Read Accesses. Houjun Tang, Suren Byna, Steve Harenberg, Wenzhao Zhang, Xiaocheng Zou, Daniel F. Martin, Bin Dong, Dharshi Devendran, Kesheng Wu, David Trebotich, Scott Klasky, Nagiza F. Samatova |
| 2016 | Locality-Aware Laplacian Mesh Smoothing. Guillaume Aupy, JeongHyung Park, Padma Raghavan |
| 2016 | MIC: An Efficient Anonymous Communication System in Data Center Networks. Tingwei Zhu, Dan Feng, Yu Hua, Fang Wang, Qingyu Shi, Jiahao Liu |
| 2016 | MPI Overlap: Benchmark and Analysis. Alexandre Denis, François Trahay |
| 2016 | Making In-Memory Frequent Pattern Mining Durable and Energy Efficient. Yi Lin, Po-Chun Huang, Duo Liu, Xiao Zhu, Liang Liang |
| 2016 | Managing I/O Interference in a Shared Burst Buffer System. Sagar Thapaliya, Purushotham V. Bangalore, Jay F. Lofstead, Kathryn M. Mohror, Adam Moody |
| 2016 | Massively-Parallel Lossless Data Decompression. Evangelia A. Sitaridi, René Müller, Tim Kaldewey, Guy M. Lohman, Kenneth A. Ross |
| 2016 | MobiSensing: Exploiting Human Mobility for Multi-application Mobile Data Sensing with Low User Intervention. Kang Chen, Haiying Shen |
| 2016 | On the Impact of Widening Vector Registers on Sequence Alignment. Jeffrey Daily, Ananth Kalyanaraman, Sriram Krishnamoorthy, Bin Ren |
| 2016 | One-Sided Interface for Matrix Operations Using MPI-3 RMA: A Case Study with Elemental. Sayan Ghosh, Jeff R. Hammond, Antonio J. Peña, Pavan Balaji, Assefaw Hadish Gebremedhin, Barbara M. Chapman |
| 2016 | Optimal Collision/Conflict-Free Distance-2 Coloring in Wireless Synchronous Broadcast/Receive Tree Networks. Davide Frey, Hicham Lakhlef, Michel Raynal |
| 2016 | Optimal Multi-taxi Dispatch for Mobile Taxi-Hailing Systems. Guoju Gao, Mingjun Xiao, Zhenhua Zhao |
| 2016 | Optimizing GPU Register Usage: Extensions to OpenACC and Compiler Optimizations. Xiaonan Tian, Dounia Khaldi, Deepak Eachempati, Rengan Xu, Barbara M. Chapman |
| 2016 | PARVMEC: An Efficient, Scalable Implementation of the Variational Moments Equilibrium Code. Sudip K. Seal, Steven P. Hirshman, Andreas Wingen, Robert S. Wilcox, Mark R. Cianciosa, Ezekial A. Unterberg |
| 2016 | PCAF: Scalable, High Precision k-NN Search Using Principal Component Analysis Based Filtering. Huan Feng, David M. Eyers, Steven Mills, Yongwei Wu, Zhiyi Huang |
| 2016 | Parallel Tree Traversal for Nearest Neighbor Query on the GPU. Moohyeon Nam, Jinwoong Kim, Beomseok Nam |
| 2016 | Parallel Two-Dimensional Unstructured Anisotropic Delaunay Mesh Generation of Complex Domains for Aerospace Applications. Juliette Pardue, Andrey N. Chernikov |
| 2016 | Parallel k-Means++ for Multiple Shared-Memory Architectures. Patrick Mackey, Robert R. Lewis |
| 2016 | Partial Flattening: A Compilation Technique for Irregular Nested Parallelism on GPGPUs. Ming-Hsiang Huang, Wuu Yang |
| 2016 | Performance Analysis of GPU-Based Convolutional Neural Networks. Xiaqing Li, Guangyan Zhang, H. Howie Huang, Zhufan Wang, Weimin Zheng |
| 2016 | Performance Boosting Opportunities under Communication Imbalance in Power-Constrained HPC Clusters. Leonardo Piga, Indrani Paul, Wei Huang |
| 2016 | Performance Maximization via Frequency Oscillation on Temperature Constrained Multi-core Processors. Shi Sha, Wujie Wen, Ming Fan, Shaolei Ren, Gang Quan |
| 2016 | Piccolo: A Fast and Efficient Rollback System for Virtual Machine Clusters. Lei Cui, Zhiyu Hao, Chonghua Wang, Haiqiang Fei, Zhenquan Ding |
| 2016 | Programming Techniques for the Automata Processor. Indranil Roy, Ankit Srivastava, Srinivas Aluru |
| 2016 | Proxy-Guided Load Balancing of Graph Processing Workloads on Heterogeneous Clusters. Shuang Song, Meng Li, Xinnian Zheng, Michael LeBeane, Jee Ho Ryoo, Reena Panda, Andreas Gerstlauer, Lizy K. John |
| 2016 | RCHC: A Holistic Runtime System for Concurrent Heterogeneous Computing. Jinsu Park, Woongki Baek |
| 2016 | RMD: A Resemblance and Mergence Based Approach for High Performance Deduplication. Panfeng Zhang, Ping Huang, Xubin He, Hua Wang, Lingyu Yan, Ke Zhou |
| 2016 | ROP: Alleviating Refresh Overheads via Reviving the Memory System in Frozen Cycles. Ping Huang, Wenjie Liu, Kun Tang, Xubin He, Ke Zhou |
| 2016 | RRect: A Novel Server-centric Data Center Network with High Availability. Zhenhua Li, Yuanyuan Yang |
| 2016 | Randomly Optimized Grid Graph for Low-Latency Interconnection Networks. Koji Nakano, Daisuke Takafuji, Satoshi Fujita, Hiroki Matsutani, Ikki Fujiwara, Michihiro Koibuchi |
| 2016 | RegTT: Accelerating Tree Traversals on GPUs by Exploiting Regularities. Feng Zhang, Peng Di, Hao Zhou, Xiangke Liao, Jingling Xue |
| 2016 | RepEx: A Flexible Framework for Scalable Replica Exchange Molecular Dynamics Simulations. Antons Treikalis, André Merzky, Haoyuan Chen, Tai-Sung Lee, Darrin M. York, Shantenu Jha |
| 2016 | Resilient Application Co-scheduling with Processor Redistribution. Anne Benoit, Loic Pottier, Yves Robert |
| 2016 | Run-Time Performance Estimation and Fairness-Oriented Scheduling Policy for Concurrent GPGPU Applications. Qingda Hu, Jiwu Shu, Jie Fan, Youyou Lu |
| 2016 | SWAP-Assembler 2: Optimization of De Novo Genome Assembler at Extreme Scale. Jintao Meng, Sangmin Seo, Pavan Balaji, Yanjie Wei, Bingqiang Wang, Shengzhong Feng |
| 2016 | Scalable Hierarchical Polyhedral Compilation. Benoît Pradelle, Benoît Meister, Muthu Manikandan Baskaran, Athanasios Konstantinidis, Thomas Henretty, Richard Lethin |
| 2016 | Sparse Matrix Format Selection with Multiclass SVM for SpMV on GPU. Akrem Benatia, Weixing Ji, Yizhuo Wang, Feng Shi |
| 2016 | TECH: A Thermal-Aware and Cost Efficient Mechanism for Colocation Demand Response. Ziqi Zhao, Fan Wu, Shaolei Ren, Xiaofeng Gao, Guihai Chen, Yong Cui |
| 2016 | Tetris Write: Exploring More Write Parallelism Considering PCM Asymmetries. Zheng Li, Fang Wang, Dan Feng, Yu Hua, Wei Tong, Jingning Liu, Xiang Liu |
| 2016 | The Case for Cross-Component Power Coordination on Power Bounded Systems. Rong Ge, Xizhou Feng, Yangyang He, Pengfei Zou |
| 2016 | The Future(s) of Transactional Memory. Jingna Zeng, João Pedro Barreto, Seif Haridi, Luís E. T. Rodrigues, Paolo Romano |
| 2016 | Think Global, Act Local: A Buffer Cache Design for Global Ordering and Parallel Processing in the WAFL File System. Peter R. Denz, Matthew Curtis-Maury, Vinay Devadas |
| 2016 | Thread Similarity Matrix: Visualizing Branch Divergence in GPGPU Programs. Zhibin Yu, Lieven Eeckhout, Cheng-Zhong Xu |
| 2016 | Understanding the Architectural Characteristics of EDA Algorithms. Xin Wang, Xiaofeng Ji, Yunping Lu, Yi Li, Weijia Zhou, Weihua Zhang, Wenyun Zhao |