| 2017 | 2017 IEEE International Parallel and Distributed Processing Symposium, IPDPS 2017, Orlando, FL, USA, May 29 - June 2, 2017 |
| 2017 | 26 PFLOPS Stencil Computations for Atmospheric Modeling on Sunway TaihuLight. Yulong Ao, Chao Yang, Xinliang Wang, Wei Xue, Haohuan Fu, Fangfang Liu, Lin Gan, Ping Xu, Wenjing Ma |
| 2017 | A Parallel FastTrack Data Race Detector on Multi-core Systems. Young Wn Song, Yann-Hang Lee |
| 2017 | A Robust Parallel Preconditioner for Indefinite Systems Using Hierarchical Matrices and Randomized Sampling. Pieter Ghysels, Xiaoye Sherry Li, Christopher Gorman, François-Henry Rouet |
| 2017 | A Scalable System Architecture to Addressing the Next Generation of Predictive Simulation Workflows with Coupled Compute and Data Intensive Applications. Mark Seager |
| 2017 | A Scalable and Resilient Microarchitecture Based on Multiport Binding for High-Radix Router Design. Yi Dai, Kefei Wang, Gang Qu, Liquan Xiao, Dezun Dong, Xingyun Qi |
| 2017 | A Work-Efficient Parallel Sparse Matrix-Sparse Vector Multiplication Algorithm. Ariful Azad, Aydin Buluç |
| 2017 | ATM: Approximate Task Memoization in the Runtime System. Iulian Brumar, Marc Casas, Miquel Moretó, Mateo Valero, Gurindar S. Sohi |
| 2017 | Accelerating Graph and Machine Learning Workloads Using a Shared Memory Multicore Architecture with Auxiliary Support for In-hardware Explicit Messaging. Halit Dogan, Farrukh Hijaz, Masab Ahmad, Brian Kahne, Peter Wilson, Omer Khan |
| 2017 | Accelerating Spark Datasets by Inlining Deserialization. Jan Wroblewski, Kazuaki Ishizaki, Hiroshi Inoue, Moriyoshi Ohara |
| 2017 | Accommodating Thread-Level Heterogeneity in Coupled Parallel Applications. Samuel K. Gutierrez, Kei Davis, Dorian C. Arnold, Randal S. Baker, Robert W. Robey, Patrick S. McCormick, Daniel Holladay, Jon A. Dahl, Joe Zerr, Florian Weik, Christoph Junghans |
| 2017 | Aces4: A Platform for Computational Chemistry Calculations with Extremely Large Block-Sparse Arrays. Beverly A. Sanders, Jason N. Byrd, Nakul Jindal, Victor F. Lotrich, Dmitry I. Lyakh, Ajith Perera, Rodney J. Bartlett |
| 2017 | Adaptive Software Caching for Efficient NVRAM Data Persistence. Pengcheng Li, Dhruva R. Chakrabarti, Chen Ding, Liang Yuan |
| 2017 | Addressing Performance Heterogeneity in MapReduce Clusters with Elastic Tasks. Wei Chen, Jia Rao, Xiaobo Zhou |
| 2017 | Algorithms for Hierarchical and Semi-Partitioned Parallel Scheduling. Vincenzo Bonifaci, Gianlorenzo D'Angelo, Alberto Marchetti-Spaccamela |
| 2017 | An Adaptive Core-Specific Runtime for Energy Efficiency. Sridutt Bhalachandra, Allan Porterfield, Stephen L. Olivier, Jan F. Prins |
| 2017 | An N log N Parallel Fast Direct Solver for Kernel Matrices. Chenhan D. Yu, William B. March, George Biros |
| 2017 | Apollo: Reusable Models for Fast, Dynamic Tuning of Input-Dependent Code. David Beckingsale, Olga Pearce, Ignacio Laguna, Todd Gamblin |
| 2017 | Application Level Reordering of Remote Direct Memory Access Operations. Wim Lavrijsen, Costin Iancu |
| 2017 | Approximation Proofs of a Fast and Efficient List Scheduling Algorithm for Task-Based Runtime Systems on Multicores and GPUs. Olivier Beaumont, Lionel Eyraud-Dubois, Suraj Kumar |
| 2017 | Argo NodeOS: Toward Unified Resource Management for Exascale. Swann Perarnau, Judicael A. Zounmevo, Matthieu Dreher, Brian C. Van Essen, Roberto Gioiosa, Kamil Iskra, Maya B. Gokhale, Kazutomo Yoshii, Peter H. Beckman |
| 2017 | Automatic Collapsing of Non-Rectangular Loops. Philippe Clauss, Ervin Altintas, Matthieu Kuhn |
| 2017 | Automatic-Signal Monitors with Multi-object Synchronization. Wei-Lun Hung, Vijay K. Garg |
| 2017 | Autonomic Resource Management for Program Orchestration in Large-Scale Data Analysis. Masahiro Tanaka, Kenjiro Taura, Kentaro Torisawa |
| 2017 | Autotuning Stencil Computations with Structural Ordinal Regression Learning. Biagio Cosenza, Juan José Durillo, Stefano Ermon, Ben H. H. Juurlink |
| 2017 | Bidiagonalization and R-Bidiagonalization: Parallel Tiled Algorithms, Critical Paths and Distributed-Memory Implementation. Mathieu Faverge, Julien Langou, Yves Robert, Jack J. Dongarra |
| 2017 | Bounded Reordering Allows Efficient Reliable Message Transmission. Keishla D. Ortiz-Lopez, Jennifer L. Welch |
| 2017 | Capability Models for Manycore Memory Systems: A Case-Study with Xeon Phi KNL. Sabela Ramos, Torsten Hoefler |
| 2017 | Characterizing and Modeling Power and Energy for Extreme-Scale In-Situ Visualization. Vignesh Adhinarayanan, Wu-chun Feng, David H. Rogers, James P. Ahrens, Scott Pakin |
| 2017 | Clustering Throughput Optimization on the GPU. Michael G. Gowanlock, Cody M. Rude, David M. Blair, Justin D. Li, Victor Pankratius |
| 2017 | Co-Run Scheduling with Power Cap on Integrated CPU-GPU Systems. Qi Zhu, Bo Wu, Xipeng Shen, Li Shen, Zhiying Wang |
| 2017 | Communication Optimization on GPU: A Case Study of Sequence Alignment Algorithms. Jie Wang, Xinfeng Xie, Jason Cong |
| 2017 | Communication-Avoiding Parallel Algorithms for Solving Triangular Systems of Linear Equations. Tobias Wicky, Edgar Solomonik, Torsten Hoefler |
| 2017 | Community Detection on the GPU. Md. Naim, Fredrik Manne, Mahantesh Halappanavar, Antonino Tumeo |
| 2017 | Computational Challenges in Constructing the Tree of Life. Tandy J. Warnow |
| 2017 | Container-Based Cloud Platform for Mobile Computation Offloading. Song Wu, Chao Niu, Jia Rao, Hai Jin, Xiaohai Dai |
| 2017 | Content-Aware Non-Volatile Cache Replacement. Qi Zeng, Jih-Kwon Peir |
| 2017 | Cooling-Aware Job Scheduling and Node Allocation for Overprovisioned HPC Systems. Thang Cao, Wei Huang, Yuan He, Masaaki Kondo |
| 2017 | Corrected Gossip Algorithms for Fast Reliable Broadcast on Unreliable Systems. Torsten Hoefler, Amnon Barak, Amnon Shiloh, Zvi Drezner |
| 2017 | DC Jiyan Sun, Yan Zhang, Xin Wang, Shihan Xiao, Zhen Xu, Hongjing Wu, Xin Chen, Yanni Han |
| 2017 | DEFT-Cache: A Cost-Effective and Highly Reliable SSD Cache for RAID Storage. Jiguang Wan, Wei Wu, Ling Zhan, Qing Yang, Xiaoyang Qu, Changsheng Xie |
| 2017 | DR-BW: Identifying Bandwidth Contention in NUMA Architectures with Supervised Learning. Hao Xu, Shasha Wen, Alfredo Giménez, Todd Gamblin, Xu Liu |
| 2017 | Data Centric Performance Measurement Techniques for Chapel Programs. Hui Zhang, Jeffrey K. Hollingsworth |
| 2017 | Design and Implementation of Papyrus: Parallel Aggregate Persistent Storage. Jungwon Kim, Kittisak Sajjapongse, Seyong Lee, Jeffrey S. Vetter |
| 2017 | Directive-Based Partitioning and Pipelining for Graphics Processing Units. Xuewen Cui, Thomas R. W. Scogland, Bronis R. de Supinski, Wu-chun Feng |
| 2017 | Distributed Vehicle Routing Approximation. Akhil Krishnan, Mikhail Markov, Borzoo Bonakdarpour |
| 2017 | Dynamic Adaptation in Wireless Networks Under Comprehensive Interference via Carrier Sense. Dongxiao Yu, Yuexuan Wang, Tigran Tonoyan, Magnús M. Halldórsson |
| 2017 | Dynamic Memory-Aware Task-Tree Scheduling. Guillaume Aupy, Clement Brasseur, Loris Marchal |
| 2017 | E^2MC: Entropy Encoding Based Memory Compression for GPUs. Sohan Lal, Jan Lucas, Ben H. H. Juurlink |
| 2017 | Efficient and Deterministic Scheduling for Parallel State Machine Replication. Odorico Machado Mendizabal, Ruda S. T. De Moura, Fernando Luís Dotti, Fernando Pedone |
| 2017 | Elastic Consistent Hashing for Distributed Storage Systems. Wei Xie, Yong Chen |
| 2017 | Elastic Data Compression with Improved Performance and Space Efficiency for Flash-Based Storage Systems. Bo Mao, Hong Jiang, Suzhen Wu, Yaodong Yang, Zaifa Xi |
| 2017 | Elastic-Cache: GPU Cache Architecture for Efficient Fine- and Coarse-Grained Cache-Line Management. Bingchao Li, Jizhou Sun, Murali Annavaram, Nam Sung Kim |
| 2017 | Eliminating Irregularities of Protein Sequence Search on Multicore Architectures. Jing Zhang, Sanchit Misra, Hao Wang, Wu-chun Feng |
| 2017 | Enhancing Datacenter Resource Management through Temporal Logic Constraints. Hao He, Jiang Hu, Dilma Da Silva |
| 2017 | Exploring DataVortex Systems for Irregular Applications. Roberto Gioiosa, Antonino Tumeo, Jian Yin, Thomas Warfel, David J. Haglin, Santiago Betelú |
| 2017 | FFQ: A Fast Single-Producer/Multiple-Consumer Concurrent FIFO Queue. Sergei Arnautov, Pascal Felber, Christof Fetzer, Bohdan Trach |
| 2017 | Fault-Tolerant Online Packet Scheduling on Parallel Channels. Pawel Garncarek, Tomasz Jurdzinski, Krzysztof Lorys |
| 2017 | Fault-Tolerant Robot Gathering Problems on Graphs With Arbitrary Appearing Times. Sergio Rajsbaum, Armando Castañeda, David Flores-Peñaloza, Manuel Alcantara |
| 2017 | FlexVC: Flexible Virtual Channel Management in Low-Diameter Networks. Pablo Fuentes, Enrique Vallejo, Ramón Beivide, Cyriel Minkenberg, Mateo Valero |
| 2017 | Fly-Over: A Light-Weight Distributed Power-Gating Mechanism for Energy-Efficient Networks-on-Chip. Rahul Boyapati, Jiayi Huang, Ningyuan Wang, Kyung Hoon Kim, Ki Hwan Yum, Eun Jung Kim |
| 2017 | General Purpose Task-Dependence Management Hardware for Task-Based Dataflow Programming Models. Xubin Tan, Jaume Bosch, Miquel Vidal, Carlos Álvarez, Daniel Jiménez-González, Eduard Ayguadé, Mateo Valero |
| 2017 | Generating Families of Practical Fast Matrix Multiplication Algorithms. Jianyu Huang, Leslie Rice, Devin A. Matthews, Robert A. van de Geijn |
| 2017 | Generating Performance Models for Irregular Applications. Ryan D. Friese, Nathan R. Tallent, Abhinav Vishnu, Darren J. Kerbyson, Adolfy Hoisie |
| 2017 | HOMP: Automated Distribution of Parallel Loops and Data in Highly Parallel Accelerator-Based Systems. Yonghong Yan, Jiawen Liu, Kirk W. Cameron, Mariam Umar |
| 2017 | High-Performance Virtual Machine Migration Framework for MPI Applications on SR-IOV Enabled InfiniBand Clusters. Jie Zhang, Xiaoyi Lu, Dhabaleswar K. Panda |
| 2017 | Image-Domain Gridding on Graphics Processors. Bram Veenboer, Matthias Petschow, John W. Romein |
| 2017 | Improving the Integration of Task Nesting and Dependencies in OpenMP. Josep M. Pérez, Vicenç Beltran, Jesús Labarta, Eduard Ayguadé |
| 2017 | Language-Based Optimizations for Persistence on Nonvolatile Main Memory Systems. Joel Edward Denny, Seyong Lee, Jeffrey S. Vetter |
| 2017 | Large Scale Manycore-Aware PIC Simulation with Efficient Particle Binning. Hiroshi Nakashima, Yoshiki Summura, Keisuke Kikura, Yohei Miyake |
| 2017 | Leader Election in Asymmetric Labeled Unidirectional Rings. Karine Altisen, Ajoy K. Datta, Stéphane Devismes, Anaïs Durand, Lawrence L. Larmore |
| 2017 | Leader Election in a Smartphone Peer-to-Peer Network. Calvin Newport |
| 2017 | Localized Fault Recovery for Nested Fork-Join Programs. Gokcen Kestor, Sriram Krishnamoorthy, Wenjing Ma |
| 2017 | MOCHA: Morphable Locality and Compression Aware Architecture for Convolutional Neural Networks. Syed Mohammad Asad Hassan Jafri, Ahmed Hemani, Kolin Paul, Naeem Abbas |
| 2017 | MRapid: An Efficient Short Job Optimizer on Hadoop. Hong Zhang, Hai Huang, Liqiang Wang |
| 2017 | Memory Compression Techniques for Network Address Management in MPI. Yanfei Guo, Charles J. Archer, Michael Blocksome, Scott Parker, Wesley Bland, Ken Raffenetti, Pavan Balaji |
| 2017 | MetaKV: A Key-Value Store for Metadata Management of Distributed Burst Buffers. Teng Wang, Adam Moody, Yue Zhu, Kathryn M. Mohror, Kento Sato, Tanzima Z. Islam, Weikuan Yu |
| 2017 | Mimir: Memory-Efficient and Scalable MapReduce for Large Supercomputing Systems. Tao Gao, Yanfei Guo, Boyu Zhang, Pietro Cicotti, Yutong Lu, Pavan Balaji, Michela Taufer |
| 2017 | Model-Driven Sparse CP Decomposition for Higher-Order Tensors. Jiajia Li, Jee Choi, Ioakeim Perros, Jimeng Sun, Richard W. Vuduc |
| 2017 | Monitoring Properties of Large, Distributed, Dynamic Graphs. Gal Yehuda, Daniel Keren, Islam Akaria |
| 2017 | Multi-GPU Graph Analytics. Yuechao Pan, Yangzihao Wang, Yuduo Wu, Carl Yang, John D. Owens |
| 2017 | Multigrain Parallelism: Bridging Coarse-Grain Parallel Programs and Fine-Grain Event-Driven Multithreading. Jaime Arteaga Molina, Stéphane Zuckerman, Guang R. Gao |
| 2017 | NVIDIA Deep Learning Tutorial. Julie Bernauer |
| 2017 | O(log N)-Time Complete Visibility for Asynchronous Robots with Lights. Gokarna Sharma, Ramachandran Vaidyanathan, Jerry L. Trahan, Costas Busch, Suresh Rai |
| 2017 | On Optimizing Distributed Tucker Decomposition for Dense Tensors. Venkatesan T. Chakaravarthy, Jee W. Choi, Douglas J. Joseph, Xing Liu, Prakash Murali, Yogish Sabharwal, Dheeraj Sreedhar |
| 2017 | One-Way Wave Equation Migration at Scale on GPUs Using Directive Based Programming. Kshitij Mehta, Maxime R. Hugues, Oscar R. Hernandez, David E. Bernholdt, Henri Calandra |
| 2017 | Optimal Algorithms for a Mesh-Connected Computer with Limited Additional Global Bandwidth. Yujie An, Quentin F. Stout |
| 2017 | Optimization and Parallelization of B-Spline Based Orbital Evaluations in QMC on Multi/Many-Core Shared Memory Processors. Amrita Mathuriya, Ye Luo, Anouar Benali, Luke Shulenburger, Jeongnim Kim |
| 2017 | PUNAS: A Parallel Ungapped-Alignment-Featured Seed Verification Algorithm for Next-Generation Sequencing Read Alignment. Yuandong Chan, Kai Xu, Haidong Lan, Weiguo Liu, Yongchao Liu, Bertil Schmidt |
| 2017 | PaPar: A Parallel Data Partitioning Framework for Big Data Applications. Hao Wang, Jing Zhang, Da Zhang, Sarunya Pumma, Wu-chun Feng |
| 2017 | Parallel Construction of Suffix Trees and the All-Nearest-Smaller-Values Problem. Patrick Flick, Srinivas Aluru |
| 2017 | Parallelism and Garbage Collection Aware I/O Scheduler with Improved SSD Performance. Jiayang Guo, Yiming Hu, Bo Mao, Suzhen Wu |
| 2017 | Partitioning Low-Diameter Networks to Eliminate Inter-Job Interference. Nikhil Jain, Abhinav Bhatele, Xiang Ni, Todd Gamblin, Laxmikant V. Kalé |
| 2017 | Partitioning Trillion-Edge Graphs in Minutes. George M. Slota, Sivasankaran Rajamanickam, Karen D. Devine, Kamesh Madduri |
| 2017 | PhiOpenSSL: Using the Xeon Phi Coprocessor for Efficient Cryptographic Calculations. Shun Yao, Dantong Yu |
| 2017 | Power Efficient Sharing-Aware GPU Data Management. Abdulaziz Tabbakh, Murali Annavaram, Xuehai Qian |
| 2017 | Production Hardware Overprovisioning: Real-World Performance Optimization Using an Extensible Power-Aware Resource Management Framework. Ryuichi Sakamoto, Thang Cao, Masaaki Kondo, Koji Inoue, Masatsugu Ueda, Tapasya Patki, Daniel A. Ellsworth, Barry Rountree, Martin Schulz |
| 2017 | Proximity-Aware Balanced Allocations in Cache Networks. Ali Pourmiri, Mahdi Jafari Siavoshani, Seyed Pooya Shariatpanahi |
| 2017 | RCube: A Power Efficient and Highly Available Network for Data Centers. Zhenhua Li, Yuanyuan Yang |
| 2017 | Rational Fair Consensus in the Gossip Model. Andrea Clementi, Luciano Gualà, Guido Proietti, Giacomo Scornavacca |
| 2017 | Reducing Pagerank Communication via Propagation Blocking. Scott Beamer, Krste Asanovic, David A. Patterson |
| 2017 | Relaxations for High-Performance Message Passing on Massively Parallel SIMT Processors. Benjamin Klenk, Holger Fröning, Hans Eberle, Larry Dennison |
| 2017 | Respin: Rethinking Near-Threshold Multiprocessor Design with Non-volatile Memory. Xiang Pan, Anys Bacha, Radu Teodorescu |
| 2017 | Runtime Aware Architectures. Mateo Valero |
| 2017 | SWhybrid: A Hybrid-Parallel Framework for Large-Scale Protein Sequence Database Search. Haidong Lan, Weiguo Liu, Yongchao Liu, Bertil Schmidt |
| 2017 | ScalaIOExtrap: Elastic I/O Tracing and Extrapolation. Xiaoqing Luo, Frank Mueller, Philip H. Carns, Jonathan Jenkins, Robert Latham, Robert B. Ross, Shane Snyder |
| 2017 | Scalable Graph Traversal on Sunway TaihuLight with Ten Million Cores. Heng Lin, Xiongchao Tang, Bowen Yu, Youwei Zhuo, Wenguang Chen, Jidong Zhai, Wanwang Yin, Weimin Zheng |
| 2017 | Scalable Lock-Free Vector with Combining. Ivan Walulya, Philippas Tsigas |
| 2017 | Significantly Improving Lossy Compression for Scientific Data Sets Based on Multidimensional Prediction and Error-Controlled Quantization. Dingwen Tao, Sheng Di, Zizhong Chen, Franck Cappello |
| 2017 | SimProf: A Sampling Framework for Data Analytic Workloads. Jen-Cheng Huang, Lifeng Nai, Pranith Kumar, Hyojong Kim, Hyesoon Kim |
| 2017 | Similarity Search on Automata Processors. Vincent T. Lee, Justin Kotalik, Carlo C. del Mundo, Armin Alaghi, Luis Ceze, Mark Oskin |
| 2017 | SlimSell: A Vectorizable Graph Representation for Breadth-First Search. Maciej Besta, Florian Marending, Edgar Solomonik, Torsten Hoefler |
| 2017 | Sparse Tensor Factorization on Many-Core Processors with High-Bandwidth Memory. Shaden Smith, Jongsoo Park, George Karypis |
| 2017 | The Reverse Cuthill-McKee Algorithm in Distributed-Memory. Ariful Azad, Mathias Jacquelin, Aydin Buluç, Esmond G. Ng |
| 2017 | The SEPO Model of Computation to Enable Larger-Than-Memory Hash Tables for GPU-Accelerated Big Data Analytics. Reza Mokhtari, Michael Stumm |
| 2017 | Tight Load Balancing Via Randomized Local Search. Petra Berenbrink, Peter Kling, Christopher Liaw, Abbas Mehrabian |
| 2017 | Toucan - A Translator for Communication Tolerant MPI Applications. Sergio M. Martin, Marsha J. Berger, Scott B. Baden |
| 2017 | Towards Highly scalable Ab Initio Molecular Dynamics (AIMD) Simulations on the Intel Knights Landing Manycore Processor. Mathias Jacquelin, Wibe A. de Jong, Eric J. Bylaska |
| 2017 | Transparent Caching for RMA Systems. Salvatore Di Girolamo, Flavio Vella, Torsten Hoefler |
| 2017 | When Neurons Fail. El Mahdi El Mhamdi, Rachid Guerraoui |
| 2017 | swDNN: A Library for Accelerating Deep Learning Applications on Sunway TaihuLight. Jiarui Fang, Haohuan Fu, Wenlai Zhao, Bingwei Chen, Weijie Zheng, Guangwen Yang |