IPDPS A

121 papers

YearTitle / Authors
20172017 IEEE International Parallel and Distributed Processing Symposium, IPDPS 2017, Orlando, FL, USA, May 29 - June 2, 2017
201726 PFLOPS Stencil Computations for Atmospheric Modeling on Sunway TaihuLight.
Yulong Ao, Chao Yang, Xinliang Wang, Wei Xue, Haohuan Fu, Fangfang Liu, Lin Gan, Ping Xu, Wenjing Ma
2017A Parallel FastTrack Data Race Detector on Multi-core Systems.
Young Wn Song, Yann-Hang Lee
2017A Robust Parallel Preconditioner for Indefinite Systems Using Hierarchical Matrices and Randomized Sampling.
Pieter Ghysels, Xiaoye Sherry Li, Christopher Gorman, François-Henry Rouet
2017A Scalable System Architecture to Addressing the Next Generation of Predictive Simulation Workflows with Coupled Compute and Data Intensive Applications.
Mark Seager
2017A Scalable and Resilient Microarchitecture Based on Multiport Binding for High-Radix Router Design.
Yi Dai, Kefei Wang, Gang Qu, Liquan Xiao, Dezun Dong, Xingyun Qi
2017A Work-Efficient Parallel Sparse Matrix-Sparse Vector Multiplication Algorithm.
Ariful Azad, Aydin Buluç
2017ATM: Approximate Task Memoization in the Runtime System.
Iulian Brumar, Marc Casas, Miquel Moretó, Mateo Valero, Gurindar S. Sohi
2017Accelerating Graph and Machine Learning Workloads Using a Shared Memory Multicore Architecture with Auxiliary Support for In-hardware Explicit Messaging.
Halit Dogan, Farrukh Hijaz, Masab Ahmad, Brian Kahne, Peter Wilson, Omer Khan
2017Accelerating Spark Datasets by Inlining Deserialization.
Jan Wroblewski, Kazuaki Ishizaki, Hiroshi Inoue, Moriyoshi Ohara
2017Accommodating Thread-Level Heterogeneity in Coupled Parallel Applications.
Samuel K. Gutierrez, Kei Davis, Dorian C. Arnold, Randal S. Baker, Robert W. Robey, Patrick S. McCormick, Daniel Holladay, Jon A. Dahl, Joe Zerr, Florian Weik, Christoph Junghans
2017Aces4: A Platform for Computational Chemistry Calculations with Extremely Large Block-Sparse Arrays.
Beverly A. Sanders, Jason N. Byrd, Nakul Jindal, Victor F. Lotrich, Dmitry I. Lyakh, Ajith Perera, Rodney J. Bartlett
2017Adaptive Software Caching for Efficient NVRAM Data Persistence.
Pengcheng Li, Dhruva R. Chakrabarti, Chen Ding, Liang Yuan
2017Addressing Performance Heterogeneity in MapReduce Clusters with Elastic Tasks.
Wei Chen, Jia Rao, Xiaobo Zhou
2017Algorithms for Hierarchical and Semi-Partitioned Parallel Scheduling.
Vincenzo Bonifaci, Gianlorenzo D'Angelo, Alberto Marchetti-Spaccamela
2017An Adaptive Core-Specific Runtime for Energy Efficiency.
Sridutt Bhalachandra, Allan Porterfield, Stephen L. Olivier, Jan F. Prins
2017An N log N Parallel Fast Direct Solver for Kernel Matrices.
Chenhan D. Yu, William B. March, George Biros
2017Apollo: Reusable Models for Fast, Dynamic Tuning of Input-Dependent Code.
David Beckingsale, Olga Pearce, Ignacio Laguna, Todd Gamblin
2017Application Level Reordering of Remote Direct Memory Access Operations.
Wim Lavrijsen, Costin Iancu
2017Approximation Proofs of a Fast and Efficient List Scheduling Algorithm for Task-Based Runtime Systems on Multicores and GPUs.
Olivier Beaumont, Lionel Eyraud-Dubois, Suraj Kumar
2017Argo NodeOS: Toward Unified Resource Management for Exascale.
Swann Perarnau, Judicael A. Zounmevo, Matthieu Dreher, Brian C. Van Essen, Roberto Gioiosa, Kamil Iskra, Maya B. Gokhale, Kazutomo Yoshii, Peter H. Beckman
2017Automatic Collapsing of Non-Rectangular Loops.
Philippe Clauss, Ervin Altintas, Matthieu Kuhn
2017Automatic-Signal Monitors with Multi-object Synchronization.
Wei-Lun Hung, Vijay K. Garg
2017Autonomic Resource Management for Program Orchestration in Large-Scale Data Analysis.
Masahiro Tanaka, Kenjiro Taura, Kentaro Torisawa
2017Autotuning Stencil Computations with Structural Ordinal Regression Learning.
Biagio Cosenza, Juan José Durillo, Stefano Ermon, Ben H. H. Juurlink
2017Bidiagonalization and R-Bidiagonalization: Parallel Tiled Algorithms, Critical Paths and Distributed-Memory Implementation.
Mathieu Faverge, Julien Langou, Yves Robert, Jack J. Dongarra
2017Bounded Reordering Allows Efficient Reliable Message Transmission.
Keishla D. Ortiz-Lopez, Jennifer L. Welch
2017Capability Models for Manycore Memory Systems: A Case-Study with Xeon Phi KNL.
Sabela Ramos, Torsten Hoefler
2017Characterizing and Modeling Power and Energy for Extreme-Scale In-Situ Visualization.
Vignesh Adhinarayanan, Wu-chun Feng, David H. Rogers, James P. Ahrens, Scott Pakin
2017Clustering Throughput Optimization on the GPU.
Michael G. Gowanlock, Cody M. Rude, David M. Blair, Justin D. Li, Victor Pankratius
2017Co-Run Scheduling with Power Cap on Integrated CPU-GPU Systems.
Qi Zhu, Bo Wu, Xipeng Shen, Li Shen, Zhiying Wang
2017Communication Optimization on GPU: A Case Study of Sequence Alignment Algorithms.
Jie Wang, Xinfeng Xie, Jason Cong
2017Communication-Avoiding Parallel Algorithms for Solving Triangular Systems of Linear Equations.
Tobias Wicky, Edgar Solomonik, Torsten Hoefler
2017Community Detection on the GPU.
Md. Naim, Fredrik Manne, Mahantesh Halappanavar, Antonino Tumeo
2017Computational Challenges in Constructing the Tree of Life.
Tandy J. Warnow
2017Container-Based Cloud Platform for Mobile Computation Offloading.
Song Wu, Chao Niu, Jia Rao, Hai Jin, Xiaohai Dai
2017Content-Aware Non-Volatile Cache Replacement.
Qi Zeng, Jih-Kwon Peir
2017Cooling-Aware Job Scheduling and Node Allocation for Overprovisioned HPC Systems.
Thang Cao, Wei Huang, Yuan He, Masaaki Kondo
2017Corrected Gossip Algorithms for Fast Reliable Broadcast on Unreliable Systems.
Torsten Hoefler, Amnon Barak, Amnon Shiloh, Zvi Drezner
2017DC
Jiyan Sun, Yan Zhang, Xin Wang, Shihan Xiao, Zhen Xu, Hongjing Wu, Xin Chen, Yanni Han
2017DEFT-Cache: A Cost-Effective and Highly Reliable SSD Cache for RAID Storage.
Jiguang Wan, Wei Wu, Ling Zhan, Qing Yang, Xiaoyang Qu, Changsheng Xie
2017DR-BW: Identifying Bandwidth Contention in NUMA Architectures with Supervised Learning.
Hao Xu, Shasha Wen, Alfredo Giménez, Todd Gamblin, Xu Liu
2017Data Centric Performance Measurement Techniques for Chapel Programs.
Hui Zhang, Jeffrey K. Hollingsworth
2017Design and Implementation of Papyrus: Parallel Aggregate Persistent Storage.
Jungwon Kim, Kittisak Sajjapongse, Seyong Lee, Jeffrey S. Vetter
2017Directive-Based Partitioning and Pipelining for Graphics Processing Units.
Xuewen Cui, Thomas R. W. Scogland, Bronis R. de Supinski, Wu-chun Feng
2017Distributed Vehicle Routing Approximation.
Akhil Krishnan, Mikhail Markov, Borzoo Bonakdarpour
2017Dynamic Adaptation in Wireless Networks Under Comprehensive Interference via Carrier Sense.
Dongxiao Yu, Yuexuan Wang, Tigran Tonoyan, Magnús M. Halldórsson
2017Dynamic Memory-Aware Task-Tree Scheduling.
Guillaume Aupy, Clement Brasseur, Loris Marchal
2017E^2MC: Entropy Encoding Based Memory Compression for GPUs.
Sohan Lal, Jan Lucas, Ben H. H. Juurlink
2017Efficient and Deterministic Scheduling for Parallel State Machine Replication.
Odorico Machado Mendizabal, Ruda S. T. De Moura, Fernando Luís Dotti, Fernando Pedone
2017Elastic Consistent Hashing for Distributed Storage Systems.
Wei Xie, Yong Chen
2017Elastic Data Compression with Improved Performance and Space Efficiency for Flash-Based Storage Systems.
Bo Mao, Hong Jiang, Suzhen Wu, Yaodong Yang, Zaifa Xi
2017Elastic-Cache: GPU Cache Architecture for Efficient Fine- and Coarse-Grained Cache-Line Management.
Bingchao Li, Jizhou Sun, Murali Annavaram, Nam Sung Kim
2017Eliminating Irregularities of Protein Sequence Search on Multicore Architectures.
Jing Zhang, Sanchit Misra, Hao Wang, Wu-chun Feng
2017Enhancing Datacenter Resource Management through Temporal Logic Constraints.
Hao He, Jiang Hu, Dilma Da Silva
2017Exploring DataVortex Systems for Irregular Applications.
Roberto Gioiosa, Antonino Tumeo, Jian Yin, Thomas Warfel, David J. Haglin, Santiago Betelú
2017FFQ: A Fast Single-Producer/Multiple-Consumer Concurrent FIFO Queue.
Sergei Arnautov, Pascal Felber, Christof Fetzer, Bohdan Trach
2017Fault-Tolerant Online Packet Scheduling on Parallel Channels.
Pawel Garncarek, Tomasz Jurdzinski, Krzysztof Lorys
2017Fault-Tolerant Robot Gathering Problems on Graphs With Arbitrary Appearing Times.
Sergio Rajsbaum, Armando Castañeda, David Flores-Peñaloza, Manuel Alcantara
2017FlexVC: Flexible Virtual Channel Management in Low-Diameter Networks.
Pablo Fuentes, Enrique Vallejo, Ramón Beivide, Cyriel Minkenberg, Mateo Valero
2017Fly-Over: A Light-Weight Distributed Power-Gating Mechanism for Energy-Efficient Networks-on-Chip.
Rahul Boyapati, Jiayi Huang, Ningyuan Wang, Kyung Hoon Kim, Ki Hwan Yum, Eun Jung Kim
2017General Purpose Task-Dependence Management Hardware for Task-Based Dataflow Programming Models.
Xubin Tan, Jaume Bosch, Miquel Vidal, Carlos Álvarez, Daniel Jiménez-González, Eduard Ayguadé, Mateo Valero
2017Generating Families of Practical Fast Matrix Multiplication Algorithms.
Jianyu Huang, Leslie Rice, Devin A. Matthews, Robert A. van de Geijn
2017Generating Performance Models for Irregular Applications.
Ryan D. Friese, Nathan R. Tallent, Abhinav Vishnu, Darren J. Kerbyson, Adolfy Hoisie
2017HOMP: Automated Distribution of Parallel Loops and Data in Highly Parallel Accelerator-Based Systems.
Yonghong Yan, Jiawen Liu, Kirk W. Cameron, Mariam Umar
2017High-Performance Virtual Machine Migration Framework for MPI Applications on SR-IOV Enabled InfiniBand Clusters.
Jie Zhang, Xiaoyi Lu, Dhabaleswar K. Panda
2017Image-Domain Gridding on Graphics Processors.
Bram Veenboer, Matthias Petschow, John W. Romein
2017Improving the Integration of Task Nesting and Dependencies in OpenMP.
Josep M. Pérez, Vicenç Beltran, Jesús Labarta, Eduard Ayguadé
2017Language-Based Optimizations for Persistence on Nonvolatile Main Memory Systems.
Joel Edward Denny, Seyong Lee, Jeffrey S. Vetter
2017Large Scale Manycore-Aware PIC Simulation with Efficient Particle Binning.
Hiroshi Nakashima, Yoshiki Summura, Keisuke Kikura, Yohei Miyake
2017Leader Election in Asymmetric Labeled Unidirectional Rings.
Karine Altisen, Ajoy K. Datta, Stéphane Devismes, Anaïs Durand, Lawrence L. Larmore
2017Leader Election in a Smartphone Peer-to-Peer Network.
Calvin Newport
2017Localized Fault Recovery for Nested Fork-Join Programs.
Gokcen Kestor, Sriram Krishnamoorthy, Wenjing Ma
2017MOCHA: Morphable Locality and Compression Aware Architecture for Convolutional Neural Networks.
Syed Mohammad Asad Hassan Jafri, Ahmed Hemani, Kolin Paul, Naeem Abbas
2017MRapid: An Efficient Short Job Optimizer on Hadoop.
Hong Zhang, Hai Huang, Liqiang Wang
2017Memory Compression Techniques for Network Address Management in MPI.
Yanfei Guo, Charles J. Archer, Michael Blocksome, Scott Parker, Wesley Bland, Ken Raffenetti, Pavan Balaji
2017MetaKV: A Key-Value Store for Metadata Management of Distributed Burst Buffers.
Teng Wang, Adam Moody, Yue Zhu, Kathryn M. Mohror, Kento Sato, Tanzima Z. Islam, Weikuan Yu
2017Mimir: Memory-Efficient and Scalable MapReduce for Large Supercomputing Systems.
Tao Gao, Yanfei Guo, Boyu Zhang, Pietro Cicotti, Yutong Lu, Pavan Balaji, Michela Taufer
2017Model-Driven Sparse CP Decomposition for Higher-Order Tensors.
Jiajia Li, Jee Choi, Ioakeim Perros, Jimeng Sun, Richard W. Vuduc
2017Monitoring Properties of Large, Distributed, Dynamic Graphs.
Gal Yehuda, Daniel Keren, Islam Akaria
2017Multi-GPU Graph Analytics.
Yuechao Pan, Yangzihao Wang, Yuduo Wu, Carl Yang, John D. Owens
2017Multigrain Parallelism: Bridging Coarse-Grain Parallel Programs and Fine-Grain Event-Driven Multithreading.
Jaime Arteaga Molina, Stéphane Zuckerman, Guang R. Gao
2017NVIDIA Deep Learning Tutorial.
Julie Bernauer
2017O(log N)-Time Complete Visibility for Asynchronous Robots with Lights.
Gokarna Sharma, Ramachandran Vaidyanathan, Jerry L. Trahan, Costas Busch, Suresh Rai
2017On Optimizing Distributed Tucker Decomposition for Dense Tensors.
Venkatesan T. Chakaravarthy, Jee W. Choi, Douglas J. Joseph, Xing Liu, Prakash Murali, Yogish Sabharwal, Dheeraj Sreedhar
2017One-Way Wave Equation Migration at Scale on GPUs Using Directive Based Programming.
Kshitij Mehta, Maxime R. Hugues, Oscar R. Hernandez, David E. Bernholdt, Henri Calandra
2017Optimal Algorithms for a Mesh-Connected Computer with Limited Additional Global Bandwidth.
Yujie An, Quentin F. Stout
2017Optimization and Parallelization of B-Spline Based Orbital Evaluations in QMC on Multi/Many-Core Shared Memory Processors.
Amrita Mathuriya, Ye Luo, Anouar Benali, Luke Shulenburger, Jeongnim Kim
2017PUNAS: A Parallel Ungapped-Alignment-Featured Seed Verification Algorithm for Next-Generation Sequencing Read Alignment.
Yuandong Chan, Kai Xu, Haidong Lan, Weiguo Liu, Yongchao Liu, Bertil Schmidt
2017PaPar: A Parallel Data Partitioning Framework for Big Data Applications.
Hao Wang, Jing Zhang, Da Zhang, Sarunya Pumma, Wu-chun Feng
2017Parallel Construction of Suffix Trees and the All-Nearest-Smaller-Values Problem.
Patrick Flick, Srinivas Aluru
2017Parallelism and Garbage Collection Aware I/O Scheduler with Improved SSD Performance.
Jiayang Guo, Yiming Hu, Bo Mao, Suzhen Wu
2017Partitioning Low-Diameter Networks to Eliminate Inter-Job Interference.
Nikhil Jain, Abhinav Bhatele, Xiang Ni, Todd Gamblin, Laxmikant V. Kalé
2017Partitioning Trillion-Edge Graphs in Minutes.
George M. Slota, Sivasankaran Rajamanickam, Karen D. Devine, Kamesh Madduri
2017PhiOpenSSL: Using the Xeon Phi Coprocessor for Efficient Cryptographic Calculations.
Shun Yao, Dantong Yu
2017Power Efficient Sharing-Aware GPU Data Management.
Abdulaziz Tabbakh, Murali Annavaram, Xuehai Qian
2017Production Hardware Overprovisioning: Real-World Performance Optimization Using an Extensible Power-Aware Resource Management Framework.
Ryuichi Sakamoto, Thang Cao, Masaaki Kondo, Koji Inoue, Masatsugu Ueda, Tapasya Patki, Daniel A. Ellsworth, Barry Rountree, Martin Schulz
2017Proximity-Aware Balanced Allocations in Cache Networks.
Ali Pourmiri, Mahdi Jafari Siavoshani, Seyed Pooya Shariatpanahi
2017RCube: A Power Efficient and Highly Available Network for Data Centers.
Zhenhua Li, Yuanyuan Yang
2017Rational Fair Consensus in the Gossip Model.
Andrea Clementi, Luciano Gualà, Guido Proietti, Giacomo Scornavacca
2017Reducing Pagerank Communication via Propagation Blocking.
Scott Beamer, Krste Asanovic, David A. Patterson
2017Relaxations for High-Performance Message Passing on Massively Parallel SIMT Processors.
Benjamin Klenk, Holger Fröning, Hans Eberle, Larry Dennison
2017Respin: Rethinking Near-Threshold Multiprocessor Design with Non-volatile Memory.
Xiang Pan, Anys Bacha, Radu Teodorescu
2017Runtime Aware Architectures.
Mateo Valero
2017SWhybrid: A Hybrid-Parallel Framework for Large-Scale Protein Sequence Database Search.
Haidong Lan, Weiguo Liu, Yongchao Liu, Bertil Schmidt
2017ScalaIOExtrap: Elastic I/O Tracing and Extrapolation.
Xiaoqing Luo, Frank Mueller, Philip H. Carns, Jonathan Jenkins, Robert Latham, Robert B. Ross, Shane Snyder
2017Scalable Graph Traversal on Sunway TaihuLight with Ten Million Cores.
Heng Lin, Xiongchao Tang, Bowen Yu, Youwei Zhuo, Wenguang Chen, Jidong Zhai, Wanwang Yin, Weimin Zheng
2017Scalable Lock-Free Vector with Combining.
Ivan Walulya, Philippas Tsigas
2017Significantly Improving Lossy Compression for Scientific Data Sets Based on Multidimensional Prediction and Error-Controlled Quantization.
Dingwen Tao, Sheng Di, Zizhong Chen, Franck Cappello
2017SimProf: A Sampling Framework for Data Analytic Workloads.
Jen-Cheng Huang, Lifeng Nai, Pranith Kumar, Hyojong Kim, Hyesoon Kim
2017Similarity Search on Automata Processors.
Vincent T. Lee, Justin Kotalik, Carlo C. del Mundo, Armin Alaghi, Luis Ceze, Mark Oskin
2017SlimSell: A Vectorizable Graph Representation for Breadth-First Search.
Maciej Besta, Florian Marending, Edgar Solomonik, Torsten Hoefler
2017Sparse Tensor Factorization on Many-Core Processors with High-Bandwidth Memory.
Shaden Smith, Jongsoo Park, George Karypis
2017The Reverse Cuthill-McKee Algorithm in Distributed-Memory.
Ariful Azad, Mathias Jacquelin, Aydin Buluç, Esmond G. Ng
2017The SEPO Model of Computation to Enable Larger-Than-Memory Hash Tables for GPU-Accelerated Big Data Analytics.
Reza Mokhtari, Michael Stumm
2017Tight Load Balancing Via Randomized Local Search.
Petra Berenbrink, Peter Kling, Christopher Liaw, Abbas Mehrabian
2017Toucan - A Translator for Communication Tolerant MPI Applications.
Sergio M. Martin, Marsha J. Berger, Scott B. Baden
2017Towards Highly scalable Ab Initio Molecular Dynamics (AIMD) Simulations on the Intel Knights Landing Manycore Processor.
Mathias Jacquelin, Wibe A. de Jong, Eric J. Bylaska
2017Transparent Caching for RMA Systems.
Salvatore Di Girolamo, Flavio Vella, Torsten Hoefler
2017When Neurons Fail.
El Mahdi El Mhamdi, Rachid Guerraoui
2017swDNN: A Library for Accelerating Deep Learning Applications on Sunway TaihuLight.
Jiarui Fang, Haohuan Fu, Wenlai Zhao, Bingwei Chen, Weijie Zheng, Guangwen Yang