IPDPS A

110 papers

YearTitle / Authors
202112 Ways to Fool the Masses with Irreproducible Results.
Lorena A. Barba
202135th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2021, Portland, OR, USA, May 17-21, 2021
2021A Hybrid Scheduling Scheme for Parallel Loops.
Aaron Handleman, Arthur G. Rattew, I-Ting Angelina Lee, Tao B. Schardl
2021A Multi-GPU Design for Large Size Cryo-EM 3D Reconstruction.
Zihao Wang, Xiaohua Wan, Zhiyong Liu, Qianshuo Fan, Fa Zhang, Guangming Tan
2021A Tale of Two C's: Convergence and Composability.
Ilkay Altintas
2021ARBALEST: Dynamic Detection of Data Mapping Issues in Heterogeneous OpenMP Applications.
Lechen Yu, Joachim Protze, Oscar R. Hernandez, Vivek Sarkar
2021Accelerating Distributed-Memory Autotuning via Statistical Analysis of Execution Paths.
Edward Hutter, Edgar Solomonik
2021Accelerating Multigrid-based Hierarchical Scientific Data Refactoring on GPUs.
Jieyang Chen, Lipeng Wan, Xin Liang, Ben Whitney, Qing Liu, David Pugmire, Nicholas Thompson, Jong Youl Choi, Matthew Wolf, Todd S. Munson, Ian T. Foster, Scott Klasky
2021Accelerating non-power-of-2 size Fourier transforms with GPU Tensor Cores.
Louis Pisha, Lukasz Ligowski
2021Adaptive Spatially Aware I/O for Multiresolution Particle Data Layouts.
Will Usher, Xuan Huang, Steve Petruzza, Sidharth Kumar, Stuart R. Slattery, Samuel Temple Reeve, Feng Wang, Chris R. Johnson, Valerio Pascucci
2021AlphaR: Learning-Powered Resource Management for Irregular, Dynamic Microservice Graph.
Xiaofeng Hou, Chao Li, Jiacheng Liu, Lu Zhang, Shaolei Ren, Jingwen Leng, Quan Chen, Minyi Guo
2021An In-Depth Analysis of Distributed Training of Deep Neural Networks.
Yun-Yong Ko, Kibong Choi, Jiwon Seo, Sang-Wook Kim
2021Arbitration Policies for On-Demand User-Level I/O Forwarding on HPC Platforms.
Jean Luca Bez, Alberto Miranda, Ramon Nou, Francieli Zanon Boito, Toni Cortes, Philippe O. A. Navaux
2021Argus: Efficient Job Scheduling in RDMA-assisted Big Data Processing.
Sijie Wu, Hanhua Chen, Yonghui Wang, Hai Jin
2021Astra: Autonomous Serverless Analytics with Cost-Efficiency and QoS-Awareness.
Jananie Jarachanthan, Li Chen, Fei Xu, Bo Li
2021AuTraScale: An Automated and Transfer Learning Solution for Streaming System Auto-Scaling.
Liang Zhang, Wenli Zheng, Chao Li, Yao Shen, Minyi Guo
2021Automatic Graph Partitioning for Very Large-scale Deep Learning.
Masahiro Tanaka, Kenjiro Taura, Toshihiro Hanawa, Kentaro Torisawa
2021BiPS: Hotness-aware Bi-tier Parameter Synchronization for Recommendation Models.
Qiming Zheng, Quan Chen, Kaihao Bai, Huifeng Guo, Yong Gao, Xiuqiang He, Minyi Guo
2021Byzantine Agreement with Unknown Participants and Failures.
Pankaj Khanchandani, Roger Wattenhofer
2021Byzantine Dispersion on Graphs.
Anisur Rahaman Molla, Kaushik Mondal, William K. Moses Jr.
2021CAGC: A Content-aware Garbage Collection Scheme for Ultra-Low Latency Flash-based SSDs.
Suzhen Wu, Chunfeng Du, Haijun Li, Hong Jiang, Zhirong Shen, Bo Mao
2021CBNet: Minimizing Adjustments in Concurrent Demand-Aware Tree Networks.
Otávio Augusto de Oliviera Souza, Olga Goussevskaia, Stefan Schmid
2021CTXBack: Enabling Low Latency GPU Context Switching via Context Flashback.
Zhuoran Ji, Cho-Li Wang
2021Characterizing Small-Scale Matrix Multiplications on ARMv8-based Many-Core Architectures.
Weiling Yang, Jianbin Fang, Dezun Dong
2021Code Generation for Room Acoustics Simulations with Complex Boundary Conditions.
Larisa Stoltzfus, Brian Hamilton, Michel Steuwer, Lu Li, Christophe Dubach
2021Combining XOR and Partner Checkpointing for Resilient Multilevel Checkpoint/Restart.
Masoud Gholami, Florian Schintke
2021Communication-Avoiding and Memory-Constrained Sparse Matrix-Matrix Multiplication at Extreme Scale.
Md Taufique Hussain, Oguz Selvitopi, Aydin Buluç, Ariful Azad
2021Consistent Lock-free Parallel Stochastic Gradient Descent for Fast and Stable Convergence.
Karl Bäckström, Ivan Walulya, Marina Papatriantafilou, Philippas Tsigas
2021Cori: Dancing to the Right Beat of Periodic Data Movements over Hybrid Memory Systems.
Thaleia Dimitra Doudali, Daniel Zahka, Ada Gavrilovska
2021Correlation-wise Smoothing: Lightweight Knowledge Extraction for HPC Monitoring Data.
Alessio Netti, Daniele Tafani, Michael Ott, Martin Schulz
2021Covirt: Lightweight Fault Isolation and Resource Protection for Co-Kernels.
Nicholas Gordon, John R. Lange
2021DAG-based Scheduling with Resource Sharing for Multi-task Applications in a Polyglot GPU Runtime.
Alberto Parravicini, Arnaud Delamare, Marco Arnaboldi, Marco D. Santambrogio
2021DSXplore: Optimizing Convolutional Neural Networks via Sliding-Channel Convolutions.
Yuke Wang, Boyuan Feng, Yufei Ding
2021DUET: A Compiler-Runtime Subgraph Scheduling Approach for Tensor Programs on a Coupled CPU-GPU Architecture.
Minjia Zhang, Zehua Hu, Mingqin Li
2021Dancing in the Dark: Profiling for Tiered Memory.
Jinyoung Choi, Sergey Blagodurov, Hung-Wei Tseng
2021Decentralized Low-Latency Task Scheduling for Ad-Hoc Computing.
Janick Edinger, Martin Breitbach, Niklas Gabrisch, Dominik Schäfer, Christian Becker, Amr Rizk
2021Deep Reinforcement Agent for Scheduling in HPC.
Yuping Fan, Zhiling Lan, J. Taylor Childers, Paul Rich, William E. Allcock, Michael E. Papka
2021Demystifying GPU Reliability: Comparing and Combining Beam Experiments, Fault Simulation, and Profiling.
Fernando Fernandes dos Santos, Siva Kumar Sastry Hari, Pedro Martins Basso, Luigi Carro, Paolo Rech
2021Demystifying GPU UVM Cost with Deep Runtime and Workload Analysis.
Tyler N. Allen, Rong Ge
2021Designing High-Performance MPI Libraries with On-the-fly Compression for Modern GPU Clusters
Qinghua Zhou, C. Chu, N. S. Kumar, Pouya Kousha, Seyedeh Mahdieh Ghazimirsaeed, Hari Subramoni, Dhabaleswar K. Panda
2021Detecting Malicious Model Updates from Federated Learning on Conditional Variational Autoencoder.
Zhipin Gu, Yuexiang Yang
2021Distributed Training of Embeddings using Graph Analytics.
Gurbinder Gill, Roshan Dathathri, Saeed Maleki, Madan Musuvathi, Todd Mytkowicz, Olli Saarikivi
2021Distributed-Memory k-mer Counting on GPUs.
Israt Nisa, Prashant Pandey, Marquita Ellis, Leonid Oliker, Aydin Buluç, Katherine A. Yelick
2021Distributed-memory multi-GPU block-sparse tensor contraction for electronic structure.
Thomas Hérault, Yves Robert, George Bosilca, Robert J. Harrison, Cannada A. Lewis, Edward F. Valeev, Jack J. Dongarra
2021EAGLE: Expedited Device Placement with Automatic Grouping for Large Models.
Hao Lan, Li Chen, Baochun Li
2021Efficient Algorithms for Encrypted All-gather Operation.
Mehran Sadeghi Lahijani, Abu Naser, Cong Wu, Mohsen Gavahi, Viet Tung Hoang, Zhi Wang, Xin Yuan
2021Efficient Distributed Algorithms in the k-machine model via PRAM Simulations.
John Augustine, Kishore Kothapalli, Gopal Pandurangan
2021Efficient Video Captioning on Heterogeneous System Architectures.
Horng-Ruey Huang, Ding-Yong Hong, Jan-Jan Wu, Pangfeng Liu, Wei-Chung Hsu
2021Efficient parallel CP decomposition with pairwise perturbation and multi-sweep dimension tree.
Linjian Ma, Edgar Solomonik
2021Euler Meets GPU: Practical Graph Algorithms with Theoretical Guarantees.
Adam Polak, Adrian Siwiec, Michal Stobierski
2021Extending Sparse Tensor Accelerators to Support Multiple Compression Formats.
Eric Qin, Geonhwa Jeong, William Won, Sheng-Chun Kao, Hyoukjun Kwon, Sudarshan Srinivasan, Dipankar Das, Gordon Euhyun Moon, Sivasankaran Rajamanickam, Tushar Krishna
2021Extremely Fast and Energy Efficient One-way Wave Equation Migration on GPU-based heterogeneous architecture.
Long Qu, Loris Lucido, Marie Bonnasse-Gahot, Pascal Vezolle, Diego Klahr
2021F-Write: Fast RDMA-supported Writes in Erasure-coded In-memory Clusters.
Bin Xu, Jianzhong Huang, Qiang Cao, Xiao Qin, Ping Xie
2021Facilitating Data Discovery for Large-scale Science Facilities using Knowledge Networks.
Yubo Qin, Ivan Rodero, Manish Parashar
2021Finer-LRU: A Scalable Page Management Scheme for HPC Manycore Architectures.
Jiwoo Bang, Chungyong Kim, Sunggon Kim, Qichen Chen, Cheongjun Lee, Eun-Kyu Byun, Jaehwan Lee, Hyeonsang Eom
2021From Parallelization to Customization - Challenges and Opportunities.
Jason Cong
2021FusedMM: A Unified SDDMM-SpMM Kernel for Graph Embedding and Graph Neural Networks.
Md. Khaledur Rahman, Majedul Haque Sujon, Ariful Azad
2021High Performance Streaming Tensor Decomposition.
Yongseok Soh, Patrick Flick, Xing Liu, Shaden Smith, Fabio Checconi, Fabrizio Petrini, Jee W. Choi
2021High-Level FPGA Accelerator Design for Structured-Mesh-Based Explicit Numerical Solvers.
Kamalakkannan Kamalavasan, Gihan R. Mudalige, István Z. Reguly, Suhaib A. Fahmy
2021High-Level Synthesis of Parallel Specifications Coupling Static and Dynamic Controllers.
Vito Giovanni Castellana, Antonino Tumeo, Fabrizio Ferrandi
2021High-Performance Spectral Element Methods on Field-Programmable Gate Arrays : Implementation, Evaluation, and Future Projection.
Martin Karp, Artur Podobas, Niclas Jansson, Tobias Kenter, Christian Plessl, Philipp Schlatter, Stefano Markidis
2021Improving checkpointing intervals by considering individual job failure probabilities.
Alvaro Frank, Manuel Baumgartner, Reza Salkhordeh, André Brinkmann
2021Interpreting Write Performance of Supercomputer I/O Systems with Regression Models.
Bing Xie, Zilong Tan, Philip H. Carns, Jeffrey S. Chase, Kevin Harms, Jay F. Lofstead, Sarp Oral, Sudharshan S. Vazhkudai, Feiyi Wang
2021Introducing Application Awareness Into a Unified Power Management Stack.
Daniel C. Wilson, Siddhartha Jana, Aniruddha Marathe, Stephanie Brink, Christopher M. Cantalupo, Diana R. Guttman, Brad Geltz, Lowren H. Lawson, Asma H. Al-Rawi, Ali Mohammad, Fuat Keceli, Federico Ardanaz, Jonathan M. Eastep, Ayse K. Coskun
2021Is Asymptotic Cost Analysis Useful in Developing Practical Parallel Algorithms.
Guy E. Blelloch
2021Jigsaw: A Slice-and-Dice Approach to Non-uniform FFT Acceleration for MRI Image Reconstruction.
Brendan L. West, Jeffrey A. Fessler, Thomas F. Wenisch
2021Leveraging PaRSEC Runtime Support to Tackle Challenging 3D Data-Sparse Matrix Problems.
Qinglei Cao, Yu Pei, Kadir Akbudak, George Bosilca, Hatem Ltaief, David E. Keyes, Jack J. Dongarra
2021Lightweight Function Monitors for Fine-Grained Management in Large Scale Python Applications.
Tim Shaffer, Zhuozhao Li, Ben Tovar, Yadu N. Babuji, T. J. Dasso, Zoe Surma, Kyle Chard, Ian T. Foster, Douglas Thain
2021Matrix Engines for High Performance Computing: A Paragon of Performance or Grasping at Straws?
Jens Domke, Emil Vatai, Aleksandr Drozd, Peng Chen, Yosuke Oyama, Lingqi Zhang, Shweta Salaria, Daichi Mukunoki, Artur Podobas, Mohamed Wahib, Satoshi Matsuoka
2021Max-Stretch Minimization on an Edge-Cloud Platform.
Anne Benoit, Redouane Elghazi, Yves Robert
2021MultiLogVC: Efficient Out-of-Core Graph Processing Framework for Flash Storage.
Kiran Kumar Matam, Hanieh Hashemi, Murali Annavaram
2021Multiplicative Weights Algorithms for Parallel Automated Software Repair.
Joseph Renzullo, Westley Weimer, Stephanie Forrest
2021NVMe-CR: A Scalable Ephemeral Storage Runtime for Checkpoint/Restart with NVMe-over-Fabrics.
Shashank Gugnani, Tianxi Li, Xiaoyi Lu
2021Noise-Resilient Empirical Performance Modeling with Deep Neural Networks.
Marcus Ritter, Alexander Geiß, Johannes Wehrstein, Alexandru Calotoiu, Thorsten Reimann, Torsten Hoefler, Felix Wolf
2021Nowa: A Wait-Free Continuation-Stealing Concurrency Platform.
Florian Schmaus, Nicolas Pfeiffer, Wolfgang Schröder-Preikschat, Timo Hönig, Jörg Nolte
2021Optimal Task Assignment for Heterogeneous Federated Learning Devices.
Laércio Lima Pilla
2021Optimizing Memory-Compute Colocation for Irregular Applications on a Migratory Thread Architecture.
Thomas B. Rolinger, Christopher D. Krieger, Alan Sussman
2021Optimizing Performance for Open-Channel SSDs in Cloud Storage System.
Xiaoyi Zhang, Feng Zhu, Shu Li, Kun Wang, Wei Xu, Dengcai Xu
2021PALM: Progress- and Locality-Aware Adaptive Task Migration for Efficient Thread Packing.
Jinsu Park, Seongbeom Park, Myeonggyun Han, Woongki Baek
2021Parallel String Graph Construction and Transitive Reduction for De Novo Genome Assembly.
Giulia Guidi, Oguz Selvitopi, Marquita Ellis, Leonid Oliker, Katherine A. Yelick, Aydin Buluç
2021Pase: Parallelization Strategies for Efficient DNN Training.
Venmugil Elango
2021Performance Analysis of Scientific Computing Workloads on General Purpose TEEs.
Ayaz Akram, Anna Giannakou, Venkatesh Akella, Jason Lowe-Power, Sean Peisert
2021Performance Evaluation of Adaptive Routing on Dragonfly-based Production Systems.
Sudheer Chunduri, Kevin Harms, Taylor L. Groves, Peter Mendygral, Justs Zarins, Michèle Weiland, Yasaman Ghadar
2021Performance-Portable Graph Coarsening for Efficient Multilevel Graph Analysis.
Michael S. Gilbert, Seher Acer, Erik G. Boman, Kamesh Madduri, Sivasankaran Rajamanickam
2021Plex: Scaling Parallel Lexing with Backtrack-Free Prescanning.
Le Li, Shigeyuki Sato, Qiheng Liu, Kenjiro Taura
2021QPR: Quantizing PageRank with Coherent Shared Memory Accelerators.
Abdullah T. Mughrabi, Mohannad Ibrahim, Gregory T. Byrd
2021QoS-Aware and Resource Efficient Microservice Deployment in Cloud-Edge Continuum.
Kaihua Fu, Wei Zhang, Quan Chen, Deze Zeng, Xin Peng, Wenli Zheng, Minyi Guo
2021RVMA: Remote Virtual Memory Access.
Ryan E. Grant, Michael J. Levenhagen, Matthew G. F. Dosanjh, Patrick M. Widener
2021Rack-Scaling: An efficient rack-based redistribution method to accelerate the scaling of cloud disk arrays.
Zhehan Lin, Hanchen Guo, Chentao Wu, Jie Li, Guangtao Xue, Minyi Guo
2021Rank Position Forecasting in Car Racing.
Bo Peng, Jiayu Li, Selahattin Akkas, Takuya Araki, Ohno Yoshiyuki, Judy Qiu
2021Redesigning Peridigm on SIMT Accelerators for High-performance Peridynamics Simulations.
Xinyuan Li, Huang Ye, Jian Zhang
2021Revisiting Huffman Coding: Toward Extreme Performance on Modern GPU Architectures.
Jiannan Tian, Cody Rivera, Sheng Di, Jieyang Chen, Xin Liang, Dingwen Tao, Franck Cappello
2021SNOW Revisited: Understanding When Ideal READ Transactions Are Possible.
Kishori M. Konwar, Wyatt Lloyd, Haonan Lu, Nancy A. Lynch
2021SRNoC: A Statically-Scheduled Circuit-Switched Superconducting Race Logic NoC.
George Michelogiannakis, Darren Lyles, Patricia Gonzalez-Guerrero, Meriam Gay Bautista, Dilip Vasudevan, Anastasiia Butko
2021SUPER: SUb-Graph Parallelism for TransformERs.
Arpan Jain, Tim Moon, Tom Benson, Hari Subramoni, Sam Adé Jacobs, Dhabaleswar K. Panda, Brian Van Essen
2021SYMBIOSYS: A Methodology for Performance Analysis of Composable HPC Data Services.
Srinivasan Ramesh, Allen D. Malony, Philip H. Carns, Robert B. Ross, Matthieu Dorier, Jérome Soumagne, Shane Snyder
2021Scalable Epidemiological Workflows to Support COVID-19 Planning and Response.
Dustin Machi, Parantapa Bhattacharya, Stefan Hoops, Jiangzhuo Chen, Henning S. Mortveit, Srinivasan Venkatramanan, Bryan L. Lewis, Mandy L. Wilson, Arindam Fadikar, Tom Maiden, Christopher L. Barrett, Madhav V. Marathe
2021Scaling Out a Combinatorial Algorithm for Discovering Carcinogenic Gene Combinations to Thousands of GPUs.
Sajal Dash, Qais Al-Hajri, Wu-chun Feng, Harold R. Garner, Ramu Anandakrishnan
2021Scaling Sparse Matrix Multiplication on CPU-GPU Nodes.
Yang Xia, Peng Jiang, Gagan Agrawal, Rajiv Ramnath
2021Speculative Parallel Reverse Cuthill-McKee Reordering on Multi- and Many-core Architectures.
Daniel Mlakar, Martin Winter, Mathias Parger, Markus Steinberger
2021Spray: Sparse Reductions of Arrays in OPENMP.
Jan Hückelheim, Johannes Doerfert
2021Systemic Assessment of Node Failures in HPC Production Platforms.
Anwesha Das, Frank Mueller, Barry Rountree
2021Temporal blocking of finite-difference stencil operators with sparse "off-the-grid" sources.
George Bisbas, Fabio Luporini, Mathias Louboutin, Rhodri Nelson, Gerard J. Gorman, Paul H. J. Kelly
2021TileSpMV: A Tiled Algorithm for Sparse Matrix-Vector Multiplication on GPUs.
Yuyao Niu, Zhengyang Lu, Meichen Dong, Zhou Jin, Weifeng Liu, Guangming Tan
2021Towards Internet-Scale Convolutional Root-Cause Analysis with DIAGNET.
Loïck Bonniot, Christoph Neumann, François Taïani
2021Towards Practical Cloud Offloading for Low-cost Ground Vehicle Workloads.
Yuan Xu, Tianwei Zhang, Jimin Han, Sa Wang, Yungang Bao
2021Transparent I/O-Aware GPU Virtualization for Efficient Resource Consolidation.
Nelson Mimura Gonzalez, Tonia Elengikal
2021Virtual-Link: A Scalable Multi-Producer Multi-Consumer Message Queue Architecture for Cross-Core Communication.
Qinzhe Wu, Jonathan Beard, Ashen Ekanayake, Andreas Gerstlauer, Lizy K. John
2021xBGAS: A Global Address Space Extension on RISC-V for High Performance Computing.
Xi Wang, John D. Leidel, Brody Williams, Alan Ehret, Miguel Mark, Michel A. Kinsy, Yong Chen
2021zMesh: Exploring Application Characteristics to Improve Lossy Compression Ratio for Adaptive Mesh Refinement.
Huizhang Luo, Junqi Wang, Qing Liu, Jieyang Chen, Scott Klasky, Norbert Podhorszki