| 2018 | 2018 IEEE International Parallel and Distributed Processing Symposium, IPDPS 2018, Vancouver, BC, Canada, May 21-25, 2018 |
| 2018 | A Communication-Avoiding 3D LU Factorization Algorithm for Sparse Matrices. Piyush Sao, Xiaoye Sherry Li, Richard W. Vuduc |
| 2018 | A Dynamic Hash Table for the GPU. Saman Ashkiani, Martin Farach-Colton, John D. Owens |
| 2018 | A Fast Scalable Implicit Solver with Concentrated Computation for Nonlinear Time-Evolution Problems on Low-Order Unstructured Finite Elements. Tsuyoshi Ichimura, Kohei Fujita, Masashi Horikoshi, Larry Meadows, Kengo Nakajima, Takuma Yamaguchi, Kentaro Koyama, Hikaru Inoue, Akira Naruse, Keisuke Katsushima, Muneo Hori, Lalith Maddegedara |
| 2018 | A Fast and Massively-Parallel Inverse Solver for Multiple-Scattering Tomographic Image Reconstruction. Mert Hidayetoglu, Carl Pearson, Izzat El Hajj, Levent Gürel, Weng Cho Chew, Wen-mei W. Hwu |
| 2018 | A Fill Estimation Algorithm for Sparse Matrices and Tensors in Blocked Formats. Willow Ahrens, Helen Xu, Nicholas Schiefer |
| 2018 | A Lightweight Communication Runtime for Distributed Graph Analytics. Hoang-Vu Dang, Roshan Dathathri, Gurbinder Gill, Alex Brooks, Nikoli Dryden, Andrew Lenharth, Loc Hoang, Keshav Pingali, Marc Snir |
| 2018 | A Migratory Heterogeneity-Aware Data Layout Scheme for Parallel File Systems. Shuibing He, Xian-He Sun, Yang Wang, Chengzhong Xu |
| 2018 | A New GPU Algorithm to Compute a Level Set-Based Analysis for the Parallel Solution of Sparse Triangular Systems. Ernesto Dufrechou, Pablo Ezzatti |
| 2018 | A Parallel Algorithm for Bayesian Network Inference Using Arithmetic Circuits. Md. Vasimuddin, Sriram P. Chockalingam, Srinivas Aluru |
| 2018 | A Set-Aware Key-Value Store on Shingled Magnetic Recording Drives with Dynamic Band. Ting Yao, Zhi-hu Tan, Jiguang Wan, Ping Huang, Yiwen Zhang, Changsheng Xie, Xubin He |
| 2018 | An Energy-Efficient Single-Source Shortest Path Algorithm. Sara Karamati, Jeffrey S. Young, Richard W. Vuduc |
| 2018 | Analyzing Resource Trade-offs in Hardware Overprovisioned Supercomputers. Ryuichi Sakamoto, Tapasya Patki, Thang Cao, Masaaki Kondo, Koji Inoue, Masatsugu Ueda, Daniel A. Ellsworth, Barry Rountree, Martin Schulz |
| 2018 | Application Codesign of Near-Data Processing for Similarity Search. Vincent T. Lee, Amrita Mazumdar, Carlo C. del Mundo, Armin Alaghi, Luis Ceze, Mark Oskin |
| 2018 | Architectural Support for Unlimited Memory Versioning and Renaming. Eran Gilad, Tehila Mayzels, Elazar Raab, Mark Oskin, Yoav Etsion |
| 2018 | Auto-tuning Streamed Applications on Intel Xeon Phi. Peng Zhang, Jianbin Fang, Tao Tang, Canqun Yang, Zheng Wang |
| 2018 | Avoiding Synchronization in First-Order Methods for Sparse Convex Optimization. Aditya Devarakonda, Kimon Fountoulakis, James Demmel, Michael W. Mahoney |
| 2018 | BabelFlow: An Embedded Domain Specific Language for Parallel Analysis and Visualization. Steve Petruzza, Sean Treichler, Valerio Pascucci, Peer-Timo Bremer |
| 2018 | Beyond Binary Search: Parallel In-Place Construction of Implicit Search Tree Layouts. Kyle Berney, Henri Casanova, Alyssa Higuchi, Ben Karsin, Nodari Sitchinava |
| 2018 | BitFlow: Exploiting Vector Parallelism for Binary Neural Networks on CPU. Yuwei Hu, Jidong Zhai, Dinghua Li, Yifan Gong, Yuhao Zhu, Wei Liu, Lei Su, Jiangming Jin |
| 2018 | Blocking Optimization Techniques for Sparse Tensor Computation. Jee W. Choi, Xing Liu, Shaden Smith, Tyler A. Simon |
| 2018 | CIAO: Cache Interference-Aware Throughput-Oriented Architecture and Scheduling for GPUs. Jie Zhang, Shuwen Gao, Nam Sung Kim, Myoungsoo Jung |
| 2018 | COMPI: Concolic Testing for MPI Applications. Hongbo Li, Sihuan Li, Zachary Benavides, Zizhong Chen, Rajiv Gupta |
| 2018 | CTA-Aware Prefetching and Scheduling for GPU. Gunjae Koo, Hyeran Jeon, Zhenhong Liu, Nam Sung Kim, Murali Annavaram |
| 2018 | Cataloging the Visible Universe Through Bayesian Inference at Petascale. Jeffrey Regier, Kiran Pamnany, Keno Fischer, Andreas Noack, Maximilian Lam, Jarrett Revels, Steve Howard, Ryan Giordano, David Schlegel, Jon McAuliffe, Rollin C. Thomas, Prabhat |
| 2018 | Chameleon: An Adaptive Wear Balancer for Flash Clusters. Nannan Zhao, Ali Anwar, Yue Cheng, Mohammed Salman, Daping Li, Jiguang Wan, Changsheng Xie, Xubin He, Feiyi Wang, Ali Raza Butt |
| 2018 | Chameleon: Online Clustering of MPI Program Traces. Amir Bahmani, Frank Mueller |
| 2018 | Characterizing Scheduling Delay for Low-Latency Data Analytics Workloads. Wei Chen, Aidi Pi, Shaoqi Wang, Xiaobo Zhou |
| 2018 | Communication Efficient Checking of Big Data Operations. Lorenz Hübschle-Schneider, Peter Sanders |
| 2018 | Communication Lower Bounds for Matricized Tensor Times Khatri-Rao Product. Grey Ballard, Nicholas Knight, Kathryn Rouse |
| 2018 | Communication-Free Massively Distributed Graph Generation. Daniel Funke, Sebastian Lamm, Peter Sanders, Christian Schulz, Darren Strash, Moritz von Looz |
| 2018 | Complete Visitability for Autonomous Robots on Graphs. Aisha Aljohani, Pavan Poudel, Gokarna Sharma |
| 2018 | Convergence Models and Surprising Results for the Asynchronous Jacobi Method. Jordi Wolfson-Pou, Edmond Chow |
| 2018 | CoolPIM: Thermal-Aware Source Throttling for Efficient PIM Instruction Offloading. Lifeng Nai, Ramyad Hadidi, He Xiao, Hyojong Kim, Jaewoong Sim, Hyesoon Kim |
| 2018 | Cudele: An API and Framework for Programmable Consistency and Durability in a Global Namespace. Michael A. Sevilla, Ivo Jimenez, Noah Watkins, Jeff LeFevre, Peter Alvaro, Shel Finkelstein, Patrick Donnelly, Carlos Maltzahn |
| 2018 | Designing Efficient Shared Address Space Reduction Collectives for Multi-/Many-cores. Jahanzeb Maqbool Hashmi, Sourav Chakraborty, Mohammadreza Bayatpour, Hari Subramoni, Dhabaleswar K. Panda |
| 2018 | Development and Application of a Hybrid Programming Environment on an ARM/DSP System for High Performance Computing. Gaurav Mitra, Jonathan Bohmann, Ian Lintault, Alistair P. Rendell |
| 2018 | Distributed Louvain Algorithm for Graph Community Detection. Sayan Ghosh, Mahantesh Halappanavar, Antonino Tumeo, Ananth Kalyanaraman, Hao Lu, Daniel G. Chavarría-Miranda, Arif Khan, Assefaw Hadish Gebremedhin |
| 2018 | Distributed Symmetry Breaking in Graphs with Bounded Diversity. Leonid Barenboim, Tzalik Maimon |
| 2018 | Do Developers Understand IEEE Floating Point? Peter A. Dinda, Conor Hetland |
| 2018 | Efficient Gradient Boosted Decision Tree Training on GPUs. Zeyi Wen, Bingsheng He, Ramamohanarao Kotagiri, Shengliang Lu, Jiashuai Shi |
| 2018 | Efficient Solving of Scan Primitive on Multi-GPU Systems. Adrián Pérez Diéguez, Margarita Amor, Ramon Doallo, Akira Nukada, Satoshi Matsuoka |
| 2018 | Efficient, Parallel At-scale Correlation Analysis for Atom Probe Tomography on Hybrid Architectures. Hao Lu, Sudip K. Seal, Gregory Muzyn, Wei Guo, Jonathan D. Poplawsky |
| 2018 | Empowering Flexible and Scalable High Performance Architectures with Embedded Photonics. Keren Bergman |
| 2018 | Evaluating Active Learning with Cost and Memory Awareness. Dmitry Duplyakin, Jed Brown, Donna Calhoun |
| 2018 | Evaluating the Performance and Cost of Accelerating Seismic Processing with CUDA, OpenCL, OpenACC, and OpenMP. Tiago Lobato Gimenes, Flavia Pisani, Edson Borin |
| 2018 | Experimental Design of Work Chunking for Graph Algorithms on High Bandwidth Memory Architectures. George M. Slota, Sivasankaran Rajamanickam |
| 2018 | GC-Aware Request Steering with Improved Performance and Reliability for SSD-Based RAIDs. Suzhen Wu, Weidong Zhu, Guixin Liu, Hong Jiang, Bo Mao |
| 2018 | GPU Data Access on Complex Geometries for D3Q19 Lattice Boltzmann Method. Gregory Herschlag, Seyong Lee, Jeffrey S. Vetter, Amanda Randles |
| 2018 | GPU LSM: A Dynamic Dictionary Data Structure for the GPU. Saman Ashkiani, Shengren Li, Martin Farach-Colton, Nina Amenta, John D. Owens |
| 2018 | GPU-Accelerated Large-Scale Genome Assembly. Sayan Goswami, Kisung Lee, Shayan Shams, Seung-Jong Park |
| 2018 | GreenSprint: Effective Computational Sprinting in Green Data Centers. Haoran Cai, Xu Zhou, Qiang Cao, Hong Jiang, Feng Sheng, Xiandong Qi, Jie Yao, Changsheng Xie, Liang Xiao, Liang Gu |
| 2018 | Hardware Transactional Memory Meets Memory Persistency. Daniel Castro, Paolo Romano, João Pedro Barreto |
| 2018 | Harnessing the Power of Many: Extensible Toolkit for Scalable Ensemble Applications. Vivek Balasubramanian, Matteo Turilli, Weiming Hu, Matthieu Lefebvre, Wenjie Lei, Ryan T. Modrak, Guido Cervone, Jeroen Tromp, Shantenu Jha |
| 2018 | Highly Efficient Compensation-Based Parallelism for Wavefront Loops on GPUs. Kaixi Hou, Hao Wang, Wu-chun Feng, Jeffrey S. Vetter, Seyong Lee |
| 2018 | HybridPass: Hybrid Scheduling for Mixed Flows in Datacenter Networks. Bo Peng, Jianguo Yao, Zhengwei Qi, Haibing Guan |
| 2018 | Implicit Decomposition for Write-Efficient Connectivity Algorithms. Naama Ben-David, Guy E. Blelloch, Jeremy T. Fineman, Phillip B. Gibbons, Yan Gu, Charles McGuffey, Julian Shun |
| 2018 | Improving Network Throughput with Global Communication Reordering. Wim Lavrijsen, Costin Iancu, Xing Pan |
| 2018 | Indigo: A Domain-Specific Language for Fast, Portable Image Reconstruction. Michael B. Driscoll, Benjamin Brock, Frank Ong, Jonathan I. Tamir, Hsiou-Yuan Liu, Michael Lustig, Armando Fox, Katherine A. Yelick |
| 2018 | Intra-Cluster Coalescing to Reduce GPU NoC Pressure. Lu Wang, Xia Zhao, David R. Kaeli, Zhiying Wang, Lieven Eeckhout |
| 2018 | Joint Server and Network Energy Saving in Data Centers for Latency-Sensitive Applications. Liang Zhou, Chih-Hsun Chou, Laxmi N. Bhuyan, K. K. Ramakrishnan, Daniel Wong |
| 2018 | LALCA: Locality-Aware Lock Contention Avoidance for NVMe-Based Scale-out Storage System. Myoungwon Oh, Sejin Park, Jugwan Eom, Seungmin Kim, Sangjae Kim, Kang-Won Lee, Heon Y. Yeom |
| 2018 | Large Bandwidth-Efficient FFTs on Multicore and Multi-socket Systems. Doru-Thom Popovici, Tze Meng Low, Franz Franchetti |
| 2018 | Lattice H-Matrices on Distributed-Memory Systems. Akihiro Ida |
| 2018 | Level-Spread: A New Job Allocation Policy for Dragonfly Networks. Yijia Zhang, Ozan Tuncer, Fulya Kaplan, Katzalin Olcoz, Vitus J. Leung, Ayse K. Coskun |
| 2018 | Lightweight MPI Communicators with Applications to Perfectly Balanced Quicksort. Michael Axtmann, Armin Wiebigke, Peter Sanders |
| 2018 | Local Mixing Time: Distributed Computation and Applications. Anisur Rahaman Molla, Gopal Pandurangan |
| 2018 | MIDAS: Multilinear Detection at Scale. Saliya Ekanayake, Jose Cadena, Udayanga Wickramasinghe, Anil Vullikanti |
| 2018 | MOCA: Memory Object Classification and Allocation in Heterogeneous Memory Systems. Aditya Narayan, Tiansheng Zhang, Shaizeen Aga, Satish Narayanasamy, Ayse K. Coskun |
| 2018 | Millipede: Die-Stacked Memory Optimizations for Big Data Machine Learning Analytics. Nitin, Mithuna Thottethodi, T. N. Vijaykumar |
| 2018 | Mitigating Traffic-Based Side Channel Attacks in Bandwidth-Efficient Cloud Storage. Pengfei Zuo, Yu Hua, Cong Wang, Wen Xia, Shunde Cao, Yukun Zhou, Yuanyuan Sun |
| 2018 | Online Tuning of Parallelism Degree in Parallel Nesting Transactional Memory. Jingna Zeng, Paolo Romano, João Pedro Barreto, Luís E. T. Rodrigues, Seif Haridi |
| 2018 | Optimizing Parallel Graph Connectivity Computation via Subgraph Sampling. Michael Sutton, Tal Ben-Nun, Amnon Barak |
| 2018 | Overhead-Conscious Format Selection for SpMV-Based Applications. Yue Zhao, Weijie Zhou, Xipeng Shen, Graham Yiu |
| 2018 | PADDLE: Performance Analysis Using a Data-Driven Learning Environment. Jayaraman J. Thiagarajan, Rushil Anirudh, Bhavya Kailkhura, Nikhil Jain, Tanzima Z. Islam, Abhinav Bhatele, Jae-Seung Yeom, Todd Gamblin |
| 2018 | Parallel Algorithms Through Approximation: B-Edge Cover. S. M. Ferdous, Arif M. Khan, Alex Pothen |
| 2018 | Parallel Scheduling of DAGs under Memory Constraints. Loris Marchal, Hanna Nagy, Bertrand Simon, Frédéric Vivien |
| 2018 | Performance Isolation of Data-Intensive Scale-out Applications in a Multi-tenant Cloud. Palden Lama, Shaoqi Wang, Xiaobo Zhou, Dazhao Cheng |
| 2018 | Performance and Accuracy Trade-offs of HPC Application Modeling and Simulation. Zhou Tong, Xin Yuan, Scott Pakin, Michael Lang |
| 2018 | Performance and Scalability of Lightweight Multi-kernel Based Operating Systems. Balazs Gerofi, Rolf Riesen, Masamichi Takagi, Taisuke Boku, Kengo Nakajima, Yutaka Ishikawa, Robert W. Wisniewski |
| 2018 | Performance of Hierarchical-matrix BiCGStab Solver on GPU Clusters. Ichitaro Yamazaki, Ahmad Abdelfattah, Akihiro Ida, Satoshi Ohshima, Stanimire Tomov, Rio Yokota, Jack J. Dongarra |
| 2018 | QoS Support for Scientific Workflows Using Software-Defined Storage Resource Enclaves. Suman Karki, Bao Nguyen, Xuechen Zhang |
| 2018 | Quantifying the Performance and Energy-Efficiency Impact of Hardware Transactional Memory on Scientific Applications on Large-Scale NUMA Systems. Jinsu Park, Woongki Baek |
| 2018 | Quotient Filters: Approximate Membership Queries on the GPU. Afton Geil, Martin Farach-Colton, John D. Owens |
| 2018 | Real-Time Massively Distributed Multi-object Adaptive Optics Simulations for the European Extremely Large Telescope. Hatem Ltaief, Ali Charara, Damien Gratadour, Nicolas Doucet, Bilel Hadri, Eric Gendron, Saber Feki, David E. Keyes |
| 2018 | Rethinking large-scale Economic Modeling for Efficiency: Optimizations for GPU and Xeon Phi Clusters. Simon Scheidegger, Dmitry Mikushin, Felix Kubler, Olaf Schenk |
| 2018 | Roofline Guided Design and Analysis of a Multi-stencil CFD Solver for Multicore Performance. Bahareh Mostafazadeh, Ferran Marti, Feng Liu, Aparna Chandramowlishwaran |
| 2018 | Runtime Scheduling Policies for Distributed Graph Algorithms. Jesun Sahariar Firoz, Marcin Zalewski, Andrew Lumsdaine, Martina Barnas |
| 2018 | SELECT: A Distributed Publish/Subscribe Notification System for Online Social Networks. Nuno Apolónia, Stefanos Antaris, Sarunas Girdzijauskas, George Pallis, Marios D. Dikaiakos |
| 2018 | SLIMFAST: Reducing Metadata Redundancy in Sound and Complete Dynamic Data Race Detection. Yuanfeng Peng, Christian DeLozier, Ariel Eizenberg, William Mansky, Joseph Devietti |
| 2018 | SWORD: A Bounded Memory-Overhead Detector of OpenMP Data Races in Production Runs. Simone Atzeni, Ganesh Gopalakrishnan, Zvonimir Rakamaric, Ignacio Laguna, Gregory L. Lee, Dong H. Ahn |
| 2018 | Scalable Breadth-First Search on a GPU Cluster. Yuechao Pan, Roger Pearce, John D. Owens |
| 2018 | Scalable Data Resilience for In-memory Data Staging. Shaohua Duan, Pradeep Subedi, Keita Teranishi, Philip E. Davis, Hemanth Kolla, Marc Gamell, Manish Parashar |
| 2018 | Scalable Power-Efficient Kilo-Core Photonic-Wireless NoC Architectures. Avinash Kodi, Kyle Shifflet, Savas Kaya, Soumyasanta Laha, Ahmed Louri |
| 2018 | Scheduling Monotone Moldable Jobs in Linear Time. Klaus Jansen, Felix Land |
| 2018 | Scheduling Parallel Tasks under Multiple Resources: List Scheduling vs. Pack Scheduling. Hongyang Sun, Redouane Elghazi, Ana Gainaru, Guillaume Aupy, Padma Raghavan |
| 2018 | Self-Stabilizing Supervised Publish-Subscribe Systems. Michael Feldmann, Christina Kolb, Christian Scheideler, Thim Strothmann |
| 2018 | Semantics-Preserving Parallelization of Stochastic Gradient Descent. Saeed Maleki, Madanlal Musuvathi, Todd Mytkowicz |
| 2018 | Skueue: A Scalable and Sequentially Consistent Distributed Queue. Michael Feldmann, Christian Scheideler, Alexander Setzer |
| 2018 | Software-Hardware Managed Last-level Cache Allocation Scheme for Large-Scale NVRAM-Based Multicores Executing Parallel Data Analytics Applications. Masab Ahmad, Halit Dogan, Fabio Checconi, Xinyu Que, Daniele Buono, Omer Khan |
| 2018 | Spartan: A Framework For Sparse Robust Addressable Networks. John Augustine, Sumathi Sivasubramaniam |
| 2018 | Swallow: Joint Online Scheduling and Coflow Compression in Datacenter Networks. Qihua Zhou, Peng Li, Kun Wang, Deze Zeng, Song Guo, Minyi Guo |
| 2018 | THOR: THermal-aware Optimizations for extending ReRAM Lifetime. Majed Valad Beigi, Gokhan Memik |
| 2018 | TTLG - An Efficient Tensor Transposition Library for GPUs. Jyothi Vedurada, Arjun Suresh, Aravind Sukumaran-Rajam, Jinsung Kim, Changwan Hong, Ajay Panyala, Sriram Krishnamoorthy, V. Krishna Nandivada, Rohit Kumar Srivastava, P. Sadayappan |
| 2018 | Taming the "Monster": Overcoming Program Optimization Challenges on SW26010 Through Precise Performance Modeling. Shizhen Xu, Yuanchao Xu, Wei Xue, Xipeng Shen, Fang Zheng, Xiaomeng Huang, Guangwen Yang |
| 2018 | The Algorithmics of Write Optimization. Michael A. Bender |
| 2018 | The Day After Tomorrow: The Looming Post-Exascale Crisis. Bruce Hendrickson |
| 2018 | The Power to Schedule a Parallel Program. Kunal Agrawal, Seth Gilbert |
| 2018 | Tiny Groups Tackle Byzantine Adversaries. Mercy O. Jaiyeola, Kyle Patron, Jared Saia, Maxwell Young, Qian M. Zhou |
| 2018 | Trade-Off Study of Localizing Communication and Balancing Network Traffic on a Dragonfly System. Xin Wang, Misbah Mubarak, Xu Yang, Robert B. Ross, Zhiling Lan |
| 2018 | UBIS: Utilization-Aware Cluster Scheduling. Karthik Kambatla, Vamsee Yarlagadda, Iñigo Goiri, Ananth Grama |
| 2018 | Understanding and Modeling Lossy Compression Schemes on HPC Scientific Data. Tao Lu, Qing Liu, Xubin He, Huizhang Luo, Eric Suchyta, Jong Choi, Norbert Podhorszki, Scott Klasky, Matthew Wolf, Tong Liu, Zhenbo Qiao |
| 2018 | Unobtrusive Asynchronous Exception Handling with Standard Java Try/Catch Blocks. Mostafa Mehrabi, Nasser Giacaman, Oliver Sinnen |
| 2018 | WarpDrive: Massively Parallel Hashing on Multi-GPU Nodes. Daniel Jünger, Christian Hundt, Bertil Schmidt |
| 2018 | What Size Should Your Buffers to Disks be? Guillaume Aupy, Olivier Beaumont, Lionel Eyraud-Dubois |
| 2018 | Work-Stealing, Locality-Aware Actor Scheduling. Saman Barghi, Martin Karsten |
| 2018 | sDPF-RSA: Utilizing Floating-point Computing Power of GPUs for Massive Digital Signature Computations. Jiankuo Dong, Fangyu Zheng, Niall Emmart, Jingqiang Lin, Charles C. Weems |