SC A

88 papers

YearTitle / Authors
201610M-core scalable fully-implicit solver for nonhydrostatic atmospheric dynamics.
Chao Yang, Wei Xue, Haohuan Fu, Hongtao You, Xinliang Wang, Yulong Ao, Fangfang Liu, Lin Gan, Ping Xu, Lanning Wang, Guangwen Yang, Weimin Zheng
2016A PCIe congestion-aware performance model for densely populated accelerator servers.
Maxime Martinasso, Grzegorz Kwasniewski, Sadaf R. Alam, Thomas C. Schulthess, Torsten Hoefler
2016A data driven scheduling approach for power management on HPC systems.
Sean Wallace, Xu Yang, Venkatram Vishwanath, William E. Allcock, Susan Coghlan, Michael E. Papka, Zhiling Lan
2016A domain-specific compiler for a parallel multiresolution adaptive numerical simulation environment.
Samyam Rajbhandari, Jinsung Kim, Sriram Krishnamoorthy, Louis-Noël Pouchet, Fabrice Rastello, Robert J. Harrison, P. Sadayappan
2016A highly effective global surface wave numerical simulation with ultra-high resolution.
Fangli Qiao, Wei Zhao, Xunqiang Yin, Xiaomeng Huang, Xin Liu, Qi Shu, Guansuo Wang, Zhenya Song, Xinfang Li, Haixing Liu, Guangwen Yang, Yeli Yuan
2016A machine learning framework for performance coverage analysis of proxy applications.
Tanzima Z. Islam, Jayaraman J. Thiagarajan, Abhinav Bhatele, Martin Schulz, Todd Gamblin
2016A multi-faceted approach to job placement for improved performance on extreme-scale systems.
Christopher Zimmer, Saurabh Gupta, Scott Atchley, Sudharshan S. Vazhkudai, Carl Albing
2016A parallel algorithm for finding all pairs
Sriram P. Chockalingam, Sharma V. Thankachan, Srinivas Aluru
2016A parallel arbitrary-order accurate AMR algorithm for the scalar advection-diffusion equation.
Arash Bakhtiari, Dhairya Malhotra, Amir Raoofy, Miriam Mehl, Hans-Joachim Bungartz, George Biros
2016Accelerating lattice QCD multigrid on GPUs using fine-grained parallelization.
Michael A. Clark, Bálint Joó, Alexei Strelchenko, Michael Cheng, Arjun Singh Gambhir, Richard C. Brower
2016An efficient and scalable algorithmic method for generating large: scale random graphs.
Md. Maksudul Alam, Maleq Khan, Anil Vullikanti, Madhav V. Marathe
2016An ephemeral burst-buffer file system for scientific applications.
Teng Wang, Kathryn M. Mohror, Adam Moody, Kento Sato, Weikuan Yu
2016An exploration of optimization algorithms for high performance tensor completion.
Shaden Smith, Jongsoo Park, George Karypis
2016Automating wavefront parallelization for sparse matrix computations.
Anand Venkat, Mahdi Soltan Mohammadi, Jongsoo Park, Hongbo Rong, Rajkishore Barik, Michelle Mills Strout, Mary W. Hall
2016Block iterative methods and recycling for improved scalability of linear solvers.
Pierre Jolivet, Pierre-Henri Tournier
2016Caliper: performance introspection for HPC software stacks.
David Böhme, Todd Gamblin, David Beckingsale, Peer-Timo Bremer, Alfredo Giménez, Matthew P. LeGendre, Olga Pearce, Martin Schulz
2016Characterizing parallel scientific applications on commodity clusters: an empirical study of a tapered fat-tree.
Edgar A. León, Ian Karlin, Abhinav Bhatele, Steven H. Langer, Chris Chambreau, Louis H. Howell, Trent D'Hooge, Matthew L. Leininger
2016Compiler-directed lightweight checkpointing for fine-grained guaranteed soft error recovery.
Qingrui Liu, Changhee Jung, Dongyoon Lee, Devesh Tiwari
2016DAOS and friends: a proposal for an exascale storage system.
Jay F. Lofstead, Ivo Jimenez, Carlos Maltzahn, Quincey Koziol, John Bent, Eric Barton
2016DCA: a DRAM-cache-aware DRAM controller.
Cheng-Chieh Huang, Vijay Nagarajan, Arpit Joshi
2016Daino: a high-level framework for parallel and efficient AMR on GPUs.
Mohamed Wahib, Naoya Maruyama, Takayuki Aoki
2016Designing MPI library with on-demand paging (ODP) of infiniband: challenges and benefits.
Mingzhe Li, Khaled Hamidouche, Xiaoyi Lu, Hari Subramoni, Jie Zhang, Dhabaleswar K. Panda
2016Designing scalable
Arif M. Khan, Alex Pothen, Md. Mostofa Ali Patwary, Mahantesh Halappanavar, Nadathur Rajagopalan Satish, Narayanan Sundaram, Pradeep Dubey
2016Development effort estimation in HPC.
Sandra Wienke, Julian Miller, Martin Schulz, Matthias S. Müller
2016Distributed-memory large deformation diffeomorphic 3D image registration.
Andreas Mang, Amir Gholami, George Biros
2016Efficient delaunay tessellation through K-D tree decomposition.
Dmitriy Morozov, Tom Peterka
2016Elastic multi-resource fairness: balancing fairness and efficiency in coupled CPU-GPU architectures.
Shanjiang Tang, Bingsheng He, Shuhao Zhang, Zhaojie Niu
2016Enabling efficient preemption for SIMT architectures with lightweight context switching.
Zhen Lin, Lars Nyland, Huiyang Zhou
2016Enhanced MPSM3 for applications to quantum biological simulations.
A. Pozdneev, Valéry Weber, Teodoro Laino, Constantine Bekas, Alessandro Curioni
2016Enhancing infiniband with openflow-style SDN capability.
Jason Lee, Zhou Tong, Karthik Achalkar, Xin Yuan, Michael Lang
2016Evaluating HPC networks via simulation of parallel workloads.
Nikhil Jain, Abhinav Bhatele, Sam White, Todd Gamblin, Laxmikant V. Kalé
2016Evaluating and optimizing OpenCL kernels for high performance computing with FPGAs.
Hamid Reza Zohouri, Naoya Maruyama, Aaron Smith, Motohiko Matsuda, Satoshi Matsuoka
2016Exploring the potentials of parallel garbage collection in SSDs for enterprise storage systems.
Narges Shahidi, Mohammad Arjomand, Myoungsoo Jung, Mahmut T. Kandemir, Chita R. Das, Anand Sivasubramaniam
2016Extended task queuing: active messages for heterogeneous systems.
Michael LeBeane, Brandon Potter, Abhisek Pan, Alexandru Dutu, Vinay Agarwala, Wonchan Lee, Deepak Majeti, Bibek Ghimire, Eric Van Tassell, Samuel Wasmundt, Brad Benton, Maurício Breternitz, Michael L. Chu, Mithuna Thottethodi, Lizy K. John, Steven K. Reinhardt
2016Extreme scale plasma turbulence simulations on top supercomputers worldwide.
William M. Tang, Bei Wang, Stéphane Ethier, Grzegorz Kwasniewski, Torsten Hoefler, Khaled Z. Ibrahim, Kamesh Madduri, Samuel Williams, Leonid Oliker, Carlos Rosales-Fernandez, Timothy J. Williams
2016Extreme-scale phase field simulations of coarsening dynamics on the sunway taihulight supercomputer.
Jian Zhang, Chunbao Zhou, Yangang Wang, Lili Ju, Qiang Du, Xuebin Chi, Dongsheng Xu, Dexun Chen, Yong Liu, Zhao Liu
2016Failure detection and propagation in HPC systems.
George Bosilca, Aurélien Bouteiller, Amina Guermouche, Thomas Hérault, Yves Robert, Pierre Sens, Jack J. Dongarra
2016Flexfly: enabling a reconfigurable dragonfly through silicon photonics.
Ke Wen, Payman Samadi, Sébastien Rumley, Christine P. Chen, Yiwen Shen, Meisam Bahadori, Keren Bergman, Jeremiah J. Wilke
2016FlipBack: automatic targeted protection against silent data corruption.
Xiang Ni, Laxmikant V. Kalé
2016G-store: high-performance graph store for trillion-edge processing.
Pradeep Kumar, H. Howie Huang
2016Granularity and the cost of error recovery in resilient AMR scientific applications.
Anshu Dubey, Hajime Fujita, Daniel T. Graves, Andrew A. Chien, Devesh Tiwari
2016Graph colouring as a challenge problem for dynamic graph processing on distributed systems.
Scott Sallinen, Keita Iwabuchi, Suraj Poudel, Maya B. Gokhale, Matei Ripeanu, Roger A. Pearce
2016GreenLA: green linear algebra software for GPU-accelerated heterogeneous computing.
Jieyang Chen, Li Tan, Panruo Wu, Dingwen Tao, Hongbo Li, Xin Liang, Sihuan Li, Rong Ge, Laxmi N. Bhuyan, Zizhong Chen
2016HARP: predictive transfer optimization based on historical analysis and real-time probing.
Engin Arslan, Kemal Guner, Tevfik Kosar
2016High performance emulation of quantum circuits.
Thomas Häner, Damian S. Steiger, Mikhail Smelyanskiy, Matthias Troyer
2016High-frequency nonlinear earthquake simulations on petascale heterogeneous supercomputers.
Daniel Roten, Yifeng Cui, Kim B. Olsen, Steven M. Day, Kyle Withers, William H. Savran, Peng Wang, Dawei Mu
2016Improving application resilience to memory errors with lightweight compression.
Scott Levy, Kurt B. Ferreira, Patrick G. Bridges
2016Increasing molecular dynamics simulation rates with an 8-fold increase in electrical power efficiency.
W. Michael Brown, Andrey Semin, Michael Hebenstreit, Sergey Khvostov, Karthik Raman, Steven J. Plimpton
2016LIBXSMM: accelerating small matrix multiplications by runtime code generation.
Alexander Heinecke, Greg Henry, Maxwell Hutchinson, Hans Pabst
2016MUSA: a multi-level simulation approach for next-generation HPC machines.
Thomas Grass, César Allande, Adrià Armejach, Alejandro Rico, Eduard Ayguadé, Jesús Labarta, Mateo Valero, Marc Casas, Miquel Moretó
2016Measuring and understanding throughput of network topologies.
Sangeetha Abdu Jyothi, Ankit Singla, Brighten Godfrey, Alexandra Kolla
2016Merge-based parallel sparse matrix-vector multiplication.
Duane Merrill, Michael Garland
2016MetaMorph: a library framework for interoperable kernels on multi- and many-core clusters.
Ahmed E. Helal, Paul Sathre, Wu-chun Feng
2016Modeling dilute solutions using first-principles molecular dynamics: computing more than a million atoms with over a million cores.
Jean-Luc Fattebert, Daniel Osei-Kuffuor, Erik W. Draeger, Tadashi Ogitsu, William D. Krauss
2016Multi-resource fair sharing for datacenter jobs with placement constraints.
Wei Wang, Baochun Li, Ben Liang, Jun Li
2016Optimal execution of co-analysis for large-scale molecular dynamics simulations.
Preeti Malakar, Venkatram Vishwanath, Christopher Knight, Todd S. Munson, Michael E. Papka
2016Optimizing memory efficiency for deep convolutional neural networks on GPUs.
Chao Li, Yi Yang, Min Feng, Srimat T. Chakradhar, Huiyang Zhou
2016PFEAST: a high performance sparse eigenvalue solver using distributed-memory linear solvers.
James Kestyn, Vasileios Kalantzis, Eric Polizzi, Yousef Saad
2016PIPES: a language and compiler for task-based programming on distributed-memory clusters.
Martin Kong, Louis-Noël Pouchet, P. Sadayappan, Vivek Sarkar
2016Performance analysis, design considerations, and applications of extreme-scale
Utkarsh Ayachit, Andrew C. Bauer, Earl P. N. Duque, Greg Eisenhauer, Nicola J. Ferrier, Junmin Gu, Kenneth E. Jansen, Burlen Loring, Zarija Lukic, Suresh Menon, Dmitriy Morozov, Patrick O'Leary, Reetesh Ranjan, Michel E. Rasquin, Christopher P. Stone, Venkatram Vishwanath, Gunther H. Weber, Brad Whitlock, Matthew Wolf, K. John Wu, E. Wes Bethel
2016Performance modeling of in situ rendering.
Matthew Larsen, Cyrus Harrison, James Kress, David Pugmire, Jeremy S. Meredith, Hank Childs
2016Perilla: metadata-based optimizations of an asynchronous runtime for adaptive mesh refinement.
Tan Nguyen, Didem Unat, Weiqun Zhang, Ann S. Almgren, Muhammed Nufail Farooqi, John Shalf
2016Pinpointing scale-dependent integer overflow bugs in large-scale parallel applications.
Ignacio Laguna, Martin Schulz
2016Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2016, Salt Lake City, UT, USA, November 13-18, 2016
John West, Cherri M. Pancake
2016Real-time synthesis of compression algorithms for scientific data.
Martin Burtscher, Hari Mukka, Annie Yang, Farbod Hesaaraki
2016Refactoring and optimizing the community atmosphere model (CAM) on the sunway taihulight supercomputer.
Haohuan Fu, Junfeng Liao, Wei Xue, Lanning Wang, Dexun Chen, Long Gu, Jinxiu Xu, Nan Ding, Xinliang Wang, Conghui He, Shizhen Xu, Yishuang Liang, Jiarui Fang, Yuanchao Xu, Weijie Zheng, Jingheng Xu, Zhen Zheng, Wanjing Wei, Xu Ji, He Zhang, Bingwei Chen, Kaiwei Li, Xiaomeng Huang, Wenguang Chen, Guangwen Yang
2016Reliable and efficient performance monitoring in linux.
Maria Dimakopoulou, Stéphane Eranian, Nectarios Koziris, Nicholas Bambos
2016SERF: efficient scheduling for fast deep neural network serving via judicious parallelism.
Feng Yan, Yuxiong He, Olatunji Ruwase, Evgenia Smirni
2016Scalable non-blocking preconditioned conjugate gradient methods.
Paul R. Eller, William Gropp
2016Scalemine: scalable parallel frequent subgraph mining in a single large graph.
Ehab Abdelhamid, Ibrahim Abdelaziz, Panos Kalnis, Zuhair Khayyat, Fuad T. Jamour
2016Scheduling-aware routing for supercomputers.
Jens Domke, Torsten Hoefler
2016Server-side log data analytics for I/O workload characterization and coordination on large shared storage systems.
Yang Liu, Raghul Gunasekaran, Xiaosong Ma, Sudharshan S. Vazhkudai
2016Simulation and performance analysis of the ECMWF tape library system.
Markus Mäsker, Lars Nagel, Tim Süß, André Brinkmann, Lennart Sorth
2016Simulations of below-ground dynamics of fungi: 1.184 pflops attained by automated generation and autotuning of temporal blocking codes.
Takayuki Muranushi, Hideyuki Hotta, Junichiro Makino, Seiya Nishizawa, Hirofumi Tomita, Keigo Nitadori, Masaki Iwasawa, Natsuki Hosono, Yutaka Maruyama, Hikaru Inoue, Hisashi Yashiro, Yoshifumi Nakamura
2016Strassen's algorithm reloaded.
Jianyu Huang, Tyler M. Smith, Greg M. Henry, Robert A. van de Geijn
2016The mont-blanc prototype: an alternative approach for HPC systems.
Nikola Rajovic, Alejandro Rico, Filippo Mantovani, Daniel Ruiz, Josep Oriol Vilarrubi, Constantino Gómez, Luna Backes, Diego Nieto, Harald Servat, Xavier Martorell, Jesús Labarta, Eduard Ayguadé, Chris Adeniyi-Jones, Said Derradji, Hervé Gloaguen, Piero Lanucara, Nico Sanna, Jean-François Méhaut, Kevin Pouget, Brice Videau, Eric Boyer, Momme Allalen, Axel Auweter, David Brayford, Daniele Tafani, Volker Weinberg, Dirk Brömmel, René Halver, Jan H. Meinke, Ramón Beivide, Mariano Benito, Enrique Vallejo, Mateo Valero, Alex Ramírez
2016The vectorization of the tersoff multi-body potential: an exercise in performance portability.
Markus Höhnerbach, Ahmed E. Ismail, Paolo Bientinesi
2016Towards green aviation with python at petascale.
Peter E. Vincent, Freddie D. Witherden, Brian C. Vermeire, Jin Seok Park, Arvind Iyer
2016Transient guarantees: maximizing the value of idle cloud capacity.
Supreeth Shastri, Amr Rizk, David Irwin
2016Translating OpenMP device constructs to OpenCL using unnecessary data transfer elimination.
Junghyun Kim, Yong-Jun Lee, Jung-Ho Park, Jaejin Lee
2016Truenorth ecosystem for brain-inspired computing: scalable systems, software, and applications.
Jun Sawada, Filipp Akopyan, Andrew S. Cassidy, Brian Taba, Michael V. DeBole, Pallab Datta, Rodrigo Alvarez-Icaza, Arnon Amir, John V. Arthur, Alexander Andreopoulos, Rathinakumar Appuswamy, Heinz Baier, Davis Barch, David J. Berg, Carmelo di Nolfo, Steven K. Esser, Myron Flickner, Thomas A. Horvath, Bryan L. Jackson, Jeff Kusnitz, Scott Lekuch, Michael Mastro, Timothy Melano, Paul A. Merolla, Steven E. Millman, Tapan K. Nayak, Norm Pass, Hartmut E. Penner, William P. Risk, Kai Schleupen, Benjamin G. Shaw, Hayley Wu, Brian Giera, Adam T. Moody, T. Nathan Mundhenk, Brian Van Essen, Eric X. Wang, David P. Widemann, Qing Wu, William E. Murphy, Jamie K. Infantolino, James A. Ross, Dale R. Shires, Manuel M. Vindiola, Raju Namburu, Dharmendra S. Modha
2016Týr: blob storage meets built-in transactions.
Pierre Matri, Alexandru Costan, Gabriel Antoniu, Jesús Montes, María S. Pérez
2016Understanding error propagation in GPGPU applications.
Guanpeng Li, Karthik Pattabiraman, Chen-Yong Cher, Pradip Bose
2016Understanding performance interference in next-generation HPC systems.
Oscar H. Mondragon, Patrick G. Bridges, Scott Levy, Kurt B. Ferreira, Patrick M. Widener
2016Unprotected computing: a large-scale study of DRAM raw error rate on a supercomputer.
Leonardo Bautista-Gomez, Ferad Zyulkyarov, Osman S. Unsal, Simon McIntosh-Smith
2016Watch out for the bully!: job interference study on dragonfly network.
Xu Yang, John Jenkins, Misbah Mubarak, Robert B. Ross, Zhiling Lan
2016ZNN
Aleksandar Zlateski, Kisuk Lee, H. Sebastian Seung
2016dCUDA: hardware supported overlap of computation and communication.
Tobias Gysi, Jeremia Bär, Torsten Hoefler