PPoPP - RankMe

50 papers

Year	Title / Authors
2010	A distributed placement service for graph-structured and tree-structured data. Gregory Buehrer, Srinivasan Parthasarathy, Shirish Tatikonda
2010	A practical concurrent binary search tree. Nathan Grasso Bronson, Jared Casper, Hassan Chafi, Kunle Olukotun
2010	A symbolic verifier for CUDA programs. Guodong Li, Ganesh Gopalakrishnan, Robert M. Kirby, Daniel J. Quinlan
2010	An adaptive performance modeling tool for GPU architectures. Sara S. Baghsorkhi, Matthieu Delahaye, Sanjay J. Patel, William D. Gropp, Wen-mei W. Hwu
2010	An optimizing compiler for GPGPU programs with input-data sharing. Yi Yang, Ping Xiang, Jingfei Kong, Huiyang Zhou
2010	Analyzing lock contention in multithreaded applications. Nathan R. Tallent, John M. Mellor-Crummey, Allan Porterfield
2010	Application heartbeats for software performance and health. Henry Hoffmann, Jonathan Eastep, Marco D. Santambrogio, Jason E. Miller, Anant Agarwal
2010	Applying the concurrent collections programming model to asynchronous parallel dense linear algebra. Aparna Chandramowlishwaran, Kathleen Knobe, Richard W. Vuduc
2010	CUDAlign: using GPU to accelerate the comparison of megabase genomic sequences. Edans Flavius de Oliveira Sandes, Alba Cristina Magalhaes Alves de Melo
2010	Compiler aided selective lock assignment for improving the performance of software transactional memory. Sandya Mannarswamy, Dhruva R. Chakrabarti, Kaushik Rajan, Sujoy Saraswati
2010	Composable thread coloring. Dean F. Sutherland, William L. Scherlis
2010	Continuous speculative program parallelization in software. Chao Zhang, Chen Ding, Xiaoming Gu, Kirk Kelsey, Tongxin Bai, Xiaobing Feng
2010	Data transformations enabling loop vectorization on multithreaded data parallel architectures. Byunghyun Jang, Perhaad Mistry, Dana Schaa, Rodrigo Dominguez, David R. Kaeli
2010	Debugging programs that use atomic blocks and transactional memory. Ferad Zyulkyarov, Tim Harris, Osman S. Unsal, Adrián Cristal, Mateo Valero
2010	Does cache sharing on modern CMP matter to the performance of contemporary multithreaded programs? Eddy Z. Zhang, Yunlian Jiang, Xipeng Shen
2010	Effective communication and computation overlap with hybrid MPI/SMPSs. Vladimir Marjanovic, Jesús Labarta, Eduard Ayguadé, Mateo Valero
2010	Exascale computing: the challenges and opportunities in the next decade. Tilak Agerwala
2010	Extreme scale computing: challenges and opportunities. Josep Torrellas, Bill Gropp, Jaime H. Moreno, Kunle Olukotun, Vivek Sarkar
2010	Fast tridiagonal solvers on the GPU. Yao Zhang, Jonathan Cohen, John D. Owens
2010	Featherweight X10: a core calculus for async-finish parallelism. Jonathan K. Lee, Jens Palsberg
2010	GAMBIT: effective unit testing for concurrency libraries. Katherine E. Coons, Sebastian Burckhardt, Madanlal Musuvathi
2010	Helper locks for fork-join parallel programming. Kunal Agrawal, Charles E. Leiserson, Jim Sukha
2010	Improving parallelism and locality with asynchronous algorithms. Lixia Liu, Zhiyuan Li
2010	Input-driven dynamic execution prediction of streaming applications. Farhana Aleen, Monirul Sharif, Santosh Pande
2010	Intra-application shared cache partitioning for multithreaded applications. Sai Prashanth Muralidhara, Mahmut T. Kandemir, Padma Raghavan
2010	Is hardware innovation over? Arvind
2010	Is transactional programming actually easier? Christopher J. Rossbach, Owen S. Hofmann, Emmett Witchel
2010	KRASH: reproducible CPU load generation on many cores machines. Swann Perarnau, Guillaume Huard
2010	Lazy binary-splitting: a run-time adaptive work-stealing scheduler. Alexandros Tzannes, George C. Caragea, Rajeev Barua, Uzi Vishkin
2010	Leveraging parallel nesting in transactional memory. João Pedro Barreto, Aleksandar Dragojevic, Paulo Ferreira, Rachid Guerraoui, Michal Kapalka
2010	Load balancing on speed. Steven A. Hofmeyr, Costin Iancu, Filip Blagojevic
2010	Model-driven autotuning of sparse matrix-vector multiply on GPUs. Jeewhan Choi, Amik Singh, Richard W. Vuduc
2010	Modeling advanced collective communication algorithms on cell-based systems. Qasim Ali, Samuel P. Midkiff, Vijay S. Pai
2010	Modeling transactional memory workload performance. Donald E. Porter, Emmett Witchel
2010	NOrec: streamlining STM by abolishing ownership records. Luke Dalessandro, Michael F. Spear, Michael L. Scott
2010	New abstractions for effective performance analysis of STM programs. Dhruva R. Chakrabarti
2010	PHANTOM: predicting performance of parallel applications on large-scale parallel machines using a single node. Jidong Zhai, Wenguang Chen, Weimin Zheng
2010	Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP 2010, Bangalore, India, January 9-14, 2010 R. Govindarajan, David A. Padua, Mary W. Hall
2010	SLAW: a scalable locality-aware adaptive work-stealing scheduler for multi-core systems. Yi Guo, Yisheng Zhao, Vincent Cavé, Vivek Sarkar
2010	Scalable communication protocols for dynamic sparse data exchange. Torsten Hoefler, Christian Siebert, Andrew Lumsdaine
2010	Scaling LAPACK panel operations using parallel cache assignment. Anthony M. Castaldo, R. Clint Whaley
2010	Scheduling support for transactional memory contention management. Walther Maldonado, Patrick Marlier, Pascal Felber, Adi Suissa, Danny Hendler, Alexandra Fedorova, Julia L. Lawall, Gilles Muller
2010	Structure-driven optimizations for amorphous data-parallel programs. Mario Méndez-Lojo, Donald Nguyen, Dimitrios Prountzos, Xin Sui, Muhammad Amber Hassaan, Milind Kulkarni, Martin Burtscher, Keshav Pingali
2010	Supporting lock-free composition of concurrent data objects. Daniel Cederman, Philippas Tsigas
2010	Symbolic prefetching in transactional distributed shared memory. Alokika Dash, Brian Demsky
2010	The LOFAR correlator: implementation and performance analysis. John W. Romein, P. Chris Broekema, Jan David Mol, Rob van Nieuwpoort
2010	The pilot library for novice MPI programmers. John D. Carter, William B. Gardner, Gary Gréwal
2010	Thread to strand binding of parallel network applications in massive multi-threaded systems. Petar Radojkovic, Vladimir Cakarevic, Javier Verdú, Alex Pajuelo, Francisco J. Cazorla, Mario Nemirovsky, Mateo Valero
2010	Towards scalable and transparent parallelization of multiplayer games using transactional memory support. Daniel Lupei, Bogdan Simion, Don Pinto, Matthew Misler, Mihai Burcea, William Krick, Cristiana Amza
2010	Using data structure knowledge for efficient lock generation and strong atomicity. Gautam Upadhyaya, Samuel P. Midkiff, Vijay S. Pai