HPCA - RankMe – RankMe

58 papers

Year	Title / Authors
2016	2016 IEEE International Symposium on High Performance Computer Architecture, HPCA 2016, Barcelona, Spain, March 12-16, 2016
2016	A case for toggle-aware compression for GPU systems. Gennady Pekhimenko, Evgeny Bolotin, Nandita Vijaykumar, Onur Mutlu, Todd C. Mowry, Stephen W. Keckler
2016	A complete key recovery timing attack on a GPU. Zhen Hang Jiang, Yunsi Fei, David R. Kaeli
2016	A large-scale study of soft-errors on GPUs in the field. Bin Nie, Devesh Tiwari, Saurabh Gupta, Evgenia Smirni, James H. Rogers
2016	A low power software-defined-radio baseband processor for the Internet of Things. Yajing Chen, Shengshuo Lu, Hun-Seok Kim, David T. Blaauw, Ronald G. Dreslinski, Trevor N. Mudge
2016	A low-power hybrid reconfigurable architecture for resistive random-access memories. Miguel Angel Lastras-Montaño, Amirali Ghofrani, Kwang-Ting Cheng
2016	A market approach for handling power emergencies in multi-tenant data center. Mohammad A. Islam, Xiaoqi Ren, Shaolei Ren, Adam Wierman, Xiaorui Wang
2016	A performance analysis framework for optimizing OpenCL applications on FPGAs. Zeke Wang, Bingsheng He, Wei Zhang, Shunning Jiang
2016	Amdahl's law for lifetime reliability scaling in heterogeneous multicore processors. William J. Song, Saibal Mukhopadhyay, Sudhakar Yalamanchili
2016	Approximating warps with intra-warp operand value similarity. Daniel Wong, Nam Sung Kim, Murali Annavaram
2016	Atomic persistence for SCM with a non-intrusive backend controller. Kshitij A. Doshi, Ellis Giles, Peter J. Varman
2016	Best-offset hardware prefetching. Pierre Michaud
2016	CATalyst: Defeating last-level cache side channel attacks in cloud computing. Fangfei Liu, Qian Ge, Yuval Yarom, Frank McKeen, Carlos V. Rozas, Gernot Heiser, Ruby B. Lee
2016	Cache QoS: From concept to reality in the Intel® Xeon® processor E5-2600 v3 product family. Andrew Herdrich, Edwin Verplanke, Priya Autee, Ramesh Illikkal, Chris Gianos, Ronak Singhal, Ravi R. Iyer
2016	ChargeCache: Reducing DRAM latency by exploiting row access locality. Hasan Hassan, Gennady Pekhimenko, Nandita Vijaykumar, Vivek Seshadri, Donghyuk Lee, Oguz Ergin, Onur Mutlu
2016	CompEx: Compression-expansion coding for energy, latency, and lifetime improvements in MLC/TLC NVM. Poovaiah M. Palangappa, Kartik Mohanram
2016	Core tunneling: Variation-aware voltage noise mitigation in GPUs. Renji Thomas, Kristin Barber, Naser Sedaghati, Li Zhou, Radu Teodorescu
2016	Cost effective physical register sharing. Arthur Perais, André Seznec
2016	DUANG: Fast and lightweight page migration in asymmetric memory systems. Hao Wang, Jie Zhang, Sharmila Shridhar, Gieseo Park, Myoungsoo Jung, Nam Sung Kim
2016	DVFS for NoCs in CMPs: A thread voting approach. Yuan Yao, Zhonghai Lu
2016	Design and implementation of a mobile storage leveraging the DRAM interface. Sungyong Seo, Youngjin Cho, Youngkwang Yoo, Otae Bae, Jaegeun Park, Heehyun Nam, Sunmi Lee, Yongmyung Lee, Seungdo Chae, MoonSang Kwon, Jin-Hyeok Choi, Sangyeun Cho, Jaeheon Jeong, Duckhyun Chang
2016	Efficient GPU hardware transactional memory through early conflict resolution. Sui Chen, Lu Peng
2016	Efficient footprint caching for Tagless DRAM Caches. Hakbeom Jang, Yongjun Lee, Jongwon Kim, Youngsok Kim, Jangwoo Kim, Jinkyu Jeong, Jae W. Lee
2016	Efficient synthetic traffic models for large, complex SoCs. Jieming Yin, Onur Kayiran, Matthew Poremba, Natalie D. Enright Jerger, Gabriel H. Loh
2016	Energy-efficient address translation. Vasileios Karakostas, Jayneel Gandhi, Adrián Cristal, Mark D. Hill, Kathryn S. McKinley, Mario Nemirovsky, Michael M. Swift, Osman S. Unsal
2016	HRL: Efficient and flexible reconfigurable logic for near-data processing. Mingyu Gao, Christos Kozyrakis
2016	Improving smartphone user experience by balancing performance and energy with probabilistic QoS guarantee. Benjamin Gaudette, Carole-Jean Wu, Sarma B. K. Vrudhula
2016	LASER: Light, Accurate Sharing dEtection and Repair. Liang Luo, Akshitha Sriraman, Brooke Fugate, Shiliang Hu, Gilles Pokam, Chris J. Newburn, Joseph Devietti
2016	Lattice priority scheduling: Low-overhead timing-channel protection for a shared memory controller. Andrew Ferraiuolo, Yao Wang, Danfeng Zhang, Andrew C. Myers, G. Edward Suh
2016	LiveSim: Going live with microarchitecture simulation. Sina Hassani, Gabriel Southern, Jose Renau
2016	Low-Cost Inter-Linked Subarrays (LISA): Enabling fast inter-subarray data movement in DRAM. Kevin K. Chang, Prashant J. Nair, Donghyuk Lee, Saugata Ghose, Moinuddin K. Qureshi, Onur Mutlu
2016	MaPU: A novel mathematical computing architecture. Donglin Wang, Xueliang Du, Leizu Yin, Chen Lin, Hong Ma, Weili Ren, Huijuan Wang, Xingang Wang, Shaolin Xie, Lei Wang, Zijun Liu, Tao Wang, Zhonghua Pu, Guangxin Ding, Mengchen Zhu, Lipeng Yang, Ruoshan Guo, Zhiwei Zhang, Xiao Lin, Jie Hao, Yongyong Yang, Wenqin Sun, Fabiao Zhou, NuoZhou Xiao, Qian Cui, Xiaoqin Wang
2016	McVerSi: A test generation framework for fast memory consistency verification in simulation. Marco Elver, Vijay Nagarajan
2016	Memristive Boltzmann machine: A hardware accelerator for combinatorial optimization and deep learning. Mahdi Nazm Bojnordi, Engin Ipek
2016	Minimal disturbance placement and promotion. Elvira Teran, Yingying Tian, Zhe Wang, Daniel A. Jiménez
2016	Mobile CPU's rise to power: Quantifying the impact of generational mobile CPU design trends on performance, energy, and user satisfaction. Matthew Halpern, Yuhao Zhu, Vijay Janapa Reddi
2016	Modeling cache performance beyond LRU. Nathan Beckmann, Daniel Sánchez
2016	Parity Helix: Efficient protection for single-dimensional faults in multi-dimensional memory systems. Xun Jian, Vilas Sridharan, Rakesh Kumar
2016	PleaseTM: Enabling transaction conflict management in requester-wins hardware transactional memory. Sunjae Park, Milos Prvulovic, Christopher J. Hughes
2016	Predicting the memory bandwidth and optimal core allocations for multi-threaded applications on large-scale NUMA machines. Wei Wang, Jack W. Davidson, Mary Lou Soffa
2016	Pushing the limits of accelerator efficiency while retaining programmability. Tony Nowatzki, Vinay Gangadhar, Karthikeyan Sankaralingam, Greg Wright
2016	RADAR: Runtime-assisted dead region management for last-level caches. Madhavan Manivannan, Vassilis Papaefstathiou, Miquel Pericàs, Per Stenström
2016	Restore truncation for performance improvement in future DRAM systems. Xianwei Zhang, Youtao Zhang, Bruce R. Childers, Jun Yang
2016	Revisiting virtual L1 caches: A practical design using dynamic synonym remapping. Hongil Yoon, Gurindar S. Sohi
2016	SCsafe: Logging sequential consistency violations continuously and precisely. Yuelu Duan, David A. Koufaty, Josep Torrellas
2016	SLaC: Stage laser control for a flattened butterfly network. Yigit Demir, Nikos Hardavellas
2016	ScalCore: Designing a core for voltage scalability. Bhargava Gopireddy, Choungki Song, Josep Torrellas, Nam Sung Kim, Aditya Agrawal, Asit K. Mishra
2016	Selective GPU caches to eliminate CPU-GPU HW cache coherence. Neha Agarwal, David W. Nellans, Eiman Ebrahimi, Thomas F. Wenisch, John Danskin, Stephen W. Keckler
2016	Simultaneous Multikernel GPU: Multi-tasking throughput processors via fine-grained sharing. Zhenning Wang, Jun Yang, Rami G. Melhem, Bruce R. Childers, Youtao Zhang, Minyi Guo
2016	SizeCap: Efficiently handling power surges in fuel cell powered data centers. Yang Li, Di Wang, Saugata Ghose, Jie Liu, Sriram Govindan, Sean James, Eric Peterson, John Siegler, Rachata Ausavarungnirun, Onur Mutlu
2016	Software transparent dynamic binary translation for coarse-grain reconfigurable architectures. Matthew A. Watkins, Tony Nowatzki, Anthony Carno
2016	Symbiotic job scheduling on the IBM POWER8. Josué Feliu, Stijn Eyerman, Julio Sahuquillo, Salvador Petit
2016	TABLA: A unified template-based framework for accelerating statistical machine learning. Divya Mahajan, Jongse Park, Emmanuel Amaro, Hardik Sharma, Amir Yazdanbakhsh, Joon Kyung Kim, Hadi Esmaeilzadeh
2016	The runahead network-on-chip. Zimo Li, Joshua San Miguel, Natalie D. Enright Jerger
2016	Towards high performance paged memory for GPUs. Tianhao Zheng, David W. Nellans, Arslan Zulfiqar, Mark Stephenson, Stephen W. Keckler
2016	Venice: Exploring server architectures for effective resource sharing. Jianbo Dong, Rui Hou, Michael C. Huang, Tao Jiang, Boyan Zhao, Sally A. McKee, Haibin Wang, Xiaosong Cui, Lixin Zhang
2016	Warped-preexecution: A GPU pre-execution approach for improving latency hiding. Keunsoo Kim, Sangpil Lee, Myung Kuk Yoon, Gunjae Koo, Won Woo Ro, Murali Annavaram
2016	iPAWS: Instruction-issue pattern-based adaptive warp scheduling for GPGPUs. Minseok Lee, Gwangsun Kim, John Kim, Woong Seo, Yeon-Gon Cho, Soojung Ryu