| 2016 | 49th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2016, Taipei, Taiwan, October 15-19, 2016 |
| 2016 | A cloud-scale acceleration architecture. Adrian M. Caulfield, Eric S. Chung, Andrew Putnam, Hari Angepat, Jeremy Fowers, Michael Haselman, Stephen Heil, Matt Humphrey, Puneet Kaur, Joo-Young Kim, Daniel Lo, Todd Massengill, Kalin Ovtcharov, Michael Papamichael, Lisa Woods, Sitaram Lanka, Derek Chiou, Doug Burger |
| 2016 | A patch memory system for image processing and computer vision. Jason Clemons, Chih-Chi Cheng, Iuri Frosio, Daniel R. Johnson, Stephen W. Keckler |
| 2016 | A unified memory network architecture for in-memory computing in commodity servers. Jia Zhan, Itir Akgun, Jishen Zhao, Al Davis, Paolo Faraboschi, Yuangang Wang, Yuan Xie |
| 2016 | An ultra low-power hardware accelerator for automatic speech recognition. Reza Yazdani, Albert Segura, José-María Arnau, Antonio González |
| 2016 | Approxilyzer: Towards a systematic framework for instruction-level approximate computing and its application to hardware resiliency. Radha Venkatagiri, Abdulrahman Mahmoud, Siva Kumar Sastry Hari, Sarita V. Adve |
| 2016 | Bridging the I/O performance gap for big data workloads: A new NVDIMM-based approach. Renhai Chen, Zili Shao, Tao Li |
| 2016 | C Cheng-Chieh Huang, Rakesh Kumar, Marco Elver, Boris Grot, Vijay Nagarajan |
| 2016 | CANDY: Enabling coherent DRAM caches for multi-node systems. Chia-Chen Chou, Aamer Jaleel, Moinuddin K. Qureshi |
| 2016 | Cache-emulated register file: An integrated on-chip memory architecture for high performance GPGPUs. Naifeng Jing, Jianfei Wang, Fengfeng Fan, Wenkang Yu, Li Jiang, Chao Li, Xiaoyao Liang |
| 2016 | Cambricon-X: An accelerator for sparse neural networks. Shijin Zhang, Zidong Du, Lei Zhang, Huiying Lan, Shaoli Liu, Ling Li, Qi Guo, Tianshi Chen, Yunji Chen |
| 2016 | Chainsaw: Von-neumann accelerators to leverage fused instruction chains. Amirali Sharifian, Snehasish Kumar, Apala Guha, Arrvindh Shriraman |
| 2016 | Chameleon: Versatile and practical near-DRAM acceleration architecture for large memory systems. Hadi Asghari Moghaddam, Young Hoon Son, Jung Ho Ahn, Nam Sung Kim |
| 2016 | Co-designing accelerators and SoC interfaces using gem5-Aladdin. Yakun Sophia Shao, Sam Likun Xi, Vijayalakshmi Srinivasan, Gu-Yeon Wei, David M. Brooks |
| 2016 | Concise loads and stores: The case for an asymmetric compute-memory architecture for approximation. Animesh Jain, Parker Hill, Shih-Chieh Lin, Muneeb Khan, Md. Enamul Haque, Michael A. Laurenzano, Scott A. Mahlke, Lingjia Tang, Jason Mars |
| 2016 | Contention-based congestion management in large-scale networks. Gwangsun Kim, Changhyun Kim, Jiyun Jeong, Mike Parker, John Kim |
| 2016 | Continuous runahead: Transparent hardware acceleration for memory intensive workloads. Milad Hashemi, Onur Mutlu, Yale N. Patt |
| 2016 | Continuous shape shifting: Enabling loop co-optimization via near-free dynamic code rewriting. Animesh Jain, Michael A. Laurenzano, Lingjia Tang, Jason Mars |
| 2016 | CrystalBall: Statically analyzing runtime behavior via deep sequence learning. Stephen A. Zekany, Daniel Rings, Nathan Harada, Michael A. Laurenzano, Lingjia Tang, Jason Mars |
| 2016 | Data-centric execution of speculative parallel programs. Mark C. Jeffrey, Suvinay Subramanian, Maleen Abeydeera, Joel S. Emer, Daniel Sánchez |
| 2016 | Delegated persist ordering. Aasheesh Kolli, Jeff Rosen, Stephan Diestelhorst, Ali G. Saidi, Steven Pelley, Sihang Liu, Peter M. Chen, Thomas F. Wenisch |
| 2016 | Dictionary sharing: An efficient cache compression scheme for compressed caches. Biswabandan Panda, André Seznec |
| 2016 | Dynamic error mitigation in NoCs using intelligent prediction techniques. Dominic DiTomaso, Travis Boraten, Avinash Kodi, Ahmed Louri |
| 2016 | Efficient data supply for hardware accelerators with prefetching and access/execute decoupling. Tao Chen, G. Edward Suh |
| 2016 | Efficient kernel synthesis for performance portable programming. Li-Wen Chang, Izzat El Hajj, Christopher I. Rodrigues, Juan Gómez-Luna, Wen-mei W. Hwu |
| 2016 | Evaluating programmable architectures for imaging and vision applications. Artem Vasilyev, Nikhil Bhagdikar, Ardavan Pedram, Stephen Richardson, Shahar Kvatinsky, Mark Horowitz |
| 2016 | Exploiting semantic commutativity in hardware speculation. Guowei Zhang, Virginia Chiu, Daniel Sánchez |
| 2016 | From high-level deep neural models to FPGAs. Hardik Sharma, Jongse Park, Divya Mahajan, Emmanuel Amaro, Joon Kyung Kim, Chenkai Shao, Asit Mishra, Hadi Esmaeilzadeh |
| 2016 | Fused-layer CNN accelerators. Manoj Alwani, Han Chen, Michael Ferdman, Peter A. Milder |
| 2016 | GRAPE: Minimizing energy for GPU applications with performance requirements. Muhammad Husni Santriaji, Henry Hoffmann |
| 2016 | Graphicionado: A high-performance and energy-efficient accelerator for graph analytics. Tae Jun Ham, Lisa Wu, Narayanan Sundaram, Nadathur Satish, Margaret Martonosi |
| 2016 | HARE: Hardware accelerator for regular expressions. Vaibhav Gogte, Aasheesh Kolli, Michael J. Cafarella, Loris D'Antoni, Thomas F. Wenisch |
| 2016 | Improving bank-level parallelism for irregular applications. Xulong Tang, Mahmut T. Kandemir, Praveen Yedlapalli, Jagadish Kotra |
| 2016 | Improving energy efficiency of DRAM by exploiting half page row access. Heonjae Ha, Ardavan Pedram, Stephen Richardson, Shahar Kvatinsky, Mark Horowitz |
| 2016 | Jump over ASLR: Attacking branch predictors to bypass ASLR. Dmitry Evtyushkin, Dmitry V. Ponomarev, Nael B. Abu-Ghazaleh |
| 2016 | KLAP: Kernel launch aggregation and promotion for optimizing dynamic parallelism. Izzat El Hajj, Juan Gómez-Luna, Cheng Li, Li-Wen Chang, Dejan S. Milojicic, Wen-mei W. Hwu |
| 2016 | Keynotes: Internet of Things: History and hype, technology and policy. Margaret Martonosi |
| 2016 | Lazy release consistency for GPUs. Johnathan Alsop, Marc S. Orr, Bradford M. Beckmann, David A. Wood |
| 2016 | Low-cost soft error resilience with unified data verification and fine-grained recovery for acoustic sensor based detection. Qingrui Liu, Changhee Jung, Dongyoon Lee, Devesh Tiwari |
| 2016 | MIMD synchronization on SIMT architectures. Ahmed ElTantawy, Tor M. Aamodt |
| 2016 | NEUTRAMS: Neural network transformation and co-design under neuromorphic hardware constraints. Yu Ji, Youhui Zhang, Shuangchen Li, Ping Chi, Cihang Jiang, Peng Qu, Yuan Xie, Wenguang Chen |
| 2016 | NeSC: Self-virtualizing nested storage controller. Yonatan Gottesman, Yoav Etsion |
| 2016 | OSCAR: Orchestrating STT-RAM cache traffic for heterogeneous CPU-GPU architectures. Jia Zhan, Onur Kayiran, Gabriel H. Loh, Chita R. Das, Yuan Xie |
| 2016 | Path confidence based lookahead prefetching. Jinchun Kim, Seth H. Pugsley, Paul V. Gratz, A. L. Narasimha Reddy, Chris Wilkerson, Zeshan Chishti |
| 2016 | Perceptron learning for reuse prediction. Elvira Teran, Zhe Wang, Daniel A. Jiménez |
| 2016 | PoisonIvy: Safe speculation for secure memory. Tamara Silbergleit Lehman, Andrew D. Hilton, Benjamin C. Lee |
| 2016 | Quantifying and improving the efficiency of hardware-based mobile malware detectors. Mikhail Kazdagli, Vijay Janapa Reddi, Mohit Tiwari |
| 2016 | Racer: TSO consistency via race detection. Alberto Ros, Stefanos Kaxiras |
| 2016 | Redefining QoS and customizing the power management policy to satisfy individual mobile users. Kaige Yan, Xingyao Zhang, Jingweijia Tan, Xin Fu |
| 2016 | Reducing data movement energy via online data clustering and encoding. Shibo Wang, Engin Ipek |
| 2016 | Register sharing for equality prediction. Arthur Perais, Fernando A. Endo, André Seznec |
| 2016 | ReplayConfusion: Detecting cache-based covert channel attacks using record and replay. Mengjia Yan, Yasser Shalabi, Josep Torrellas |
| 2016 | SABRes: Atomic object reads for in-memory rack-scale computing. Alexandros Daglis, Dmitrii Ustiugov, Stanko Novakovic, Edouard Bugnion, Babak Falsafi, Boris Grot |
| 2016 | Snatch: Opportunistically reassigning power allocation between processor and memory in 3D stacks. Dimitrios Skarlatos, Renji Thomas, Aditya Agrawal, Shibin Qin, Robert C. N. Pilawa-Podgurski, Ulya R. Karpuzcu, Radu Teodorescu, Nam Sung Kim, Josep Torrellas |
| 2016 | Spectral profiling: Observer-effect-free profiling by monitoring EM emanations. Nader Sehatbakhsh, Alireza Nazari, Alenka G. Zajic, Milos Prvulovic |
| 2016 | Stripes: Bit-serial deep neural network computing. Patrick Judd, Jorge Albericio, Tayler H. Hetherington, Tor M. Aamodt, Andreas Moshovos |
| 2016 | The Bunker Cache for spatio-value approximation. Joshua San Miguel, Jorge Albericio, Natalie D. Enright Jerger, Aamer Jaleel |
| 2016 | The microarchitecture of a real-time robot motion planning accelerator. Sean Murray, William Floyd-Jones, Ying Qi, George Dimitri Konidaris, Daniel J. Sorin |
| 2016 | Ti-states: Processor power management in the temperature inversion region. Yazhou Zu, Wei Huang, Indrani Paul, Vijay Janapa Reddi |
| 2016 | Towards efficient server architecture for virtualized network function deployment: Implications and implementations. Yang Hu, Tao Li |
| 2016 | Zorua: A holistic approach to resource virtualization in GPUs. Nandita Vijaykumar, Kevin Hsieh, Gennady Pekhimenko, Samira Manabi Khan, Ashish Shrestha, Saugata Ghose, Adwait Jog, Phillip B. Gibbons, Onur Mutlu |
| 2016 | pTask: A smart prefetching scheme for OS intensive applications. Prathmesh Kallurkar, Smruti R. Sarangi |
| 2016 | vDNN: Virtualized deep neural networks for scalable, memory-efficient neural network design. Minsoo Rhu, Natalia Gimelshein, Jason Clemons, Arslan Zulfiqar, Stephen W. Keckler |