| 1996 | A Persistent Rescheduled-page Cache for Low Overhead Object Code Compatibility in VLIW Architectures. Thomas M. Conte, Sumedh W. Sathaye, Sanjeev Banerjia |
| 1996 | Accurate and Practical Profile-driven Compilation Using the Profile Buffer. Thomas M. Conte, Kishore N. Menezes, Mary Ann Hirsch |
| 1996 | Analysis Techniques for Predicated Code. Richard Johnson, Michael S. Schlansker |
| 1996 | Assigning Confidence to Conditional Branch Predictions. Erik Jacobsen, Eric Rotenberg, James E. Smith |
| 1996 | Combining Loop Transformations Considering Caches and Scheduling. Michael E. Wolf, Dror E. Maydan, Ding-Kai Chen |
| 1996 | Compiler Synthesized Dynamic Branch Prediction. Scott A. Mahlke, Balas K. Natarajan |
| 1996 | Custom-fit Processors: Letting Applications Define Architectures. Joseph A. Fisher, Paolo Faraboschi, Giuseppe Desoli |
| 1996 | Design Decisions Influencing the UltraSPARC's Instruction Fetch Architecture. Robert Yung |
| 1996 | Efficient Path Profiling. Thomas Ball, James R. Larus |
| 1996 | Exceeding the Dataflow Limit via Value Prediction. Mikko H. Lipasti, John Paul Shen |
| 1996 | Global Predicate Analysis and Its Application to Register Allocation. David M. Gillies, Roy Dz-Ching Ju, Richard Johnson, Michael S. Schlansker |
| 1996 | Heuristics for Register-Constrained Software Pipelining. Josep Llosa, Mateo Valero, Eduard Ayguadé |
| 1996 | Hot Cold Optimization of Large Windows/NT Applications. Robert S. Cohn, P. Geoffrey Lowney |
| 1996 | Increasing the Instruction Fetch Rate via Block-structured Instruction Set Architectures. Eric Hao, Po-Yung Chang, Marius Evers, Yale N. Patt |
| 1996 | Instruction Fetch Mechanisms for VLIW Architectures with Compressed Encodings. Thomas M. Conte, Sanjeev Banerjia, Sergei Y. Larin, Kishore N. Menezes, Sumedh W. Sathaye |
| 1996 | Instruction Scheduling and Executable Editing. Eric Schnarr, James R. Larus |
| 1996 | Instruction Scheduling for the HP PA-8000. David A. Dunn, Wei-Chung Hsu |
| 1996 | Integrating a Misprediction Recovery Cache (MRC) into a Superscalar Pipeline. James O. Bondi, Ashwini K. Nanda, Simonjit Dutta |
| 1996 | Java Bytecode to Native Code Translation: The Caffeine Prototype and Preliminary Results. Cheng-Hsueh A. Hsieh, John C. Gyllenhaal, Wen-mei W. Hwu |
| 1996 | Meld Scheduling: Relaxing Scheduling Constraints Across Region Boundaries. Santosh G. Abraham, Vinod Kathail, Brian L. Deitrich |
| 1996 | Modulo Scheduling of Loops in Control-intensive Non-numeric Programs. Daniel M. Lavery, Wen-mei W. Hwu |
| 1996 | Optimization for a Superscalar Out-of-Order Machine. Anne M. Holler |
| 1996 | Optimization of Machine Descriptions for Efficient Use. John C. Gyllenhaal, Wen-mei W. Hwu, B. Ramakrishna Rau |
| 1996 | Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 29, Paris, France, December 2-4, 1996 Stephen W. Melvin, Steve Beaty |
| 1996 | Profile-driven Instruction Level Parallel Scheduling with Application to Super Blocks. Chandra Chekuri, Richard Johnson, Rajeev Motwani, B. Natarajan, B. Ramakrishna Rau, Michael S. Schlansker |
| 1996 | Software Pipelining Loops with Conditional Branches. Mark G. Stoodley, Corinna G. Lee |
| 1996 | Speculative Hedge: Regulating Compile-time Speculation Against Profile Variations. Brian L. Deitrich, Wen-mei W. Hwu |
| 1996 | Tango: A Hardware-Based Data Prefetching Technique for Superscalar Processors. Shlomit S. Pinter, Adi Yoaz |
| 1996 | The Performance Potential of Data Dependence Speculation & Collapsing. Yiannakis Sazeides, Stamatis Vassiliadis, James E. Smith |
| 1996 | Trace Cache: A Low Latency Approach to High Bandwidth Instruction Fetching. Eric Rotenberg, Steve Bennett, James E. Smith |
| 1996 | Wrong-path Instruction Prefetching. Jim Pierce, Trevor N. Mudge |