| 1995 | A limit study of local memory requirements using value reuse profiles. Andrew S. Huang, John Paul Shen |
| 1995 | A modified approach to data cache management. Gary S. Tyson, Matthew K. Farrens, John Matthews, Andrew R. Pleszkun |
| 1995 | A system level perspective on branch architecture performance. Brad Calder, Dirk Grunwald, Joel S. Emer |
| 1995 | Alternative implementations of hybrid branch predictors. Po-Yung Chang, Eric Hao, Yale N. Patt |
| 1995 | An effective programmable prefetch engine for on-chip caches. Tien-Fu Chen |
| 1995 | An experimental study of several cooperative register allocation and instruction scheduling strategies. Cindy Norris, Lori L. Pollock |
| 1995 | An investigation of the performance of various instruction-issue buffer topologies. Stéphan Jourdan, Pascal Sainrat, Daniel Litaize |
| 1995 | Cache miss heuristics and preloading techniques for general-purpose programs. Toshihiro Ozawa, Yasunori Kimura, Shin'ichiro Nishizaki |
| 1995 | Control flow prediction with tree-like subgraphs for superscalar processors. Simonjit Dutta, Manoj Franklin |
| 1995 | Critical path reduction for scalar programs. Michael S. Schlansker, Vinod Kathail |
| 1995 | Decoupling integer execution in superscalar processors. Subbarao Palacharla, James E. Smith |
| 1995 | Design of storage hierarchy in multithreaded architectures. Lucas Roh, Walid A. Najjar |
| 1995 | Disjoint eager execution: an optimal form of speculative execution. Augustus K. Uht, Vijay Sindagi, Kelley Hall |
| 1995 | Dynamic path-based branch correlation. Ravi Nair |
| 1995 | Dynamic rescheduling: a technique for object code compatibility in VLIW architectures. Thomas M. Conte, Sumedh W. Sathaye |
| 1995 | Efficient instruction scheduling using finite state automata. Vasanth Bala, Norman Rubin |
| 1995 | Exploiting short-lived variables in superscalar processors. Luis A. Lozano, Guang R. Gao |
| 1995 | Hypernode reduction modulo scheduling. Josep Llosa, Mateo Valero, Eduard Ayguadé, Antonio González |
| 1995 | Improving CISC instruction decoding performance using a fill unit. Mark Smotherman, Manoj Franklin |
| 1995 | Improving instruction-level parallelism by loop unrolling and dynamic memory disambiguation. Jack W. Davidson, Sanjay Jinturkar |
| 1995 | Modulo scheduling with multiple initiation intervals. Nancy J. Warter-Perez, Noubar Partamian |
| 1995 | Partial resolution in branch target buffers. Barry S. Fagin, Kathryn Russell |
| 1995 | Partitioned register file for TTAs. Johan Janssen, Henk Corporaal |
| 1995 | Performance issues in correlated branch prediction schemes. Nicholas C. Gloy, Michael D. Smith, Cliff Young |
| 1995 | Petri net versus modulo scheduling for software pipelining. Vicki H. Allan, U. R. Shah, K. M. Reddy |
| 1995 | Proceedings of the 28th Annual International Symposium on Microarchitecture, Ann Arbor, Michigan, USA, November 29 - December 1, 1995 Trevor N. Mudge, Kemal Ebcioglu |
| 1995 | Region-based compilation: an introduction and motivation. Richard E. Hank, Wen-mei W. Hwu, B. Ramakrishna Rau |
| 1995 | Register allocation for predicated code. Alexandre E. Eichenberger, Edward S. Davidson |
| 1995 | SPAID: software prefetching in pointer- and call-intensive environments. Mikko H. Lipasti, William J. Schmidt, Steven R. Kunkel, Robert R. Roediger |
| 1995 | Self-regulation of workload in the Manchester Data-Flow computer. John R. Gurd, David F. Snelling |
| 1995 | Spill-free parallel scheduling of basic blocks. B. Natarajan, Michael S. Schlansker |
| 1995 | Stage scheduling: a technique to reduce the register requirements of a modulo schedule. Alexandre E. Eichenberger, Edward S. Davidson |
| 1995 | The M-Machine multicomputer. Marco Fillo, Stephen W. Keckler, William J. Dally, Nicholas P. Carter, Andrew Chang, Yevgeny Gurevich, Whay Sing Lee |
| 1995 | The performance impact of incomplete bypassing in processor pipelines. Pritpal S. Ahuja, Douglas W. Clark, Anne Rogers |
| 1995 | The predictability of branches in libraries. Brad Calder, Dirk Grunwald, Amitabh Srivastava |
| 1995 | The role of adaptivity in two-level adaptive branch prediction. Stuart Sechrest, Chih-Chieh Lee, Trevor N. Mudge |
| 1995 | Unrolling-based optimizations for modulo scheduling. Daniel M. Lavery, Wen-mei W. Hwu |
| 1995 | Zero-cycle loads: microarchitecture support for reducing load latency. Todd M. Austin, Gurindar S. Sohi |