HP Text Chapter 1
(John L. Hennessy and David A. Patterson, Computer Architecture: A
Quantitative Approach, Third Edition, Morgan Kaufmann, 2003.)
G. M. Amdahl, ``Validity of the single-processor approach to achieving large scale computing capabilities,'' In AFIPS Conference Proceedings,
pp. 483--485, April 1967.
G. Radin, ``The 801 Minicomputer,'' In Proceedings of the International Symposium on Architectural Support for Programming Languages and Operating Systems, pp. 39--47, March 1982.
D. A. Patterson and D. R. Ditzel, ``The Case for the Reduced Instruction Set Computer,'' in ACM SIGARCH Computer Architecture News, vol. 8, pp. 25--33, October, 1980.
J. S. Emer and D. W. Clark, ``A Characterization of Processor Performance in the VAX--11/780,'' In Proceedings of the 11th International Symposium on Computer Architecture, pp. 301--310, June 1984.
J. S. Emer and D. W. Clark, ``Retrospective: A Characterization of Processor Performance in the VAX--11/780,'' In 25 Years of the International Symposia on Computer Architecture: Selected Papers, pp. 37--38, 1998.
Bhandarkar, D. and Clark, D.W., ``Performance from Architecture:
Comparing a RISC and a CISC with Similar Hardware Organization,"
In Proceedings of the Fourth International Conference on Architectural Support for
Programming Languages and Operating Systems, pp. 310-319,
April, 1991.
R. M. Tomasulo, ``An Efficient Algorithm for Exploiting Multiple Arithmetic Units,'' In IBM Journal of Research and Development, Volume 11, pp. 25--33, January, 1967
Subbarao Palacharla, Norman P. Jouppi, and James E. Smith.
Quantifying the Complexity of Superscalar Processors, University of
Wisconsin, CS-TR-1996-1328. [citesear link]
Joseph A. Fisher, ``Very Long Instruction Word Architectures and the ELI-512,'' In The Tenth International Symposium on Computer Architecture, 1983.
Joseph A. Fisher, ``Retrospective: Very Long Instruction Word Architectures and the ELI-512,'' In 25 Years of the International Symposia on Computer Architecture: Selected Papers, pp. 34--36, 1998.
Joseph A. Fisher and Stefan M. Freudenberger, ``Predicting Conditional Branch Directions from Previous Runs of a Program,'' In Proceedings of the Firfth International Conference on Architectural Support for Programming Languges and Operating Systems, pp. 85--95, 1992.
Erik R. Altman, David Kaeli, and Yaron Sheffer, ``Welcome to the Opportunities of Binary Translation,'' IEEE Computer, Volume 33, Number 3,
pp. 40--45, March, 2000.
Cindy Zheng and Carol Thompson, ``PA-RISC to IA-64: Transparent Execution, No Recompilation'' IEEE Computer, Volume 33, Number 3,
pp. 47--52, March, 2000.
Michael Gschwind, Erik R. Altman, Sumedh Sathaye, Paul Ledak, and David Appenzeller, ``Dynamic and Transparent Binary Translation,'' IEEE Computer, Volume 33, Number 3, pp. 54--59, March, 2000.
Supplemental:
Richard L. Sites, Anton Chernoff, Matthew B. Kirk, Maurice P. Marks,
aand Scott G. Robinson, ``Binary Translation,'' In Digital Technical Journal, Volume 4, Number 4, 1992.
Alexander Klaiber,
``The Technology Behind Crusoe(TM) Processors'', January, 2000.
[page with link to PDF]
Peter Markstein, IA-64 and Elementrary Functions, Hewlett-Packard, 2000. (read p. 9--40 for class)
Johannes M. Mulder and Nhon T. Quach, and Michael J. Flynn. ``An Area Model for On-Chip Memories and its Application,'' IEEE Journal of Solid State Circuits, Volume 26, Number 2, pp. 98--106, February, 1991.
D. M. Tullsen, S. J. Eggers, J. S. Emer, H. M. Levy, J. L. Lo, and R. L. Stamm, ``Exploiting choice: Instruction fetch and issue on an implementable simultatneous multithreading processor,'' Proceedings of the 23rd Annual International Symposium on Computer Architecture, pp. 191--202, May 1996.
James Burns and Jean-Luc Gaudiot, ``Quantifying the SMT Layout Overhead---Does SMT Pull Its Weight?,'' Proceedings of the International Symposium on High-Performance Computer Architecture, pp. 109--120, 1999.
Mark S. Papamarcos and Janak H. Patel, ``A Low-Overhead Coherence Solution for Multiprocessors with Private Cache Memories,'' In Proceedings of the 11th International Symposium on Computer Architecture, 1984.
Janak H. Patel, ``Retrospective: A Low-Overhead Coherence Solution for Multiprocessors with Private Cache Memories,'' In 25 Years of the International Symposia on Computer Architecture: Selected Papers, pp. 39--41, 1998.
Michael Noakes, Deborah Wallach, and William J. Dally.
``The J-Machine Multicomputer: An Architectural Evaluation,''
In Proceedings of the 20th International Symposium on
Computer Architecture, pp. 224--235, May 1993.
William J. Dally, Andrew Chang, Andrew Chien, Stuart Fiske,
Waldemar Horwat, John Keen, Richard Lethin, Michael Noakes,
Peter Nuth, Ellen Spertus, Deborah Wallach, and D. Scott Wills,
``Retrospective: The J-Machine,''
In 25 Years of the International Symposia on Computer Architecture: Selected Papers, pp. 54--78, 1998.
John R. Hauser and John Wawrzynek. ``Garp: A MIPS Processor with a
Reconfigurable Coprocessor,'' in Proceedings of the IEEE Symposium on
Field-Programmable Custom Computing Machines (FCCM '97, April 16-18,
1997), pp. 24-33
[
Abstract and pointers] (N.B. earlier version with more
architecture details...can probably scan through parts of it after reading previous)
Supplemental:
Vincent Michael Bove, Jr. and John A. Watlington.
Cheops: A Reconfigurable Data-Flow System for Video Processing.
IEEE Transactions on Circuits and Systems for Video Technology,
5(2):140--149, April 1995. [HTML]
L. M. Ni and P. K. McKinley, ``A survey of Wormhole Routing Techniques
in Direct Networks,'' IEEE Computer, 26(2):62--76, 1993.
Supplemental:
B. M. Maggs, ``Randomly wired Multistage Networks,'' in Statistical
Science, 8(1):70--74, February, 1993.
[Abstract and
Links] -- good survey/starting point.
S. Arora, B. M. Maggs, and F. T. Leighton, ``On-line Algorithms for
Path Selection in a Nonblocking Network,''
in SIAM Journal on Computing, 25(3):600--625, June 1996.
[Abstract and
links] -- when you want all the details and proofs.
Frederic Chong, Eran Egozy, and Andre DeHon.
Fault Tolerance and Performance of Multipath Multistage
Interconnection Networks.
In Thomas F. Knight Jr. and John Savage, editors, Advanced
Research in VLSI and Parallel Systems 1992, pages 227-242. MIT Press, March
1992.
[PDF][PS]
William J. Dally and Charles L. Seitz, ``The Torus Routing Chip,''
Distributed Comptuing 1:187--196, 1986.