Слайд 2Architecture & Organization
Architecture is those attributes visible to the programmer
Instruction set, number
![Architecture & Organization Architecture is those attributes visible to the programmer Instruction](/_ipx/f_webp&q_80&fit_contain&s_1440x1080/imagesDir/jpg/842539/slide-1.jpg)
of bits used for data representation, I/O mechanisms, addressing techniques.
e.g. Is there a multiply instruction?
Organization is how features are implemented
Control signals, interfaces, memory technology.
e.g. Is there a hardware multiply unit or is it done by repeated addition?
Слайд 3Architecture & Organization
All Intel x86 family share the same basic architecture
The IBM
![Architecture & Organization All Intel x86 family share the same basic architecture](/_ipx/f_webp&q_80&fit_contain&s_1440x1080/imagesDir/jpg/842539/slide-2.jpg)
System/370 family share the same basic architecture
This gives code compatibility
At least backwards
Organization differs between different versions
Слайд 4Structure & Function
Structure is the way in which components relate to each
![Structure & Function Structure is the way in which components relate to](/_ipx/f_webp&q_80&fit_contain&s_1440x1080/imagesDir/jpg/842539/slide-3.jpg)
other
Function is the operation of individual components as part of the structure
Слайд 5Function
All computer functions are:
Data processing
Data storage
Data movement and
Control
![Function All computer functions are: Data processing Data storage Data movement and Control](/_ipx/f_webp&q_80&fit_contain&s_1440x1080/imagesDir/jpg/842539/slide-4.jpg)
Слайд 7Structure - Top Level
Computer
Main
Memory
Input
Output
Systems
Interconnection
Peripherals
Communication
lines
Central
Processing
Unit
Computer
![Structure - Top Level Computer Main Memory Input Output Systems Interconnection Peripherals](/_ipx/f_webp&q_80&fit_contain&s_1440x1080/imagesDir/jpg/842539/slide-6.jpg)
Слайд 8Structure - The CPU
Computer
Arithmetic
and
Login Unit
Control
Unit
Internal CPU
Interconnection
Registers
CPU
I/O
Memory
System
Bus
CPU
![Structure - The CPU Computer Arithmetic and Login Unit Control Unit Internal](/_ipx/f_webp&q_80&fit_contain&s_1440x1080/imagesDir/jpg/842539/slide-7.jpg)
Слайд 9Structure - The Control Unit
CPU
Control
Memory
Control Unit
Registers and
Decoders
Sequencing
Logic
Control
Unit
ALU
Registers
Internal
Bus
Control Unit
![Structure - The Control Unit CPU Control Memory Control Unit Registers and](/_ipx/f_webp&q_80&fit_contain&s_1440x1080/imagesDir/jpg/842539/slide-8.jpg)
Слайд 10ENIAC - background
Electronic Numerical Integrator And Computer
University of Pennsylvania
Trajectory tables for weapons
![ENIAC - background Electronic Numerical Integrator And Computer University of Pennsylvania Trajectory](/_ipx/f_webp&q_80&fit_contain&s_1440x1080/imagesDir/jpg/842539/slide-9.jpg)
Started 1943 and Finished 1946
Too late for war effort
Used until 1955
Слайд 11ENIAC - details
Decimal (not binary)
20 accumulators of 10 digits
Programmed manually by switches
18,000
![ENIAC - details Decimal (not binary) 20 accumulators of 10 digits Programmed](/_ipx/f_webp&q_80&fit_contain&s_1440x1080/imagesDir/jpg/842539/slide-10.jpg)
vacuum tubes and 30 tons
15,000 sq. ft and 140 kW power consumption
5,000 additions per second
Слайд 12von Neumann/Turing
Stored Program concept (1952)
Main memory storing programs and data
ALU operating on
![von Neumann/Turing Stored Program concept (1952) Main memory storing programs and data](/_ipx/f_webp&q_80&fit_contain&s_1440x1080/imagesDir/jpg/842539/slide-11.jpg)
binary data
Control unit interpreting instructions from memory and executing
Input and output equipment operated by control unit
Princeton Institute for Advanced Studies IAS
Слайд 13Structure of von Neumann machine
![Structure of von Neumann machine](/_ipx/f_webp&q_80&fit_contain&s_1440x1080/imagesDir/jpg/842539/slide-12.jpg)
Слайд 14Transistors
Replaced vacuum tubes
Smaller and Cheaper
Less heat dissipation
Solid State device and Made from
![Transistors Replaced vacuum tubes Smaller and Cheaper Less heat dissipation Solid State](/_ipx/f_webp&q_80&fit_contain&s_1440x1080/imagesDir/jpg/842539/slide-13.jpg)
Silicon (Sand)
Invented 1947 at Bell Labs
William Shockley et al.
Слайд 15Transistor Based Computers
Second generation machines
NCR & RCA produced small transistor machines
IBM 7000
DEC
![Transistor Based Computers Second generation machines NCR & RCA produced small transistor](/_ipx/f_webp&q_80&fit_contain&s_1440x1080/imagesDir/jpg/842539/slide-14.jpg)
- 1957
Produced PDP-1
Слайд 16Microelectronics
Literally - “small electronics”
A computer is made up of gates, memory cells
![Microelectronics Literally - “small electronics” A computer is made up of gates,](/_ipx/f_webp&q_80&fit_contain&s_1440x1080/imagesDir/jpg/842539/slide-15.jpg)
and interconnections
These can be manufactured on a semiconductor
e.g. silicon wafer
Слайд 17Generations of Computer
Vacuum tube - 1946-1957
Transistor - 1958-1964
Small scale integration - 1965
![Generations of Computer Vacuum tube - 1946-1957 Transistor - 1958-1964 Small scale](/_ipx/f_webp&q_80&fit_contain&s_1440x1080/imagesDir/jpg/842539/slide-16.jpg)
on
Up to 100 devices on a chip
Medium scale integration - to 1971
100 - 3,000 devices on a chip
Large scale integration - 1971-1977
3,000 - 100,000 devices on a chip
Very large scale integration - 1978 to date
100,000 - 100,000,000 devices on a chip
Ultra large scale integration
Over 100,000,000 devices on a chip
Слайд 19CPU Structure
CPU must:
Fetch instructions
Interpret instructions
Fetch data
Process data
Write data
![CPU Structure CPU must: Fetch instructions Interpret instructions Fetch data Process data Write data](/_ipx/f_webp&q_80&fit_contain&s_1440x1080/imagesDir/jpg/842539/slide-18.jpg)
Слайд 22Registers
CPU must have some working space (temporary storage)
Called registers
Number and function vary
![Registers CPU must have some working space (temporary storage) Called registers Number](/_ipx/f_webp&q_80&fit_contain&s_1440x1080/imagesDir/jpg/842539/slide-21.jpg)
between processor designs
One of the major design decisions
Top level of memory hierarchy
Слайд 23User Visible Registers
General Purpose
Data
Address
Condition Codes
![User Visible Registers General Purpose Data Address Condition Codes](/_ipx/f_webp&q_80&fit_contain&s_1440x1080/imagesDir/jpg/842539/slide-22.jpg)
Слайд 24General Purpose Registers (1)
May be true general purpose
May be restricted
May be used
![General Purpose Registers (1) May be true general purpose May be restricted](/_ipx/f_webp&q_80&fit_contain&s_1440x1080/imagesDir/jpg/842539/slide-23.jpg)
for data or addressing
Data
Accumulator
Addressing
Segment
Слайд 25General Purpose Registers (2)
Make them general purpose
Increase flexibility and programmer options
Increase instruction
![General Purpose Registers (2) Make them general purpose Increase flexibility and programmer](/_ipx/f_webp&q_80&fit_contain&s_1440x1080/imagesDir/jpg/842539/slide-24.jpg)
size & complexity
Make them specialized
Smaller (faster) instructions
Less flexibility
Слайд 26How Many GP Registers?
Between 8 – 32
Fewer = more memory references
RISC
![How Many GP Registers? Between 8 – 32 Fewer = more memory references RISC](/_ipx/f_webp&q_80&fit_contain&s_1440x1080/imagesDir/jpg/842539/slide-25.jpg)
Слайд 27How big?
Large enough to hold full address
Large enough to hold full word
Often
![How big? Large enough to hold full address Large enough to hold](/_ipx/f_webp&q_80&fit_contain&s_1440x1080/imagesDir/jpg/842539/slide-26.jpg)
possible to combine two data registers
C programming
double int a;
long int a;
Слайд 28Condition Code Registers
Sets of individual bits
e.g. result of last operation was zero
Can
![Condition Code Registers Sets of individual bits e.g. result of last operation](/_ipx/f_webp&q_80&fit_contain&s_1440x1080/imagesDir/jpg/842539/slide-27.jpg)
be read (implicitly) by programs
e.g. Jump if zero
Can not (usually) be set by programs
Слайд 29Control & Status Registers
Program Counter
Instruction Decoding Register
Memory Address Register
Memory Buffer Register
![Control & Status Registers Program Counter Instruction Decoding Register Memory Address Register Memory Buffer Register](/_ipx/f_webp&q_80&fit_contain&s_1440x1080/imagesDir/jpg/842539/slide-28.jpg)
Слайд 30Program Status Word
A set of bits
Includes Condition Codes
Sign of last result
Zero
Carry
Equal
Overflow
Interrupt enable/disable
Supervisor
![Program Status Word A set of bits Includes Condition Codes Sign of](/_ipx/f_webp&q_80&fit_contain&s_1440x1080/imagesDir/jpg/842539/slide-29.jpg)
Слайд 32Intel
1971 - 4004
First microprocessor
All CPU components on a single chip
4 bit
Followed
![Intel 1971 - 4004 First microprocessor All CPU components on a single](/_ipx/f_webp&q_80&fit_contain&s_1440x1080/imagesDir/jpg/842539/slide-31.jpg)
in 1972 by 8008
8 bit
Both designed for specific applications
1974 - 8080
Intel’s first general purpose microprocessor
Слайд 33Performance Mismatch
Processor speed increased
Memory capacity increased
Memory speed lags behind processor speed
![Performance Mismatch Processor speed increased Memory capacity increased Memory speed lags behind processor speed](/_ipx/f_webp&q_80&fit_contain&s_1440x1080/imagesDir/jpg/842539/slide-32.jpg)
Слайд 34DRAM and Processor Characteristics
![DRAM and Processor Characteristics](/_ipx/f_webp&q_80&fit_contain&s_1440x1080/imagesDir/jpg/842539/slide-33.jpg)
Слайд 35Solutions
Increase number of bits retrieved at one time
Make DRAM “wider” rather than
![Solutions Increase number of bits retrieved at one time Make DRAM “wider”](/_ipx/f_webp&q_80&fit_contain&s_1440x1080/imagesDir/jpg/842539/slide-34.jpg)
“deeper”
Change DRAM interface
Cache
Reduce frequency of memory access
More complex cache and cache on chip
Increase interconnection bandwidth
High speed buses
Слайд 36Pentium Evolution (1)
8080
first general purpose microprocessor
8 bit data path
Used in first personal
![Pentium Evolution (1) 8080 first general purpose microprocessor 8 bit data path](/_ipx/f_webp&q_80&fit_contain&s_1440x1080/imagesDir/jpg/842539/slide-35.jpg)
computer – Altair
8086
much more powerful
16 bit
instruction cache, prefetch few instructions
8088 (8 bit external bus) used in first IBM PC
80286
16 Mbyte memory addressable
80386
32 bit
Support for multitasking
Слайд 37Pentium Evolution (2)
80486
sophisticated powerful cache and instruction pipelining
built in math co-processor
Pentium
Superscalar
Multiple instructions
![Pentium Evolution (2) 80486 sophisticated powerful cache and instruction pipelining built in](/_ipx/f_webp&q_80&fit_contain&s_1440x1080/imagesDir/jpg/842539/slide-36.jpg)
executed in parallel
Pentium Pro
Increased superscalar organization
Aggressive register renaming
branch prediction
data flow analysis
speculative execution
Слайд 38Speeding it up
Pipelining
On board L1 & L2 cache
Branch prediction
Data flow analysis and
Speculative
![Speeding it up Pipelining On board L1 & L2 cache Branch prediction](/_ipx/f_webp&q_80&fit_contain&s_1440x1080/imagesDir/jpg/842539/slide-37.jpg)
execution
Слайд 39Cache
Small amount of fast memory
Sits between normal main memory and CPU
May be
![Cache Small amount of fast memory Sits between normal main memory and](/_ipx/f_webp&q_80&fit_contain&s_1440x1080/imagesDir/jpg/842539/slide-38.jpg)
located on CPU chip or module
Слайд 42Pentium Evolution (3)
Pentium II
MMX technology
graphics, video & audio processing
Pentium III
Additional floating point
![Pentium Evolution (3) Pentium II MMX technology graphics, video & audio processing](/_ipx/f_webp&q_80&fit_contain&s_1440x1080/imagesDir/jpg/842539/slide-41.jpg)
instructions for 3D graphics
Pentium 4
Note Arabic rather than Roman numerals
Further floating point and multimedia enhancements
Itanium
64 bit
Слайд 43Pentium 4 Cache
80386 – no on chip cache
80486 – 8k using 16
![Pentium 4 Cache 80386 – no on chip cache 80486 – 8k](/_ipx/f_webp&q_80&fit_contain&s_1440x1080/imagesDir/jpg/842539/slide-42.jpg)
byte lines and four way set associative organization
Pentium (all versions) – two on chip L1 caches
Data & instructions
Pentium 4 – L1 caches
8k bytes
64 byte lines
four way set associative
L2 cache
Feeding both L1 caches
256k and 128 byte lines
8 way set associative
Слайд 45Background to IA-64
Pentium 4 appears to be last in x86 line
Intel &
![Background to IA-64 Pentium 4 appears to be last in x86 line](/_ipx/f_webp&q_80&fit_contain&s_1440x1080/imagesDir/jpg/842539/slide-44.jpg)
Hewlett-Packard (HP) jointly developed
New architecture
64 bit architecture
Not extension of x86
Not adaptation of HP 64bit RISC architecture
Exploits vast circuitry and high speeds
Systematic use of parallelism
Слайд 46Motivation
Instruction level parallelism
Implicit in machine instruction
Not determined at run time by
![Motivation Instruction level parallelism Implicit in machine instruction Not determined at run](/_ipx/f_webp&q_80&fit_contain&s_1440x1080/imagesDir/jpg/842539/slide-45.jpg)
processor
Long or very long instruction words (LIW/VLIW)
Branch predication (not the same as branch prediction)
Speculative loading
Intel & HP call this Explicit Parallel Instruction Computing (EPIC)
IA-64 is an instruction set architecture intended for implementation on EPIC
Слайд 48Why New Architecture?
Not hardware compatible with x86
Now have tens of millions of
![Why New Architecture? Not hardware compatible with x86 Now have tens of](/_ipx/f_webp&q_80&fit_contain&s_1440x1080/imagesDir/jpg/842539/slide-47.jpg)
transistors available on chip
Could build bigger cache
Diminishing returns
Add more execution units
Increase superscaling
More units makes processor “wider”
More logic needed to orchestrate
Improved branch prediction required
Longer pipelines required
At most six instructions per cycle
Слайд 49
CLOSEST POINT OF APPROACH
TCAS
INTRUDER
CPA
B
A
![CLOSEST POINT OF APPROACH TCAS INTRUDER CPA B A](/_ipx/f_webp&q_80&fit_contain&s_1440x1080/imagesDir/jpg/842539/slide-48.jpg)
Слайд 55INSTRUMENTATION IN
AIRBUS A - 320
![INSTRUMENTATION IN AIRBUS A - 320](/_ipx/f_webp&q_80&fit_contain&s_1440x1080/imagesDir/jpg/842539/slide-54.jpg)