278500-Apresentações
287 pág.

278500-Apresentações


DisciplinaArquitetura e Organização de Computadores 1269 materiais4.364 seguidores
Pré-visualização17 páginas
Cache miss stall
while read block
TLB miss
exception
Physical address
252\uf6d92004 Morgan Kaufmann Publishers
TLBs and Caches
=
=
20
Virtual page number Page offset
31 30 29 3 2 1 014 13 12 11 10 9
Virtual address
TagValid Dirty
TLB
Physical page number
TagValid
TLB hit
Cache hit
Data
Data
Byte
offset
=
=
=
=
=
Physical page number Page offset
Physical address tag Cache index
12
20
Block
offset
Physical address
18
32
8 4 2
12
8
Cache
253\uf6d92004 Morgan Kaufmann Publishers
Modern Systems
\u2022
254\uf6d92004 Morgan Kaufmann Publishers
Modern Systems
\u2022
255\uf6d92004 Morgan Kaufmann Publishers
Modern Systems
\u2022 Things are getting complicated!
256\uf6d92004 Morgan Kaufmann Publishers
\u2022 Processor speeds continue to increase very fast
\u2014 much faster than either DRAM or disk access times
\u2022 Design challenge: dealing with this growing disparity
\u2013 Prefetching? 3rd level caches and more? Memory design?
Some Issues
Year
Perform ance
1
10
100
1,000
10,000
100,000
CPU
Memory
257\uf6d92004 Morgan Kaufmann Publishers
Chapter 8
258\uf6d92004 Morgan Kaufmann Publishers
Interfacing Processors and Peripherals
\u2022 I/O Design affected by many factors (expandability, resilience)
\u2022 Performance:
\u2014 access latency 
\u2014 throughput
\u2014 connection between devices and the system
\u2014 the memory hierarchy
\u2014 the operating system
\u2022 A variety of different users (e.g., banks, supercomputers, engineers)
Disk Disk
Processor
Cache
Memory- I/O bus
Main
memory
I/O
controller
I/O
controller
I/O
controller
Graphics
output
Network
Interrupts
259\uf6d92004 Morgan Kaufmann Publishers
I/O Devices
\u2022 Very diverse devices
\u2014 behavior (i.e., input vs. output)
\u2014 partner (who is at the other end?)
\u2014 data rate
260\uf6d92004 Morgan Kaufmann Publishers
I/O Example: Disk Drives
\u2022 To access data:
\u2014 seek: position head over the proper track (3 to 14 ms. avg.)
\u2014 rotational latency: wait for desired sector (.5 / RPM)
\u2014 transfer: grab the data (one or more sectors) 30 to 80 MB/sec
Platter
Track
Platters
Sectors
Tracks
261\uf6d92004 Morgan Kaufmann Publishers
Exemplo
\u2022 Qual o tempo médio para ler ou escrever num disco com:
\u2013 10.000 RPM
\u2013 Seek time médio de 6ms
\u2013 Taxa de transferência de 50 MB/s
\u2013 Overhead do controlador de 0,2ms
262\uf6d92004 Morgan Kaufmann Publishers
Falhas
\u2022 Permanentes
\u2013 Confiabilidade
\u2022 Intermitentes
\u2013 Disponibilidade
\u2022 MTTF: Mean time to failure
\u2022 MTTR: Mean time to repair
\u2022 MTBF: Mean time between failures (= MTTF + MTTR)
\u2022 Hot swap: discos podem ser trocados sem desligar o sistema
263\uf6d92004 Morgan Kaufmann Publishers
RAID - Redundant Arrays of Inexpensive Disks
\u2022 RAID 0
\u2013 Sem redundância, apenas divisão dos dados entre discos
\u2022 RAID 1
\u2013 Redundância total, espelhamento dos discos
\u2022 RAID 2
\u2013 Inclui detecção e correção de erro (caiu em desuso)
\u2022 RAID 3
\u2013 Um disco extra para armazenar a paridade dos dados
\u2022 RAID 4
\u2013 Similar ao 3, com organização da paridade em blocos nos discos
\u2022 RAID 5
\u2013 Similar ao 4, com paridade distribuída entre os discos
\u2022 RAID 6
\u2013 Inclui um disco extra para recuperação de um segundo erro
264\uf6d92004 Morgan Kaufmann Publishers
Comunicação com os dispositivos
\u2022 Enviar comandos
\u2013 I/O mapeada em memória
\u2013 Instruções específicas de I/O
\u2022 Comunicação
\u2013 Polling
\u2013 Interrupção
\u2022 Prioridades
\u2013 DMA
\u2022 Memória Virtual
\u2022 Cache
\u2022 Overhead
\u2022 Independência
265\uf6d92004 Morgan Kaufmann Publishers
I/O Example: Buses
\u2022 Shared communication link (one or more wires)
\u2022 Difficult design:
\u2014 may be bottleneck
\u2014 length of the bus
\u2014 number of devices
\u2014 tradeoffs (buffers for higher bandwidth increases latency)
\u2014 support for many different devices
\u2014 cost
\u2022 Types of buses:
\u2014 processor-memory (short high speed, custom design)
\u2014 backplane (high speed, often standardized, e.g., PCI)
\u2014 I/O (lengthy, different devices, e.g., USB, Firewire)
\u2022 Synchronous vs. Asynchronous
\u2014 use a clock and a synchronous protocol, fast and small
but every device must operate at same rate and
clock skew requires the bus to be short
\u2014 don\u2019t use a clock and instead use handshaking 
266\uf6d92004 Morgan Kaufmann Publishers
Organização
Processor Memory
Backplane bus
a. I/O devices
267\uf6d92004 Morgan Kaufmann Publishers
Organização
Processor Memory
Processor-memory bus
b.
Bus 
adapter
Bus 
adapter
I/O 
bus
I/O 
bus
Bus 
adapter
I/O 
bus
268\uf6d92004 Morgan Kaufmann Publishers
Organização
Processor Memory
Processor-memory bus
c.
Bus 
adapter
Backplane 
bus
Bus 
adapter
I/O bus
Bus 
adapter
I/O bus
269\uf6d92004 Morgan Kaufmann Publishers
Barramentos Assíncronos
DataRdy
Ack
Data
ReadReq 1
3
4
5
7
642 2
270\uf6d92004 Morgan Kaufmann Publishers
Máquina de Estados
1 
Record from 
data lines 
and assert 
Ack
ReadReq
ReadReq
________
ReadReq
ReadReq
3, 4 
Drop Ack; 
put memory 
data on data 
lines; assert 
DataRdy
Ack
Ack
6 
Release data 
lines and 
DataRdy
________
___
Memory
2 
Release data 
lines; deassert 
ReadReq
Ack
DataRdy
DataRdy
5 
Read memory 
data from data 
lines; 
assert Ack
DataRdy
DataRdy
7 
Deassert Ack
I/O device
Put address 
on data 
lines; assert 
ReadReq
________
Ack
___
________
 
New I/O request
New I/O request
271\uf6d92004 Morgan Kaufmann Publishers
I/O Bus Standards
\u2022 Today we have two dominant bus standards:
272\uf6d92004 Morgan Kaufmann Publishers
Other important issues
\u2022 Bus Arbitration:
\u2014 daisy chain arbitration (not very fair)
\u2014 centralized arbitration (requires an arbiter), e.g., PCI
\u2014 collision detection, e.g., Ethernet
\u2022 Performance Analysis techniques:
\u2014 queuing theory
\u2014 simulation
\u2014 analysis, i.e., find the weakest link (see \u201cI/O System 
Design\u201d)
\u2022 Many new developments
273\uf6d92004 Morgan Kaufmann Publishers
Exemplo: Impacto de E/S
\u2022 Benchmark original gasta 100s, sendo 90s de processamento e 10s de E/S. Se o 
processador fica 50% mais rápido por ano e a E/S não melhora, qual será a 
melhora de desempenho dentro de 5 anos?
455%45%22s10s12s5
257%36%28s10s18s4
170%27%37s10s27s3
100%20%50s10s40s2
43%14%70s10s60s1
0%10%100s10s90s0
Speedup% de E/STotalE/SProcessadorAnos
Obs.: 90s/12s = 7,5
274\uf6d92004 Morgan Kaufmann Publishers
Pentium 4
\u2022 I/O Options
Parallel ATA
(100 MB/sec)
Parallel ATA
(100 MB/sec)
(20 MB/sec)
PCI bus
(132 MB/sec)
CSA
(0.266 GB/sec)
AGP 8X
(2.1 GB/sec)
Serial ATA
(150 MB/sec)
Disk
Pentium 4
processor
1 Gbit Ethernet
Memory
controller
hub
(north bridge)
82875P
Main
memory
DIMMs
DDR 400
(3.2 GB/sec)
DDR 400
(3.2 GB/sec)
Serial ATA
(150 MB/sec)
Disk
AC/97
(1 MB/sec)Stereo
(surround-
sound) USB 2.0
(60 MB/sec)
. . .
I/O
controller
hub
(south bridge)
82801EB
Graphics
output
(266 MB/sec)
System bus (800 MHz, 604 GB/sec)
CD/DVD
Tape
10/100 Mbit Ethernet
275\uf6d92004 Morgan Kaufmann Publishers
Pentium 4
276\uf6d92004 Morgan Kaufmann Publishers
Fallacies and Pitfalls
\u2022 Fallacy: the rated mean time to failure of disks is 1,200,000 hours, 
so disks practically never fail.
\u2022 Fallacy: magnetic disk storage is on its last legs, will be replaced.
\u2022 Fallacy: A 100 MB/sec bus can transfer 100 MB/sec.
\u2022 Pitfall: Moving functions from the CPU to the I/O processor, 
expecting to improve performance without analysis.
277\uf6d92004 Morgan Kaufmann Publishers
Chapter 9
278\uf6d92004 Morgan Kaufmann Publishers
Multiprocessors
\u2022 Idea: create powerful computers by connecting