278500-Apresentações
287 pág.

278500-Apresentações


DisciplinaArquitetura e Organização de Computadores 1271 materiais4.365 seguidores
Pré-visualização17 páginas
\u2013 binary: -.11 = -1.1 x 2-1
\u2013 floating point: exponent = 126 = 01111110
\u2013 IEEE single precision: 10111111010000000000000000000000
76\uf6d92004 Morgan Kaufmann Publishers
Exemplos
\u2022 Comparar em binário
\u2013 0,75; -0,75; 4,25; 0,625; 19
77\uf6d92004 Morgan Kaufmann Publishers
Constantes
NaN (Not a Number)<> 02047<> 0255
Infinito020470255
Número em ponto flutuanteQualquer1-2046Qualquer1-254
Número não normalizado<> 00<> 00
00000
MantissaExpoenteMantissaExpoente
ValorPrecisão DuplaPrecisão Simples
78\uf6d92004 Morgan Kaufmann Publishers
Floating point addition
\u2022
Still normalized?
4. Round the significand to the appropriate
number of bits
YesOverflow or
underflow?
Start
No
Yes
Done
1. Compare the exponents of the two numbers.
Shift the smaller number to the right until its
exponent would match the larger exponent
2. Add the significands
3. Normalize the sum, either shifting right and
incrementing the exponent or shifting left
and decrementing the exponent
No Exception
Small ALU
Exponent
difference
Control
ExponentSign Fraction
Big ALU
ExponentSign Fraction
0 1 0 1 0 1
Shift right
0 1 0 1
Increment or
decrement
Shift left or right
Rounding hardware
ExponentSign Fraction
79\uf6d92004 Morgan Kaufmann Publishers
Multiplication
5. Set the sign of the product to positive if the
signs of the original operands are the same;
if they differ make the sign negative
Still normalized?
4. Round the significand to the appropriate
number of bits
YesOverflow or
underflow?
Start
No
Yes
Done
1. Add the biased exponents of the two
numbers, subtracting the bias from the sum
to get the new biased exponent
2. Multiply the significands
3. Normalize the product if necessary, shifting
it right and incrementing the exponent
No Exception
80\uf6d92004 Morgan Kaufmann Publishers
Precisão
\u2022 X + 1 \u2013 X = 1?
\u2022 Internamente, o processador armazena os números 
de ponto flutuante com ao menos 2 bits a mais: 
guard e round
\u2022 Um terceiro bit, o sticky indica se algum conteúdo 
significativo foi perdido em arredondamento
\u2022 Formas de arredondar:
\u2013 Em direção a +infinito
\u2013 Em direção a \u2013infinito
\u2013 Truncar
\u2013 Em direção ao par mais próximo
81\uf6d92004 Morgan Kaufmann Publishers
Floating Point Complexities
\u2022 Operations are somewhat more complicated (see text)
\u2022 In addition to overflow we can have \u201cunderflow\u201d
\u2022 Accuracy can be a big problem
\u2013 IEEE 754 keeps two extra bits, guard and round
\u2013 four rounding modes
\u2013 positive divided by zero yields \u201cinfinity\u201d
\u2013 zero divide by zero yields \u201cnot a number\u201d
\u2013 other complexities
\u2022 Implementing the standard can be tricky
\u2022 Not using the standard can be even worse
\u2013 see text for description of 80x86 and Pentium bug!
82\uf6d92004 Morgan Kaufmann Publishers
Chapter Three Summary
\u2022 Computer arithmetic is constrained by limited precision
\u2022 Bit patterns have no inherent meaning but standards do exist
\u2013 two\u2019s complement
\u2013 IEEE 754 floating point
\u2022 Computer instructions determine \u201cmeaning\u201d of the bit patterns
\u2022 Performance and accuracy are important so there are many
complexities in real machines 
\u2022 Algorithm choice is important and may lead to hardware 
optimizations for both space and time (e.g., multiplication)
\u2022 You may want to look back (Section 3.10 is great reading!)
83\uf6d92004 Morgan Kaufmann Publishers
Chapter 4
84\uf6d92004 Morgan Kaufmann Publishers
\u2022 Measure, Report, and Summarize
\u2022 Make intelligent choices
\u2022 See through the marketing hype
\u2022 Key to understanding underlying organizational motivation
Why is some hardware better than others for different programs?
What factors of system performance are hardware related?
(e.g., Do we need a new machine, or a new operating system?)
How does the machine's instruction set affect performance?
Performance
85\uf6d92004 Morgan Kaufmann Publishers
Which of these airplanes has the best performance?
Airplane Passengers Range (mi) Speed (mph)
Boeing 737-100 101 630 598
Boeing 747 470 4150 610
BAC/Sud Concorde 132 4000 1350
Douglas DC-8-50 146 8720 544
\u2022How much faster is the Concorde compared to the 747? 
\u2022How much bigger is the 747 than the Douglas DC-8?
86\uf6d92004 Morgan Kaufmann Publishers
\u2022 Response Time (latency)
\u2014 How long does it take for my job to run?
\u2014 How long does it take to execute a job?
\u2014 How long must I wait for the database query?
\u2022 Throughput
\u2014 How many jobs can the machine run at once?
\u2014 What is the average execution rate?
\u2014 How much work is getting done?
\u2022 If we upgrade a machine with a new processor what do we increase?
\u2022 If we add a new machine to the lab what do we increase?
Computer Performance: TIME, TIME, TIME
87\uf6d92004 Morgan Kaufmann Publishers
\u2022 Elapsed Time
\u2013 counts everything (disk and memory accesses, I/O , etc.)
\u2013 a useful number, but often not good for comparison purposes
\u2022 CPU time
\u2013 doesn't count I/O or time spent running other programs
\u2013 can be broken up into system time, and user time
\u2022 Our focus: user CPU time 
\u2013 time spent executing the lines of code that are &quot;in&quot; our program
Execution Time
88\uf6d92004 Morgan Kaufmann Publishers
\u2022 For some program running on machine X, 
PerformanceX = 1 / Execution timeX
\u2022 &quot;X is n times faster than Y&quot;
PerformanceX / PerformanceY = n
\u2022 Problem:
\u2013 machine A runs a program in 20 seconds
\u2013 machine B runs the same program in 25 seconds
Book's Definition of Performance
89\uf6d92004 Morgan Kaufmann Publishers
Clock Cycles
\u2022 Instead of reporting execution time in seconds, we often use cycles
\u2022 Clock \u201cticks\u201d indicate when to start activities (one abstraction):
\u2022 cycle time = time between ticks = seconds per cycle
\u2022 clock rate (frequency) = cycles per second (1 Hz. = 1 cycle/sec)
A 4 Ghz. clock has a cycle time 
time
seconds
program
=
cycles
program
×
seconds
cycle
(ps) spicosecond 2501210
 
9104
1
 =×
×
90\uf6d92004 Morgan Kaufmann Publishers
So, to improve performance (everything else being equal) you can
either (increase or decrease?)
________ the # of required cycles for a program, or
________ the clock cycle time or, said another way, 
________ the clock rate.
How to Improve Performance
seconds
program
=
cycles
program
×
seconds
cycle
91\uf6d92004 Morgan Kaufmann Publishers
\u2022 Could assume that number of cycles equals number of instructions
This assumption is incorrect,
different instructions take different amounts of time on different machines.
Why? hint: remember that these are machine instructions, not lines of C code
time
1
s
t
 
i
n
s
t
r
u
c
t
i
o
n
2
n
d
 
i
n
s
t
r
u
c
t
i
o
n
3
r
d
 
i
n
s
t
r
u
c
t
i
o
n
4
t
h
5
t
h
6
t
h
.
.
.
How many cycles are required for a program?
92\uf6d92004 Morgan Kaufmann Publishers
\u2022 Multiplication takes more time than addition
\u2022 Floating point operations take longer than integer ones
\u2022 Accessing memory takes more time than accessing registers
\u2022 Important point: changing the cycle time often changes the number of 
cycles required for various instructions (more later)
time
Different numbers of cycles for different instructions
93\uf6d92004 Morgan Kaufmann Publishers
\u2022 Our favorite program runs in 10 seconds on computer A, which has a 
4 GHz. clock. We are trying to help a computer designer build a new 
machine B, that will run this program in 6 seconds. The designer can use 
new (or perhaps more expensive) technology to substantially increase the 
clock rate, but has informed us that this increase will affect the rest of the 
CPU design,