3.1.3 Real Numbers and Normalised Floating-Point Numbers

3.1.
3 REAL NUMBERS AND NORMALISED FLOATINGPOINT NUMBERS
COMPUTERSCIENCE/UPPER6/P3
Page 1
Note: The first 1 on the left hand side is in the 9 th position, therefore in a 8 bit
register, the 9th bit
Is simply removed. So you will then have 00000010 which is 2 in denary.
Page 2
For example, subtracting 5 from 15 is really adding 5 to 15, but this is hidden by the two'scomplement representation:
Page 3
THE FORMAT OF BINARY FLOATING-POINT REAL

NUMBERS
NOTE: SYLLABUS STARTS HERE
Page 4
A binary floating point number may consist of 2, 3 or 4 bytes; however the only ones you need to
worry about are the 2 byte (16 bit) variety. The first 10 bits are the Mantissa; the last 6 bits are the
exponent.
Just like the denary floating point representation, a binary floating point number will have a mantissa
and an exponent, though as you are dealing with binary (base 2)
NOTE: After the most significant bit, place the decimal point (in the
mantissa only)
Page 5
100.1
011.1
The number is -3.5
decimal.
Convert binary floating-point real numbers into

denary
There are several stages to take when working out a floating point number in binary. In fact it is
much like a disco dance routine - known on this page as the Noorgat Dance, (you wont be tested on
name but it should help you to remember)
1. Sign - find the sign of the mantissa (make a note of this)
2. Slide - find the value of the exponent and whether it is positive or negative
Page 6
3. Bounce - move the decimal the distance the exponent asks, left for a negative
exponent, right for a positive
4. Flip - If the mantissa is negative perform twos complement on it
5. Swim - starting at the decimal point work out the values of the mantissa, going left,
then right. Now make sure you refer back to the sign you recorded on the sign move.
Example: binary floating point worked example

Lets try it out. We are given the following 16 bit floating point number, with 10 bits for the mantissa,
and 6 bits for the exponent. Remember the decimal point is between the first and second most
significant bits
The first action we need to perform is the sign, find out the sign of the mantissa
It is 0 so the mantissa is positive
The second step in the Noorgat dance is the slide, we need to find the value of the exponent, that is
the last 6 bits of the number
So we know that the exponent is of size positive one and we will have to move
the decimal point
one place to the right.
The third step in the Noorgat dance is the bounce that is moving the decimal point of the Mantissa
the number of positions specified by the slide, which was one position to the right. Like so:
Page 7
The fourth step is the optional flip. Check back to the sign stage and see if the Mantissa is negative.
It isn't? Oh well you can skip past this stage then as we only flip the number if the mantissa is
negative.
The fifth and final step is the swim. Taking the mantissa on its own we can now work out the value of
the floating point number. Start at the centre and label each number to the left
on. The each number on the right
and so
and so on.
Voila! the answer is 1
Work out the denary for the following, using 10 bits for the mantissa and 6 bits for the
exponent:
EX1) 0.001101000 000110
Answer :
1. Sign: the mantissa starts with a zero, therefore it is a positive number.
2. Slide: work out the value of the exponent
000110 = +6
3. Bounce: we need to move the decimal point in the mantissa. In this case the exponent was positive so
we need to move the decimal point 6 places to the right
0.001101000 -> 0001101.000
4. Flip: as the number isn't negative we don't need to do this
5. Swim: work out the value on the left hand side and right hand side of the decimal point
1+4+8 = +13 FINISHED!
Page 8
EX2) 0 101000000 111111
Answer :
1. Sign: the mantissa starts with a zero, therefore it is a positive number.
111111 It starts with a one therefore it is a negative number
000001 = -1
3. Bounce: we need to move the decimal point in the mantissa. In this case the exponent
was negative so we need to move the decimal point 1 place to the left
0.101000000 -> 0.0101000000
4. Flip: as the mantissa number isn't negative we don't need to do this
1/4 + 1/16 = +0.3125 FINISHED!
EX3) 1 011111010 000101
Answer :
1. Sign: the mantissa starts with a one, therefore it is a negative number.
000101 = +5
Page 9
we need to move the decimal point 5 places to the right
1.011111010 -> 101111.1010
4. Flip: the mantissa is negative as noted in step one so we need to convert this number
101111.1010 -> 010000.0110
16+1/4+1/8 = -16.375 FINISHED!
EX4) 1 101000000 111101
Answer :
111101 It starts with a one therefore it is a negative number
000011 = -3
3. Bounce: we need to move the decimal point in the mantissa. In this case the exponent
was negative so we need to move the decimal point 3 places to the left. Watch carefully!
1.101000000 -> 1.111101000000
note that we placed extra ones on the front of the number.
Consider the exponent being negative and the mantissa positive, we would add
extra zeros on the front 0.01 * 2^-3 = 0.00001
If both are negative placing zeros in front of the mantissa would make it
positive!
Therefore we need to add extra ones to keep the mantissa negative
With the flip we'll lose these 'extra' ones
Page 10
1.111101000000 -> 0.000011000000

1/32+1/64 = -0.046875 Remember the number was negative! FINISHED!
EX5) 1 111111010 000011
Answer:
000011 = +3
we need to move the decimal point 3 places to the right.
1.111111010 -> 1111.111010
1111.1110100 -> 0000.000110
1/16+1/32 = -0.09375 Remember the number was negative! FINISHED!
CONVERTING DENARY INTO BINARY FLOATING-POINT

You might also be asked to convert a denary number into its binary floating point equivalent.
1. work out the binary equivalent
2. work out how far to move the binary point (y)
Page 11
3. set the exponent to be reverse of the number of places you moved the binary point (y)
4. pad the number with extra bits
Example: denary to binary floating point

If we are asked to convert the denary number 39.75 into binary floating point we first need to find out
the binary equivalent:
128 64 32 16
0
1 .
1 . 1
How far do we need to move the binary point to the left so that the number is normlised?
0
0 . 1
(6 places to the left)
So to get our decimal point back to where it started, we need to move 6 places to the right. 6 now
becomes your exponent.
0.100111110 | 000110
If you want to check your answer, convert the number above into decimal. You get 39.75!
EXAMPLE 1
Work out the binary floating point for the following, using 10 bits for the mantissa and 6 bits for the
exponent:
67
Page 12
Answer:
128 64 32 16
0
1 .
1 . 0
0 . 1
To get the front to be normalised we must move the decimal point 7 places. (moving it 6 places would
have made the number negative!)
0.100001100 | 000111
EXAMPLE2
23.25
[Collapse]
Answer:
128 64 32 16
0
1 .
1 . 0
0
0 . 1
To get the front to be normalised we must move the decimal point 5 places. (moving it 4 places would
have made the number negative!)
0.101110100 | 000101
EXAMPLE 3
123.80
[Collapse]
Answer :
Page 13
128 64 32 16
0
1 .
1 . 1
0 . 1
To get the front to be normalised we must move the decimal point 7 places.
0.1111011111 | 000111
But this is using 11 bits for the mantissa, we have to drop one, losing accuracy!
0.111101111 | 000111
EXAMPLE4
-513
[Collapse]
Answer :
1024 512 256 128 64 32 16
0
1 .
1 . 0
Convert this into its negative form using the flipping rule:
1024 512 256 128 64 32 16
1
1 .
1 . 0
How far do we need to move the binary point to the left so that the number is normalized?
1 . 0
To get the front to be normalized we must move the decimal point 10 places.
1.011111111 | 001010
Page 14
Notice that we have had to drop the last one as this would not have fitted into 10 bits for the mantissa.
This means that the number shown is only:
10111111110.0
converting this into denary:
01000000010.0 = -514
You'll look at errors using floating point numbers very soon
For when you have a 16bit number where the mantissa is 10bits and the exponent is 6 bits:
the largest positive number will be:
Mantissa: 0.111111111
Exponent: 011111
the smallest positive number will be:
Mantissa: 0.000000001
Exponent: 100000
the largest negative number will be:
Mantissa: 1.000000000
Exponent: 011111
the smallest negative number will be:

Mantissa: 1.111111111
Exponent: 100000
Page 15
NORMALISATION OF FLOATING-POINT NUMBERS:

When storing numbers we need to use the space we are given in the most efficient way. We need
the most efficient representation we can. With a fixed number of bits, a normalized representation of
a number will display the number to the greatest accuracy possible. In summary normalized
numbers:
Give only one representation of a number
Save space
Give the most accurate representation of a number in a given number of bits
As a rule of thumb: when dealing with Floating point numbers in binary you must make sure that the
first two bits are different. That is:
And most definitely NOT:
1.1
0.0
Let's look at an example. Taking a binary floating point number:
Page 16
We can see that the number starts with
. We need to change this to
for it be normalised. To
do this we need to move the decimal place one position to the right, and to retain the same number
represented by the unnormalised number we need to change the exponent accordingly. With a
movement one place right to normalise the number we need to change the exponent to move the
decimal point one place left to compensate. Thus subtracting one from the current exponent:
To make sure you have normalized it correctly, check that
Lets try a more complicated example:
To get the mantissa normalised we need to move the decimal point two places to the right. To
maintain the same value as the original floating point number we need to adjust the exponent to be
two smaller.
Now check that the new normalised value has the same value as the original.
NOTE: Make sure that normalising a number does not change the sign bit. e.g.
0.0001 should go to 0.100 and NOT 1.000
Summary: Normalising numbers

1. Normalise the left hand side (mantissa).
2. Record the number of bounces it has taken to normalise
Page 17
3. Work out the exponent of the normalised number by using: original exponent
bounce
Normalised numbers start with 2 bits that are different
Make sure that your normalisation does not change the sign of the mantissa
Normalisation provides the maximum precision for a given number of bits
Normalisation makes sure there is only one representation for each number
Exercise: Normalisation Questions

Are the Following numbers normalised?
EXAMPLE1
0.010000000 111111
Answer: No, as it starts with 0.0
EXAMPLE2
0.111111000 111111
Answer: Yes, as it starts with 0.1
EXAMPLE3
1.100000010 111111
Answer: No, as it starts with 1.1
Normalise the following numbers:
Page 18
EXAMPLE1
0 010000000 111111
Answer:
1. 0.010000000 111111 -> 00.10000000 111111
2. One place to the right
3. 111111 - 1 = -1 -1 = -2 = 000010 (+2) = 111110 (-2)
00.10000000 111110 = 0.100000000 111110
EXAMPLE2
0 001101000 000110
Answer :
1. 0.001101000 000110 -> 000.1101000 000110
2. Two places to the right
3. 000110 - 2 = 6 - 2 = 4 = 000100 (+4)
000.1101000 000100 = 0.110100000 000100
Page 19
EXAMPLE3
1 111111010 000011
Answer :
1. 1.111111010 000011 -> 1111111.010 000011
2. Six places to the right
3. 000011 - 6 = 3 - 6 = -3 = 111101 (-3)
111111.010 111101 = 1.01000000 111101
REASONS FOR NORMALISATION
Page 20
LIMITS OF FLOATING-POINT REPRESENTATION

(effects of changing allocation of bits to mantissa
and exponent)
Precision
When using floating point numbers you have to balance the range and the precision of numbers.
That is whether you want to have a very large range of values or you want a number that is very
precise down to a large number of decimal places. This means that you are going to always weigh
up how many digits should be used for the mantissa and how many should be used for the
exponent. In summary:
If you want a very precise number use more digits for the mantissa and less for the exponent
as this will allow for more decimal places
If you want a large range of numbers use more digits for the exponent and less for the
mantissa.
Page 21
OVERFLOW
When the result of a sum is too large to be represented by your number system you might run out of
space to represent it and end up storing a much smaller number
Try and show 99,999,999,999,999,999,999 in 12 bit FP
UNDERFLOW
Page 22
When a number or the result of an equation is too small, you might not have enough digits in your
mantissa and exponent to show it. In the following example the number would register as 0
Try and show 0.0000000000000000000000000001 in 12 bit FP
TRUNCATION
Why computers cannot represent real numbers : 2, , but only
approximation?
But floating-point cannot represent so many decimal digits and

truncation will occur.
ROUNDING ERRORS IN BINARY REPRESENTATIONS
Page 23
When we try to represent some numbers sometimes we can't within the space we have been given,
for example trying to write down 1/3 = 0.33333333; you see what I mean? With floating point
numbers you can't always get perfect precision and sometimes we suffer errors.
Feed this equation into google:
999999999999999 - 999999999999998
The browser will perform a floating point calculation and give you the answer of 0!
So recognizing that we can have rounding errors with floating point numbers we'll take a look at the
different errors that might be caused. The following number wants to be represented in binary 23.27,
the closest we get is 23.25
Page 24

3.1.3 Real Numbers and Normalised Floating-Point Numbers

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

3.1.3 Real Numbers and Normalised Floating-Point Numbers

Uploaded by

Copyright:

Available Formats

3.1.

3 REAL NUMBERS AND NORMALISED FLOATINGPOINT NUMBERS

THE FORMAT OF BINARY FLOATING-POINT REAL

Convert binary floating-point real numbers into

Example: binary floating point worked example

It is 0 so the mantissa is positive

Voila! the answer is 1

EX2) 0 101000000 111111

EX3) 1 011111010 000101

EX4) 1 101000000 111101

1.111101000000 -> 0.000011000000

EX5) 1 111111010 000011

CONVERTING DENARY INTO BINARY FLOATING-POINT

Example: denary to binary floating point

(6 places to the left)

(7 places to the left)

(5 places to the left)

(7 places to the left)

(10 places to the left)

the smallest negative number will be:

NORMALISATION OF FLOATING-POINT NUMBERS:

Give only one representation of a number

Give the most accurate representation of a number in a given number of bits

And most definitely NOT:

Let's look at an example. Taking a binary floating point number:

We can see that the number starts with

. We need to change this to

To make sure you have normalized it correctly, check that

Lets try a more complicated example:

0.0001 should go to 0.100 and NOT 1.000

Summary: Normalising numbers

Normalised numbers start with 2 bits that are different

Normalisation provides the maximum precision for a given number of bits

Exercise: Normalisation Questions

Normalise the following numbers:

REASONS FOR NORMALISATION

LIMITS OF FLOATING-POINT REPRESENTATION

But floating-point cannot represent so many decimal digits and

ROUNDING ERRORS IN BINARY REPRESENTATIONS

You might also like