How Binary is Used to Store Data
The Fundamentals of Binary Encoding
- As explored in the previous chapter, binary is a base-2 numeral system that uses only two digits: 0 and 1.
- Each digit is called a bit (short for binary digit).
- Everything in a computer is represented as 1s and 0s
- This is because computers work on transistors which are either on (1) or off (0), perfect for binary
Binary is like a digital Morse code, only two symbols, but endless combinations.
NoteBinary is the most compact representation for computers, as it directly aligns with their hardware, which uses transistors operating in two states.
Binary Encoding of Different Data Types
Integers
Unsigned Integers
- Use all bits to represent non-negative values.
- 8 bits would show values 0 - 255
Signed Integers
Two's Complement
Two's complement is a method for representing signed (positive, negative, and zero) binary numbers in computers. It's the most common way to represent integers in digital systems, enabling the use of the same hardware for both addition and subtraction operations.
- Use two's complement to represent negative values
- where the most significant bit (MSB) is the sign bit.
- 1 indicates a negative value of the MSB (e.g. 1011 = -8 + 2 + 1 = -5)
- 0 indicates a positive value as the MSB is negative (i.e. -0 + 1 = 1)
- 8 bits would show values -128 to +127
Sign and Magnitude
A signed binary integer representation that uses the MSB to indicate 0 for + and 1 for - while the remaining place values indicate the number being represented.
- The Most significant bit indicates a symbol
- a 1 indicates Negative
- a 0 indicates Positive
- The remaining place values are then used as normal to calculate the magnitude of the number
- Example 4 bit number: 1011
- MSB is 1 so negative
- The remaining 3 bits 011 is 3
- The number is therefore -3
Binary-coded decimal
- Based on using 4 bits of binary to represent the character of a number.
- This is useful to avoid rounding errors, especially in financial systems.
- However, they require more space and more complex processing to store values and perform arithmetic.
- Example: 1001 0011 is 93
- 1001 = 9
- 0011 = 3
Fixed Point Numbers
- The decimal point is placed in a predetermined position. e.g. 4 bits
- This is a simple and quick method for producing decimal values
- However lacks precision and range
- If from the place value of $2^0$ values to the left are raised to an additional power, the numbers to the right are raised to a negative power. $2^{-1}$ and so on.
| Place value | $2^3$ | $2^2$ | $2^1$ | $2^0$ | . | $2^{-1}$ | $2^{-2}$ | $2^{-3}$ | $2^{-4}$ |
|---|---|---|---|---|---|---|---|---|---|
| Place | 8 | 4 | 2 | 1 | . | 0.5 | 0.25 | 0.125 | 0.0625 |
| Bit | 0 | 1 | 1 | 0 | . | 0 | 1 | 1 | 1 |
| Decimal Value | 0 | 4 | 2 | 0 | . | 0 | 0.25 | 0.125 | 0.0625 |
Calculation: $(1 \times 4)+(1 \times 2)+(1 \times 0.25)+(1 \times 0.125)+(1 \times 0.0625) = 6.4375$
Strings and Characters
ASCII
- Uses 7 bits to represent 128 characters.
- Extended ASCII used 8 bits to allow accented charaters.
- Each Character is assigned a number and that number is converted into binary.
- ASCII only accounts for English like characters due to the limited number of bits available.
Unicode
- Uses variable-width encodings like UTF-8 to represent characters from all writing systems.
- Can store over 100,000 characters.
- Includes emojis, accents, symbols, global scripts (e.g. 汉字, русский).
- Unicode uses 16-bit or more, takes more space, but more flexible.
Images
- Uses the Red Green Blue colour model
- Image is split into pixels (picture elements), individual squares in a picture showing a specific colour.
- Height x Width of pixels in an image is referred to as the resolution
- Using a mix of Red, Green and Blue light each pixel shows a specific colour
- When they are displayed together in a grid of pixels much like a collage it forms an image
- Colour depth refers to the number of bits allocated to each pixel to represent the different colours.