RevisionDojo

The Concept of Compression

Definition

Compression

Compression is the process of reducing the size of data files to save storage space and improve transmission efficiency.

It plays a crucial role in modern computing, enabling faster data transfer, reduced storage costs, and optimized performance.

Note

Compression is essential for managing large data sets, optimizing web content delivery, and enhancing application performance by minimizing load times and storage requirements.

Lossless vs. Lossy Compression

Definition

Lossless compression

Lossless compression runs an algorithm to compress the data, the algorithm can then be reversed to restore the data. This ensures that the original data can be perfectly reconstructed from the compressed version.

It achieves this by identifying and eliminating redundant patterns without losing any information.

Definition

Lossy compression

Lossy compression reduces file size by discarding some data that is less noticeable or redundant, resulting in an approximation of the original content.

It achieves higher compression rates than lossless methods by removing details that have minimal impact on perception, making it ideal for multimedia formats like images, audio, and video.

Aspect	Lossless Compression	Lossy Compression
Data Integrity	Preserves original data	Discards some data
Compression Ratio	Lower	Higher
Applications	Backups, archival storage, text files	Images, audio, video
Reversibility	Reversible (exact reconstruction)	Irreversible (data loss is permanent)
Perceptual Redundancy	Does not exploit perceptual redundancy	Exploits perceptual redundancy to discard less noticeable data

Note

The choice between lossless and lossy compression depends on the specific application and the trade-off between file size and data integrity.

Note

Lossless compression requires an algorithm to decompress the file to access the data.
- May impact small embedded systems that do not have the resources available to decompress the file
Lossy compression the data has been removed so there is no processing required to reconstruct the file, it can simply be viewed as is.
- Suitable for small embedded systems that have limited processing power.
E.g. a kids toy with sound has limited processing power and space, lossy compression is best.

Lossless Compression Methods

Huffman

Analyses the frequency of symbols (letters, numbers, etc.) in a file.
Assigns shorter binary codes to more common symbols, and longer codes to rare ones.
This produces a prefix-free binary tree, no code is a prefix of another.

The result: fewer bits used for common characters → smaller file.

Analogy

Like giving short nicknames to people you talk to often, and full names to rare contacts.

Run-Length Encoding (RLE)

Definition

Run-Length Encoding (RLE)

Run-Length Encoding (RLE) is a lossless data compression technique that replaces consecutive repeating occurrences of a symbol with a single instance of the symbol followed by a count of its repetitions.

How RLE Works

Identify Consecutive Values: RLE scans the data for consecutive occurrences of the same value.
Replace with Code: Each run is replaced with a code representing the value and its count.

Example

The sequence AAAAABCCC becomes 5A1B3C.

Common Mistake

RLE always makes content smaller.
If there are not many repeated characters then it might actually make the compressed version larger than the original.
- Example 1: AAAAABCCCC (10 char) to 5A1B4C (6 char)
- Example 2: HELLO_WORLD (11 char) to 1H1E2L1O1_1W1O1R1L1D (20 char)
Example 1 reduced size due to high number of repeated characters, however Example 2 made the compressed version larger as there was not a lot of repeated characters.

Applications of RLE

Text Files: Compressing spaces or repeated characters.
Images: Efficient for images with large areas of uniform color, such as faxes or simple graphics.

Note

RLE is most effective for data with repetitive patterns but less efficient for data with high variability.

Unlock the rest of this chapter with a Free account

Nice try, unfortunately this paywall isn't as easy to bypass as you think. Want to help devleop the site? Join the team at https://revisiondojo.com/join-us. exercitation voluptate cillum ullamco excepteur sint officia do tempor Lorem irure minim Lorem elit id voluptate reprehenderit voluptate laboris in nostrud qui non Lorem nostrud laborum culpa sit occaecat reprehenderit

Definition

Paywall

(on a website) an arrangement whereby access is restricted to users who have paid to subscribe to the site.

anim nostrud sit dolore minim proident quis fugiat velit et eiusmod nulla quis nulla mollit dolor sunt culpa aliqua

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.

Duis aute irure dolor in reprehenderit

Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

Note

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam quis nostrud exercitation.

Excepteur sint occaecat cupidatat non proident

Nemo enim ipsam voluptatem quia voluptas sit aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos qui ratione voluptatem sequi nesciunt. Neque porro quisquam est, qui dolorem ipsum quia dolor sit amet, consectetur, adipisci velit.

Tip

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.

Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris.
Duis aute irure dolor in reprehenderit in voluptate velit esse cillum.

The Concept of Compression

Definition

Compression

Compression is the process of reducing the size of data files to save storage space and improve transmission efficiency.

It plays a crucial role in modern computing, enabling faster data transfer, reduced storage costs, and optimized performance.

Note

Compression is essential for managing large data sets, optimizing web content delivery, and enhancing application performance by minimizing load times and storage requirements.

Lossless vs. Lossy Compression

Definition

Lossless compression

It achieves this by identifying and eliminating redundant patterns without losing any information.

Definition

Lossy compression

Lossy compression reduces file size by discarding some data that is less noticeable or redundant, resulting in an approximation of the original content.

It achieves higher compression rates than lossless methods by removing details that have minimal impact on perception, making it ideal for multimedia formats like images, audio, and video.

Aspect	Lossless Compression	Lossy Compression
Data Integrity	Preserves original data	Discards some data
Compression Ratio	Lower	Higher
Applications	Backups, archival storage, text files	Images, audio, video
Reversibility	Reversible (exact reconstruction)	Irreversible (data loss is permanent)
Perceptual Redundancy	Does not exploit perceptual redundancy	Exploits perceptual redundancy to discard less noticeable data

Note

The choice between lossless and lossy compression depends on the specific application and the trade-off between file size and data integrity.

Note

Lossless compression requires an algorithm to decompress the file to access the data.
- May impact small embedded systems that do not have the resources available to decompress the file
Lossy compression the data has been removed so there is no processing required to reconstruct the file, it can simply be viewed as is.
- Suitable for small embedded systems that have limited processing power.
E.g. a kids toy with sound has limited processing power and space, lossy compression is best.

Lossless Compression Methods

Huffman

Analyses the frequency of symbols (letters, numbers, etc.) in a file.
Assigns shorter binary codes to more common symbols, and longer codes to rare ones.
This produces a prefix-free binary tree, no code is a prefix of another.

The result: fewer bits used for common characters → smaller file.

Analogy

Like giving short nicknames to people you talk to often, and full names to rare contacts.

Run-Length Encoding (RLE)

Definition

Run-Length Encoding (RLE)

How RLE Works

Identify Consecutive Values: RLE scans the data for consecutive occurrences of the same value.
Replace with Code: Each run is replaced with a code representing the value and its count.

Example

The sequence AAAAABCCC becomes 5A1B3C.

Common Mistake

RLE always makes content smaller.
If there are not many repeated characters then it might actually make the compressed version larger than the original.
- Example 1: AAAAABCCCC (10 char) to 5A1B4C (6 char)
- Example 2: HELLO_WORLD (11 char) to 1H1E2L1O1_1W1O1R1L1D (20 char)
Example 1 reduced size due to high number of repeated characters, however Example 2 made the compressed version larger as there was not a lot of repeated characters.

Applications of RLE

Text Files: Compressing spaces or repeated characters.
Images: Efficient for images with large areas of uniform color, such as faxes or simple graphics.

Note

RLE is most effective for data with repetitive patterns but less efficient for data with high variability.

Unlock the rest of this chapter with a Free account

Definition

Paywall

(on a website) an arrangement whereby access is restricted to users who have paid to subscribe to the site.

anim nostrud sit dolore minim proident quis fugiat velit et eiusmod nulla quis nulla mollit dolor sunt culpa aliqua

Duis aute irure dolor in reprehenderit

Note

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam quis nostrud exercitation.

Excepteur sint occaecat cupidatat non proident

Tip

Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris.
Duis aute irure dolor in reprehenderit in voluptate velit esse cillum.

A1.1.8 Concept of Compression Notes

The Concept of Compression

Lossless vs. Lossy Compression

Lossless Compression Methods

Huffman

Run-Length Encoding (RLE)

How RLE Works

Applications of RLE

Unlock the rest of this chapter with a Free account

anim nostrud sit dolore minim proident quis fugiat velit et eiusmod nulla quis nulla mollit dolor sunt culpa aliqua

Duis aute irure dolor in reprehenderit

Excepteur sint occaecat cupidatat non proident

Introduction to Compression

The Concept of Compression

Lossless vs. Lossy Compression

Lossless Compression Methods

Huffman

Run-Length Encoding (RLE)

How RLE Works

Applications of RLE

Unlock the rest of this chapter with a Free account

anim nostrud sit dolore minim proident quis fugiat velit et eiusmod nulla quis nulla mollit dolor sunt culpa aliqua

Duis aute irure dolor in reprehenderit

Excepteur sint occaecat cupidatat non proident

Introduction to Compression

A1 Computer fundamentals4 subtopics

A2 Networks4 subtopics

A3 Databases4 subtopics

A4 Machine learning4 subtopics

B1 Computational thinking1 subtopic

B2 Programming5 subtopics

B3 Object-oriented programming2 subtopics

B4 Abstract data types (HL only)1 subtopic

A1.1.8 Concept of Compression Notes

The Concept of Compression

Lossless vs. Lossy Compression

Lossless Compression Methods

Huffman

Run-Length Encoding (RLE)

How RLE Works

Applications of RLE

Unlock the rest of this chapter with a Free account

anim nostrud sit dolore minim proident quis fugiat velit et eiusmod nulla quis nulla mollit dolor sunt culpa aliqua

Duis aute irure dolor in reprehenderit

Excepteur sint occaecat cupidatat non proident

Introduction to Compression

A1 Computer fundamentals4 subtopics

A2 Networks4 subtopics

A3 Databases4 subtopics

A4 Machine learning4 subtopics

B1 Computational thinking1 subtopic

B2 Programming5 subtopics

B3 Object-oriented programming2 subtopics

B4 Abstract data types (HL only)1 subtopic

The Concept of Compression

Lossless vs. Lossy Compression

Lossless Compression Methods

Huffman

Run-Length Encoding (RLE)

How RLE Works

Applications of RLE

Unlock the rest of this chapter with a Free account

anim nostrud sit dolore minim proident quis fugiat velit et eiusmod nulla quis nulla mollit dolor sunt culpa aliqua

Duis aute irure dolor in reprehenderit

Excepteur sint occaecat cupidatat non proident

Introduction to Compression