Run-length encoding is a simple data compression technique that works by identifying repeated consecutive values in data and storing them as a single value and count. It is well-suited for compressing image, audio, and other data containing long runs of repeated values. While run-length encoding achieves lower compression ratios than more advanced methods, it is very fast and easy to implement, making it a practical choice for many applications. Variants include encoding in different directions or discarding least significant bits to further improve compression at the cost of lossiness.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
224 views
Lecture 11 - Run-Length Encoding
Run-length encoding is a simple data compression technique that works by identifying repeated consecutive values in data and storing them as a single value and count. It is well-suited for compressing image, audio, and other data containing long runs of repeated values. While run-length encoding achieves lower compression ratios than more advanced methods, it is very fast and easy to implement, making it a practical choice for many applications. Variants include encoding in different directions or discarding least significant bits to further improve compression at the cost of lossiness.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 30
Run-Length Encoding
By Fareed Ahmed Jokhio
Run-Length Encoding • Data files frequently contain the same character repeated many times in a row. • For example, text files use multiple spaces to separate sentences, indent paragraphs, format tables & charts, etc. • Digitized signals can also have runs of the same value, indicating that the signal is not changing. Run-Length Encoding • For instance, an image of the night time sky would contain long runs of the character or characters representing the black background. • Likewise, digitized music might have a long run of zeros between songs. • Run-length encoding is a simple method of compressing these types of files. Run-Length Encoding • Figure on next slide illustrates run-length encoding for a data sequence having frequent runs of zeros. • Each time a zero is encountered in the input data, two values are written to the output file. • The first of these values is a zero, a flag to indicate that run-length compression is beginning. • The second value is the number of zeros in the run. Run-Length Encoding • If the average run-length is longer than two, compression will take place. • On the other hand, many single zeros in the data can make the encoded file larger than the original. Run-Length Encoding • Many different run-length schemes have been developed. • For example, the input data can be treated as individual bytes, or groups of bytes that represent something more elaborate, such as floating point numbers. • Run-length encoding can be used on only one of the characters (as with the zero above), several of the characters, or all of the characters. Run-Length Encoding • Run-length encoding is a data compression algorithm that is supported by most bitmap file formats, such as TIFF, BMP, and PCX. • RLE is suited for compressing any type of data regardless of its information content, but the content of the data will affect the compression ratio achieved by RLE. Run-Length Encoding • Although most RLE algorithms cannot achieve the high compression ratios of the more advanced compression methods, RLE is both easy to implement and quick to execute, making it a good alternative to either using a complex compression algorithm or leaving your image data uncompressed. Run-Length Encoding • RLE works by reducing the physical size of a repeating string of characters. This repeating string, called a run, is typically encoded into two bytes. The first byte represents the number of characters in the run and is called the run count. Run-Length Encoding • In practice, an encoded run may contain 1 to 128 or 256 characters; the run count usually contains as the number of characters minus one (a value in the range of 0 to 127 or 255). • The second byte is the value of the character in the run, which is in the range of 0 to 255, and is called the run value. Run-Length Encoding • Uncompressed, a character run of 15 A characters would normally require 15 bytes to store: • AAAAAAAAAAAAAAA • After RLE encoding, this string becomes: • 15A Run-Length Encoding • The 15A code generated to represent the character string is called an RLE packet. • Here, the first byte, 15, is the run count and contains the number of repetitions. • The second byte, A, is the run value and contains the actual repeated value in the run. Run-Length Encoding • A new packet is generated each time the run character changes, or each time the number of characters in the run exceeds the maximum count. • Assume that our 15-character string now contains four different character runs: • AAAAAAbbbXXXXXt • Using run-length encoding this could be compressed as: • 6A3b5X1t Run-Length Encoding • Thus, after run-length encoding, the 15-byte string would require only eight bytes of data to represent the string, as opposed to the original 15 bytes. • In this case, run-length encoding yielded a compression ratio of almost 2 to 1. Run-Length Encoding • Long runs are rare in certain types of data. • For example, ASCII plaintext seldom contains long runs. • In the previous example, the last run (containing the character t) was only a single character in length; a 1-character run is still a run. • Both a run count and a run value must be written for every 2-character run. Run-Length Encoding • To encode a run in RLE requires a minimum of two characters worth of information; therefore, a run of single characters actually takes more space. • For the same reasons, data consisting entirely of 2-character runs remains the same size after RLE encoding. Run-Length Encoding • In our example, encoding the single character at the end as two bytes did not noticeably hurt our compression ratio because there were so many long character runs in the rest of the data. • But observe how RLE encoding doubles the size of the following 14-character string: Run-Length Encoding • Xtmprsqzntwlfb • After RLE encoding, this string becomes: • 1X1t1m1p1r1s1q1z1n1t1w1l1f1b Run-Length Encoding • RLE schemes are simple and fast, but their compression efficiency depends on the type of image data being encoded. • A black-and-white image that is mostly white, such as the page of a book, will encode very well, due to the large amount of contiguous data that is all the same color. Run-Length Encoding • An image with many colors that is very busy in appearance, however, such as a photograph, will not encode very well. • This is because the complexity of the image is expressed as a large number of different colors. • And because of this complexity there will be relatively few runs of the same color. Variants on Run-Length Encoding • There are a number of variants of run-length encoding. • Image data is normally run-length encoded in a sequential process that treats the image data as a 1D stream, rather than as a 2D map of data. • In sequential processing, a bitmap is encoded starting at the upper left corner and proceeding from left to right across each scan line (the X axis) to the bottom right corner of the bitmap (shown in figure (a) on next slide). Variants on Run-Length Encoding Variants on Run-Length Encoding • But alternative RLE schemes can also be written to encode data down the length of a bitmap (the Y axis) along the columns (shown in Figure, b), to encode a bitmap into 2D tiles (shown in Figure, c), or even to encode pixels on a diagonal in a zig-zag fashion (shown in, d). • Odd RLE variants such as this last one might be used in highly specialized applications but are usually quite rare. Variants on Run-Length Encoding • Another seldom-encountered RLE variant is a lossy run-length encoding algorithm. • RLE algorithms are normally lossless in their operation. • However, discarding data during the encoding process, usually by zeroing out one or two least significant bits in each pixel, can increase compression ratios without adversely affecting the appearance of very complex images. • This RLE variant works well only with real-world images that contain many subtle variations in pixel values. Variants on Run-Length Encoding • Make sure that your RLE encoder always stops at the end of each scan line of bitmap data that is being encoded. • There are several benefits to doing so. • Encoding only a simple scan line at a time means that only a minimal buffer size is required. • Encoding only a simple line at a time also prevents a problem known as cross-coding. Variants on Run-Length Encoding • Cross-coding is the merging of scan lines that occurs when the encoded process loses the distinction between the original scan lines. • If the data of the individual scan lines is merged by the RLE algorithm, the point where one scan line stopped and another began is lost or, at least, is very hard to detect quickly. Variants on Run-Length Encoding Variants on Run-Length Encoding Variants on Run-Length Encoding Variants on Run-Length Encoding