0% found this document useful (0 votes)
224 views

Lecture 11 - Run-Length Encoding

Run-length encoding is a simple data compression technique that works by identifying repeated consecutive values in data and storing them as a single value and count. It is well-suited for compressing image, audio, and other data containing long runs of repeated values. While run-length encoding achieves lower compression ratios than more advanced methods, it is very fast and easy to implement, making it a practical choice for many applications. Variants include encoding in different directions or discarding least significant bits to further improve compression at the cost of lossiness.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
224 views

Lecture 11 - Run-Length Encoding

Run-length encoding is a simple data compression technique that works by identifying repeated consecutive values in data and storing them as a single value and count. It is well-suited for compressing image, audio, and other data containing long runs of repeated values. While run-length encoding achieves lower compression ratios than more advanced methods, it is very fast and easy to implement, making it a practical choice for many applications. Variants include encoding in different directions or discarding least significant bits to further improve compression at the cost of lossiness.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 30

Run-Length Encoding

By Fareed Ahmed Jokhio


Run-Length Encoding
• Data files frequently contain the same
character repeated many times in a row.
• For example, text files use multiple spaces to
separate sentences, indent paragraphs, format
tables & charts, etc.
• Digitized signals can also have runs of the
same value, indicating that the signal is not
changing. 
Run-Length Encoding
• For instance, an image of the night time sky
would contain long runs of the character or
characters representing the black background.
• Likewise, digitized music might have a long run
of zeros between songs.
• Run-length encoding is a simple method of
compressing these types of files.
Run-Length Encoding
• Figure on next slide illustrates run-length
encoding for a data sequence having frequent
runs of zeros.
• Each time a zero is encountered in the input data,
two values are written to the output file.
• The first of these values is a zero, a flag to indicate
that run-length compression is beginning.
• The second value is the number of zeros in the
run. 
Run-Length Encoding
•  If the average run-length is longer than two,
compression will take place.
• On the other hand, many single zeros in the
data can make the encoded file larger than the
original.
Run-Length Encoding
• Many different run-length schemes have been
developed.
• For example, the input data can be treated as
individual bytes, or groups of bytes that
represent something more elaborate, such as
floating point numbers.
• Run-length encoding can be used on only one of
the characters (as with the zero above), several of
the characters, or all of the characters.
Run-Length Encoding
• Run-length encoding is a data compression
algorithm that is supported by most bitmap
file formats, such as TIFF, BMP, and PCX.
• RLE is suited for compressing any type of data
regardless of its information content, but the
content of the data will affect the compression
ratio achieved by RLE.
Run-Length Encoding
• Although most RLE algorithms cannot achieve
the high compression ratios of the more
advanced compression methods, RLE is both
easy to implement and quick to execute,
making it a good alternative to either using a
complex compression algorithm or leaving
your image data uncompressed.
Run-Length Encoding
• RLE works by reducing the physical size of a
repeating string of characters. This repeating
string, called a run, is typically encoded into
two bytes. The first byte represents the
number of characters in the run and is called
the run count.
Run-Length Encoding
• In practice, an encoded run may contain 1 to
128 or 256 characters; the run count usually
contains as the number of characters minus
one (a value in the range of 0 to 127 or 255).
• The second byte is the value of the character
in the run, which is in the range of 0 to 255,
and is called the run value.
Run-Length Encoding
• Uncompressed, a character run of 15 A
characters would normally require 15 bytes to
store:
• AAAAAAAAAAAAAAA
• After RLE encoding, this string becomes:
• 15A
Run-Length Encoding
• The 15A code generated to represent the
character string is called an RLE packet.
• Here, the first byte, 15, is the run count and
contains the number of repetitions.
• The second byte, A, is the run value and
contains the actual repeated value in the run.
Run-Length Encoding
• A new packet is generated each time the run
character changes, or each time the number of
characters in the run exceeds the maximum count.
• Assume that our 15-character string now contains
four different character runs:
• AAAAAAbbbXXXXXt
• Using run-length encoding this could be
compressed as:
• 6A3b5X1t
Run-Length Encoding
• Thus, after run-length encoding, the 15-byte
string would require only eight bytes of data
to represent the string, as opposed to the
original 15 bytes.
• In this case, run-length encoding yielded a
compression ratio of almost 2 to 1.
Run-Length Encoding
• Long runs are rare in certain types of data.
• For example, ASCII plaintext seldom contains
long runs.
• In the previous example, the last run (containing
the character t) was only a single character in
length; a 1-character run is still a run.
• Both a run count and a run value must be
written for every 2-character run. 
Run-Length Encoding
• To encode a run in RLE requires a minimum of
two characters worth of information;
therefore, a run of single characters actually
takes more space.
• For the same reasons, data consisting entirely
of 2-character runs remains the same size
after RLE encoding.
Run-Length Encoding
• In our example, encoding the single character
at the end as two bytes did not noticeably
hurt our compression ratio because there
were so many long character runs in the rest
of the data.
• But observe how RLE encoding doubles the
size of the following 14-character string:
Run-Length Encoding
• Xtmprsqzntwlfb
• After RLE encoding, this string becomes:
• 1X1t1m1p1r1s1q1z1n1t1w1l1f1b
Run-Length Encoding
• RLE schemes are simple and fast, but their
compression efficiency depends on the type of
image data being encoded.
• A black-and-white image that is mostly white,
such as the page of a book, will encode very
well, due to the large amount of contiguous
data that is all the same color.
Run-Length Encoding
• An image with many colors that is very busy in
appearance, however, such as a photograph,
will not encode very well.
• This is because the complexity of the image is
expressed as a large number of different
colors.
• And because of this complexity there will be
relatively few runs of the same color.
Variants on Run-Length Encoding
• There are a number of variants of run-length
encoding.
• Image data is normally run-length encoded in a
sequential process that treats the image data as a 1D
stream, rather than as a 2D map of data.
• In sequential processing, a bitmap is encoded
starting at the upper left corner and proceeding from
left to right across each scan line (the X axis) to the
bottom right corner of the bitmap (shown in figure
(a) on next slide).
Variants on Run-Length Encoding
Variants on Run-Length Encoding
• But alternative RLE schemes can also be written
to encode data down the length of a bitmap
(the Y axis) along the columns (shown in Figure,
b), to encode a bitmap into 2D tiles (shown in
Figure, c), or even to encode pixels on a
diagonal in a zig-zag fashion (shown in, d).
• Odd RLE variants such as this last one might be
used in highly specialized applications but are
usually quite rare.
Variants on Run-Length Encoding
• Another seldom-encountered RLE variant is a lossy
run-length encoding algorithm.
• RLE algorithms are normally lossless in their operation.
• However, discarding data during the encoding process,
usually by zeroing out one or two least significant bits
in each pixel, can increase compression ratios without
adversely affecting the appearance of very complex
images.
• This RLE variant works well only with real-world images
that contain many subtle variations in pixel values.
Variants on Run-Length Encoding
• Make sure that your RLE encoder always stops at
the end of each scan line of bitmap data that is
being encoded.
• There are several benefits to doing so.
• Encoding only a simple scan line at a time means
that only a minimal buffer size is required.
• Encoding only a simple line at a time also
prevents a problem known as cross-coding.
Variants on Run-Length Encoding
• Cross-coding is the merging of scan lines that
occurs when the encoded process loses the
distinction between the original scan lines.
• If the data of the individual scan lines is
merged by the RLE algorithm, the point where
one scan line stopped and another began is
lost or, at least, is very hard to detect quickly.
Variants on Run-Length Encoding
Variants on Run-Length Encoding
Variants on Run-Length Encoding
Variants on Run-Length Encoding

You might also like