Module 2
Module 2
Module 2
Dr Pushpavathi.K.P, BMSCE
Types of text
• Unformatted text – Plain text.
• Hypertext.
Dr Pushpavathi.K.P, BMSCE
Unformatted text
• ASCII character set.
• Printable characters- alphabetic, numeric, punctuation.
• Control characters:
1. Format control character- BS, SP, DEL, ESC.
2. Information separators - FS, RS.
3. Transmission control characters – SOH, STX, ETX, ACK, NAK,
SYN.
• Mosaic characters- supplementary version – to create relatively
simple graphical images.
Dr Pushpavathi.K.P, BMSCE
Character set to produce unformatted text
ASCII
character set
Dr Pushpavathi.K.P, BMSCE
Character set to produce unformatted text (cntd.)
Mosaic characters
Dr Pushpavathi.K.P, BMSCE
Formatted text
• Publishing sector.
• Characters of different styles and variable size.
• Structure a document into chapters, paragraph.
• Graphics and pictures inserted at appropriate points.
• Enter a specific command.
• Reserved format control character.
• Numeric or alphabetic character.
Dr Pushpavathi.K.P, BMSCE
Formatted text (cntd.)
a) formatted text string b) printed version of string
Dr Pushpavathi.K.P, BMSCE
Hyper Text
• Pages
• Defined linkage points-hyperlinks
• Electronic version of the documents.
• Browser.
• Home page
• URL
• HTML- presentation of the document.
• Directives – page formatting commands sandwiched between pair of
tags , eg: <p> , <B>text</B>
Dr Pushpavathi.K.P, BMSCE
Electronic document written using hypertext
Dr Pushpavathi.K.P, BMSCE
Dr Pushpavathi.K.P, BMSCE
Digital document
Dr Pushpavathi.K.P, BMSCE
Digital document (cntd.)
• Vertical resolution – 3.85 or 7.7 lines/mm or 100 or 200 lines/inch
• o/p of scanner – Resolution of apprx. 8 pels
• Bitonical images –black and white
• One binary digit –represent each pel
0 for white , 1 for black pel
• Typicla image – stream of 2 million bits - uncompressed
Dr Pushpavathi.K.P, BMSCE
Facsimile conversion codes
• ITU-T group 3 and group 4- standards
a) Termination codes – run length – 0 to 63
b) Make-up codes – run length – multiples of 64
• Table of code words –produced - based on relative frequency of
occurrence of no. of white and black pels in scanned lines.
• Code words are fixed
Dr Pushpavathi.K.P, BMSCE
Facsimile conversion codes (cntd.)
Dr Pushpavathi.K.P, BMSCE
Compression principles
Dr Pushpavathi.K.P, BMSCE
Source Encoders and Destination Decoders
Dr Pushpavathi.K.P, BMSCE
Lossless and lossy compression
Dr Pushpavathi.K.P, BMSCE
Entropy encoding:
• Lossless and independent.
• Information representation.
• Two examples:
1. Run length encoding.
2. Statistical encoding.
Dr Pushpavathi.K.P, BMSCE
Entropy encoding (cntd.)
• Entropy of the source – minimum average no of bits required to
transmit a particular source stream
• Qualitatively - defined as the useful information
• Ebits(C) = -log2(PC) , where Ebits(C) –entropy of character C, - in bits
PC – probability of character C in given data
Dr Pushpavathi.K.P, BMSCE
Statistical encoding:
• Variable length code word.
• Prefix property.
• Huffman encoding algorithm.
• Entropy.
entropy of source
• Efficiency of encoding scheme =
average number of bits per codeword
Dr Pushpavathi.K.P, BMSCE
Source encoding:
• Differential encoding.
• Transform encoding.
Dr Pushpavathi.K.P, BMSCE
Differential encoding:
• Amplitude of signal covers larger range, but difference in amplitude
between successive values is relatively small.
• Smaller set of codewords.
• Indicates difference in amplitude between current signal and
immediately preceding value.
• Can be lossless or lossy – depending on number of bits to encode
different value.
Dr Pushpavathi.K.P, BMSCE
Transform encoding
• Transforming the source information.
• No loss of information.
• Digitization of image produces two dimensional matrix.
• Rate of change in magnitude – Spatial frequency.
• Horizontal and vertical frequency components.
• Human eye is less sensitive to higher spatial frequency.
• DCT.
Dr Pushpavathi.K.P, BMSCE
Transform encoding: pixel pattern
Dr Pushpavathi.K.P, BMSCE
Transform encoding : DCT principle
Dr Pushpavathi.K.P, BMSCE
Text Compression
• Compression algorithm must be lossless.
• Entropy encoding – Statistical encoding.
• Two types of Statistical encoding:
1. Optimum set of codewords – Huffman coding, Arithmetic coding
2. Variable length characters – Lempel-Ziv (LZ) algoritm.
• Two types of coding :
1. Static coding.
2. Dynamic or adaptive coding.
Dr Pushpavathi.K.P, BMSCE
Static Huffman Coding
• Transmitted character string is analyzed.
• Determine character types and relative frequency.
• Coding operation involves creating an unbalanced tree.
• Huffman code tree – Binary tree.
• Branches are assigned the value 0 or 1.
• Root node (top), branch node , leaf node (termination point).
Dr Pushpavathi.K.P, BMSCE
Huffman code tree
Example - character string : AAAABBCD (Final tree with codes)
Dr Pushpavathi.K.P, BMSCE
Dr Pushpavathi.K.P, BMSCE
Decoding Algorithm
Dr Pushpavathi.K.P, BMSCE
Dr Pushpavathi.K.P, BMSCE
Dr Pushpavathi.K.P, BMSCE
Decoding (cntd.)
Dr Pushpavathi.K.P, BMSCE
Dynamic (adaptive) Huffman Coding
Dr Pushpavathi.K.P, BMSCE
Adaptive Huffman tree
Dr Pushpavathi.K.P, BMSCE
Ref: multimedia communication , components, Techniques, standards – Krishna kumar D N
Incrementing the count : if next symbol is ‘A’
Dr Pushpavathi.K.P, BMSCE
Ref: multimedia communication , components, Techniques, standards – Krishna kumar D N
Sibling property - swapping
One more increment in ‘A’
Dr Pushpavathi.K.P, BMSCE
Ref: multimedia communication , components, Techniques, standards – Krishna kumar D N
Swapping of internal nodes
Receiving two more ‘A’ s
Dr Pushpavathi.K.P, BMSCE
Ref: multimedia communication , components, Techniques, standards – Krishna kumar D N
Arithmetic Coding
Dr Pushpavathi.K.P, BMSCE
Arithmetic Coding (cntd.)
e=0.3, n=0.3, t=0.2, w=0.1, .=0.1 ‘WENT’
Dr Pushpavathi.K.P, BMSCE
Arithmetic Coding (decoding)
• Decoder knows:
• Set of characters present in the received encoded messages
• Segment to which each character has been assigned
• Its related range.
• Follow the same procedure as in encoder.
• No. of decimal digits increase linearly with no. of characters in string.
• Complex messages –first fragmented into multiple smaller strings –
encoded separately
• Resulting set of code words – sent as blocks of floating point numbers
(binary) in known formats.
Dr Pushpavathi.K.P, BMSCE
Lempel-Ziv Coding
• Uses strings of characters for the coding instead of single characters.
• Table holds character string of the text (all possible words).
• Encoder sends only the index of the word.
• Decoder use this index to access the word.
• Table is used as dictionary.
• Dictionary based compression algorithm.
• Word processing packages have dictionary-15 bits (25000 words).
• Shorter words – lower compression ratio, longer words – higher.
Dr Pushpavathi.K.P, BMSCE
Lempel-Ziv-Welsh Coding
Dr Pushpavathi.K.P, BMSCE
Lempel-Ziv-Welsh (LZW) compression algorithm
Basic operation
Dr Pushpavathi.K.P, BMSCE
Dr Pushpavathi.K.P, BMSCE
Reference
• Multimedia communications- Applications , networks, Protocols and
Standards – Fred Halsall
Dr Pushpavathi.K.P, BMSCE