P X 64 P Ranges From 1 To 30. Hence The Standard Was Once Known As P 64, "P Star 64". The Standard Requires The Video Encoders Delay To Be Less Than 150
P X 64 P Ranges From 1 To 30. Hence The Standard Was Once Known As P 64, "P Star 64". The Standard Requires The Video Encoders Delay To Be Less Than 150
The standard was designed for videophone, video conferencing, and other, audio visual
services over ISDN telephone lines. Initially, it was intended to support multiples (from 1 to
5) of 384 kbps channels. In the end, however, the video codec supports bitrates of p x 64
kbps, where p ranges from 1 to 30. Hence the standard was once known as p * 64,
pronounced "p star 64". The standard requires the video encoders delay to be less than 150
msec, so that the video can be used for real - time, bidirectional video conferencing.
H.261 belongs to the following set of ITU recommendations for visual telephony systems:
H.221. Frame structure for an audiovisual channel supporting 64 to 1,920 kbps
H.230. Frame control signals for audiovisual systems
Table Video formats supported by H.261
The following figure illustrates a typical H.261 frame sequence. Two types of image frames
are defined: ultra - frames (I - frames) and interframes (P - frames).
I - frames are treated as independent images. Basically, a transform coding method similar
to JPEG is applied within each I - frame, hence the name "intra".
P - frames are not independent. They are coded by a forward predictive coding method in
which current macroblocks are predicted from similar macroblocks in the preceding I: or P -
frame, and differences between the macroblocks are coded. Temporal
redundancy removal is hence included in P - frame coding, whereas I - frame coding
performs only spatial redundancy removal. It is important to remember that prediction from
a previous P - frame is allowed (not just from a previous I - frame).
The interval between pairs of I - frames is a variable and is determined by the encoder.
Usually, an ordinary digital video has a couple of I - frames per second. Motion vectors in
H.261 are always measured in units of full pixels and have a limited range of ±15 pixels that
is, p = 15.
H.261 Frame sequence
I - frame coding
Intra - Frame(l - Frame) Coding
Macroblocks are of size 16 x 16 pixels for the Y frame of the orignal image. For Cb and Cr
frames, they correspond to areas of 8 x 8, since 4:2:0 chroma subsampling is employed.
Hence, a macroblock consists of four Y blocks, one Cb, and one Cr, 8 x 8 blocks.
For each 8 x 8 block, a DCT transform is applied. As in JPEG, the DCT coefficients go
through a quantization stage. Afterwards, they are zigzag - scanned and eventually entropy -
coded.
Quantization in H.261
The quantization in H.261 does not use 8 x 8 quantization matrices, as in JPEG and MPEG.
Instead, it uses a constant, called stepsize, for all DCT coefficients within a macroblock.
H.261 P - frame coding based on motion compensation
According to the need (e.g., bitrate control of the video) stepsize can take on any one of the
31 even values from 2 to 62. One exception, however, is made for the DC coefficient in intra
mode, where a step size of 8 is always used. If we use DCT and QDCT to denote the DCT
coefficients before and after quantization, then for DC coefficients in intra mode,
Meanwhile, the quantized DCT coefficients for I are also sent to Q - 1 and IDCT and hence
appear at Point as I. Combined with a zero input from Point, the data at Point remains as I
and this is stored in Frame Memory, waiting to be used for Motion Estimation and Motion -
Compensation - based Prediction for the subsequent frame P1.
Quantization Control serves as feedback — that is, when the Output Buffer is too full, the
quantization step size is increased, so as to reduce the size of the coded data. This is
known as an encoding rate control process.
When the subsequent Current Frame P1 arrives at Point 1, the Motion Estimation process is
invoked to find the motion vector for the best matching macroblock in frame I for each of
the macroblocks in P1. The estimated motion vector is sent to both Motion - Compensation
- based Prediction and Variable - Length Encoding (VLE). The MC - based Prediction yields
the best matching macroblock in P1. This is denoted as P`1 appearing at Point 2.
At Point, the "prediction error" is obtained, which is D1 = P1 - P`1. Now D1 undergoes DCT,
Quantization, and Entropy Coding, and the result is sent to the Output Buffer. As before, the
DCT coefficients for D1 are also sent to Q - l and IDCT and appear at Point 4 as D1.
Added to P’1 at Point, we have P' 1 = P' 1 + D' 1at Point6. This is stored in Frame Memory,
waiting to be used for Motion Estimation and Motion - Compensation - based Prediction for
the subsequent frame P2. The steps for encoding P2 are similar to those for P1, except
that P2will be the Current Frame and P1 becomes the Reference Frame.
For the decoder, the input code for frames will be decoded first by Entropy Decoding, Q -
1, and IDCT. For Intra - frame mode, the first decoded frame appears at Point 1 and then
Point 4 as I. It is sent as the first output and at the same time stored in the Frame Memory.
Subsequently, the input code for Inter - frame Pi is decoded, and prediction error D1 is
received at Point. Since the motion vector for the current macroblock is also entropy -
decoded and sent to Motion - Compensation - based Prediction, the corresponding
predicted macroblock P’1 can be located in frame I and will appear at Points.
Combined with D' 1, we have P'1 = P' 1 + D' 1 at point, and it is sent out as the decoded
frame and also stored in the Frame Memory, Again, the steps for decoding P2 are similar to
those for P1
GQuant indicates the quantizer to be used in the GOB, unless it is overridden by any
subsequent Macroblock Quantizer (MQuant). GQuant and MQuant are referred to
as scale. Each macroblock (MB) has its own Address, indicating its position within the GOB,
quantizer (MQuant), and six 8 x 8 image blocks (4 Y, 1 Cb, 1 Cr). Type denotes whether it is
an Intra- or Inter, motion - compensated or non - motion - compensated macroblock. Motion
Vector Data (MVD) is obtained by taking the
Arrangement of GOBs in H.261 luminance images
difference between the motion vectors of the preceding and current macroblocks.
Moreover, since some blocks in the macroblocks match well and some match poorly in
Motion Estimation, a bitmask Coded Block Pattern (CBP) is used to indicate this
information. Only well - matched blocks will have their coefficients transmitted. Block layer.
For each 8^ x. g block, the bitstream starts with DC value, followed by pairs of length of zero
- run (Rim) and the subsequent nonzero value (Level) for ACs, and finally the End of Block
(EOB) code. The range of "Run" is [0,63]. "Level" reflects quantized values its range is [ -
127,127], and Level ≠ 0.