Secret encoders In addition to the two approaches described above, many other methods for selecting appropriate candidate motion vectors exist, including a wide variety of proprietary solutions. Most video compression standards specify only the format of the compressed video bit stream and the decoding steps and leave the encoding process undefined so that encoders can employ a variety of approaches to motion estimation. The approach to motion estimation is the largest differentiator among video encoder implementations that comply with a common standard. The choice of motion estimation technique significantly affects computational requirements and video quality; therefore, details of the approach to motion estimation in commercially available encoders are often closely guarded trade secrets. Motion compensation I n the video decoder, motion compensation uses the motion vectors encoded in the compressed bit stream to predict the pixels in each macro block. If the horizontal and vertical components of the motion vector are both integer values, then the predicted macro block is simply a copy of the 16-pixel by 16-pixel region of the reference frame. If either component of the motion vector has a non-integer value, interpolation is used to estimate the image at non-integer pixel locations. Next, the prediction error is decoded and added to the predicted macro block in order to reconstruct the actual macro block pixels. As mentioned earlier, the 16×16 macro block may be subdivided into smaller sections with independent motion vectors. Compared to motion estimation, motion compensation is much less computationally demanding. While motion estimation must perform SAD or SSD computation on a number of 16-pixel by 16-pixel regions in an attempt to find the best match for each macro block, motion compensation simply copies or interpolates one such region for each macro block. Still, motion compensation can consume as much as 40% of the processor cycles in a video decoder, though this number varies greatly depending on the content of a video sequence, the video compression standard, and the decoder implementation. For example, the motion compensation workload can comprise as little as 5% of the processor cycles spent in the decoder for a frame that makes little use of interpolation. Like motion estimation, motion compensation requires the video decoder to keep one or two reference frames in memory, often requiring external memory chips for this purpose. However, motion compensation makes fewer accesses to reference frame buffers than does motion estimation. Therefore, memory bandwidth requirements are less stringent for motion compensation compared to motion estimation, although high memory bandwidth is still desirable for best processor performance. Polishing the image: Deblocking & deranging Ideally, lossy image and video compression algorithms discard only perceptually insignificant information, so that the reconstructed image or video sequence appears identical to the original uncompressed image or video to the human eyes. In practice, however, some artifacts may be visible. This can happen due to a poor encoder implementation, video content that is particularly challenging to encode, or a selected bit rate that is too low for the video sequence, resolution and frame rate. The latter case is particularly common, since many applications trade off video quality for a reduction in storage and/ or bandwidth requirements. Two types of artifacts, ‘blocking’ and ‘ringing,’ are common in video compression applications. Blocking artifacts are due to the fact that compression algorithms divide each frame into 8×8 blocks. Each block is reconstructed with some small errors, and the errors at the edges of a block often contrast with the errors at the edges of neighboring blocks, making block boundaries visible. In contrast, ringing artifacts appear as distortions around the edges of image features. Ringing artifacts are due to the encoder discarding too much information in quantizing the high-frequency DCT coefficients. Video compression applications often employ filters following decompression to reduce blocking and ringing artifacts. These filtering steps are known as ‘deblocking’ and ‘deringing’ respectively. Both deblocking and deringing use low-pass FIR (finite impulse response) filters to hide these visible artifacts. Deblocking and deringing filters are fairly computationally demanding. Combined, these filters can easily consume more processor cycles than the video decoder itself. For example, an MPEG-4 simple-profile, level 1 (176×144 pixel, 15 fps) decoder optimized for the ARM9E general-purpose processor core requires that the processor be run at an instruction cycle rate of about 14 MHz when decoding a moderately complex video stream. If deblocking is added, the processor must be run at 33 MHz. If both deringing and deblocking are added, the processor must be run at about 39 MHz – nearly three times the clock rate required for the video decompression algorithm alone. Post-processing or in-line Deblocking and deringing filters can be applied to video frames as a separate post-processing step that is independent of video decompression. This approach provides system designers the flexibility to select the best deblocking and/or deringing filters for their application or to forego these filters entirely in order to reduce computational demands. With this approach, the video decoder uses each unfiltered reconstructed frame as a reference frame for decoding future video frames, and an additional frame buffer is required for the final filtered video output. Alternatively, deblocking and/ or deringing can be integrated into the video decompression algorithm. This approach, sometimes referred to as ‘loop filtering,’ uses the filtered reconstructed frame as the reference frame for decoding future video frames. This approach requires the video encoder to perform the same deblocking and/ or deringing filtering steps as the decoder in order to keep each reference frame used in encoding identical to that used in decoding. The need to perform filtering in the encoder increases processor performance requirements for encoding, but can improve image quality, especially for very low bit rates. In addition, the extra frame buffer that is required when deblocking and/ or deringing are implemented as a separate post-processing step is not needed when deblocking and deringing are integrated into the decompression algorithm. Figure 6 illustrates deblocking and deringing applied ‘in-loop’ or as a post-processing step. H.264, for example, includes an ‘in-loop’ deblocking filter, sometimes referred to as…