Video Coding Standards and Algorithms - Evolution & Patent Analysis
What is Video Coding Format?
Video coding format is defined as a format for storing and transmitting digital video content (such as in a data file or bitstream) from one system to the other system. It usually employs a standardized video compression technique, which is based on discrete cosine transform (DCT) coding and motion correction. For this purpose, a certain set of video coding specifications/documents that specify and store the technical details of these formats are available. These documents are accepted as technical standards by standardization groups, including International Standards Organization (ISO) and the International Telecommunications Union (ITU), and are hence referred to as video coding standards. In the past, a number of video coding formats have been documented and further standardized, including H.120, H.261, Motion JPEG (MJPEG), MPEG-1 Part 2, H.262 / MPEG-2 Part 2 (MPEG-2 Video), DV, H.263, MPEG-4 Part 2 (MPEG-4 Visual), Motion JPEG 2000 (MJ2), Advanced Video Coding (H.264 / MPEG-4 AVC), Theora, VC-1, and Apple ProRes. Further, these video coding standards are further classified into three key groups which include DPCM (Differential pulse-code modulation), DCT (Discrete cosine transform), and DWT (Discrete wavelet transform) (Discrete wavelet transform).
Video Coding Standards: A brief summary
Starting from the 1980s, a number of video coding standards have been introduced. The table presents the brief of such video coding standards along with key remarks.
Video Coding Standard
The first digital video compression standard was H.120 and was published in the year 1984. The first version of H.120 included features like conditional replenishment, DPCM, scalar quantization, and variable-length coding. Further, the updated version of H.120 was published in year 1984, where features like motion compensation and background prediction were added. The most recent version was released in the 1993, where it was operated at1544 (National Television System Committee (NTSC)) and 2048 (Phase Alternating Line (PAL)) kbps.
The ITU-T video compression standard H.261 was originally adopted in November 1988. This standard specifies the video coding and decoding methods for the moving image component of an audiovisual service at p * 64 Kbps, where p is a number between 1 and 30.
Motion JPEG (MJPEG)
Motion JPEG (M-JPEG or MJPEG) is a video compression technology that compresses each video frame or interlaced field of a digital video sequence as a JPEG picture.
MPEG-1 Part 2
MPEG-1 is a video and audio lossy compression standard. It is intended to compress VHS-level raw digital video and CD audio to around 1.5 Mbit/s (26:1 and 6:1 compression ratio, respectively) without sacrificing quality, allowing video CDs, digital cable/satellite TV, and digital audio broadcasting (DAB) to become realistic.
H.262 / MPEG-2 Part 2 (MPEG-2 Video)
H.262 or MPEG-2 Part 2 (formally known as ITU-T Recommendation H.262 and ISO/IEC 13818-2, also known as MPEG-2 Video) is a video coding format used in over-the-air broadcasts in most of the world, and in live TV production for low processing overhead. In detail, H.262 or MPEG-2 is a digital television transmission format used by terrestrial (over-the-air), cable, and direct broadcast satellite TV systems. It also specifies the format of movies and other programming delivered on DVD and other comparable discs.
DV is a family of digital video codecs and tape formats developed in 1995 and is also known as MiniDV, which was the most common tape format that used a DV codec.
H.263 is a video compression standard is a low-bit-rate compressed format for videotelephony and belongs to the H.26x family of video coding standards developed by the ITU-T.
MPEG-4 Part 2 (MPEG-4 Visual)
The Moving Picture Experts Group created MPEG-4 Part 2, MPEG-4 Visual (officially ISO/IEC 14496-2) as a video compression standard (MPEG). It employs block-wise motion correction and a discrete cosine transform (DCT).
Motion JPEG 2000 (MJ2)
Motion JPEG 2000 (MJ2 or MJP2) is a file format based on the MP4 and QuickTime formats for motion sequences of JPEG 2000 pictures and related audio. According to RFC 3745, the filename extensions for Motion JPEG 2000 video files are .mj2 and .mjp2.
Advanced Video Coding (H.264 / MPEG-4 AVC)
The Moving Picture Experts Group created MPEG-4 Part 2, MPEG-4 Visual (officially ISO/IEC 14496-2) as a video compression standard (MPEG). It employs an advance version of the block-wise motion correction (with respect to the MPEG-4 Visual) and a discrete cosine transform (DCT) algorithm.
Theora is a lossy video compression standard that is available with open media projects, such as the Vorbis audio format and the Ogg container. Specifically, Theora is defined as a container format in comparison to H.264 as a video encoding format.
SMPTE 421, sometimes known colloquially as VC-1, is a video coding standard. It was formally certified as an SMPTE standard after several improvements, including the establishment of a new Advanced Profile and further is a simpler alternative to the H.264/MPEG-4 AVC standard.
Apple ProRes is a high-quality, lossy video compression format developed by Apple Inc. for use in post-production, with video resolutions up to 8K supported. The ProRes codec family, like the H.26x and MPEG standards, employs compression methods based on the discrete cosine transform (DCT). ProRes is a popular final format distribution mechanism for HD broadcast files in commercials, feature films, Blu-ray, and streaming.
High Efficiency Video Coding (H.265 / MPEG-H HEVC)
High Efficiency Video Coding (HEVC), commonly known as H.265 and MPEG-H Part 2, is a video compression standard developed as part of the MPEG-H project to succeed the widely used Advanced Video Coding standard (AVC, H.264, or MPEG-4 Part 10). HEVC delivers 25% to 50% greater data compression at the same level of video quality as AVC, or significantly higher video quality at the same bit rate and further supports resolutions up to 8K UHD.
AOMedia Video 1 (AV1) is a video coding standard used for Internet video broadcasts. In comparison to libvpx-vp9, x264 High profile, and x264 Main profile, the AV1 reference encoder performs in terms of compression ratio, processing time, and encoding quality.
Versatile Video Coding (VVC / H.266)
Versatile Video Coding (VVC), also known as H.266, ISO/IEC 23090-3, and MPEG-I Part 3, is a video compression standard that is the replacement for High Efficiency Video Coding (HEVC, also known as ITU-T H.265 and MPEG-H Part 2). The update includes - enhanced compression performance and support for a wide variety of applications.
Algorithms Used for Video Coding Standards
In the past, a number of video coding standards have been published, which include H.120, H.261, MPEG-1 Part 2, MPEG-2 Part 2 (MPEG-2 Video), H.263, Motion JPEG 2000 (MJ2), H.264 / MPEG-4 AVC, Theora, H.265, AV1, and Versatile Video Coding (VVC / H.266). These standards follow a certain algorithm and thus are classified into different categories.
Algorithm wise Video coding standards
Video Coding Standard
H.261, Motion JPEG (MJPEG), MPEG-1 Part 2, H.262 / MPEG-2 Part 2 (MPEG-2 Video), DV, H.263, and MPEG-4 Part 2 (MPEG-4 Visual)
Motion JPEG 2000 (MJ2)
Advanced Video Coding (H.264 / MPEG-4 AVC), Theora, VC-1, Apple ProRes, High Efficiency Video Coding (H.265 / MPEG-H HEVC), AV1, and Versatile Video Coding (VVC / H.266)
1. DCT (Discrete Cosine Transform):
The source image is divided into 8x8 pixel blocks. Further, the DCT is applied to each block from left to right and top to bottom. In response to this transformation, all block elements are compressed and then quantized by dividing by a few specific costs. The array of compressed blocks that represents the photo is saved in a much less amount of space. A DCT function returns a DCT coefficient matrix, including information in the frequency domain. The DCT coefficients are then quantized by dividing by a quantization matrix to reduce garage space. The block length cost also has an impact on the pleasure and compression ratio.
2. DWT (Discrete Wavelet Transform):
DWT is simple to implement, reduces computing time, and eliminates irrelevant source facts. To create 2D wavelets, the picture is divided into sections with high frequency and low frequency runs. The two subsequent sub images contain both high- and low-frequency vertical statistics. Further, each sub pixel is vertically convolved with the wavelet and the scaling characteristic, resulting in two new separations. Thus, a single-stage wavelet transformation consists of a filtering operation that can decompose a 2D signal into four frequency bands. The generalized block diagram for DWT image/video compression/De-compression shows the source image divided into multiple frames followed by DWT transformation, quantization, encoding and outputting of the compressed frames.
Error (Mean Square Error - MSE)
The patent data in this article shows information related to video coding standards, including the patent filing trend across the globe and the top-rated assignees.
The number of applications filed each year across the world. It is exciting to know that the patent filing trend jumped to a new level of more than 1000 applications in the year 2019 - 2021. However, in upcoming times, it is expected to grow as the research and development in this field are still ongoing. Apart from the top companies, many other companies are also indulged in the research process, including Canon, Samsung, Nokia, Huawei, Qualcomm, LG, etc. Henceforth, the trend in patent filing is expected to rise to a new level in the upcoming years.
The top assignees in the field of video coding are presented. Out of all, Sony, with a total number of 2546 patents, holds the majority of shares, followed by Canon and LG Electronics. The other top companies/assignees that contribute to this area of research includes Qualcomm, Samsung, Huawei, Panasonic, Sharp, Intel, Toshiba, etc. Out of many other companies, the top most companies are working with the modified DCT algorithm and the advance video encoders. Such advancement has led to technological development in view of enhanced user experience for existing video services and appropriate performance levels for new media services over 5G networks. Besides the major US companies, a number of top companies like Alibaba, Kuaishou and Hikvision, as well as Korea-based Wilus Group are also working on the development of similar technology.
The future scope of video standards is very bright in view of the upcoming and trending technology. This includes the deep-learning based video coding, such as reducing the compression complexity, and power consumption. Thus, this will allow higher efficiency and better output. Apart from this, the other key elements related to video coding standards include – visual quality assessment, especially in view of PSNR (Peak Signal-to-Noise Ratio), Artificial intelligence-based encoding algorithms, Hybrid encoding/decoding algorithms, etc.
Besides this, at the international level, the video encoder market is estimated to reach from $2.3B to $3.3B USD by 2027, with a CAGR of 7.6% from 2022 to 2027. This is due to the introduction of high-efficiency video coding standards for video encoding, the simplicity of connecting analogue cameras employing video encoders to a network, and the usage of cloud services to store vast amounts of data have all contributed to the market's rise.
F. Moreno and D. Aledo, "The DLMT hardware implementation. A comparative study with the DCT and the DWT," IECON 2012 - 38th Annual Conference on IEEE Industrial Electronics Society, 2012, pp. 1591-1596, doi: 10.1109/IECON.2012.6388531
F. Moreno and D. Aledo, " Comparative Analysis between DCT & DWT Techniques of Image Compression," Journal of Information Engineering and Applications, Vol. 1, No. 2, 2011, pp. 9-17.