Primary DTV standards.
In this paper, a video compression standard used in digital television systems is discussed. Basic concepts of video compression and principles of lossy and lossless compression are given. Techniques of video compression (intraframe and interframe compression), the type of frames and principles of the bit rate compression are discussed. Characteristics of standard-definition television (SDTV), high-definition television (HDTV) and ultra-high-definition television (UHDTV) are given. The principles of the MPEG-2, MPEG-4 and High Efficiency Video Coding (HEVC) compression standards are analyzed. Overview of basic standards of video compression and the impact of compression on the quality of TV images and the number of TV channels in the multiplexes of terrestrial and satellite digital TV transmission are shown. This work is divided into six sections.
Video compression technology is technology which allows you to record video in such a way that they take up less memory space and allows for the video to be a little different from the original, when playing. Reducing data (data compression) is possible because the image contains redundant (same) information . Compression is the process of reducing the number of bits that are used to encode individual picture elements.
In digital television, parameters for digital video signal with compression and without compression are given by recommendation ITU-R BT.601 . In broadcasting, transmission with lower speed requires less bandwidth and transmitter with lower power. Recording signals, using compression, reduces the required capacity of storage media, and it is directly proportional to the size of compression. For archival purposes that significantly reduces the required space and cost of the archive.
Techniques for accomplishing a reduction in the video size are mostly confined to compress individual frames of content and techniques of writing changes and differences between frames. Videos are usually composed of three types of frames: I-frames (intra-frames), P-frames (predicted-frames) and B-frames (bidirectional-frames). The difference between different types of frames is only in the write mode and read mode (the interpretation). During the playing (displaying), each frame is shown as a normal image regardless of recording technique of the video format. Intraframe or spatial compression is technique in the video compression for reducing the size of individual frames. Interframe or temporal compression is a video compression technique that achieves a reduction in size of similar series of frames .
The development of digital telecommunications allows the use of high-definition television (HDTV) besides standard-definition television (SDTV). HDTV is a technology that offers significantly higher quality of picture and sound than the traditional display technology did (analog PAL, NTSC, SECAM, SDTV and digital). Since the resolution is higher, the image is sharper, less blurry and the content is closer to reality. HDTV offers smoother movement, detailed and more vibrant colors, and there is a very high-quality multichannel sound that makes viewing experience even better. Table 1 shows the basic characteristics of the primary digital TV standards.
|DTV||Resolution||Aspect ratio||Number of frames per second|
|HDTV||1920 × 1080||16:9||25p, 30p, 50i|
|1280 × 720||16:9||25p, 30p, 50i|
|SDTV||720 × 576||16:9||25p, 30p, 50i|
|720 × 576||4:3||25p, 30p, 50i|
HDTV offers two quality signals: 720 and 1080 are the basic tags, which can be added to either the letter “i” or the letter “p”, which means the ways of drawing the image (i = interlaced—draws every other line; p = progressive—draws the line-by-line). The “heights” of image are 720 and 1080, and the width is 1280 or 1920 pixels. The number of images per second is specified next to the tag, for example. 720p50 indicates a resolution of 1280 × 720, progressively rendering images and 50 frames per second .
Without compression, digital video signal would contain an enormous amount of data. For example, the standard digital video signal according to CCIR standards has 25 frames per second, resolution of 720 × 576 pixels, and each pixel is represented by 24 bits (3 bits for each color component). Transmission of uncompressed video signals requires channel capacity of 216 Mb/s. Video-definition HDTV signal requires six times bigger channel capacity of about 1.5 Gb/s. In multimedia systems, problems occur during the storage of digital video signal. That is why different algorithms are used to compress video signals. Compression ratio depending on the algorithm used for compression (MPEG-2, MPEG-4, etc.) can be different. The required bit rate for MPEG-2 standard used to transfer HD signal is about 20 Mb/s, and for SDTV, resolution 720 × 576, line is about 4 Mb/s. If we are using the MPEG-4 standard then for the same quality, twice lower is required signal strength. European broadcasters mainly use MPEG-2 standard, although lately MPEG-4 standard is increasingly used [5, 6].
Table 2 shows the flows of compressed television signals that are used in practice in a broadcast, obtained from the MPEG-2 and MPEG-4 standards.
|Standards for video compression||TV video resolution||Bit rates compressed video signals (Mb/s)|
Ultra-high-definition television (UHDTV) includes 4K UHDTV (2160p) and 8K UHDTV (4320p), which represents the two digital video formats proposed by NHK Science & Technology Research Laboratories and approved by the International Telecommunications Union (ITU). Minimum resolution of this format is 3840 × 2160 pixels . Digital TV program consists of three components: video, audio and service data as shown in Figure 1.
Service information, which contains additional information such as teletext and specific information of network, including an electronic program guide (EPG), is generated in digital form and does not require coding. Encoders compress the data by removing irrelevant or redundant parts of the image and sound signals and perform a reduction operation to produce separate video and audio packets of elementary streams.
In 1990, due to the need for storage and playback of moving images and sound in digital format for multimedia applications on various platforms, ISO has formed an expert group called Motion Picture Experts Group (MPEG).
In order to enable the interconnection of equipment from different manufacturers, standards for compression and transmission of video signals are defined. Among them, the best known are H.261, H.263 and H.264 for videoconferencing transmission, videophone and distribution of digital material via the web, as well as the MPEG standards (MPEG-1, MPEG-2, MPEG-4, MPEG-7, MPEG-21), which are intended for standardization of multimedia systems and digital television.
2. MPEG-2 standard
2.1. The principles of the MPEG-2 standard
Other standard developed by the MPEG group is ISO/IEC IS 13818: Generic Coding of Moving Pictures and Associated Audio, so-called MPEG-2 standard. It is aimed to professional digital television [8, 9], adopted in 1999, produced on the disadvantages of the standard MPEG-1. It is compatible with MPEG-1 standard, using the same tools and adding some new.
Basic innovations with the MPEG-2 standard are as follows: an increased bit rate, picture formats with and without thinning, scalability of quality and time, improved methods of quantization and coding, etc. Since it is primarily designed for TV signal compression, MPEG-2 standard allows the use of both types of image scanning: progressive scanning and scan by line spacing. In the compression process, all three types of pictures can be coded as I, P and B pictures. Standard encoder structure comprises a mixture of I, P and B frames in a way that I frame appears after every 10–15 frames, and two B frames between two adjacent I frame. Usually, a group of pictures (GOP) has one I frame or more P and B frames.
2.2. Profiles and levels MPEG-2 standard
Since the complete syntax of the MPEG-2 standard is complex and difficult for practical implementation on a single silicon chip, the MPEG-2 standard defines five subsets of the full syntax, called profiles, which are designed for a variety of applications. These are simple (simple) profile, main (main) profile, signal-to-noise ratio (SNR) scalable profile, spatial scalability (spatial scalable) and high profile (high) profile. Later, another is created, 4:2:2 profile, and definition of another (multiview) profile is in progress.
The profile is defined by four levels, which regulate the choice of available parameters during the hardware implementation. The levels determine the maximum bit rate, and according to the bit rate the speed of transmission of TV programs and resolutions of the system are chosen, and they are, on the other hand, determined by the number of samples per line, number of lines per image and the number of frames per second. There are four levels: high level (HL) H14L (H 1440 level), main level (ML) and low level (LL) . Parameter limitations by levels are shown in Table 3 .
|Level||Maximum pixel number||Maximum line number||Maximum frames/s|
|H 1440 level||1440||1152||60|
|H 1920 level||1920||1080||60|
Simple profile is designed to simplify the transmitter encoder and receiver decoder, with reductions in binary rate (transfer speed), and the inability bidirectional prediction (B pictures do not exist) supports only I and P prediction. As such it is suitable only for low-resolution terrestrial television. The maximum bit rate is 15 Mb/s.
The main profile is the optimal compromise between compression ratio and price. It supports all three types of prediction I, P, B, which automatically leads to the complexity of the encoder and decoder. Main profile supports all four levels, with a maximum bit rates of 4, 15, 60 and 80 Mb/s, for low, main, high-1440 and 1920 high level, respectively. The majority of broadcast applications are scheduled for operation in the main profile. Terrestrial digital TV uses the main profile and main level (MP and ML). SNR scalable supports profile only for low and main levels with a maximum bit rate of 4 and 15 Mb/s, respectively.
Spatially scalable profile supports only high-1440 level with a maximum flow rate of 60 Mb/s, of which 15 Mb/s is part of the base layer. It allows the transfer of basic image quality depending on the spatial resolution (spatial) or quantization accuracy, with addition of supporting information (enhanced layer). This allows simultaneous broadcasting of a program in elementary and higher resolution, so that in case of difficult reception conditions the signal of lower quality can be received (lower resolution) instead of higher. They are intended for extended-definition TV (EDTV).
High profile (also known as professional) is designed for later use with hierarchical coding for applications with extremely high definition (HDTV—high-definition TV) in format sampling 4:2:2 or 4:2:0. High profile supports the main, high-1440 and 1920 high level, with a maximum flow rates of 20, 80 and 100 Mb/s, respectively. The flow of the base layer is 4, 20 and 25 Mb/s, respectively.
4:2:2 profile has been introduced to allow working with color images in 4:2:2 format, which is necessary for studio equipment. Although, during the development of MPEG-2 standard, studio uses have not been taken into account, it showed that the MPEG-2 standard is suitable for this purpose. 4:2:2 profile has allowed the use of existing tools for coding and in studio applications, which requires a higher bit rate.
Multiview profile (MVP) is introduced in order to enable efficient coding of two video sequences derived from two cameras which are recording the same scene, and which are set at a slight angle (stereovision). This profile also uses existing tools for encoding, but with a new purpose. There is also reverse compatible decoder which means a higher level still can play lower level profile, while compatibility in the opposite direction is not possible. Present stage of development uses a combination profile and level of main profile at main level. Maximum number of pixels that can theoretically be transmitted by MPEG-2 encoder is 16,383 × 16,383 = 22,657,689.
2.3. MPEG-2 transport stream
Video and audio encoders transmit signal in the main stream. Raw uncompressed audio and video parts of the signal, known as presentation units, are located in the encoder for receiving video and audio access units. Video access unit can be I, P and B coded picture. Audio access units are containing encoded information for a few milliseconds of sound window: 24 ms (layer II), and 24 or 8 ms in the case of the layer III. The video and audio access units form the elementary streams in a respective manner. Each elementary stream (ES) is then divided into packets to form a video or audio packetized elementary stream (PES). Service and other data are similarly grouped into their PES. PES packets are then divided into smaller 188-bit transport packages [2, 10].
To gain access to the transfer of MPEG-2 signal, data streams must be multiplexed. With multiplexing, the following is obtained:
portable data stream (TS = transport stream)—designed to transmit signals to terrestrial, cable and satellite connections,
programming data stream (PS = program stream)—designed for storing digital data on DVD or other storage space.
Multiplexing of audio and video signals is necessary in order to enable their joint transmission, and properly decode and display. The multiplexing hierarchy determined by MPEG-2 standard can be divided into:
basic data stream (ES = elementary stream),
packetized basic data stream (PES = packetized elementary stream),
portable (TS) or program data stream—PS (Figure 2).
Programming flow obtained by multiplexing includes packages resulting from one or more elementary streams belonging to one program. It can contain one stream of the video signal, and more data streams of an audio signal.
Packets of the program stream have a variable length, which causes difficulties when the decoder needs to recognize the exact beginning and the end of the package. To make this possible, the packet header contains information of the length of the package. PES packet can vary in length up to a maximum of 64 KB, while the typical length is about 2 KB. The part that follows the header contains the access unit as parts of the original elementary stream. At the same time, there is no obligation to equalize the start of access units with a start of information part (payload). According to that a new access unit can start at any point in the information part of PES packets, there is also the possibility that a few small access units can be contained in one PES packet.
The most important components of the header are as follows:
starting prefix code (3 bits),
starting code of a flow (1 bit),
start time stamp,
PTS (33 bits),
decoding time stamp (DTS; 33 bit).
PTS and DTS cannot be included in each PES packet, as long as they are being involved in at least 100 ms in the transport data stream (DTV), or every 700 ms in the programming data stream (DVD). DTS indicates the time required for deleting or decoding access unit. Within the headers, some other fields that contain important parameters are included, such as the length of the PES packet, the length of the header and whether the PTS and DTS fields are present in the package. Among this, there are several other optional fields, a total of 25, which can be used to transfer additional information about packetized elementary stream, such as the relative priority and copyright information.
3. MPEG-4 standard
3.1. General characteristics of the MPEG-4 standard
MPEG-4 is a generic standard for coding audiovisual information, and it was presented in 1998 under the label ISO/IEC 14496 . In this standard, video and audio signals are characterized by interactivity, high degree of compression, and universal access, and this standard has a high level of flexibility and expandability.
The algorithms that are implemented in MPEG-4 standard represent scene as a set of audiovisual objects, among which there are some hierarchical relations in space and time. In all previous standards for compression of video, image has been seen as a unified whole. In this standard, we are meeting with the concept of video object, thereby to distinguish two types of visual objects—natural and synthetic visual objects.
At the lowest hierarchical level are primitive media objects, such as, for example, static images (fixed background scenes), visual objects (a person who speaks no background), and audio facilities (voice of the speaker). This approach brings an increase in compression ratio, increased interactivity and enables the integration of objects’ different nature such as natural image or video, graphics, text and sound.
MPEG-4 standard has inherited the MPEG-2 standard. Each MPEG standard consists of several parts (Parts). Each part covers a certain aspect and area of use. Thus, for example, MPEG-4 Part 2 is used for video coding (such as DivX and Quicktime 6), MPEG-4 Part 10/H.264 represents an Advanced Video Coding (AVC), and it is used in areas with high-definition content such as HD broadcasting and storage, HD formats such as HD-DVD and Blue-ray discs . MPEG-4 Part 3 Advanced Audio Coding (AAC) is a part for high-quality audio coding.
The first inheritor to MPEG-2 format was MPEG-4 Part 2, which is published by ISO in 1999. As in the case of the MPEG-2, coding efficiency is strictly related to the complexity of the source material and the encoder implementation. MPEG-4 Part 2 is defined for applications in the field of multimedia in small bit rates, but it is in further expanded for applications in the field of broadcasting. Formal subjective evaluation has shown that the gain of the efficient coding with MPEG-4 Part 2, compared to the MPEG-2, is between 15 and 20%. For Digital Video Broadcasting (DVB) applications, this gain is not enough to justify the destabilization and destruction of MPEG-2 codec (which are used by DVB systems)—considering that the MPEG-4 Part 2 is not compatible with MPEG-2.
3.2. Image formats in MPEG-4 standard
Following the example of MPEG-2 standard, MPEG-4 standard supports both ways scanning images, progressive and interlaced scanning. Spatial resolution of luminance component can be expressed in blocks ranging from 8 × 8 pixels, up to 2048 × 2048 pixels. For presentation of video signal in color, this standard is using a conventional Y Cb Cr color coordinate system with weighing 4:4:4, 4:2:2, 4:1:1 and 4:2:0. Each component is represented with 4–12 bits per image pixel. Different temporal resolution is supported, as well as an infinitely variable number of frames per second .
As it was the case in the previous MPEG standards, the macroblock presents basic unit in which data of video signal are transmitted. Macroblock contains coded information about the shape, motion and texture (color) of the pixels. There is a wide range of bit rate from 5 Kb/s to 38.4 Mb/s, but it is optimized for use in three ranges of bit rate: <64 Kb/s, 64 Kb/s to 384 Mb/s and 384 Kb/s to 4 Mb/s. Also are supported constant bit rate and variable bit rate.
Each video object can be coded in one or more layers, which allows it variable resolution (scalability) encoding. Also, each video object is discretized in time so that each time samples representing a video object plane (VOP) [2, 13, 14]. Time samples of video object are grouped into group of video object planes (GOV).
3.3. MPEG-4.10 (H.264/AVC) standard
Previous video coding standards such as MPEG-2 and MPEG-4 Part 2 have been established and are used in the areas of videoconferencing over mobile TV and broadcasting TV content in standard/high definition, up to the application of very high quality, such as applications for professional digital video recorders and digital cinema—digital images on the big screen. But with the spread of digital video applications and its use in new applications such as advanced mobile TV or broadcast HDTV signal, requirements for effective representation of the video image are increased to the point where the previous standards for video coding cannot keep pace.
The new MPEG-4 Part 10 (MPEG-4.10) standard of video compression is the result of efforts of the Joint Video Team (JVT), which includes members of the Video Coding Expert Group (VCEG) and the Motion Picture Experts Group (MPEG), which is the reason for naming it twice (H.264 and MPEG-4.10). Standard is also commonly referred to as H.264/Advanced Video Coding (AVC).
This standard, registered under the number ISO-IEC14496-10, provides a significant increase in compression efficiency in regard to MPEG-2 (gain of at least 50%) . This efficiency is of particular importance for high-definition television (HDTV), which in the MPEG-2 requires a bit rate of at least 15–18 Mb/s.
H.264 showed significant improvement in coding efficiency, a significant improvement when it comes to resistance to errors, as well as increased flexibility and area of use compared to their predecessors. A change was added in the MPEG-4.10 (H.264/AVC), the so-called FRExt (FREkt) amendment, which further extended the area of use to areas such as mobile TV, internet broadcasting, distribution and professional studio and postproduction . Table 4  shows the usage scenarios and compression in bits supplied with the H.264 codec, and Table 5  shows the characteristics of the H.264 standard level.
|Using||Resolution and frame rates||Bit rate|
|Mobile content (3G)||176 × 144, 10–24 fps||70–180 Kb/s|
|Internet/standard definition||640 × 480, 24 fps||2–3 Mb/s|
|High definition (HDTV)||1280 × 720, 25p, 30p||7–8 Mb/s|
|Full high definition (full HDTV)||1920 × 1080, 25p, 30p||10–12 Mb/s|
H.264 consists of two layers: layer for video encoding, designed for effective representation of video coding layer (VCL) and network-flexible layer network abstraction layer (NAL), which converts VCL video content in formats suitable for transmission over a variety of transport layers or storage media.
|H.264 level||Resolution||Frame rate||Max. compressed bit rate (non-FRExt profile) maximum||Maximum number of reference frames|
|1.1||CIF or QCIF||7.5 (CIF)/30 (QCIF)||192 Kb/s||2 (CIF)/9 (QCIF)|
|2.1||HHR (480i or 576i)||30/25||4 Mb/s||6|
|3.1||1280 × 720p||30||14 Mb/s||5|
|3.2||1280 × 720p||60||20 Mb/s||4|
|4||HD formats (720p or 1080i)||60p/30i||20 Mb/s||4|
|4.1||HD formats (720p or 1080i)||60p/30i||50 Mb/s||4|
|4.2||1920 × 1080p||60p||50 Mb/s||4|
|5||2k × 1k||72||135 Mb/s||5|
|5.1||2k × 1k or 4k × 2k||120/30||240 Mb/s||5|
3.4. The concept of video coding layer (VCL)
The coded video sequence in the H.264 consists of a series of encoded pictures. The coded image may represent either the entire frame or one field, as was the case with the MPEG-2 codec. Overall, it can be considered that the video frame comprises two fields: the field at the top and the field at the bottom. If the both fields of a given frame were taken at various time points, the frame is called interlaced scan frame; otherwise, it is called a progressive scan frame.
4. H.265/HEVC standard
4.1. General principles of HEVC standard
Thanks to the evolution of technology, which has enabled us to have a resolution of video material from 4K and higher reality, the evolution of video coding is inevitable, so it can keep up the step. HEVC/H.265 video coding (High Efficiency Video Coding) , is the fruit of cooperation between ISO/IEC Moving Picture Experts Group (MPEG) and ITU-T Video Coding Experts Group (VCEG) standardization organization, which brings better performance than the previous coding standards, as well as H.264/AVC, and the biggest advantage of the new standard is up to 50% more efficient compression compared to H.264 and support for 8K UHD resolution. This means that the video material of the same quality will occupy at half encoding less space with HEVC than the H.264/AVC coding, thanks to better algorithms and analysis of the video material which eventually brings better coding efficiency.
Direct predecessor of this standard is H.264/MPEG-4 Advanced Video Coding (AVC). HEVC seeks to replace its predecessor by using a generic syntax that could be customized to newer emerging applications. He wants to achieve several goals, such as code efficiency, adaptability to different systems of transport, resilience on errors and implementation with parallel processing in a multiprocessor’s architecture.
4.2. The data structure in H.265/HEVC standard
H.265/HEVC is a hybrid video coding algorithm based on blocks. The basic coding algorithm is a hybrid of intraprediction, interprediction and transformational coding. For representation of a color video signal, H.265/HEVC standard uses YCbCr color space in format 4:2:0. Each sample of the individual components of the color space is represented with a resolution of 8–10 bits per sample, in coding and decoding. Video image is progressively scanned in a rectangular format dimensions W × H, where W represents a width, H height of the image for the luma component. Chrominance components for color format 4:2:0 are scanned in a rectangular format dimensions W/2 × H/2 [17, 18].
H.265/HEVC standard has kept hybrid architecture of previous coding MPEG standard for video encoding. A significant difference in approach lies in the fact that the previous H standards—video coding are based on macroblocks, H.265/HEVC standard for encoding uses the adaptive quadtree structure based on Coding Tree Unit (CTU). Basically, the quadtree structure is composed of various blocks and units.
A block is defined as a matrix of samples of different sizes, while the unit includes a luma block and corresponding chrominance blocks together with the syntax necessary for their coding. With the further division of structure coding units are obtained and also the coding blocks.
Decoding of quadtree structure does not represent a significant additional burden because it can easily be switch into a hierarchical structure by using z-scan. Predictable modes for interframe encoded CU are using non-square PU, which requires the necessary support for decoding in the form of additional logic in the decoder which performs multiple conversions between the raster scan, and the scan-scan. In terms of preserving the speed of bit rates, with the encoder side, there is a simple algorithm to analyze the structure of the tree to determine the optimal share of the blocks . CTU sizes are 16 × 16, 32 × 32 and 64 × 64 pixels.
4.3. Profiles, levels and layers
Profile is defined by a set of coding tools or algorithms which, if used, ensure compatibility of the output coded bit stream with standard applications that belong to this profile, or have similar functional requirements. Level refers to the limitations of the current stream bits that define memory and resource requirements of the decoder. These restrictions are maximum number of samples and the maximum number of samples per second that can be decoded (sample rate), the maximum image size, maximum bit rate (how many bits can decoder spend per second of video record), minimum compression ratio, size of the buffer memory and so on. In HEVC standard, only for the purposes of diversity from some applications, in bit rate and buffer memory, which are used to store the encoded image (control flow information), were defined two layers: a basic “Main” and demanding “High”. Currently, the draft of HEVC has defined a single profile “Main” and expectation is more defined profiles. Goal is to reduce the number of profiles, so that there will be maximum compatibility between devices, and also, due to the fact that sometimes services are separated, for example, for broadcasting TV signals, mobile services and video on demand, the goal is convergence to devices that will support all of them together.
Ultra-high-definition television (UHDTV or UHD) includes 4K UHDTV (2160p) and 8K UHDTV or Super Hi-Vision (4320p), which are two digital video formats proposed by NHK Science & Technology Research Laboratories and defined and approved by the International Telecommunication Union (ITU) .
Full high definition (FHD) indicates that the image with 1920 pixels set in the 1080 lines. UHD includes twice the number of pixels and lines in its basic version, which can be called a Quad Full HD because it has four times more pixels than Full HD. Basically, there are two UHD resolution, 3840 × 2160 and 7680 × 4320 for easier identification is often called the first UHD-1 (4K) and the other UHD-2 (8K).
Number of 3840 pixels in one row consist UHD 4K, while Full HD consist from 1920 lines. The point is that the nomenclature “No. K” it was taken from formats works with theatrical distribution; on the other side, UHD starts to use as commercial term. When you see a movie that has a 2K resolution film, it will be 2048 pixels’ resolution in 1080 lines, and in the case of 4K projection, it will be a resolution of 4096 × 2160 pixels .
UHD brings many benefits, but also the kind of disadvantages like any new technology, especially in its beginning. The benefits of higher resolution logically have a greater amount of information on the screen and therefore a more realistic view, especially on the diagonals that are larger than 140 cm (55″) and where full HD resolution loses the impression of high sharpness. This is why manufacturers have presented the first UHDTVs in diagonals of 84″ (213 cm) that they would now be available in smaller dimensions—140 cm (55″) and 165 cm (65″). UHD on smaller diagonals does not have much sense because the density of information is too large and the average viewing distances further details cannot be seen in relation to the full HD content.
Many parameters have effect on the realism of images, and among them resolution is not most important element. Number of pixels has a smaller impact on how we experience the image of other parameters, such as increased dynamic range, the range and depth of color, as well as the number of frames per second. UHD used Rec. 2020 standard color range in contrast to the definition used by Rec. 709 standard. Rec. 2020 defines a bit depth of either 10 bits per sample or 12 bits per sample. Rec. 2020 specifies the following frame rates: 120p, 119.88p, 100p, 60p, 59.94p, 50p, 30p, 29.97p, 25p, 24p, 23.976p . Table 6  provides an overview of the main characteristics of images in HDTV, 4K and 8K UHDTV.
|HDTV||4K UHDTV||8K UHDTV|
|Pixels × number of lines||1280 × 720 p|
1440 × 1080 i
1920 × 1080 p(i)
|3840 × 2160||7680 × 4320|
|Frame rate||25, 50, … fps|
|25, 50, … fps|
30, 60, 120 fps
|25, 50, … fps|
30, 60, 120 fps
|Bit depth||8 or 10 bits||10 or 12 bits||10 or 12 bits|
|Viewing distance||3 × H (30°)||1.5 × H (60°)||0.75 × H (100°)|
5. Comparison of compression standards for terrestrial SDTV and HDTV transmission
In the period from May 15 to June 16, 2006, in Geneva, held a Regional Conference on Radio Communications (RRC-06), organized by the International Telecommunication Union (ITU), with the aim of establishing a new international agreement and the associated frequency plan for the digital broadcasting of radio and television programs. The conference RRC-06 Final Act were adopted (Final Acts) which contain a new agreement Geneva 2006 (GE06), which enables the introduction of complete digital terrestrial television broadcasting in the planning zone. All European countries have pledged that no later than June 17, 2015, the switch to digital broadcasting of radio and television signals, and perform analog switch off (ASO). In many countries, it is already implemented as ASO .
European countries have adopted the standard Digital Video Broadcasting-Terrestrial (DVB-T) and DVB-T2. The first concepts DVB-T were adopted in 1993, and the first final version in 1997. It involves the transmission of digitized audio and video content via terrestrial broadcasting technology in the VHF and UHF band using conventional system transmitter and corresponding receiver .
DVB-T2 is an enhanced version of the DVB standard for terrestrial broadcasting. Compared with DVB-T, DVB-T2 offers a significantly lower sensitivity to noise and interference and provides 30–50% greater flow of data which is particularly suitable for HDTV .
Video compression standards of DVB-T standards used in different countries are shown in Table 7 . The number of national multiplex (MUX) is given, local and regional non-represented. When digital terrestrial TV transmission started and year ASO was executed are presented. Data were collected from the official websites of national regulatory agencies and providers of digital terrestrial transmission.
|Austria||6||MPEG-2||MPEG4 for pay TV and HD||2004||2010|
|Croatia||5||MPEG-2||MPEG-4 for pay TV||2002||2010|
|Czech Republic||3||MPEG-2||MPEG-4 for experimental HD||2000||2012|
|Denmark||6||MPEG-4||MPEG-4 for pay TV||2003||2009|
|Estonia||4||MPEG-4||MPEG-4 for HD||2004||2010|
|France||8||MPEG-2||MPEG-4 for pay TV and HD||2005||2011|
|Italy||22||MPEG-2, MPEG-4||MPEG-4 tests||1998||2012|
|Slovakia||4||MPEG-2, MPEG-4||MPEG-4 tests||2009||2012|
|United Kingdom||6||MPEG-2||MPEG-4 for HD||1998.||2012|
|Albania||10||MPEG-2, MPEG-4||MPEG-4 for HD||2004||–|
|Belarus||3||MPEG-4||MPEG-4 for pay TV||2004||–|
From Table 2, it can be seen that countries that have moved completely to digital broadcasting mainly used DVB-T standard, or used in parallel and DVB-T2, while countries that are transitioning to digital transmission opted for the DVB-T2 standard. A small number of countries using DVB-T standard include MPEG-2 compression, mainly for free-to-air (FTA). Compression standard MPEG-4, due to savings in capacity, mainly used for encrypted channels, i.e., pay TV and HDTV. An increasing number of countries that use the DVB-T standard are planning to in the near future switch to an enhanced DVB-T2 standard.
6. Application of compression standards for UHDTV
6.1. 4K UHDTV via satellite
Number of UHD content is not large, but their number is growing rapidly. Many cameras are now able to record materials and above 4K resolution, such as RED Epic camera which can record approximately 5K resolution or 5120 × 2700 pixels, as well as the Sony F65 8K camera recording at a resolution of 8192 × 4320 pixels. The first 4K UHD facilities were available over broadband services (Netflix and YouTube) to 2013 and in 2014 started the first experimental TV channels that broadcast 4K UHD controversial content. Sporting events in 2012, 2013 and 2014 were the first UHD content broadcast via satellite. Pioneers in the distribution of 4K UHDTV are the Japanese public broadcaster NHK and KBS Korean TV . The leading satellite companies took part in the distribution of UHD Eutelsat, SES Astra, Measat, Eutelsat, and Hispasat. Although in tests carried out with video H.264/AVC, HEVC is mainly used today. Table 8  provides an overview of the number of SDTV, HDTV and 4K UHDTV that may be received from the satellite to the various transmission parameters.
|Satellite transmission||Carrier data rate (Mbps)||Number of channels|
|SDTV (p25/p30)||HDTV (p25/p30)||4K UHDTV (p50/p60)|
|DVB-S, QPSK, FEC 3/4||38||4–5 in MPEG-2||4–5 in MPEG-4||–|
|DVB-S2, 8PSK, FEC 5/6||72||24 + in MPEG-4||7–9 in MPEG-4|
14–18 in HEVC
|2–5 in HEVC|
|DVB-S2, 16APSK, FEC 2/3||79||–||7–9 in MPEG-4|
15–19 in HEVC
|1 in MPEG-4|
3–5 in HEVC
|DVB-S2X, 16APSK, FEC 3/4||83||–||8–10 in MPEG-4|
16–20 in HEVC
|1 in MPEG-4|
3–5 in HEVC
|DVB-S2X, 16APSK, FEC 135/180||99||–||9–12 in MPEG-4|
19–24 in HEVC
|1 in MPEG-4|
3–6 in HEVC
6.2. 4K UHDTV in digital terrestrial television (DTT)
Initial tests 4K UHDTV in digital terrestrial television systems were carried out in 2012 in Japan and South Korea by KBS and NHK still using unstandardized HEVC video compression. Later tests were done in other countries. Table 9  gives the basic test characteristics of 4K UHDTV in digital terrestrial television (DTT) networks in the world.
Technicolor has successfully conducted tests to broadcasting terrestrial 4K UHDTV content. Broadcasting used American ATSC 3.0 standard, trough Sinclair Broadcast transmitter . Technically speaking, this is the world premiere of the use of scalable HEVC (SHVC) video coding, MPEG-H compression standards, as well as MMT MPEG A/V standards. The test was performed in the Sinclair Broadcast experimental facility. The new technology allows you to receive signals via conventional antenna, as well as through mobile and tablet devices.
Based on technological profiles and the typology of the various countries, the report predicts that the demand of end users and the transition to the new standards take between 3 and 12 years old. Markets that were among the first to adopt new technologies will likely take between three and six years to the current broadcasting possibilities of yarn on a combination of DVB-T2 MPEG4/HEVC, SDTV/HDTV/UHDTV (4K). The third profile that refers to the HDTV/UHDTV (4K/8K) is supposed to happen between 2023 and 2030. The report says that the DTT platform is currently threatened due to scarcity of radio spectrum, as well as plans for the redistribution of 700 MHz range, which will reduce the available capacity by an average of 30%.
|Country||Multiplex capacity (Mbit/s)||Signal bit rate (Mbit/s)||Video encoding standard||Picture standard|
|Republic of Korea||<35.0||25–34||HEVC Main 10||3840 × 2160p|
8 bits or 10 bits/pixel
|HEVC||3840 × 2160p|
|Spain||36.72||35||HEVC||3840 × 2160p|
|Sweden||31.7||24||HEVC||3840 × 2160p|
|United Kingdom||40.2||Variable (35)||HEVC||Mixture of 3840 × 2160p|
50 frames/s and 3840 × 2160p
8 or 10 bits/pixel
|Czech republic||–||–||HEVC||3840 × 2160p|
In addition to broadcast 4K UHDTV channels in satellite and terrestrial digital networks during 2015 in the world has launched several UHDTV services in Internet Protocol Television (IPTV) and Over-The-Top (OTT) systems.
6.3. 8K UHDTV
Besides the ultra HD format, there is also Super Hi-Vision 8K for whose development and promotion are in charge of the Japanese public broadcaster NHK. Super Hi-Vision format was able to show 120 frames per second and a resolution of 7680 × 4320 pixels which corresponds to the format of 32 megapixels. This format offers four times the resolution of 4K format and 16 times higher than HD. Table 10  gives the basic test characteristics of 8K UHDTV in DTT networks in the world.
|Country||Multiplex capacity (Mbit/s)||Signal bit rate (Mbit/s)||Video encoding standard||Picture standard|
|7680 × 4320p|
|Republic of Korea||50.47||50.0||HEVC||–|
|4K UHDTV—Phase 1||4K UHDTV—Phase 2||8K UHDTV|
|Resolution||3840 × 2160||3840 × 2160||7680 × 4320|
|Dynamic range||HDR preferred||HDR mandatory||HDR mandatory|
|Color space||Rec. 2020||Rec. 2020||Rec. 2020|
|Color sampling||4:2:0, 4:2:2||4:2:0, 4:2:2||4:2:0, 4:2:2, 4:4:4|
|Color bit depth||10 bits||10/12 bits||10/12 bits|
|Video encoding||HEVC Main 10||HEVC Main 10||HEVC Main 10|
|Audio format||5.1||Beyond 5.1||Object based|
|Audio codec||Open||TBD||Next-generation audio codec|
|Viewing distance||1.5 picture height||1.5 picture height||0.75 picture height|
The advantages brought by compression of the TV signal are as follows: reducing the frequency range of telecommunication channel which transmits TV signal, reducing the memory capacity required for recording images (storing images), access to data is reduced because the faster skips over the material, provided a data transfer in real time, it reduces the needed RAM memory and hardware becomes less expensive and leads to the miniaturization of hardware in the television. By reducing the number of bits, less power is required to broadcast; for example, if the transmitter of the same power is broadcasting analog and digital signal, for digital reception antenna of smaller diameter is required.
To ensure reliable communication between users who use equipment and software from different manufacturers, standardization of methods of compression was carried out. So today, depending on the quality and use (television, multimedia services, videoconferencing, video telephony, etc.), there are several standards (JPEG, MPEG-1, MPEG-2, MPEG-4, H.261, H.263, H. 264, H.265, etc.).
Since it is a new technology that just catches the “momentum” toward global use, UHD is the future of television. Also, UHD offers the ultimate user experience and creates opportunities for the entire industry. 4K and 8K services will stimulate the growth of broadband, as well as the expansion of TV services in emerging markets. Consequently, the compression standard for TV in the near future will be HEVC/H.265.
This work was done within the Erasmus Plus Capacity-Building projects in the field of Higher Education: “Implementation of the Study Program—Digital Broadcasting and Broadband Technologies (Master Studies)”, Project No. 561688-EPP-1-2015-1-XK-EPPKA2-CBHE-JP.