Basic parameters of the test sequences.
This paper deals with the influence of chroma subsampling on video quality measured by objective metrics for H.264/AVC and H.265/HEVC compression standards. The evaluation is done for eight types of sequences with full HD and ultra HD resolutions depending on content. The experimental results showed that there is no impact of chroma subsampling on the video. According to the results, it can also be said that H.265/HEVC codec yields better compression efficiency than H.264/AVC and the different is more visible in UHD resolution. The bigger difference in quality is in lower bitrates, with increasing bitrate the quality of H.264/AVC codec approaches the H.265/HEVC codec.
- chroma subsampling
- objective assessment
Recently, the interest of multimedia services has rapidly increased, which has reflected also in the use of higher resolutions. The future trend will be moved from Full HD to Ultra HD resolution. This fact results in the need for higher bandwidth as well as the need to develop new compression standards. That also leads to the demand of video quality assessment. Last year’s many new compression standards have become available, e.g. H.265/HEVC or VP9, and many others are being still developed as DAALA or VP10.
2. State of the art
Even if in papers [1–3] the coding efficiency comparison of well-known and most used compression standards such as H.264/AVC, H.265/HEVC and VP9 using objective metrics has been researched, but in these papers only sequences in HD and Full HD resolutions were compared, objective quality assessment for Ultra HD resolution was missing. In papers [4–8], the objective quality assessment of the newest compression standards such as H.265/HEVC and VP9 has been examined, but the reference and still the most used compression standard H.264 were not taken into account. In papers [9–11], only the quality of multimedia services has been explored. In all mentioned papers [1–11], the influence of chroma subsampling on the video quality has not been analyzed. Therefore, the aim of this paper is to parse how the chroma subsampling affects the video quality measured by selected objective metrics.
3. Chroma subsampling
Chroma subsampling is the practice of encoding images by implementing less resolution for chroma information than for luma information. This is due to our human visual system which is less sensitive to color than to luminance. So, the video system can be optimized by devoting more bandwidth to the luma component (usually denoted as Y or Y′ after gamma correction) than to the color difference components, Cb and Cr. The subsampling scheme is commonly expressed as a three part ratio J/a/b (e.g. 4:2:2). The parts are (in their respective order) (Figure 1):
J: horizontal sampling reference (width of the conceptual region), usually 4.
a: number of chrominance samples (Cr, Cb) in the first row of J pixels.
b: number of changes of chrominance samples (Cr, Cb) between the first and second rows of J pixels.
The most used subsampling schemes are 4:4:4, 4:2:2, 4:2:0 and 4:1:1.
4:4:4: The Cb and Cr colors are sampled at the same rate as the luma (Y); thus, there is no chroma subsampling.
4:2:2: Both chroma components (Cb and Cr) are sampled at half the horizontal resolution of the luma (Y), so the horizontal chroma resolution is halved. This reduces the bandwidth of an uncompressed video signal by one-third with little to no visual difference.
4:2:0: Both chroma components (Cb and Cr) are sampled at half the vertical resolution of Y, so the bandwidth is halved compared to no chroma subsampling.
4. H.264/AVC and H.265/HEVC compression standards
Although H.264/AVC codec was developed in 2003, it is still one of the most used compression standards. It has been designed for a wide range of applications, ranging from video for mobile phones through web applications to TV broadcasting (HDTV). H.264/AVC also defines profiles and levels. There are only three profiles currently defined: baseline, main and extended .
The High Efficiency Video Coding known as HEVC/H.265 was developed in January 2013 by a partnership known as the Joint Collaborative Team on Video Coding (JCT-VC), which arose by the cooperation of the ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG) standardization organizations. It is the newest coding standard from the MPEG family codecs. It contains many improvements, which makes him more effective than the previous standards .
5. Objective video quality assessment methods
In general, the video quality assessment can be divided into objective and subjective parts. The subjective evaluation consists of the use of people: assessors who rate the video quality. Even if it is the most reliable way to determine the video quality, it has a big disadvantage: it is time-consuming because minimum 15 observers for each test are needed (according to ITU-R BT.500-13 recommendation). Because of this drawback, in many cases, the objective assessment is used. It consists of the use of computational methods, which score the video quality. The biggest advantage of this type of assessment is that it has opportunity to easily repeat the tests. Today, many objective metrics exist. Mostly used are peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM).
Although the PSNR  is the oldest and easiest objective metric, it is still used.
The SSIM metric measures three components—the luminance, the contrast and the structural similarity—and combines them into one final value which determines the quality. It reaches a very good correlation with subjective perception . The results are given in interval [0,1] where 0 represents the worst and 1 the best quality.
6. Measurement procedure
In this experiment, eight video test sequences, according to different temporal and spatial information, were used. All sequences are the part of the database . In the next part, a brief description of used sequences is written:
Bund nightscape—city night shot. The scene is time lapsed; the dynamic segments of scene are moving cars and walkers on the curb; static segments are represented by urban buildings. The camera captures scene from static position (Figure 2a).
Campfire party—night scene closes to the fire. In the front of the image is flaming bonfire (the fast change of temporal and luminance information). In the background of the image is a group of slightly static people. At the end of the sequence, the camera zooms on the group of people (Figure 2b).
Construction field—shot on the construction site, where the static background is represented by buildings under construction and dynamic objects are represented by construction vehicles (excavator) and walking workers. The slow-motion scene captured statically (Figure 2c).
Fountains—the daily shot on the city fountain. The foreground consists of squirting water (a lot of edges in the picture); the background is static formed by trees and the buildings. The capturing is static, scene with low dynamic of motion (Figure 2d).
Marathon—marathon competition. The runners are multiple moving objects with moderate dynamics; the background is a static road. The camera capturing is static from high point of view (Figure 2e).
Runners—the running challenge; but in contrast to “marathon scene,” there are fewer runners. The camera is static, located in the front of the runners slightly angled to the side (higher spatial information). Scene is relatively dynamic (Figure 2f).
Tall buildings—the shot on the modern city. The static objects are skyscrapers, river and the urban infrastructure; the slow-motion objects are represented by city traffic. The camera is moving slowly from the left to the right side. The scene is characteristic with the change of spatial and temporal information (Figure 2g).
Wood—the forest scenery. The shot on the trees in the forest (captured objects are static), the motion of the camera is from the left to the right side and the motion is accelerating in the sequence. Relatively high value of the spatial and temporal information (Figure 2h).
Generally, the compression difficulty is directly related to the spatial and temporal information of a sequence. Regarding to , the spatial (SI) and temporal information (TI) using the Mitsu tool  was calculated. According to the results, the spatial-temporal information plane was drawn (Figure 3).
The basic parameters of these sequences are shown in Table 1.
|Resolution||Chroma subsampling||Bit depth||Aspect ratio||Frame rate (fps)||Length (frames)||Length (seconds)|
|3840×2160 (UHD)||4:4:4||10 bits per channel||16:9||30||300||10|
The measurement process consists of four steps:
The reference sequences were downloaded from Ref.  in the uncompressed format (*.yuv).
After that, all of them were encoded to H.264/AVC and H.265/HEVC compression standards using the FFmpeg tool . The target bitrates were chosen to 1, 3, 5, 10, and 15 Mbps. The GoP size was set to the half of the frame rate, i.e. M = 3 and N = 15.
Subsequently, the sequences were back decoded to the format *.yuv using the same FFmpeg tool.
Finally, the quality between these sequences and the reference (uncompressed) ones was compared and evaluated. This was done using the MSU Measuring Tool Pro version 3.0 . For the measurement, the PSNR and SSIM objective metrics were used.
The whole procedure of measurement and evaluation is represented in Figure 4.
7. Experimental results
Figure 5 shows the ratio of the video quality (measured by PSNR and SSIM metrics) and bitrate of both compression standards for all test sequences for FHD resolution and UHD resolution. In Figure 5, eight graphs are inset: depending on the codec, resolution and used objective metric.
Figure 6 shows the impact of the chroma subsampling on the video quality measured by PSNR and SSIM metrics of both compression standards for all test sequences for FHD resolution and UHD resolution. In Figure 6, eight graphs are inset: depending on the codec, resolution and used objective metric.
Figure 7 demonstrates the difference of the coding efficiency between H.264 and H.265 codecs as well as the difference between FHD and UHD resolutions. This representation is done for five test sequences, which have different SI and TI parameters and lie in various (opposite) positions in the SI/TI diagram (Figure 3). In Figure 7, 10 graphs are inset: depending on mentioned test sequence, resolution and used objective metric.
According to the experimental results and graphs, several conclusions can be pronounced:
The impact of chroma subsampling on the video quality is negligible: there is no difference between unsampled (4:4:4) and subsampled (4:2:2; 4:2:0). It follows that it is more useful and reasonable to use chroma subsampled: you can save up to 50% of the bandwidth by the same quality.
As we assumed the H.265/HEVC compression standard yields better compression efficiency than H.264/AVC compression standard—at the same resolution and bitrate, the compression quality of H.265/HEVC standard is better than H.264/AVC. The quality of both compression standards rises logarithmically with increasing bitrate—in low bitrates the quality grows faster than in high bitrates.
By comparing both codecs, it can be generally said that the bigger difference in quality is in lower bitrates—with increasing bitrate the quality of H.264/AVC codec approaches the H.265/HEVC codec.
The coding efficiency of H.265/HEVC standard is more visible in UHD resolution.
The effectiveness of compression depends on the types of test sequences. In consideration of measurements’ results, we can say that sequences with smaller TI and SI, for instance, the “Bund Nightscape” and the “Construction Field,” yield better quality in low bitrates than other sequences. Vice versa, the sequences with high motion as the “Runners” or the “Marathon” yield worse quality in low bitrates. Individual category is sequences that contain many details, for instance, the “Fountain” sequence—this type of sequences has no so big difference between qualities in low and high bitrates, and they do not reach high quality in higher bitrates.
This paper dealt with the influence of chroma subsampling on video quality measured by objective metrics for H.264/AVC and H.265/HEVC compression standards. The evaluation was done for eight types of sequences with Full HD and Ultra HD resolutions depending on content. The experimental results showed that there is no impact of chroma subsampling on the video quality—there is no difference between unsampled (4:4:4) and subsampled (4:2:2; 4:2:0). According to the results, it can also be said that H.265/HEVC codec yields better compression efficiency than H.264/AVC, and the difference is more visible in UHD resolution. The bigger difference in quality is in lower bitrates—with increasing bitrate the quality of H.264/AVC codec approaches the H.265/HEVC codec.
This paper is supported by the following project: University Science Park of the University of Zilina – II. phase (ITMS: 313011D13) supported by the Operational Programme Research and Innovation and funded by the European Regional Development Fund.