Comparing Gamma and Weibull as Frame Size Distributions for High Efficient Video Coding

1 Abstract —Digital video is one of the major traffic components in communication networks. Modelling the frame size of encoded video data is a preliminary step in the research and development of synthetic video data generators enabling a thorough analysis of video architecture systems that are often difficult to perform with real digital video data. In this paper, a statistical analysis of frame sizes of High Efficient Video Coding (HEVC) video generated at bit rates of interest for high quality Full HD video applications is performed. The selection of potential distributions for modelling the HEVC frame size distribution is based on the results from the modelling of H.264 frame size distribution. Experimental results show that the Gamma distribution has a better fit, to the HEVC frame size distribution, than the Weibull distribution.

video often employ synthetic video traffic generation for analysing the behaviour and performance of a given video service with respect to the coding process, the rate control process, and the multiplexing and transmission process.
A video traffic model should be strictly related to the video coding standard that will be used for coding digital video, capturing the characteristic of the video sequence.
Bitrate, GOP structure (the sequence order of I, B, P frames in a GOP), frame size, frame size distribution, frame sizes dependences (short range dependencies, long range dependencies, autocorrelation), intra-GOP correlation, inter-GOP correlation [7] are some of the characteristics of major interest.
A traffic model for MPEG-4 and H.264/MPEG4 AVC video traces has been proposed in [7].Inter-GOP and intra-GOP correlation in compressed VBR (variable bit rate) sequences have been addressed by incorporating waveletdomain analysis into time-domain modelling.The proposed model has been developed at frame-size level, allowing for the analysis of loss ratio for each type of frames.
A statistical analysis of H.264 video frame size distribution has been presented in [8].Frame sizes of H.264/MPEG4 AVC encoded video have been modelled with well-known statistical distributions.The main results have shown that the Gamma and Weibull distributions give appropriate statistical distributions of video frame sizes.
A video traffic modelling tool for simulation-based performance evaluation studies [9] consider different models for traffic correlations based on the M/G/∞ process, able to exhibit both short range dependence (SRD) and long range dependence (LRD).
A model for controlled VBR video traffic reflecting different properties of video traffic related to the content, encoder, and rate controller is proposed in [10].The model is based on the interaction of video encoder and encoded bit stream by the means of a rate controller.The model parameters depend on the encoding parameters, on the rate control parameters, and on the properties of video content.This model exhibits into the synthetic generated video traffic both LRD and SRD dependences and also some properties related to the content and the encoding parameters.
Traffic modelling of VBR video is necessary for performance assessment of a network design and for the creation of synthetic loads to be used for benchmarking a network.A recent classification and survey of VBR video traffic models [11] provides a review of the state of the art and a classification and comparison among some representative video traffic model.
The objective of this paper is to analyse the statistical behaviour of frame sizes in HEVC video generated at bit rates of interest for high definition video applications to be used as a basis for simulation studies of digital television services.
The paper is organized as follows.Section II briefly reviews the video coding architecture.The statistical analysis is presented in Section III.Section IV concludes the paper.

II. HEVC VIDEO CODING
A digital video sequence is composed of video scenes.A video scene is composed of frames (pictures).Each frame is encoded with different algorithms.
The main frame types are named I, P, and B. I frames are encoded exploiting spatial redundancy for the compression.
The encoding of an I frame is self-contained meaning that no other information is required for the reconstruction (decoding) of that frame.P frames are encoded exploiting both spatial and temporal correlations.
The encoding of P frames requires the knowledge of previously (reference) encoded frames and the decoder must save all reference frames necessary for decoding the target P frame.
The encoding of B frames requires the knowledge of previous and future frames (which must be encoded in advance) and the decoder must save all past and future reference frames necessary for decoding the target B frame.
In general, in MPEG international standards a frame is partitioned into slices and the encoding is performed at slice level, hence the encoding is referred to I-slices, B-slices, Pslices.
Video encoding rate control is designed for satisfying requirement such as encoding at a given bitrate, encoding at given target quality, encoding at the highest possible quality with a constraint of a target bitrate, encoding at a given bitrate with a constraint of a maximum buffer size.
In constant bit rate encoding, the encoders dynamically adjust encoding parameters in order to minimize the difference between the actual bitrate and the target bitrate.Most of the algorithms are based on the adaptation of the quantization parameters at one or more of the following video levels: scene, GOP, frame, slice, macroblock, block.
The encoded bit streams while satisfying the CBR constraint presents a highly variable video quality.CBR is not the preferred choice in digital video entertainment application.
VBR video encoding can be controlled or uncontrolled.In controlled encoding, the bit rate of the encoded bit stream is controlled by a VBR rate controller according to feedback signals from encoding results and coding complexity of video source.
The rate controller imposes some constraints on the degree of variability allowed in the bit rate.In uncontrolled encoding, the video pictures are encoded with an almost constant quantization (QP) to provide a relative constant quality for encoded video regardless of coding complexity of video source.
Novel service, the increased popularity of high-definition (HD) video, a possible future adoption of video format with resolutions higher than HD have fostered the development of video coding standard improving the capacity of current standards such as H.264/MPEG-4 AVC.
Network traffic caused by video applications for mobile devices, or for video-on-demand services, is very demanding for high quality/high resolution videos.
The recent High Efficiency Video Coding (HEVC) standard is an ITU-T and ISO/IEC joint project activity developed under a partnership named Joint Collaborative Team on Video Coding (JVT-VC).
HEVC is mainly focused into two key points: support higher video resolution and augment the use of parallel processing architectures.As in the previews ITU-T and ISO/IEC, the HEVC standardizes only the bitstream structure and syntax of the decoder.HEVC video coding layer makes use of the same hybrid approach employed by previous video standard since H.261.
Figure 1 shows a diagram of a hybrid encoder for creating a bistream compliant with HEVC.
A coding algorithm for producing an HEVC bitstream operates as follows.Each image is decomposed into block units by a coding tree unit (CTU) process.Each block is transmitted to the encoder.The first image of a video sequence is encoded exploiting intra-picture prediction.The other images are encoded exploiting inter-picture prediction defining motion vector (MV) highlighting temporal redundancies between frames.
The signal difference between the original picture and the corresponding intra-or inter-prediction is transformed by the mean of linear spatial transform.
Transformed coefficients are then scaled and quantized.The quantized coefficients are then encoded (entropy coding) by a context-adaptive binary arithmetic coding (CABAC) process.Finally, the encoded data is transmitted together with the other prediction information.

III. STATISTICAL ANALYSIS
Test video sequences have been chosen from [13], where several contributions of video content that are most relevant for determining the effectiveness of consumer video processing applications and quality measurement algorithms are provided.
The selection of test video sequences has been oriented by the following constraints: YUV 4:2:0 sampling, full HD 1920 × 1080 resolution, 60 Hz frame rate.
The cyclic GOP structure used repeatedly throughout the test video sequence is shown in Fig. 2.This coding structure is of size 8 (8 B slices).
This prediction structure is characterized by a structural delay of eight pictures and can provide an improved coding efficiency compared to IBBP coding [14].
Figure 3 shows the average PSNR at different bitrates.PSNR (Y) is the average for all the frames, PSNR (Y, only I frame) is the average for the I frames, PSNR (Y, only B frame) is the average for the B frames.I frames have a higher PSNR than B frames.
Statistical analysis has been done for each prediction layer.Referring to Fig. 2: layer 0 (L0) is composed of frames with POC 0 and 8; layer 1 (L1) is composed of frames with POC 4; layer 2 (L2) is composed of frames with picture order code (POC) 2 and 6; layer 3 (L3) is composed of frames with POC 1, 3, 5, and 7.
For each layer, frame size histograms have been analysed in order to empirically determine a set of statistical distribution to be used as a starting point for fitting to the data sets.
An example of the frame size, for 60 consecutive HEVC encoded frames, is shown in Fig. 4.
Gamma and Weibull distribution have been selected for the fitting analysis, on the basis of the results obtained when modelling H.264 frame sizes [8].
The parameters of the Gamma and Weibull distributions have been computed using the maximum likelihood estimation (MLE) method.Figure 5 shows an example of a L3 frame sizes histogram, for a video sequence encoded by HEVC at 6 Mbps.
The goodness of the fit has been tested using the Pearson's chi-square test.The value of the test is where 2   is the Pearson's cumulative test statistic, i o is an observed frequency, i e is an expected frequency, n is the size of the histogram domain.

IV. CONCLUSIONS
Defining the statistical properties of a VBR video traffic and relying upon a video traffic model able to produce simulated traffic data having the desired statistical properties enables for an accurate study of the performance of communication networks.
In this paper, the statistical properties of frame sizes of HD video sequences encoded by the HEVC standard have been analysed and the Gamma and Weibull distributions have been evaluated and compared as fitting distributions.HEVC videos, selected from a dataset of sports video content-types, have been generated at bit rates of interest for high definition video applications.For a GOP structure of interest in digital video applications the analysis has been conducted at different prediction layers.Experimental results have shown that the Gamma distribution is more appropriate than the Weibull distribution for modelling the statistical distribution of HEVC encoded frame.

Fig. 2 .
Fig. 2. Temporal prediction structure and the POC values, decoding order, and RPS content for each picture.

Fig. 3 .Fig. 4 .Fig. 5 .
Fig. 3. Average PSNR at different bitrates.PSNR (Y) is the average for all the frames, PSNR (Y, only I frame) is the average for the I frames, PSNR (Y, only B frame) is the average for the B frames.I frames have a higher PSNR than B frames.