JEG Reproducible Video Quality Analysis Framework

From VQEG JEG Wiki
Jump to: navigation, search


Main idea

The work on this page presents a framework to facilitate reproducibility of research in video quality evaluation. Its initial version is built around the JEG-Hybrid database of HEVC coded video sequences. The framework is modular, organized in the form of pipelined activities, which range from the tools needed to generate the whole database from reference signals up to the analysis of the video quality measures already present in the database. Researchers can rerun, modify and extend any module, starting from any point in the pipeline, while always achieving perfect reproducibility of the results. The modularity of the structure allows to work on subsets of the database since for some analysis this might be too computationally intensive. To this purpose, the framework also includes a software module to compute interesting subsets, in terms of coding conditions, of the whole database.

Software Architecture

The software consists of a pipeline architecture where each active component communicates to the other component using directories. Each individual component will be explained next.


Large scale encoding environment

The large scale encoding environment performs the Hypothetical Reference Circuit (HRC) processing. At this moment, this module consists of the HEVC standardization reference encoding package ( accompanied by a valuable set of configuration files and scripts in order to reproduce or extend the first version of the HEVC large scale database. Whenever the proposed video quality analysis framework needs improvement with more compression standards, other compression parameters, or network impairment simulations, then solely this block must be extended upon.

The correct creation or download of the large-scale database can be verified by SHA512 checksums available at: (

Subset selection

This process facilitates the research work if a proper HRCs are selected to represent a large-scale database. Therefore, this component consists of two algorithms for subset selection. The first one is optimized to cover different ranges of quality and bitrate. The second algorithm is optimized for HRCs that behave differently with different source contents. The two algorithms are demonstrated in the accompanying DSP paper.

Quality measure

The quality measurement component consists of a collection of full reference quality measurements like Peak Signal-to-Noise Ratio (PSNR), Structural Similarity (SSIM), Multi-Scale Structural Similarity (MS-SSIM), and Visual Information Fidelity (VIFp). These measurements are integrated in this Reproducible Video Quality Analysis software package using the Video Quality Measurement Tool (VQMT) from Ecole Polytechnique Fédérale de Lausanne (EPFL).

Quality measure analysis

The quality measure analysis focuses on extracting the relevant data from the full reference quality measurement database and process them in order to perform the analysis. In this particular work we extracted frame-level MSE and PSNR values to compare the effect of temporal pooling through averaging either MSE or PSNR. Moreover, the variance of the frame-level PSNR is also computed and made available for the next visualization block. The module can be easily customized to handle different measures either present in the database or made available through files in the same format. Moreover, other indicators (e.g., moving averages etc., can be included in the output for the visualization block.


The visualization block is currently a set of gnuplot command files that can be used to easily visualize the data produced in the previous step. In particular, they can automatically generate scatter plots and, with the aid of some custom-developed external modules, interpolation parameters for better visualization.

Running the software

The package contains scripts that are written in different language environments but can be executed under Linux and Windows platforms.

Step 0: Clone the git repository

Use git to clone the following repository

$ git clone

The software package contains 2 directories `DATA' and `SoftwareLibraries'. `DATA' directory contains sub directories that contains the source data and the outputs of running the softwares that can be found in `SoftwareLibraries'.

Step 1: Large Scale Encoding Environment


First of all, the source content should be placed in (DATA/SRC). The current scripts deal with three resolutions (960x544, 1280x720, and 1920x1080).

In the first module (Large-scale encoding environment), the file has to be run (SoftwareLibraries/ENC_SRC/ to generate and encode the whole coding conditions ( is a Windows script). The `.265' and `.txt' output files will be placed in (DATA/ENC). The `.265' is the bitstream file and the `.txt' is the encoding information.

Step 2: Quality measure


The quality measure module calculates the objective quality measure by running `SoftwareLibraries/'. The software will firstly decode the bitstream file (.265 file) and then uses the (SoftwareLibraries/VQMT_Binaries/VQMT.exe) to calculate the PSNR, SSIM, and VIF quality measures and saves the output per frame and in average in separate files in the (DATA/QUAL) folder.

The final step in this module is to run `SoftwareLibraries/' to aggregate all the quality measures in two files; one keeps sequence-level records, see part and the second keeps the sequence and the frame levels records.

Step 3 [Optional]: Subset selection


In order to work on a subset of HRCs, the optional module (subset selection) has to be executed.

The first step is to aggregate the input data for the MATLAB functions: `getBitrateQualityDrivenHRCs.m' and `getContentDrivenHRCs.m' that can be found in (SoftwareLibraries/SUB_ENC). The input data takes the form of two matrices, the first one contains the PSNR measures and the second one is the calculated bitrate. These two matrices are formatted as follows: MxN where M represents the total set of the HRCs and N represents the number of source contents. They can be aggregated from the `.txt' of the output of first module. This optional module saves the `.txt' and `.csv' files that contains the selected HRCs in the (DATA/SUB_ENC) folder.

Step 4: Quality measure analysis


In the quality measure analysis, first sequence-level metrics are computed. The database already contains the PSNR for each frame, whereas the MSE can be computed by reversing the PSNR formula through `'. Next, `' processes each HRC independently, automatically matching all the values (e.g., PSNR and MSE) related to the same HRC even if stored in different files (e.g., one per measure type) and computing the sequence-level indicators including the variance of the PSNR.

Other utility software can perform HRC filtering operations (`'), cumulative distribution function (CDF) computation while retaining all the original information (`'), computation of the linear interpolation functions (`script\_linear\') and of the similarity of two point cloud distributions (`script\'), by assigning points in one graph to the nearest one in the other, then computing statistics such as average number of assigned points and average distance.

Step 5: Visualization


In the visualize module, gnuplot command files are provided to directly generate all types of scatter plots shown in our accompanying paper~\cite{dsp2017:inpress} for immediate comparison with new research results, see Figure \ref{fig:softX_part2}, also including the interpolating lines.