WP3: Hybrid Model

From VQEG JEG Wiki
Revision as of 09:25, 15 October 2013 by Nicolas Staelens (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

WP3: Hybrid Model

On the basis of obtained results from WP1: HRC Generation and WP2: HMIX Creation, a new architecture for a No Reference Hybrid Model will be developed.


Overview of the JEG work and inputs to the hybrid model

Overview hybrid.JPG

Overview on existing/possible models

As a first means of research, an overview of already existing quality models should be given. The focus should be on NR/RR models.

status: in process (Shelley, Werner, Iñigo)
* Modelling Papers.doc
* Metrics Tabelle.xls
* database

Some interesting papers

Suggestions are welcome! Here are some papers:

  • A versatile model for packet loss visibility and its application to packet prioritization. Trans. Img. Proc. 19, 3 (Mar. 2010), 722-735. DOI=http://dx.doi.org/10.1109/TIP.2009.2038834
    • Authors: Lin, T., Kanumuri, S., Zhi, Y., Poole, D., Cosman, P. C., and Reibman, A. R.
    • Abstract: In this paper, we propose a generalized linear model for video packet loss visibility that is applicable to different group-of-picture structures. We develop the model using three subjective experiment data sets that span various encoding standards (H.264 and MPEG-2), group-of-picture structures, and decoder error concealment choices. We consider factors not only within a packet, but also in its vicinity, to account for possible temporal and spatial masking effects. We discover that the factors of scene cuts, camera motion, and reference distance are highly significant to the packet loss visibility. We apply our visibility model to packet prioritization for a video stream; when the network gets congested at an intermediate router, the router is able to decide which packets to drop such that visual quality of the video is minimally impacted. To show the effectiveness of our visibility model and its corresponding packet prioritization method, experiments are done to compare our perceptual-quality-based packet prioritization approach with existing Drop-Tail and Hint-Track-inspired cumulative-MSE-based prioritization methods. The result shows that our prioritization method produces videos of higher perceptual quality for different network conditions and group-of-picture structures. Our model was developed using data from high encoding-rate videos, and designed for high-quality video transported over a mostly reliable network; however, the experiments show the model is applicable to different encoding rates.
  • No-reference video quality monitoring for H.264/AVC coded video. Trans. Multi. 11, 5 (Aug. 2009), 932-946. DOI= http://dx.doi.org/10.1109/TMM.2009.2021785
    • Authors: Naccari, M., Tagliasacchi, M., and Tubaro, S.
    • Abstract: When video is transmitted over a packet-switched network, the sequence reconstructed at the receiver side might suffer from impairments introduced by packet losses, which can only be partially healed by the action of error concealment techniques. In this context we propose NORM (NO-Reference video quality Monitoring), an algorithm to assess the quality degradation of H.264/AVC video affected by channel errors. NORM works at the receiver side where both the original and the uncorrupted video content is unavailable. We explicitly account for distortion introduced by spatial and temporal error concealment together with the effect of temporal motion-compensation. NORM provides an estimate of the mean square error distortion at the macroblock level, showing good linear correlation (correlation coefficient greater than 0.80) with the distortion computed in full-reference mode. In addition, the estimate at the macroblock level can be successfully exploited by forward quality monitoring systems that compute quality objective metrics to predict mean opinion score (MOS) values. As a proof of concept, we feed the output of NORM to a reduced-reference quality monitoring system that computes an estimate of the structural similarity metric (SSIM) score, which is known to be well correlated with perceptual quality.
  • Real-time monitoring of video quality in IP networks. IEEE/ACM Trans. Netw. 16, 5 (Oct. 2008), 1052-1065. DOI= http://dx.doi.org/10.1109/TNET.2007.910617
    • Authors: Tao, S., Apostolopoulos, J., and Guérin, R
    • Abstract: This paper investigates the problem of assessing the quality of video transmitted over IP networks. Our goal is to develop a methodology that is both reasonably accurate and simple enough to support the large-scale deployments that the increasing use of video over IP are likely to demand. For that purpose, we focus on developing an approach that is capable of mapping network statistics, e.g., packet losses, available from simple measurements, to the quality of video sequences reconstructed by receivers. A first step in that direction is a loss-distortion model that accounts for the impact of network losses on video quality, as a function of application-specific parameters such as video codec, loss recovery technique, coded bit rate, packetization, video characteristics, etc. The model, although accurate, is poorly suited to large-scale, on-line monitoring, because of its dependency on parameters that are difficult to estimate in real-time. As a result, we introduce a "relative quality" metric (rPSNR) that bypasses this problem by measuring video quality against a quality benchmark that the network is expected to provide. The approach offers a lightweight video quality monitoring solution that is suitable for large-scale deployments. We assess its feasibility and accuracy through extensive simulations and experiments.
  • Quality monitoring of video over a packet network. IEEE Transactions on Multimedia 6(2): 327-334 (2004)
    • Authors: Amy R. Reibman, Vinay A. Vaishampayan, Yegnaswamy Sermadevi
    • Abstract: We consider monitoring the quality of compressedvideo transmitted over a packet network from the perspective ofa network service provider. Our focus is on no-reference methods,which do not access the original signal, and on evaluating theimpact of packet losses on quality. We present three methodsto estimate Mean Squared Error (MSE) due to packet lossesdirectly from the video bitstream. NoParse uses only network-level measurements (like packet loss rate), QuickParse extractsthe spatio-temporal extent of the impact of the loss, and FullParseextracts sequence-specific information including spatio-temporalactivity and the effects of error propagation. Our simulationresults with MPEG-2 video subjected to Transport Packet lossesillustrate the performance possible using the three methods.
  • Analysis of Freely Available Subjective Dataset for HDTV including Coding and Transmission Distortions, Fifth International Workshop on Video Processing and Quality Metrics for Consumer Electronics (VPQM–10), Scottsdale, Arizona, January 13–15, 2010.
    • Authors: Marcus Barkowsky, Margaret Pinson, Romuald Pépion, Patrick Le Callet
    • Abstract: We present the design, preparation, and analysis of a subjective experiment on typical HDTV sequences and scenarios. This experiment follows the guidelines of ITU and VQEG in order to obtain reproducible results. The careful selection of content and distortions extend over a wide and realistic range of typical transmission scenarios. Detailed statistical analysis provides important insight into the relationship between technical parameters of encoding, transmission and decoding and subjectively perceived video quality.

Model discussion

status: in process (Shelley, Werner, Iñigo)

Possible input parameters

Hybrid parameters.PNG

Ideas on how to combine different parameters or published models

Suggestions are welcome!

  • Idea nº1:
    • Estimate the quality of the video (compression distortions) with the QP.
    • Correlate the bitstream parameters with the decoded video.
    • Track the transmission distortions at the macroblock level using motion vectors. Locate the affected pixels.
    • Compute the effect of the affected pixels in the quality with the decoded video (frame freezing, frame skipping, texture, motion, visual attention).
    • Estimate a MOS from the temporal distribution of compression and transmission distortions.

Current model

As a starting point, the first version of the model (only considering the quantization parameter when constant) can be downloaded here:


The implementation is explained in detail in Hybrid Model v1

An updated version was made available by Patrik Hummelbrunner on 17/12/2011. It works with Python 3.2 and offers the processing of compressed HMIX files (hmix.gz) and outputs in CSV format. It can be found in Hybrid Model v3

External contributions to the model

External contributions are welcome (including patent). Please contact the Joint Effort Group if you are interested.

Proposed Model

Input parameters


Status: todo


Status: todo


Status: todo

Subjective data

Status: todo


Any aspect of this work package can be discussed on this section of the JEG forum.