[opendtv] Re: Precision

  • From: Craig Birkmaier <craig@xxxxxxxxx>
  • To: opendtv@xxxxxxxxxxxxx
  • Date: Thu, 31 May 2007 09:29:54 -0400

At 4:14 PM -0400 5/28/07, Tom Barry wrote:
I'd prefer to think of practical (lossy) compression as removing:

1) Redundancy, as you stated

2) Information we don't care about enough to encode, say very high frequency information, and

Careful here. You are treading on thin ice and some of the 1080P zealots may become upset with your logic here.

Very high frequency information may be important as the screen size increases. On smaller screens these details are too small to see, even if they make it through the emission channel. As we increase the screen size, however, we may need these details to deliver what appears as a sharp picture.

Admittedly, these details are the first thing that gets quantized away when we compress for emission, but this does not make them less important to "some" viewers. A viewer with a 27" screen probably does not even care about HDTV - they just want a clean sharp SDTV picture.

For the vast majority of HDTV owners, 720P is sufficient to deliver sharp pictures, especially when the high frequency information is DELIVERED, not quantized away.

But those who have gone to the expense of building a really BIG SCREEN home theater system with 1080P resolution want the extra detail, and they DON"T want no stinking compression artifacts.

And don't bother trying to tell them that MOST OF THE TIME this extra detail will be left on the "compression room" floor. They bought a 1080P display and they know that compression technology is improving, so don't waste your effort trying to explain how stuff "really" works.

3) Related, but not the same as 2), information we don't know about or don't trust. This is information that was captured, but not reliably due to sampling error, noise, whatever. There is a point of diminishing returns on how many bits we can afford to spend encoding unreliable samples or extra bit depth once these things become lost in the noise.

Yup. Entropy is a bitch.

As Mark pointed out, however, it is an integral part of the sampling process. Most of the extra information that exists in 1080P versus 720P is in that borderline area where noise starts to have an impact on the sample integrity ( not to mention the fact that most of these details are captured with limited contrast due to MTF considerations - that is, they are already inaccurate, but hopefully just attenuated, not completely wrong.

Cameras are already designed to deal with some of this. The designers know the frequencies at which useful details can be captured, and those above which noise makes the samples highly unreliable. So they design the cameras, and downstream processing gear to smoothly roll off the response in the area where some useful details are captured, and to roll off everything above a certain frequency. Without the benefits of oversampling the frequency response of a 1920 x 1080 camera extends only to about 22-24 MHz before the noise overwhelms high frequency details. Most cameras start to roll of the response just above 20 MHz.


The best way to minimize the impact of noise and sampling dither is to OVERSAMPLE. When we resample to a smaller raster we filter out much of the entropy, improving the precision of all of the higher frequency details that are left. We also improve the contrast, which contributes to the perception of a sharper picture.


And then we need to acknowledge real world encoding practices, and the techniques that are used when the peak bit rate requirements exceed the channel bandwidth. We can:

Let the encoder do the best it can, replacing real image detail with quantization noise and in the extreme blocking artifacts.

OR

Pre-filter the source to reduce the amount of high frequency detail that is presented to the encoder. This is more insidious that resampling to a lower resolution for emission, as we reduce the information but keep all of the encoding overhead for the higher resolution format. I have heard experts talk about the fact that for the 1080 line formats, up to half of the available bits can be consumed just for the transmission of motion vectors when the encoder is stressed.

Once again, however, it is important to recognize that a small percentage of viewers have 1080P displays that need this extra detail. So we need to at least pretend that this stuff is making it through the emission channel to keep them satisfied. Who cares if the programs breaks up into blocks on occasion and the delivered resolution modulates based on the encoding stress...

At least we are delivering the very best HDTV possible...

;-(


For 2) and 3) however it may be best to not filter or discard them but instead allow the encoder to opportunistically choose whichever values happen to encode nicest. That's one of the reasons I think capturing at much higher bit depths and allowing encoders to quantize them away seems, in some tests, to work more efficiently than some might predict.


I've never seen such a test. Unfortunately, encoders are no smart enough to encode what looks nicest - even with h.264. They run algorithms that are VERY limited in terms of the decision that can be made. H.264 introduced two features that help a little - one improves the decisions, the other masks the mistakes. The frequency domain transform has the ability to select from a range of choices with respect to the content of a transform block. It can be weighted for increased H or V detail and to deal with several types of gradients. The deblocking filter helps to mask the artifacts when a block is over-quantized.

The reality of how compression works is that there is a compression range where only the highest frequencies are quantized. The actual samples are replaced with correlated noise. As long as we operate in this range the pictures look very good. But there are several issues that cause severe problems.

One is high frequency edge information, which we recently established as being critical to human visual perception. Unfortunately, when an encoding block contains this edge information we typically see blocks with a bunch of coefficients that represent the edge. When we quantize these coefficient we introduce distortions in the edge which are typically seen as ringing or noise around the edge. This is particularly bothersome for text overlays. We simply cannot quantize too much, or the distortions become a major problem - like black pixels adjacent to white pixels - which violates sampling theory, which in turn makes the artifacts easier to detect.

Adding bit depth allows us to limit the range of sampling dither. Adding resolution allows us to localize the impact of the frequency transform. But both of these things increase the overhead that must be dedicated to delivering the compressed pictures - i.e. more encoding blocks, more motion vectors, more bits per sample, etc.

The BEST way to deal with this is to resample to a lower resolution raster. We get more accurate samples - less entropy - and there is less encoding overhead, which translates into higher quality samples delivered to the decoder.

Regards
Craig


----------------------------------------------------------------------
You can UNSUBSCRIBE from the OpenDTV list in two ways:

- Using the UNSUBSCRIBE command in your user configuration settings at FreeLists.org
- By sending a message to: opendtv-request@xxxxxxxxxxxxx with the word 
unsubscribe in the subject line.

Other related posts: