At 8:23 AM -0500 1/19/06, John Shutt wrote: >Is there a Moore's Law regarding codec efficiency, or is there a theoretical >limit? I mean it seems to be impossible to represent an entire 1920x1080 >frame with a single bit (unless the entire screen is monotone), so there >must be a theoretical limit as to how much you can compress an image and >still have it be a practical display. An excellent question John. With your permission, I may incorporate it into my column on video compression for the March issue of BE. And i am looking forward to other responses: this could be an interesting thread! Moore's law is certainly a factor, as it provides some indication of what we can expect in terms of computational resources for video compression algorithms in the future. Equally important, it can help us predict the resources that will be available in low cost consumer appliances in the future. As a starting point, it is important to look at both ends of the problem. That is, how does the ongoing progression in computation power impact the encoding of content, and how does it impact the decoding of content. The philosophy behind most compression codecs today is that the encoder can be very complex, but it needs to produce a bitstream that can be decoded by devices with much lower complexity. It is helpful to look at the overall resources and complexity of modern set-top boxes and image processing engines in integrated appliances. This extends well beyond the resources that are available for video decoding. For example, a few years ago most of these products had very limited support for local graphics and run time engines for Java and other applications needed to deliver enhanced services. Today we are seeing the same graphics engines (GPUs) that are designed into PCs, making their way into STBs and integrated receivers. The point I am trying to make, is that the problem is much larger than just encoding audio and video streams. We are moving into an era where the "receiver" will be used for localization and customization of the content that we view; thus the decoder and local image processing complexity will increase significantly. As this happens it opens up new possibilities for the ways in which content is encoded. MPEG-4 provides an excellent example. Not part 2 (the original video codec), or part 10 (AVC/H.264), but rather the entire specification. We have discussed recently the notion of picture elements not being included (e.g. the ball in a football match). But we have not spent much time talking about the fact that the MPEG-4 spec can achieve huge gains in compression efficiency by dealing with picture objects that are composited in the receiver. This aspect of MPEG-4 has not been exploited, in part because of the computational complexity for a receiver, and in part because of the complexity of extracting the objects from a "flattened composition," otherwise known as a finished linear video program. But much of what is needed to exploit the object composition model for MPEG-4 already exists in the production systems we use today. IF we keep track of all of the program elements (video, audio, graphics, 3D, etc) and use the metadata created by the NLE/compositing systems, we have virtually everything needed to produce an MPEG-4 composition. We can do this as simple as telling a receiver to fade the video stream to black, or to cross dissolve to a new video stream - both of these simple production techniques reek havoc with pixel based video encoding systems. Jeroen has provided many glimpses inside the work being done by Philips to enhance the presentation of video on modern display systems. I'm certain he could tell us many tales about Natural Motion, and computational complexity behind frame rate conversions. So a good way to look at the problem of video encoding, is to consider all of the potential paths that exist to predict what future frames will look like. Prediction is the biggest leverage we have in terms of gaining compression efficiency. Mark Schubin will tell you that even with virtually unlimited computation resources, we still have a difficult time building a "transparent" video standards converter, and de-interlacing algorithms still have a difficult time predicting what the information lost to interlaced acquisition looks like. MPEG-2 and AVC are still VERY CRUDE in terms of the prediction routines that are used for motion compensated prediction. The reality behind AVC is that it simply provide better granularity for many of the block matching tools used in MPEG-2. For example, we have more control over the positioning of blocks and the precision of motion vectors. We have better ways of representing and quantizing the the information inside the blocks. And we have new tools to mask errors. With few exceptions, we have not even begun to explore real motion compensated prediction in compression algorithms, and for good reason - computational complexity. MPEG-2 and MPEG-4 do not identify and track objects - they just try to find the most efficient block matches, which may (or may not) have any relationship to the actual objects and motion vectors. Perhaps the next big step will be to do real motion compensated predictions, but this approach is incredibly complex. We capture images on a 2D image plane, but the objects exist in 3D space. Thus simple issues like an object moving closer to or away from the camera make good motion compensated prediction more difficult. Now add plastic deformations and reflections into the mix and the calculations go through the roof. How do you predict what a running back looks like in 3D, or how he is deformed by a 250 pound linebacker? How do you deal with reflections from 3D objects and surfaces, when the information in the reflections is also changing. In short, as with Moore's Law, we are nowhere near the theoretical limits for improvement. The reality is that each step in the Moore's Law progression enables us to add refinements to compression algorithms, and to add more resources in the decoder to enable new ways to create and encode content. So the real issue is how to build evolution into the standards that we use to deliver digital services to the masses. We are now seeing that many DTV deployments - based on standards that are now a decade or more old are unable to deal with extensibility. If we add AVC to ATSC or DVB, the existing deployed receivers must be replaced to work with the new services. As we move to more programmable receivers, we may be able to extend their useful life by several years, but periodic upgrades are going to be a fact of life, which is probably the MOST COMPELLING argument for keeping the receiver/image processor separate from an expensive big screen monitor. Obviously we will also have integrated products, and these product may have a very limited run before they are upgraded. Consider what Apple has done with the iPOD - old iPODs are not rendered obsolete, but new capabilities are constantly being added, providing consumers with an incentive to upgrade. In the emerging digital world, it is not the standards that are the primary drivers - it is the service that can be delivered that are the driving force, and the perceived value of these services by the consumers. In the early days of the PC revolution one could justify replacing the CPU every 18-24 months based on productivity alone. Now a PC can remain useful for 5 years or more; to motivate upgrades new PCs must deliver new services - hence the interest in the Family room. >If so, then how far away from that theoretical limit is MPEG4/AVC? Is >MPEG4/AVC to the point that it really could be a standard that could last >for 20 years? How long a standard lasts is not the real issue here. JPEG was standardized around 1989. It has undergone several updates, and the JPEG-2000 standard has almost nothing in common with the original algorithm. But we will expect appliances to deal with the original JPEG standard for many decades, even as we use newer algorithms to encode still images in the future. The real issue is extensibility - building upon what has come before. The problem comes when we deploy closed systems with no provisions for extensibility, as has been the case for virtually all of the first generation of DTV standards. This problem is mitigated in part by keeping the volatile components separate from the less volatile, and cheap. I doubt that many people in the U.K will be upset about buying a new Freeview receiver when they buy an HDTV and want to view HD content. The old box will continue to function on the old TV, at least until the decision is made to stop using MPEG-2. > >Personally, I have no quarrel with Europe's "problem" about obsolete MPEG2 >receivers. They rolled out digital using very inexpensive boxes, and can >slowly starve them of bits to make room for AVC simulcasts in HD. Just as >DVB-T allows an almost continuous sliding scale of bitrates vs. robustness, >there is an inherent sliding scale of SD quality vs. HD quality and/or >number of HD services. Yup! They have taken a pragmatic approach, even as they have made concessions to the broadcast community. Most broadcasters in Europe had upgraded to digital SD before the launch of DTV. They understood that the public would see a significant improvement in picture quality without having to force everyone to buy a new TV. And now they are prepared to take advantage of the cost reductions and improvements in technology as they launch HDTV services. Unfortunately, in the U.S. for political reasons the broadcasters attempted to leapfrog a generation. The result has been a very slow start and a system that is already out of date. This was ENTIRELY predictable. I think is is foolish for anyone in the television business to think in terms of locking down technology for extended periods of time. The right approach is to design in extensibility, and decide when the time has come to start over again. My guess is that we can expect no more than 10-15 years from a product before it will be more efficient (cheaper) to start over. > >Only those who need HD will have to replace their tuners, and in many cases >the tuner will be built into the display or else what is another $300US on >top of a $2,000US HD display? Why $300? Bert tells us that we can build complete ATSC receivers with HD for less than $100. It should be even less for DVB receivers. > >Australia got the worst of it by allowing SD only boxes to be sold, but >still demanding that MPEG2 HD also be used. Perhaps their HD penetration is >so low that they could allow an HD switch to MPEG4 and compensate those few >HD adopters. Then they would be back in harmony with the Old Country. No, we got the worst of it by trying to deploy HDTV too soon. If you have a receiver or an HD capable monitor that is more than a year old, it probably will not work (for HD) in a few years. But take heart, it will still deliver 480P quality. Regards Craig ---------------------------------------------------------------------- You can UNSUBSCRIBE from the OpenDTV list in two ways: - Using the UNSUBSCRIBE command in your user configuration settings at FreeLists.org - By sending a message to: opendtv-request@xxxxxxxxxxxxx with the word unsubscribe in the subject line.