Welcome to SeeMoreDigital.net

This site is dedicated to Audio and Video encoding, together with the formats used.

Welcome Page Audio Info Video Info Useful Software How to... Cool Web Sites Goods For Sale Test Encodes Misc Stuff

 

Why I don't follow ITU601 specification for MPEG-2 to MPEG-4 conversions

Square Pixelled Images

 

A typical 720/1080 high-definition 16:9 shaped image is made up of square pixels.

 

Meaning, a square pixelled 720 image contains 1280x720 (921,600 total) pixels. And a square pixelled 1080 image contains 1920x1080 (2,073,600 total) pixels.

 

Therefore by following the above examples a square pixelled 576 (PAL) image should contain 1024x576 (589,824 total) pixels. And an square pixelled 480 (NTSC) image should contain 853.3333x480 (409,600 total) pixels.

Anamorphically Pixelled Images (with PAR signalling)

 

Way back in the 80's/90's when computer processing power was much slower than it is today, creating standard definition images with sqaure pixels was not possible. Hence, the non-square or "anamorphic" pixel image was born!

 

All standard definition MPEG-1/2 video streams, such as: DVD, DVB-S/T/C and even VCD, is distributed using anamorphic pixels. So in order for these anamorphic images to look normal when viewed, a method was developed to convert non-square pixelled images to square pixelled images in real-time. This new technology required embedding "Aspect Ratio Signalling" (ARS) information into the video bit-stream. Which is analysed during playback and decoded by the media player/receiver.

 

MPEG-2 DVD and DVB anamorphic images, are supposed to follow the ITU-R BT.601 standard. However, it would appear such calculations are based using 704 pixels and not 720 pixels.

 

To show you what I mean, please refer to the following : -

 

An 4:3 PAL image, with a PAR value of: 12/11, decimates to: 1.0909090

Meaning: 1.0909090 x 704 = 767.99993. Which is effectively, 768 pixels.

And: 768/576 creates a perfect 1.33:1 aka 4:3 frame

 

An 16:9 PAL image, with a PAR value of: 16/11, decimates to: 1.4545454

Meaning: 1.4545454 x 704 = 1023.9999. Which is effectively, 1024 pixels.

And: 1024/576 creates a perfect 1.77:1 aka 16:9 frame.

 

An 4:3 NTSC image, with a PAR value of: 10/11, decimates to: 0.9090909

Meaning: 0.9090909 x 704 = 639.99999. Which is effectively, 640 pixels.

And: 640/480 creates a perfect 1.33:1 aka 4:3 frame.

 

An 16:9 NTSC image, with a PAR value of: 16/11, decimates to: 1.2121212

Meaning: 1.2121212 x 704 = 853.33332. Which is effectively, 853.3(r) pixels.

And: 853.3/480 creates a perfect 1.77:1 aka 16:9 frame.

 

Without doubt, the above calculations are able to generate accurate "corrected" 4:3 and 16:9 frames during playback. But look what happens when you substitute 704 pixels with 720 pixels in the same calculation. Suddenly.....

 

An 4:3 PAL image, with a PAR value of: 12/11, decimates to: 1.0909090

Meaning: 1.0909090 x 720 = 785.45448 pixels.

And: 785.45448/576 creates an 1.3636363:1 aka 15:11 frame!

 

An 16:9 PAL image, with a PAR value of: 16/11, decimates to: 1.4545454

Meaning: 1.4545454 x 720 = 1047.2727 pixels.

And: 1047.2727/576 creates an 1.8181818:1 aka 20:11 frame!

 

An 4:3 NTSC image, with a PAR value of: 10/11, decimates to: 0.9090909

Meaning: 0.9090909 x 720 = 654.54544 pixels.

And: 654.54544/480 creates an 1.3636363:1 aka 15:11 frame!

 

An 16:9 NTSC image, with a PAR value of: 16/11, decimates to: 1.2121212

Meaning: 1.2121212 x 720 = 872.72727 pixels.

And: 872.72727/480 creates an 1.8181818:1 aka 20:11 frame!

 

As you can see, simply substituting 704 pixels with 720 pixels, is not going to generate encodes that make any sense to your media player... I mean, who's ever heard of frames with aspect ratios of 15:11 or 20:11?

At this stage it may be worth pointing out that no software media player (I know of) adheres to the above mentioned specification.

The Logical PAR/DAR Approach to Anamorphically Pixelled Images

 

Yep, what we want are frames containing 720 pixels with aspect ratios of 4:3 and 16:9, just like we have with our DVD's! But don't despair, because this is why I and many other MPEG-4 encoders use the following "custom" ARS calculations: -

 

4:3 PAL image, with a "custom" PAR value of: 16/15, decimates to: 1.0666666

Meaning: 1.0666666 x 720 = 767.99995. Which is effectively: 768 pixels.

And: 768/576 creates a perfect 1.33:1 aka 4:3 frame.

 

16:9 PAL image, with a "custom" PAR value of: 16/11, decimates to: 1.4222222

Meaning: 1.4222222 x 720 = 1023.9999. Which is effectively: 1024 pixels.

And: 1024/576 creates a perfect 1.77:1 aka 16:9 frame.

 

4:3 NTSC image, with a "custom" PAR value of: 10/11, decimates to: 0.8888888

Meaning: 0.8888888 x 720 = 639.99999. Which is effectively: 640 pixels.

And: 640/480 creates a perfect 1.33:1 aka 4:3 frame.

 

16:9 NTSC image, with a "custom" PAR value of: 16/11, decimates to: 1.1851851

Meaning: 1.1851851 x 720 = 853.33327. Which is effectively: 853.3(r) pixels.

And: 853.3/480 creates a perfect 1.77:1 aka 16:9 frame.

The ITU-R BT.601 standard die-hards

 

Sadly, we'll have to accept that the "ITU spec" die-hards will never accept the "logical" approach.

 

In reality, if a large circle was plonked right in the middle of an anamorphic frame and encoded using both approaches. Both would appear to look perfectly round. This is because most, if not all, human brains are not capable of telling the difference!

Back to...

What is an anamorphic image?

Last Updated

Saturday 25 July 2009