|
Welcome to SeeMoreDigital.net This site is dedicated to Audio and Video encoding, together with the formats used. |
||||||||
|---|---|---|---|---|---|---|---|---|
| Welcome Page | Audio Info | Video Info | Useful Software | How to... | Cool Web Sites | Goods For Sale | Test Encodes | Misc Stuff |
|
Simple Profile Settings
A typical Mpeg4 encoded video consists of Intra-frames, I-frames and Predicted-frames, P-frames. More recent Mpeg4 encoders also allow the user to encode using Bi Directional-frames, B-frames.
I-frames I-frames are encoded by using information from within its own frame. They do not use any information from other frames (ie temporal compression). An I-frame is similar in concept to encoding a single frame with say, JPEG.
Technically speaking, an I-frame is one in which all of the macro-blocks are stored as images rather than as motion vectors. Encoding a collection of pixels in the form of a static block is the most expensive method in terms of storage and hence I-frames are the most expensive type of frame.
Fewer I-frames in the video generally translates to better compression and I-frames are normally used by the encoder only when too few blocks can be tracked from the reference frame by the motion search algorithm. I-frames serve a very important purpose. As all of the blocks in an I frame are stored as images, thus decoding an I frame reveals a complete picture without dependency on reference frames.
For this reason I-frames are also known as key-frames. And they are the only type of frame completely independent of all others.
P-frames P-frames are forward predicted and may use either an I-frame or P-frame as a point of reference. They are encoded from the frame that precedes it. In any video sequence a group of frames will have many of the same images. For example, if you were to watch a news presenter, you will notice that the area (scene) behind the presenter stays almost identical for every frame. So instead of encoding each frame, as a totally new frame (remembering that for an PAL image there could be up to 25 of them) you can exploit the redundancy of each frame by the use of P-frames.
Essentially a P-frame is a future frame that determines where a block in the previous frame has moved in it's current P-frame. So instead of spatially encoding the frame (like with JPEG) the P-frame just says: "Hey the block in the previous frame has moved to location (X,Y)", which requires much less data than encoding each frame spatially. Essentially only the differences between frames are recognised, which is more efficient than recognising the original I-frame.
Technically speaking, an P-frame is one which can contain blocks that have been forward predicted via a motion vector from the previous frame by the motion search. It is normally unlikely that all of the blocks in a P-frame can be predicted, and where blocks can not be tracked from the previous frame, an intra-block is used in its place, similar to those found in an I-frame.
Because P-frames reconstruct much of the frame by applying motion vectors to the previous frame (ie: motion compensation) they are far less expensive in terms of storage than I-frames. One or more P-frames may follow an I-frame.
Therefore a greater ratio of P-frames to I-frames leads to a higher compression ratio. |
|
|
Advanced Simple Profile (ASP) Settings
B-frames (Bi-directional encoding or B-VOP) B-frames can also forward predict, but do this by choosing the best prediction match among a 2 frames. B-frames are not only coded by using forward predicted frames but also backward predicted frames. Which can either be an I-frame or P-frame.
Using B-frames reduces the amount of data needed to code a frame and improves quality more specifically in areas where moving objects reveal hidden areas.
All the major Mpeg4 encoding companies are constantly upgrading B-frame encoding techniques by improving how motion estimation is performed on these frames. And also by improving the precision and quantization modulation.
It's worth pointing out that some DVD/Mpeg4 stand-alone players (including the Sigma Xcard) can't handle more that 1no consecutive b-frame at a time. The current version of DivX5.1.x generates 1no b-frame as standard. However, XviD generates 2no consecutive b-frames by default. Which you may have to over-ride/set to 1no.
Global Motion Compensation
(S-VOP)
Quarter Pel Motion Estimation (Qpel)As explained in the B-frames summary, data is reduced when the difference between two frames (prediction error) is transmitted instead of the entire image being sent. The difference in a successive frames composition is generally computed on a macro-block-by-macro-block basis (16x16 pels) or on a block by block basis (8x8 pels).
For example, a part of an image located in a block at grid location (1,1) may move to grid location (1,2) in the next frame. As you may realize an image in one block will likely need more accuracy than just the ability to move on a limited block by block basis with an accuracy that is limited to an integer pixel unit (1,1).
Typically Mpeg4 uses Half Pel (1.5, 1.5) prediction and encoding techniques. However, Quarter Pel (1.25, 1.75) prediction and encoding techniques performs specific filtering on each block to produce a virtual block that is able to represent how the original block should appear if moved by a 1/4 of a pixel unit.
It's worth noting that Qpel generally decreases compressibility. Although it can find a better match, it usually doesn't 'correct' this match anyway, because the errors are too small. If there is no good match, Qpel doesn't help much either - the amount of texture data is still similar.
It was thought that using this feature would allow the user to produce the same high visual quality at about 20% less file size. But such claims have been revised
MPEG Matrices TBA |
|
|
Other Settings
Lumi masking - as used by
XviD
Psychovisual Enhancement
- as used by DivX
Pre-processing
- as used by DivX
Generally speaking, noise is a big
problem when it comes to compressing video, because a lot of data will be used
to capture the video noise that really shouldn't be there in the first place.
De-interlacing
If you convert interlaced video into
non-interlaced (ie progressive) video on a computer, you may find that
when the two adjacent fields are put together to create one frame, that the
information in those fields might not quite line up, and you get a "smearing" or
"tearing" effect that degrades visual quality. In the main, video that is created on computers is progressive. So it really is beneficial to convert 50/60 'field' interlaced material into progressive 25/30 'frames' material.
Codec manufacturers use special algorithms to minimize the smearing and tearing effects that could result. But beware, because de-interlacing tools can also be found and used in many external or front end encoding applications (such as MPEG Mediator, VirtualDubMod, Gordian Knot). And if both de-interlacing tools are activated at the same time, serious problems can and will occur.
Packed Bitstream Helps to permit B-frame decoding without delay.
When enabled P-frames and B-frames are 'packed together' into one bitstream [I][PB][B][Empty][PB][B][Empty][P]. Packed-bitstream was first introduced into the encoding process with the launch of DivX5.0.1. Xvid offer manual selection.
Closed GOV Closes every 'group of pictures' before opening a new key-frame (I-frame) |
|
|
Note |
The above information has been compiled from a variety of sources, including DivX.com and Bond (Doom9.org) - Many thanks |
|
Last Updated |
Mon 09 Nov 04 @ 18:15 - Various changes |