We're talking of interlaced video here. In this mode, a video feed is a sequence of "fields". Each field is half a frame.
Displaying this with sufficient speed makes a good interlaced image.
A frame comes into 2 different fields.
Then comes the recording process ; for a movie, each frame is fully on film. So the 2 fields extracted from this movie frame comes from one same frame.
With video, the device doing the recording does interlaced recording. So the first field is captured at time "T", and the next field is captured at time "T"+ 1/50th of second.
The big difference is here ; in film mode, a couple of fields come from a unique image. In video mode, a couple of fields doesn't come from a unique image !
With film mode, especially in PAL, where fields comes in 2:2 type sequences, it's not too much difficult. Find 2 fields, rebuild a single frame from those 2 field and it's ok because the 2 fields come from the same image.
With video mode, this is not so easy. If you use simple algorithms like bob or weave, you lose resolution or you get artefacts.
You need to implement complex algorithms that will try to rebuild intelligently a frame from those 2 fields.
That's one area where the difference between deinterlacing devices appears. Basic tv app do that badly. Software dvd players do that poorly. Dscaler does it not so badly (see Tom's Barry Greedy high motion plugin). Faroudja leads the pack .. they do it VERY well.
I hope the explanation is not too bad, the thing is not easy to explain
