AVS Forum banner

Status
Not open for further replies.
1 - 20 of 111 Posts

·
Registered
Joined
·
3,510 Posts
Discussion Starter · #1 ·
This one scene from I, Robot grabbed my attention. The frames are slightly off but the camera is perfectly still so you can see how the background changes with bicubic and lanczos resize.

thegame.zip (2mb)


Someone stated in the other ffdshow resize thread that "bicubic was better than 1-tap lanczos". As you can see, even at just 1-tap, lanczos is a tiny bit better with crisper defined squares. Bicubic really is a mess compared to 4 and 10-tap.


The difference between 4 and 10-tap is visible but not as significant. Even at 4-tap you are getting the majority of the improvement over bicubic. Those who have enough CPU should certainly use 10-tap.


(matched frames for the complainers)
 

·
Registered
Joined
·
2,378 Posts
Nice comparo! Always wondered how much I'm missing with 4-tap vs 10.
 

·
Registered
Joined
·
3,510 Posts
Discussion Starter · #6 ·
Fair enough, here are perfectly matched frames for 4 and 10-tap Lanczos and bicubic.


However, this time unlabelled :)

thegame.zip (1.8mb)


Be sure to list


A -

B -

C -


with what you think each is. Bicubic is easy, but what about 4 vs 10?
 

·
Registered
Joined
·
55 Posts
A. a bit blurrier than the others, so I guess bucibic.

B. sharper than A, but a bit greenish (I wonder how the unscaled original looks)

C. a tad lighter than B, and also less green. Can't see much difference in amount of detail between B and C..


If the green tint is an artifact that increases with the amount of taps, than I'd say that C = 4tap and B= 10tap
 

·
Registered
Joined
·
723 Posts
A. = Bicubic

B. = Lanczos4

C. = Lanczos10


I thought about reversing the order to mess with heads but that's really the way I see it.


DFA
 

·
Registered
Joined
·
5,052 Posts
My guess:

A. = Bicubic

B. = Lanczos10

C. = Lanczos4


Jesse, what's the first prize? ..and where did you get this I, Robot copy from? It is not yet released. You are not by any chance "affiliated" with the "Academy of Motion Picture Arts and Sciences", are you?? :)

_____

Axel
 

·
Registered
Joined
·
1,005 Posts
I don't know that much about how FFDShow does things, but I do know a lot about scaling filters, having written multiple software scalers.


There really isn't any significant difference, mathematically speaking, between Catmull-Rom bicubic and Lanczos2 (which is not 2-tap, it's 2-lobe). I don't know which bicubic FFDShow is actually using, but if it's not Catmull-Rom, I'd recommend that they switch to Catmull-Rom. And Lanczos itself is not a magic function - it's a windowed sinc function. All windowed sinc functions (Mitchell, Hamming, Hann, Kaiser, Lanczos) have similar results, and the primary difference between them is the amplitude of the first negative lobe. Bigger first negative lobe = sharper.


Lanczos4 has 4 lobes, but the extra two outside lobes make no difference to the sharpness. The extra sharpness is from the deeper first negative lobe. Adding lobes to a windowed sinc is just a computationally expensive way of adding extra sharpness by increasing the ratio of negative to positive in the first two lobes.


There is one other useful feature of really long (many lobe) filters, which is that they blur out fine background detail better when doing downsampling. However, they actually do it better than a real camera, or your eye. More than 2 lobes is just overkill.


The big, big downside to long filters is that they exhibit excessive ringing on sharp transitions. You can't see it in this particular sample, but it's pretty easy to see on suitably sharp material, and it varies in visibility depending on the scaling ratio. Many of the test patterns in Avia show it off quite well. Titles on video often show little halos around them when many-lobe filters are being used. I see it on the Alias discs and it bugs the heck out of me.


There are other ways of getting the extra sharpening effect from a 2-lobe filter, most notably by scaling the resampling function and subtracting a suitable gaussian function, which is essentially combining the resampling function with an unsharp mask function. In FFDShow, you could certainly try just doing Lanczos2 and an unsharp mask.


Don
 

·
Registered
Joined
·
3,510 Posts
Discussion Starter · #12 ·
So that nobody kills themself not knowing,


A- bicubic

B- 10-tap

C- 4-tap


More taps does make the image greener. Why?
 

·
Registered
Joined
·
124 Posts
dmunsil, et al -


Could you give me a couple of references to the mathematics of the various algorithms (at a level suitable for a programmer)?
 

·
Registered
Joined
·
1,638 Posts
Nice post dmunsil. Time to reevaluate my settings.
 

·
Registered
Joined
·
23,130 Posts
For the less obsessive among us, any chance of getting a VMR9/video card scaled screenshot in there? You know, to let us non ffdshowers know what we're missing:)
 

·
Registered
Joined
·
1,005 Posts
Quote:
Originally posted by lel4866
dmunsil, et al -


Could you give me a couple of references to the mathematics of the various algorithms (at a level suitable for a programmer)?
There really is no perfect reference. I just got Gonzalez and Woods's Digital Image Processing which is AFAIK the newest image processing textbook (c. 2002), and it just skates over image scaling in about a paragraph or two, mentioning that bicubic is arguably better, but saying, essentially, that bilinear is good enough. This is complete hogwash. They don't go into the mathematics at all. Their coverage of color theory is terrible as well, but that's another story.


There are some decent papers on image scaling you can find via Google. There are also some absolutely terrible papers, and papers that obviously were written to get something to publish and have no good or new ideas.


However, Ken Turkowski's survey of image scaling filters (which is a little old, but still worthwhile) is good:

Turkowski paper (PDF)


Ken makes the extremely common error of evaluating the frequency response of image scaling filters, which is completely without value. Frequency analysis suggests that very long sinc-based filters should be "perfect" and they aren't by a long shot. This is a result of engineers applying their hard-won knowledge of audio resampling to image resampling. Sadly, audio resampling and image resampling are very different.


The eye does not evaluate images in frequency space (as opposed to the ear, which does evaluate sound in frequency space), and thus the most frequency-accurate image scaling filters are not the most visually accurate. Amplitude locality, edge preservation, and smoothness are the three most important things to consider about an image scaling filter. Catmull-Rom is, IMO, the best overall compromise in those three areas. Many people prefer more sharpness (i.e. edge preservation) than Catmull-Rom produces by default, so one of many techniques for deepening the negative lobe can be used.


Catmull-Rom is one of a class of bicubic filters called Mitchell-Netravali filters. You can also play with the coefficients of Mitchell-Netravali and deepen the negative lobe that way. I find that approach pretty lame, myself. If what you want is more sharpening, think about what sharpening filters do and apply the same math to your scaling filters.


If you Google search terms like "Mitchell-Netravali", "Catmull-Rom image scaling" "Image resampling filters" you'll find all sorts of references.


It's also very worthwhile to look at the source to Paul Heckbert's Zoom program:

Zoom


It's generally quite efficient, and it does a very nice job of implementing a variety of different filters. ImageMagick uses Zoom as its basis for its internal scaling functions.


One can do better than Zoom or ImageMagick, but it requires a pretty solid understanding of the underlying basis of image scaling, which frankly is something one needs to do for oneself as far as I can tell. The usual texts are pretty lousy. Hope that helps.


Oh, and I made a mistake in my first post - Mitchell is not a windowed sinc filter; it's shorthand for Mitchell-Netravali.


Don
 

·
Registered
Joined
·
2,164 Posts
@Jesse S

What was your method for doing the screencaps? Just curious. BTW, the bicubic resize implementation beyond a certain version of FFDShow has a bit of a blur cast on it (or you could look at it another way and say it has had a reduction in sharpness), this was stated in the massive FFDShow thread by Andy himself. In order to immitate the bicubic that most people started out with it was necessary to adjust a certain slider (can't remember which one).


Cheers...

Duy-Khang Hoang
 

·
Registered
Joined
·
1,005 Posts
Quote:
Originally posted by lel4866
Thanks -


I think a good introduction I found on Catmull-Rom is : http://www.mvps.org/directx/articles/catmull/


They're supported in the DirectX math library!
I would not use that reference as a basis for understanding the Catmull-Rom image resampling filter. The image resampling filter is, in fact, based on the Catmull-Rom spline, but when used as a resampling filter the math reduces considerably.


Read what Ken Turkowski has to say about Lanczos, and keep in mind that a graph of the Catmull-Rom filter looks almost identical to the 2-lobe Lanczos filter.


Here's an intro to spline filters, with the equations, which includes the general-case Mitchell-Netravali and the specific coefficients that create the Catmull-Rom filter:

Spline image resampling filters


You'll learn more about how it's really done by looking at Zoom than reading the papers, but I recommend bouncing back and forth between the papers and net references and the source code.


When you can understand why you only need 4 taps for any upsampling operation, but 8 taps for 2X downsampling and 12 taps for 3X downsampling, you're most of the way there. :)


Don
 
1 - 20 of 111 Posts
Status
Not open for further replies.
Top