Separable Filters and a Bauble - Computerphile

Просмотров 103 тыс.

% 3 341

How do image processing apps and realtime applications apply effects so quickly? Dr Mike Pound decides to blur his Christmas Tree...
Mike's code: GitHub.com/mikepound/convolve
EXTRA BITS: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-MHIS59Se66k.html
computerphile
computer_phile
This video was filmed and edited by Sean Riley.
Computer Science at the University of Nottingham: bit.ly/nottscomputer
Computerphile is a sister project to Brady Haran's Numberphile. More at www.bradyharan.com

Опубликовано:

21 дек 2018

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 166

@dougmanatt4317 5 лет назад

Why not just buy a blurred xmas tree to start with?

@primarypenguin 5 лет назад

Don't tell the CS majors that you can translate python to C

@SilverKnightVGMusic 5 лет назад

I need the Fast Fourier transform video in my life!

@WarrenGarabrandt 5 лет назад

I've seen this done with a discrete cosine transform (which I believe is a special case of a Fourier transform). It was actually pretty neat, and SUPER fast.

@calaphos 5 лет назад

AFAIK convolutions become multiplications in frequency domain. so you take the FT of the image and the FT of the kernel, multiply them together and then take the inverse fourier transform of the image

@alcesmir 5 лет назад

@@calaphos Essentially yes. You would need some padding and cropping to not get cyclic behavior (left side blurring onto the right side and vice versa, and same for up/down).

@TruthNerds 5 лет назад

I'm no expert for the FFT, but it's "just" an algorithm for the discrete Fourier transform, originally only for cases where you dealt with N=2^n values, that is more efficient than the classic approach. The classic DFT algorithm requires a number of steps proportional to N^2 whereas the FFT requires only a number proportional to N log N. This is the same proportional difference, for example, between the average timings of bubble sort and quick sort and does make a huge difference for large N. Also, the Fourier transform is incredibly useful, image processing has already been mentioned, but also e.g. digital audio compression, statistics, cryptography, number theory to name but a few.

@KaktitsMartins 5 лет назад

"It just looks like my camerawork" :D

@Cybeonix 5 лет назад

Self deprecation.. a fine form of humor ;)

@Anvilshock 5 лет назад

I just wish he'd actually do something about it.

@General12th 3 года назад

@@Anvilshock His camerawork is fine.

@Anvilshock 3 года назад

@@General12th No, it's idiotic, hyperactive, nauseating garbage.

@ignaspy 5 лет назад

I could listen to him pouring the knowledge on my face for hours

@JacksMacintosh 5 лет назад

A video with Mike breaking down some Machine Learning terms? "What's Machine Learning VS. Deep learning? AI? Etc."

@dougmanatt4317 5 лет назад

Please make the FFT convolution video also!

@FrankHarwald 5 лет назад

32 butterflies liked your comment! :)

@amirabudubai2279 5 лет назад

It would be a short video. After a FT, a convolution is just multiplying. Taking the DFT(discrete version of FT) takes just as long as a single filter the size of the full image, but you can then do as many convolutions(of any size) as you want. FFT is closely related to jpeg compression and you see the same types of artifacting, but it is fast and allows for unlimited conversations like a DFT. TLDR, that is more a topic for numberphile. The convolution is so easy in the frequency/spatial domain that it is trivial, but the reason why it works is very math heavy.

@WildEngineering 5 лет назад

I was about to ask about Fourier Transforms because a convolution in time is multiplication in frequency

@ManWithBeard1990 5 лет назад

I'd like to point out that gaussian blur still needs to sample all pixels in that horizontal and vertical window but box blur doesn't. Box blur can be simplified to a pair finite input response filters that only take 2 samples per pixel each. A common trick in image processing software is then to use a series of box blurs instead of a true gaussian. The result you get is very similar, however the distribution is not a gaussian one, but a binomial one. Once you get enough box blurs (say, three or more), this distribution is similar enough to a gaussian one that it's hard to tell the difference.

@thomasnn 5 лет назад

This is the sort of video I enjoy the most

@Bolt6265 5 лет назад

ahhh I see Mike also still uses Windows Photo Viewer on windows 10 instead of whatever garbage photo viewer they include nowadays.

@FriedOrange 5 лет назад

on my machine, the Windows 10 photo viewer takes up over 600MB of RAM just while viewing ordinary jpegs...

@jamie_ar 5 лет назад

Candy Crush Photo Viewer

@anthonyvays5786 5 лет назад

it's because the new photo viewer is a "Universal Windows Platform" app which really means it's written in JavaScript and has to load a web browser to view your photo LOL

@jarliskalaskinowski8230 3 года назад

take a look at the program folder, a huge amount of files and dlls, remember that it is necessary to configure the programs folder and give access to you (user) to be able to see the files contained, a lot of unnecessary. I'm almost done with my 15mb photo viewing program that supports the formats written by me .tga, .pcx, net pbm and some other .jpg, .png, .tiff, .hdr, .psd, giff libraries. ..

@Minon157 4 года назад

This helped me implement a separable sobel filter thank you.

@kheerlen 3 года назад

Absolutely beautiful!

@kc9scott 5 лет назад

For image processing, although separable filters offer a huge speed benefit, they typically also have something of a penalty in image quality. If you use horizontal and vertical separable filters to sharpen, any diagonal detail in the image will get sharpened twice. Likewise, for blurring, any diagonal detail in the image will get blurred twice. This tradeoff should be taken into consideration. Usually though, the speed benefit is more important.

@BlackHermit 5 лет назад

Excellent code, very clean.

@noxabellus 5 лет назад

"I wasn't really interested in how fast it was -" I assumed that was a given with Python.

@nescius2 5 лет назад

thats so insightful, you must be python expert!

@tengkuizdihar 5 лет назад

Just a friendly reminder that pypy exist.

@jackeown 5 лет назад

Anyone who complains about python being slow either has never used python or has tried a really naive implementation of a solution to an expensive problem in pure python instead of using standard python tools like numpy or pandas which are implemented in C. You can see in this video that he's using numpy. Python is fast if you use it correctly.

@jackeown 5 лет назад

@ebulating was your python implementation using numpy or pandas?

@gralha_ 5 лет назад

@@jackeown "Python is fast if you use tools that are not in Python"

@MrGencyExit64 5 лет назад

I thought your area of interest was mostly security. Nice to see you have knowledge of my area of work too :) Separable filters are drastically important for post-processing in games.

@peabrainiac6370 5 лет назад

7:58 missed opportunity to stop the counter at 3:14:159, it went up to 3:14:190 :(

@slpk 5 лет назад

Dude your camera work is fine

@maxmusterman3371 5 лет назад

really enjoyed this. please more high level stuff like video analysis, maybe also more ML.. love u

@lasersimonjohnson 5 лет назад

Mr pound could talk about paint drying and I would still like the video !

@mohammadfallah.rasoulnejad5379 5 лет назад

Is it possible to have a talk about text localization in images? dr.Pound is really great when its come to explain things.

@MultiformeIngegno 5 лет назад

Great video!

@wimjongman 5 лет назад

Love the print paper. Are you still using dot matrix printers or is the paper left over from a great deal that your purchase department did in '92?

@Someone-jf3mb 5 лет назад

I am a simple man. I see Mike, I click.

@Kenspectacle 2 года назад

is there any way to classify which filters are suitable for the singular value decomposition?

@SiddharthKothotya 5 лет назад

How do you decide for a standard deviation of x you have to use the kernel size of y, how to you decide y here ?

@ecicce6749 5 лет назад

Did you write your code cache friendly? Like going in x direction in the inner loop and y in the outer?

@himselfe 5 лет назад

Computerphile Christmas Drinking Game: take a shot every time Mike says "really"!

@madhursharma6627 5 лет назад

Can you suggest a book from where I can learn more about these things in detail?

@olixz 5 лет назад

Dr Mike Pound, not all heroes wear capes.

@TiananmenSquareMassacre1989 5 лет назад

Very useful. Thanks.

@morgan0 5 лет назад

i wonder if my numpy-based (using some cython code) convolver would be faster than the first one. i’ve spent a long time optimizing it.

@faiskies_ 5 лет назад

Why does increasing the standard deviation change the runtime? If the filter size is constant in both the cases, only the values inside the filters will change, why does it end up in increasing the runtime?

@TheLivirus 5 лет назад

Oh, I see! I've been doing it the slow way! Thanks!

@maxmusterman3371 5 лет назад

in what cases is it possible to seperate the kernel?

@passingthetorch5831 5 лет назад

You can approximate any filter with the outer product of the first singular vectors (multiplied by the first singular value if you don't want scaling). A better approximation can be made using just the first few singular vectors (and values). Maybe you can do a video on the singular value decomposition. Or one on filters in the frequency domain.

@GumRamm 5 лет назад

So what about the Fast Fourier transform method?

@Ceelvain 5 лет назад

3:40 I was about to post a comment about using the Fourier transform to perform the convolution. (I kinda have an obsession with them.)

@cmdlp4178 5 лет назад

Some GUIs contain transcluent & blurry interfaces, the blurry effect is done by scaling the background image down and up again. examples: Windows Aero(Windows Vista & 7), IOS, Ubuntu Unity, etc...

@GeeItSomeLaldy 5 лет назад

Did that clock stop at 3m 14s 15fr?

@stvafel1172 5 лет назад

At 5:28, Mike promised us a source code link in the description. I can't find it.

@Computerphile 5 лет назад

Ah sorry, bear with me >Sean

@exekiajq 5 лет назад

Waiting for that code too! Come on Mike >:(

@DroidFreak32 5 лет назад

Chill it's just 30 minutes since the video :P

@stvafel1172 5 лет назад

@@DroidFreak32 well, i assume it has been edited for some time between filming and posting. even if that's not the case, he had 5 minutes already

@Computerphile 5 лет назад

not Mike's fault I just forgot to paste the link in :) Sean

@mattk8440 5 лет назад

I think it would be interesting to go over the maths of why the filter can be separated into horizontal and vertical parts. Or if anyone has a paper on it to link please :)

@justinnine4940 5 лет назад

how to tell if a 2D filter is separable?

@frankynakamoto2308 5 лет назад

This is great, could be used as similar to checkerboard rendering for video game performance.

@RSDDL 5 лет назад

What about the pixels at the boundary, how does the convolution deal with the adjacent “pixels” outside of the image boundary?

@kaijulian 5 лет назад

It depends om what you want it to do. You can ignore those pixels and get a smaller output image, you can pad the edges with zeroes which will result in a dark edge around it, or you can pad the edges by repeating information in your input. This could be repeating the edge pixel, mirroring pixels around the edge or wrapping the opposite side of the image around.

@dimitrisproios1860 5 лет назад

please do the fast fourier transforms too!

@smilebagsYT 5 лет назад

I'd love a video on convulsions based on FFT!

@kierenyork6791 5 лет назад

These videos never fail to make me feel like a moron...

@patrickmullan8356 5 лет назад

Would be nice to provide (or explain; I can google it if I just want it) the Algorithm, how to split a 2D-Matrix intwo 2 1D-Vectors for doing these convolutions. And probalby explain which Matrices are seperable, and which are not :>

@WarrenGarabrandt 5 лет назад

Since we are talking about speed to run the code, wouldn't you also gain some speed between the first and second passes if you rotated the image 90 degrees and did two horizontal passes (one before and one after)? This would allow the CPU cache to better predict what data you need loaded, thus saving you a lot of cache miss RAM fetches (when reading previous and next rows of data).

@styleisaweapon 5 лет назад

image rotations arent free.

@alcesmir 5 лет назад

@@styleisaweapon In this case a rotation would be pretty much free compared to doing a convolution, especially if your kernel is decently sized.

@styleisaweapon 5 лет назад

@@alcesmir Saying it doesnt make it true. A rotation forces two different access patterns, one with short stride, one with long. Doesnt matter which one is the read and which one is the write. Stop trying to get a free lunch by hiding the work in an abstraction trick. The abstraction also has costs.

@styleisaweapon 5 лет назад

cache misses have costs no matter how you try to disguise them or hide them or pretend they arent there - the proof is that you are trying to solve the cache miss problem - by doubling the number of them

@alcesmir 5 лет назад

@@styleisaweapon My point is not about reducing cache misses. My point is that doing the rotation O(width*height) is cheap compared to the convolution step O(width*height*kernel size), especially for large kernels. I agree doing the rotation is pretty useless in this case, but the argument is that you're just introducing (potentially more) cache misses somewhere else, not that image rotation is massively expensive.

@Gonkers44 5 лет назад

The video won't even load for me. Interesting. Makes me wonder, with the other quality comments, if the video got corrupted on upload.

@Gonkers44 5 лет назад

I had to manually select 360p for the video to load.

@tommerchant7542 5 лет назад

There's also a good IIR filter for gaussian blurs.

@hojjat5000 5 лет назад

That number should be multiplied by 3, because RGB.

@remberto2008 5 лет назад

Can you do a video on QUADTREEs

@styleisaweapon 5 лет назад

Probably should be pointed out that the only circularly symmetric 2D filters that are separable into two 1D filters is the gaussian. This means that circularly symmetric filters like high pass filtering cannot be done by such decompositions.

@barrettstolzman4068 5 лет назад

Only 360p?

@deoxal7947 5 лет назад

And no speed options

@neumdeneuer1890 5 лет назад

watching it in 360 because I can't wait

@barrettstolzman4068 5 лет назад

@@sn0opyKS but the "Extra Bits" video has HD as an option already

@NuclearCraftMod 5 лет назад

Maybe a high-quality low-quality joke?

@spaqin 5 лет назад

sounds like my camerawork

@nNiceDreamsMadeTrue 5 лет назад

Love your channel! Could you do a video on how game protection works; what cracking a game looks like, tricks used by game studios and some historical highlights?

@MattyFez 5 лет назад

I see Mike uses a thinkpad

@aonoymousandy7467 5 лет назад

I own several, they are one of the most comfortable laptop keyboards and they look nice with their classic black monolithic design

@noahmccann4438 5 лет назад

“We need a computer” (at 4:55) - not if you really want to show the performance difference! Though I wouldn’t recommend trying to compute the 16 megapixel image with pen and paper...

@potatok123 5 лет назад

This video is blurred...

@DotcomL 5 лет назад

Wait a few hours EDIT: Oh I got the joke

@sebastianelytron8450 5 лет назад

Is this supposed to be funny?

@tehguitarque 5 лет назад

360p tho

@digitalairaire 5 лет назад

10:20

@FrankHarwald 5 лет назад

depending on how you view it, it's either interpolated or downscaled #nerdoff

@colinmaharaj 5 лет назад

Where is the actual convolution code in C++

@MrCOPYPASTE Год назад

Couldn't we create a copy of the original images rotated by 90º avoiding a vertical pass and use it also the horizontal filter? If I recall that would be cache friendly and be orders of magnitude faster(assuming that could be performed). I remember that for Doom wall rendering does that for source images.

@nO_d3N1AL 5 лет назад

A video on Fast Fourier Transform would be useful. I can't get my head around it

@zilberberghome736 3 года назад

O(n). n increased from 60(8sec) to 271 (×4.5). Gusses: 18-21 seconds -surely trolling :) time it took: 271/60*8sec=36sec. right on. Great video btw.

@samvente1261 5 лет назад

PLEASE DO THE FFT VIDEO

@Guyflyer12 5 лет назад

Does not seem that you guys posted the code!

@recklessroges 5 лет назад

Thank you for the github link... (just as I've migrated to gitlab.) Wasn't able to add an issue, so python3 run.py --no_separable_filters --sigma 3.0 ./christmas.jpg ./christmas.out.jpg Traceback (most recent call last): File "run.py", line 84, in main() File "run.py", line 61, in main compute.convolve(img, output, kernel_2d) File "compute.pyx", line 7, in compute.convolve (compute.c:1705) def convolve(float[:, :, ::1] image, float[:, :, ::1] output, float[:, ::1] kernel): ValueError: Buffer dtype mismatch, expected 'float' but got 'double' Makefile:10: recipe for target 'run-fast' failed make: *** [run-fast] Error 1

@brandoncarroll587 5 лет назад

WOW at 8:50 there are major spoilers on the screen. :D

@Wargon2013 5 лет назад

That camera on the tripod, is that a Sony?

@Computerphile 5 лет назад

Panasonic LX100 >Sean

@Wargon2013 5 лет назад

@@Computerphile Looks quite similar to the Sony Alpha 6000 series. Thanks.

@Jone952 4 года назад

bounds checks in C?

@maxmusterman3371 5 лет назад

instant braingasm

@DustinRodriguez1_0 5 лет назад

Python can be plenty fast, but writing fast Python usually isn't emphasized. There's a couple great articles from a guy at IBM online where he takes a straightforward Mandelbrot implementation that took several seconds to run and got it to running in nanoseconds. The code was still pretty readable, but you certainly have to do it intentionally. I'd expect first step for anything like this is you need to take advantage of numpys vectorized operations. Were you looping through the indices manually? You can use something called 'stride tricks' to do very fast sliding 2D windows over large matrices with numpy and keep operations vectorized which makes a big difference on modern CPUs.

@michaelpound9891 5 лет назад

Python is inded fast enough for 99.9% of use cases I've found. In this case, you need a bit of a workaround. It kind of depends on what you mean by python being fast here. The Numpy approach would be fine if you vectorised it all, but in some sense by fully vectorised what that means in practice is the heavy lifting is done almost entirely in C behind the scenes. This isn't a complaint about python, it's just how it's designed. I tried a partial vectorisation, but I left it at that because using a full vectorised approach (or indeed the convolve function in numpy) would have somewhat defeated the object of the demo. Similarly Opencv just uses an FFT, which again wouldn't show what I was trying to do!

@jasondoe2596 5 лет назад

Oh come on, Python is inherently *extremely* slow, and also constrained by its infamous GIL (global interpreter lock). There are indeed many things you can do to improve performance, from cython to pypy (I love pypy myself!) but pure Python is actually much, much slower than C, Java, Julia, Haskell etc. What helps is that part of its "batteries included" library (BLAS routines etc.) is actually written in very fast C, but if you have to write preformant code yourself you should be ready to use its FFI...

@jasondoe2596 5 лет назад

(and then it's not Python anymore)

@tehguitarque 5 лет назад

Irony with it being at 360p, but also bad because the slight blur is unseeable to me! 2n > n^2

@legotechnic27 5 лет назад

n=1?

@nichonifroa1 5 лет назад

Your server is called deadpool?

@Computerphile 5 лет назад

Maybe watch the "beast gpu cluster" video >Sean

@KuraIthys 5 лет назад

Good old bilinear filtering. The trick is in the name. XD

@jborgesz 5 лет назад

places Portuguese subtitles

@abstractgarden5822 5 лет назад

Paint .NET, wooo...

@WildEngineering 5 лет назад

photopea

@misterbonzoid5623 3 года назад

eggnog_latte is now my password

@VorpalGun 5 лет назад

Why only 360p?

@STDrepository 5 лет назад

what the heck is a bauble?

@typhoonf6 5 лет назад

My boy Mike went up a level in the last video with his favourite language being c# :-D

@mark- 5 лет назад

360p?

@vedient 5 лет назад

why we dont use the same method to do convolutions in CNN? It would make CNN faster.

@potatok123 5 лет назад

@Yupppi 3 года назад

It took pi minutes: seconds. Coincidence?

@csbruce 5 лет назад

Interpreted languages… for when you DO have all day!

@Ceelvain 5 лет назад

What *is* a scripting language? (That's a real question.) If you want to define that with respect to the way it's executed (with an interpreter, with a compiler to machine code, or with a compiler to byte code), that's actually incorrect. And actually, the notion of "compiled language" or "interpreted language" doesn't really make sens. There are C interpreters and and there are python compilers. And neither the language itself nor the way it's executed decide its performance. Compilers have all the time in the world (kind of) to analyze the code and generate the best possible machine code. Interpreters (and virtual machines) can collect statistics and perform the optimizations on the fly based on the actual distribution of values that your program runs on. There's no clear winner as of now.

@styleisaweapon 5 лет назад

@@Ceelvain obviously by "scripted" he actually means "interpreted." One of those situations where the person knows just enough to think themselves well versed while actually tricking them into being publicly misinformative.

@Tristoo 5 лет назад

This is the reason I don't like python. That wouldn't have been hard at all to do in C, but ofc he compiled the python or whatever instead of doing it.

@crynekproductions4700 5 лет назад

Look who lost some weight ?

@simonchapman9201 5 лет назад

You have fast, pretty, cheap options. Choose TWO. Or wait for all THREE options. And wait. Times up. New games are better games.

@golden9540 5 лет назад

First

@TimMeep 5 лет назад

360 for 2 hours, then suddenly 4K (odd)

@Ironypencil 5 лет назад

Not odd at all, very typical actually

@michaelcharlesthearchangel 5 лет назад

In a quantum computer We call a quantum network image "blur" a more APPropriate term: a "bLend". The 5D (and thus 5G) quantum super-encoders are called "bLenders". bLenders are fed into the quantum bRoadcasts of the Neuronet, the Matrix made up of all IOTs (Internet-of-things). A period of phases of IOTs is often called a nNOTt in the near future. What does "nNOTt" mean? ¿Neural-Net-of-things/thinks.