The Incredible Convenience of Mathematica Image Processing
This item was originally published on the Wolfram Reasearch blog at http://blog.wolfram.com/2008/12/01/the-incredible-convenience-of-mathematica-image-processing/
It's been possible since Version 6 of Mathematica to embed images directly into lines of code, allowing such stupid code tricks as expanding a polynomial of plots.
But is this really good for anything?
As with many extremely nifty technologies, this feature of Mathematica had to wait a while before the killer app for it was discovered. And that killer app is image processing.
Mathematica 7 adds a suite of image processing functions from trivial to highly sophisticated. To apply them to images, you don't need to use any form of import command or file name references. Just type the command you want to use, then drag and drop the image from your desktop or browser right into the input line. Here's an undersaturated mandrill, and a command to increase the contrast. When you surround an image with textual input the image is automatically displayed at an icon size, but the input line still contains the full data of the image. That means if you save the notebook containing the input, the image is saved with it: the input line is completely self-contained.
Is that too much contrast? An obvious thing to want is a slider that lets you adjust the contrast. The general-purpose Manipulate command lets you make just about anything interactive, including the contrast parameter in this example.
OK, but you could just do that in Photoshop, right? Oh shush, let's do something you definitely can't.
Here's a clown fish, and a command that breaks it into 40-pixel squares.
By the way, we're seeing another neat thing about the integration of images with Mathematica's typeset input/output system. This result isn't an image, it's a list of images. Lists are general things in Mathematica, and lists of images are no exception. For example, here are the image patches in reverse order.
And here they are sorted by average pixel value (roughly by brightness):
And here are the images sorted into a scatter plot where the x axis represents the red component, the y axis represents the green component, and the size represents the blue component. (Image processing commands are fully integrated and compatible with charting commands. This is, after all, Mathematica, where one comes to expect things to work with each other.)
See how all the green patches are smooth and all the red patches are high-contrast? It might be interesting to look at these patches in a similarity network rather than a scatter plot. Here's some code to do that.
This function identifies patches that are similar in color, then connects them into a network. The parameter says how many neighbors to look at before building the network.
Remember how we made a Manipulate to play with the contrast of an image? How about a Manipulate to play with the number of neighbors?
So there you go: an interactive slider you absolutely cannot get in Photoshop, or anywhere else for that matter.
Let's look at another application, edge detection. Here's a very pretty picture taken by Peter Overmann, who led image processing development in Mathematica:
Here's what they look like processed through three image processing functions:
LaplacianFilter applies a form of edge detection, ImageAdjust tweaks the brightness and contrast, and then Dilation expands the bright areas to emphasize them.
The end result is neon jellyfish!
One of the main advantages of doing something like this in Mathematica is that you can trivially replicate operations for many images. Sure, you can write PhotoShop scripts; I've done that many times, and every time I end up swearing at the thing wishing I could use a real language (and now that Mathematica 7 is out, I finally can). Beyond a certain level of complexity it's simply not efficient to mess around with dialog boxes and script editors: you want a real language with a real syntax and a real API.
A simple example: let's apply this image processing chain not to one image, but to a series of them. Just to show off, I'll download those images directly from the web using programmatically constructed URLs. Here's a command that constructs ten simple URLs for images from one of my personal websites and imports them all into Mathematica.
And now let's apply the same processing to all ten of those images.
Well that's not ideal. Let's make a custom interactive application with a slider that lets us play with the Laplacian filter radius and the dilation parameters:
Every time you move a slider, it's reprocessing ten images automatically. Now that's integration, real power that comes from taking core functionality and combining it with powerful, general-purpose tools like Manipulate and symbolic processing. Not to belabor the point, but this is really a fundamental advantage of Mathematica over any other system out there.
Of course the ultimate example of wanting to apply image processing to many images is video processing. To the goose footage!
Here's a frame from a video I want to process (click the image to see the whole video):
What I want to do is draw a blue line around each goose. Why? Don't ask, it's a demo, OK?
The function I need for this is MorphologicalComponents, which identifies objects in an image.
To turn this into a blue outline I apply a series of image processing steps. First I expand the dark area:
Then I take the perimeter (outline) of the areas:
Then thicken the outline:
Then turn the outline into a semi-transparent blue:
And finally composite this outline onto the original image:
There you go, blue-outlined geese! Now I just need to do it for the other 424 frames in the movie. The movie is easily split into separate frames using QuickTime Pro. I put all the frames into a directory called "GeeseIn" in the same directory as this notebook. First I count the number of frames:
This command loads one frame of the movie:
The frames load pretty fast, so I can make a perfectly usable little movie viewer just using Manipulate. Put into animation mode and adjusted to the right speed this actually plays the movie quite smoothly, despite the fact that it's reimporting each frame from disk at every step.
To process the movie automatically I put all the image processing steps together into one function:
This function is fast enough that I can make another little viewer that lets me flip through the processed movie in pretty close to real time:
To output a full movie I need to convert and save all the frames. This function writes one frame out to a directory:
And this Do command processes all the frames:
Finally, it's a simple matter to assemble the movie using QuickTime Pro, resulting in this final product:
This is of course just the tip of the iceberg. For example, I'm not using the fact that MorphologicalComponents actually separately identifies and numbers each object in the image. It returns an image where each pixel is assigned the index number of the object occupying it. In other words, I can color each goose differently!
This constructs a table of different colors for each object index number, and black for the background.
This command identifies the objects, then colors each one according to the given rules.
And this command processes all the frames.
The mod (or is that rad?) result is this movie:
The flickering of colors, well, that's because the index numbering of the objects changes from one frame to the next. Avoiding that, in effect doing persistent object detection across multiple frames, is a whole other can of worms, and beyond the scope of this blog post.
Though if you were wanting to develop such an algorithm, Mathematica might not be a bad place to start.
Thank you to David Eisenman for the suggestion.