Author, Publisher, Developer, Stitchcoder


The Incredible Convenience of Mathematica Image Processing

This item was originally published on the Wolfram Reasearch blog at

Mathematica allows you to embed images directly into lines of code

It's been possible since Version 6 of Mathematica to embed images directly into lines of code, allowing such stupid code tricks as expanding a polynomial of plots.

But is this really good for anything?

As with many extremely nifty technologies, this feature of Mathematica had to wait a while before the killer app for it was discovered. And that killer app is image processing.

Mathematica 7 adds a suite of image processing functions from trivial to highly sophisticated. To apply them to images, you don't need to use any form of import command or file name references. Just type the command you want to use, then drag and drop the image from your desktop or browser right into the input line. Here's an undersaturated mandrill, and a command to increase the contrast. When you surround an image with textual input the image is automatically displayed at an icon size, but the input line still contains the full data of the image. That means if you save the notebook containing the input, the image is saved with it: the input line is completely self-contained.

Image of an undersaturated mandrill, and a command to increase the contrast

Is that too much contrast? An obvious thing to want is a slider that lets you adjust the contrast. The general-purpose Manipulate command lets you make just about anything interactive, including the contrast parameter in this example.

Using the Manipulate command to make the contrast parameter interactive

OK, but you could just do that in Photoshop, right? Oh shush, let's do something you definitely can't.

Here's a clown fish, and a command that breaks it into 40-pixel squares.

Fish image broken into 40-pixel squares

By the way, we're seeing another neat thing about the integration of images with Mathematica's typeset input/output system. This result isn't an image, it's a list of images. Lists are general things in Mathematica, and lists of images are no exception. For example, here are the image patches in reverse order.

Fish image's patches in reverse order

And here they are sorted by average pixel value (roughly by brightness):

Fish image's patches sorted by average pixel value

And here are the images sorted into a scatter plot where the x axis represents the red component, the y axis represents the green component, and the size represents the blue component. (Image processing commands are fully integrated and compatible with charting commands. This is, after all, Mathematica, where one comes to expect things to work with each other.)

Image patches in a scatter plot

See how all the green patches are smooth and all the red patches are high-contrast? It might be interesting to look at these patches in a similarity network rather than a scatter plot. Here's some code to do that.

Coding a function to make similarity graphs

This function identifies patches that are similar in color, then connects them into a network. The parameter says how many neighbors to look at before building the network.

Building a three-neighbor similarity network

Remember how we made a Manipulate to play with the contrast of an image? How about a Manipulate to play with the number of neighbors?

Manipulating the number of neighbors in our similarity graph

So there you go: an interactive slider you absolutely cannot get in Photoshop, or anywhere else for that matter.

Let's look at another application, edge detection. Here's a very pretty picture taken by Peter Overmann, who led image processing development in Mathematica:

Unprocessed image of jellyfish

Here's what they look like processed through three image processing functions:

Processed image of jellyfish

LaplacianFilter applies a form of edge detection, ImageAdjust tweaks the brightness and contrast, and then Dilation expands the bright areas to emphasize them.

The end result is neon jellyfish!

One of the main advantages of doing something like this in Mathematica is that you can trivially replicate operations for many images. Sure, you can write PhotoShop scripts; I've done that many times, and every time I end up swearing at the thing wishing I could use a real language (and now that Mathematica 7 is out, I finally can). Beyond a certain level of complexity it's simply not efficient to mess around with dialog boxes and script editors: you want a real language with a real syntax and a real API.

A simple example: let's apply this image processing chain not to one image, but to a series of them. Just to show off, I'll download those images directly from the web using programmatically constructed URLs. Here's a command that constructs ten simple URLs for images from one of my personal websites and imports them all into Mathematica.

Downloading images from the web

And now let's apply the same processing to all ten of those images.

Processing all ten images

Well that's not ideal. Let's make a custom interactive application with a slider that lets us play with the Laplacian filter radius and the dilation parameters:

Manipulating filter radius and dilation parameters

Every time you move a slider, it's reprocessing ten images automatically. Now that's integration, real power that comes from taking core functionality and combining it with powerful, general-purpose tools like Manipulate and symbolic processing. Not to belabor the point, but this is really a fundamental advantage of Mathematica over any other system out there.

Of course the ultimate example of wanting to apply image processing to many images is video processing. To the goose footage!

Here's a frame from a video I want to process (click the image to see the whole video):

Movie frame--click to view entire movie

What I want to do is draw a blue line around each goose. Why? Don't ask, it's a demo, OK?

The function I need for this is MorphologicalComponents, which identifies objects in an image.

Using MorphologicalComponents

To turn this into a blue outline I apply a series of image processing steps. First I expand the dark area:

Expanding the dark areas with the Erosion function

Then I take the perimeter (outline) of the areas:

Finding the components' perimeters

Then thicken the outline:

Thickening the outline with the Dilation command

Then turn the outline into a semi-transparent blue:

Changing the ColorSpace of the outlines

And finally composite this outline onto the original image:

Composite the outline onto the original image

There you go, blue-outlined geese! Now I just need to do it for the other 424 frames in the movie. The movie is easily split into separate frames using QuickTime Pro. I put all the frames into a directory called "GeeseIn" in the same directory as this notebook. First I count the number of frames:

Counting the frames from the movie

This command loads one frame of the movie:

Loading a single frame

The frames load pretty fast, so I can make a perfectly usable little movie viewer just using Manipulate. Put into animation mode and adjusted to the right speed this actually plays the movie quite smoothly, despite the fact that it's reimporting each frame from disk at every step.

Using Manipulate as a movie viewer

To process the movie automatically I put all the image processing steps together into one function:

Combining all the image processing into a single function

This function is fast enough that I can make another little viewer that lets me flip through the processed movie in pretty close to real time:

Viewing the processed movie with Manipulate

To output a full movie I need to convert and save all the frames. This function writes one frame out to a directory:

Writing a single frame to a directory

And this Do command processes all the frames:

Processing all the frames

Finally, it's a simple matter to assemble the movie using QuickTime Pro, resulting in this final product:

The finished movie--click to view

This is of course just the tip of the iceberg. For example, I'm not using the fact that MorphologicalComponents actually separately identifies and numbers each object in the image. It returns an image where each pixel is assigned the index number of the object occupying it. In other words, I can color each goose differently!

This constructs a table of different colors for each object index number, and black for the background.

Establishing a color for each object image number

This command identifies the objects, then colors each one according to the given rules.

Identifying and coloring each object

And this command processes all the frames.

Applying the separate colors to all frames

The mod (or is that rad?) result is this movie:

Colored geese--click to view movie

The flickering of colors, well, that's because the index numbering of the objects changes from one frame to the next. Avoiding that, in effect doing persistent object detection across multiple frames, is a whole other can of worms, and beyond the scope of this blog post.

Though if you were wanting to develop such an algorithm, Mathematica might not be a bad place to start.

Thank you to David Eisenman for the suggestion.