Making Photo Mosaics
This item was originally posted on the Wolfram Research blog at http://blog.wolfram.com/2008/05/02/making-photo-mosaics/
You've probably seen examples of photo mosaics where each "tile" in the mosaic is a tiny photograph, selected so the overall brightness and color of the tiny photo averages out to the brightness and color needed for its position in the overall mosaic.
Following a suggestion by Ed Pegg, I suddenly found it impossible to imagine life without a photo mosaic of Dmitri Mendeleev, the principal inventor of the periodic table, made out of photographs of the elements.
It was convenient in this regard that I possess the world's largest stock library of photographs of the chemical elements---about 2000 photographs of roughly 1550 different physical samples of the pure and applied elements---along with a photograph of Mendeleev and a bit of software called Mathematica. (You can see this library at periodictable.com; don't forget to order a copy of my photo periodic table poster.)
You might think that creating photo mosaics is a standard task for which software, probably even free software, is available. And for all I know it is. But upon brief reflection I decided it would probably be faster and easier for me to write code to do this from scratch in Mathematica than it would be to find something to download and then figure out how to use it.
It turns out you can do a first pass at it with three lines of input. We need two things to start with. First, a directory full of image files: I made one called "Pool" filled with a couple thousand small (32-pixel square) JPEG files. Second, a master image to form the mosaic out of: I made one called MendeleevIcon.tif, which contains a 20x20 pixel image of Dmitri Mendeleev's face. (You don't want too large of a master image; otherwise the tiled composite images will be too small to see, and the effect is actually more interesting for relatively coarse mosaics because up close they look like nothing, while from a distance the face jumps out at you.)
With those prerequisites, here's all it takes to create a credible mosaic image:
If it doesn't look like a face to you, step back about ten feet from your monitor and look again.
Now, before we get into all the things wrong with this image, let's look at how the code works. The first line imports all the images and computes the average color of each one:
Import[] returns an expression of the form Graphics[Raster[matrix, ...], ...] where matrix is a two-dimensional array of {r, g, b} color values. To compute the average color, we extract the matrix, flatten it into a list of RGB values, and calculate the average (Mean[]) of those values (each value is a triple of numbers, but Mean is happy to work with lists as well as individual numbers, so it computes the average triple).
The result is a list of pairs, {image, averagecolor}, as we can see by looking at the first few elements:
Note that the actual image, not just a reference to the file, is part of the imagePool variable. This is convenient for a relatively small list of images; if you had 50,000 instead of 2000 you might choose to store just the file name and reimport the image when needed. As a point of reference, importing these 2000 images and computing the average colors takes about eight seconds on my laptop and uses about 24MB of memory in Mathematica.
The second line is the meat of the algorithm, if you can call it that:
The function closeMatch sorts all the images according to the root-mean-square distance of their average rgb color from the target rgb color passed in c. (Norm computes the length of a vector, which happens to be the same as computing the root-mean-square of the coordinate values if you don't care about a constant factor of two.) Then it randomly picks one of the closest 20 images. As you can see, this results in a good bit of repetition of images, though not nearly as many as if we just took the absolutely closest one each time. (And yes, it's incredibly wasteful to actually sort all 2000 of the images just to pick from the best 20, but hey, until computers get rights there's no law against abusing them. My time is much more valuable than my laptop's, and in this example it doesn't take that long to do the sorting anyway. You could use the function Ordering with a slightly more complicated sorting function to avoid this wasteful over-sorting, but as we will see there are better algorithms for picking the images anyway.)
The third line creates the output by mapping the closeMatch function onto the matrix of rgb values that comes from importing the master image, then wrapping that in Grid to lay the mini-images out in a matrix. (Without the Reverse the image would be upside down.)
And that's that. If you want to write the resulting image into a file, you can wrap the last line in Export["MendeleevMosaic.tif", ...] or use the Copy As menu to copy a bitmap image of the output cell to paste into your favorite image-editing program.
Now that we've explained the code, we can acknowledge some of the problems. The most glaring issue is, of course, the repetition: the way the photo mosaic game is played, it's generally considered poor form to ever repeat an image, let alone have several repeats in close proximity. The first draft was a useful test to develop some of the core concepts, and get a nice-ish result quickly, but with a bit more code we can do much better.
To avoid repetition we're going to write a function that selects the unique closest-possible match to a given color value, then drops that image from the pool of available images. The function uses Ordering to return the position of the minimum in the list of color distances, which is much more efficient than sorting the whole list. We select that image from the working pool of available images, then drop it from the working pool (the global variable workingPool needs to be reset to the whole list each time you make a new image).
We can now simply map closestMatchAndDrop onto the pixels in the master image to get a version without repetition:
(If you do see what you think is repetition, it's only because some samples are very similar. I know there are actually no identical images, because in selecting the files to put in the pool I did a global comparison of bits between all images and eliminated duplicates.)
Of course, there is an obvious problem with this version of the code: we are applying closestMatchAndDrop to the pixels in order from bottom left to top right, and the pool is getting smaller each time. We are in effect giving priority to the pixels lower down in the image to get the best pictures; those at the top have to make do with the dregs.
We can see how big a problem this is by filling the pixels in the opposite order (by moving the Reverse command inside the Map). Notice how in the first version the area above his chin, which was filled first, is fairly well defined, while in the second version his forehead is better but his chin is more washed out.
We can do better. I think a fairly good choice might be to assume that pixels near the midrange of brightness are more important than those that are very dark or very light. So we should fill the pixels not in any special order, but in order of distance from 50% brightness.
First we create a list of all the pixels represented as pairs of the form {color, position}, where color is the color of the pixel and position is the {row, column} position of that pixel in the original image. Then we sort that list by the distance of each pixel from 50% gray:
Then we reset the working pool of images and make a new list where each pixel is represented by a pair of the form {image, position}:
Now we just need to sort that list by its second elements (the positions), break it into rows, and use Grid to arrange all the images:
I think this one looks better than either of the others, and it has no obvious bias toward being better on one side or another. While picking in order from medium to more extreme color values is by no means a perfect algorithm, it's not bad, and let's pause here to consider how even this fairly sophisticated version of making image mosaics is still just a few lines of code.
The mosaic can be further refined, at the risk of being called a cheat, by post-processing the selected images to make their brightness and color balance exactly match the target pixel. In other words, we first pick the closest available picture, then tweak it to make it a perfect match, rather than just use the best available match.
These three functions take any image and skew the red, green, and blue color channels in one direction or the other to make their average exactly match a target color. The farther the target color is from the initial average color, the more obviously modified the resulting image will be.
Now we can repeat the last stage of the code above, but wrap colorShiftImage around the selected images:
This image looks better than any of the others from a distance, but up close many of the pictures are very washed out from attempting to get the light tones of the original image, which are lighter than any images in my pool. We can make improvements by acknowledging that the bright pixels are the hardest ones to fill, and changing the order in which the pixels are filled to give highest priority to the brightest ones:
I have lots and lots of images with black backgrounds, so the dark pixels look quite natural. The algorithm found images that were already very close to their target brightness.
By the way, I should mention that this code all works perfectly with color images, but Mendeleev exists only in black and white. This is fitting because almost all the elements, and therefore almost all the photographs I have to work with, are metals of one shade of gray or another. So given a database of photos of the elements, it's actually a good thing to work with black and white master images.
Let's try a new tack with a simple stochastic (random) algorithm. This one works by randomly selecting pairs of slots in the mosaic and the pool and seeing if the result can be improved by swapping those two colors. If so, it does the swap; if not, it does nothing and tries a new pair of slots.
I won't explain the code in detail other than to say that for convenience the image has been turned into a flat list of colors: the first 400 elements of currentColorsPool represent the 400 pixels of the mosaic.
To get started we initialize the master image and current color choices for each pixel in it:
The function doPairSwitches[n] tries n random swaps to see if they improve things. One fun thing about this algorithm is that you can watch what's going on by simply dynamically watching the value of currentColorsPool. The following command sets up an image that will always show the current state of the optimization. At first, it looks completely random, because it's just the first 400 colors in the pool.
But when we run the following command, the face takes shape quite rapidly:
(In the blog we've inserted a movie that shows roughly what this looks like at about 1/5 real time. On my laptop it takes about three seconds to do 100,000 cycles, while the movie runs for about 15 seconds here so you can better see what it's doing. In Mathematica itself, the output above would just change in place.)
For convenience and efficiency we've been working with just the colors, not the actual mini-images. To turn the list of colors back into images, we use the colorsToImagesRules list set up above (there are no identical average colors in the pool, so the color uniquely identifies its source image). Here's what the image looks like after 1,000,000 cycles (about 30 seconds on my laptop):
And just for fun, here's a movie showing the progress through 2,000,000 cycles (first every 10 cycles, then every 40 cycles, then every 2000 cycles):
Of course, generating this movie takes a good bit longer than a minute because it has to generate and export 3000 full image frames: it takes about an hour on my laptop. Here is the code that generates each frame of the movie:
So, this is pretty good, and it's a neat movie done with (not to sound like a broken record) not a whole lot of code.
But of course, there is a gold standard here. There is one uniquely most-perfect possible choice and arrangement of images, the one that gives the globally minimal error in color and brightness. Setting up this calculation is beyond my mathematical knowledge, so I turned to numerics expert Daniel Lichtblau, who kindly sent me the code below.
Daniel's solution is to use linear programming starting from a 400 by 2000 (approximately) matrix that gives the color distances between every pair of target values and available pool colors. Each of these 800,000 values is then multiplied by its own unique variable, and those values are added up; that becomes the target function to be minimized. But each slot in the master image can get only one image, not a linear combination of many of them, so Daniel introduces a set of 800,000 constraints that say each variable must be between 0 and 1. There are 2400 further constraints, which say the sum of the variable values in each row must equal 1, and columns must sum to no more than 1. These constraints will only be satisfied if each variable is exactly 0 or 1, and there's a single 1 in each row (some columns will have only 0s). Okay, this fact relies on use of random perturbations, a bit of subtle theory, and very probably some outright magic.
So my image problem is reduced to a linear programming problem with 800,000 variables and 802,400 constraints. Child's play! OK, it does take about four minutes on my laptop, using an interior point method, I am told.
Here is the setup, constructing the matrix of color distances, the target function, and the constraints. (The random fuzz added to each color ensures that every pixel has a different color, even though there are actually a fair number of duplicates. With duplicates there is no single unique best result, which upsets the algorithm.)
Here is the magic, a call to the built-in function NMinimize (this line takes about four minutes on my laptop):
And here we pick the results back out of the return value of NMinimize:
So there it is, the best possible image of Mendeleev made from my pool of images.
To judge how good the earlier results are, we can look at the value of the target function (the sum of the deviations from the target color at each pixel). For the globally optimal result, the value is 9087.82 (the absolute number doesn't mean very much; what counts is how it compares to the results from other algorithms).
In the order we tried them, here are the error sums for each algorithm. The very first one, which allowed repetition of the same image, has a much lower error even than the optimal result, but, of course, that's because it can cheat by using the same image multiple times, which makes the problem trivial and the resulting image uninteresting.
Interestingly, the relatively simple algorithm of picking the best match for each pixel in order of how close that pixel's value is to 50% brightness gave a better result than anything other than the globally optimal result. The random-swapping algorithm never gets much better than about 9500, even if you let it run for many more cycles.
This is a useful thing to know, because even with Mathematica you probably don't want to try computing the true global optimum for a much larger problem than this. Two thousand images is actually a fairly small pool to start with; 50,000 would be a better number, and with that many, computing a global optimum would be out of the question.
By finding a global optimum for this relatively small problem we have validated the simpler and far more efficient algorithm, and can use it with greater confidence when the pool is larger. (And in fact, the more images you have, the less difference you would expect between the simple algorithm's result and the optimal.)
While the random-swapping algorithm doesn't seem to give quite as good a result in the end, it does have the great advantage of producing an interesting sequence of intermediate images you can make into a movie.
One of the really nice things about Mathematica is how you can just start noodling around in a very quick, nimble environment with a lot of support infrastructure. The first three lines I wrote to get some kind of mosaic image are really quite remarkable in their compactness and simplicity. That's not an aberration---it's typical of work in Mathematica. Now, it wasn't there yet, but that code let me establish a framework in which to explore, leading, in relatively short order, to several new algorithms of increasing complexity and even a cute movie I haven't seen done before.
There is no other system in the world you can do this kind of stuff in this fast. None.
And by "this kind of stuff," of course, I don't mean specifically creating mosaic tile movies illustrating the progress of an optimization algorithm. I mean any sort of open-ended computational problem where you need to develop some original ideas and make them work.
(You can download this notebook, minus the output cells, and you can also get my "Pool" directory of images if you want to play with them (3.4MB zip file). Please note that all these images are copyrighted by me (Theodore Gray), and I'm giving you a dump of my entire library. You're welcome to play with them all you like, but if you post anything based on these images on a website, please give credit to periodictable.com as the source of the images. If you want to publish anything commercially, you'll need permission from me.)