- Index
- ImageMagick Examples Preface and Index
- Methods of Comparing Images -- what is different?
- Comparison Statistics -- how different?
- Finding Duplicate Images -- finding two images that are the same
- Sorting Images by Type -- image classifications for comparing
- Handling Specific Image Types
- Image Metrics -- fingerprinting images
- Web Cameras -- Finding what has changed in fixed cameras
- Locate Sub-Images -- Locate a known sub-image on a larger image
The ability to compare two or more images, or finding duplicate images in a
large collection, is a very tricky matter. In these examples we look at
comparing images to determine how they similar they are, and where they
differ.
This may involve classifying or grouping images into various types for better
handling. Discovering some
metric to
simplify and group similar images. And
clustering similar
images together based on such metrics.
However such comparisons, and or studies while difficult can be rewarding,
with the ability to find image duplicates, copies, and even removal of 'spam'
or other text or notices from images.
Methods of Comparing Images
Compare Program
The "
compare
" program is provided to give you an easy way to
compare two similar images, to determine just how 'different' the images are.
For example here I have two frames of a animated 'bag', which I then gave to
"
compare
' to highlight the areas where it changed.
compare bag_frame1.gif bag_frame2.gif compare.gif
|
As you can see you get a white and red image, which has a 'shadow' of the
second image in it. It clearly shows that three areas changed between the two
images.
Rather than saving the 'compare' image, you can of course view it directly,
which I find more convenient, by output to the special "
x:
" output format, or using the "
display
" program. For example..
compare bag_frame1.gif bag_frame2.gif x:
compare bag_frame1.gif bag_frame2.gif miff:- | display
|
As of IM v6.4 you can change the color of the differences from red
to some other more interesting color...
compare bag_frame1.gif bag_frame2.gif \
-highlight-color SeaGreen compare_color.gif
| |
|
As of IM v6.4.2-8 you can specify the other color as well.
compare bag_frame1.gif bag_frame2.gif \
-highlight-color SeaGreen -lowlight-color PaleGreen \
compare_colors.gif
| |
|
If you don't want that 'shadow' of the second image, from IM v6.4.2-8 you can
add a "
-compose src
" to the options to remove it.
compare bag_frame1.gif bag_frame2.gif \
-compose Src compare_src.gif
| |
|
By using all three extra settings we can generate a gray-scale mask of the
changed pixels...
compare bag_frame1.gif bag_frame2.gif \
-compose Src -highlight-color White -lowlight-color Black \
compare_mask.gif
| |
|
Note however that this mask is of ANY difference, even the smallest
difference. For example you can see all the minor differences that saving an
image to the
Lossy JPEG Format produces...
convert bag_frame1.gif bag_frame1.jpg
compare bag_frame1.gif bag_frame1.jpg compare_lossy_jpeg.gif
|
As you can see even though you can't really see any difference between GIF and
the JPEG versions of the image, "
compare
" reports a lot of
differences.
By using a small
Fuzz Factor you can as IM to
ignore these minor differences between the two images.
compare -metric AE -fuzz 5% \
bag_frame1.gif bag_frame1.jpg compare_fuzz.gif
|
|
|
Which shows that most of the actual differences
are only minor.
The special "
-metric
"
setting of '
AE
' (short for "Absolute Error" count), will report
(to standard error), a count of the actual number of pixels that were masked,
at the current fuzz factor.
Difference Images
To get a better idea of exactly how different the images are, you are probably
better of getting a more exact '
difference
' composition image....
composite bag_frame1.gif bag_frame1.jpg \
-compose difference difference_jpeg.gif
| |
|
As you can see while "
compare
" showed that JPEG created a lot of
differences between the images, a '
difference
' composition was quite dark, indicating that all the
differences were relatively minor.
If the resulting image looks too black to see the differences, you may like to
Contrast Stretch the image so as to
enhance the results.
convert difference_jpeg.gif -contrast-stretch 0 difference_norm.gif
| |
|
This still shows that most of the differences are still very minor, with the
largest difference occurring along the sharp edges of the image, something the
JPEG image file format does not handle very well.
On the other hand getting a difference image of the two original frames shows
very marked differences between the two images, without any enhancement.
composite bag_frame1.gif bag_frame2.gif \
-compose difference difference_frames.gif
| |
|
Note that as the '
difference
' compose method is associative, the order of the two
images in the above examples does not matter, although unlike
"
compare
", you can compare different sized images, with the
destination image determining the final size of the difference image.
The compose method is even more useful when used with the
"
convert
" program, as you can process the resulting image further
before saving or displaying the results. For example we can gray-scale the
result, so as to get a better comparison image, that the colorful one above.
convert bag_frame1.gif bag_frame2.gif -compose difference -composite \
-colorspace gray difference_gray.gif
| |
|
However that produces a simple average of the RGB distances. As a result a
single bit color difference may not actually register as a difference! A
better method is to add the separate color channels of the difference image,
to ensure you capture ALL the differences, including the most minor
difference.
convert bag_frame1.gif bag_frame2.gif -compose difference -composite \
-separate -background black -compose plus -flatten \
difference_plus.gif
| |
|
Now unlike "
compare
", the difference image shows a mixture of
both images combined in the final result. For example look at the weird
'talisman' seems to appear in the forehead of the cat (originally the handle
of the bag from the first image). This merger can make it confusing as to
exactly what differences you are seeing, and you see both the parts that were
added to an image as well as subtracted.
Because of this confusion of details, the "
compare
" is usually the
better way for us humans to view, while the 'difference' image is the better
method for further processing the image.
You can continue this process by thresholding the result to generate a mask of
every pixel that changed between the two images.
convert bag_frame1.gif bag_frame2.gif -compose difference -composite \
-separate -background black -compose plus -flatten \
-threshold 0 difference_mask.gif
| |
|
This is basically what the "
compare
" program does, but with
more controls as to the color and output style.
See also
Transparency Masking, where
they internally use a difference images like these to perform background
removal. You may also like to look at this external page on
Change Detection as a practical example of its use.
Other Image comparison techniques...
-layers CompareAny
-layers CompareClear
-layers CompareOverlay
-layers OptimizeTransparency (uses 'ChangeMask')
Flicker Compare
An alternative to the "
compare
" program to see differences
between images is to do a flicker comparison between the similar images at a
reasonably fast rate.
convert -delay 50 bag_frame1.gif bag_frame2.gif -loop 0 flicker_cmp.gif
| |
|
To make this easier I wrote a script to display an animation of two given
images called "
flicker_cmp
" which flips between the two images, just like
the above example. It also adds a label at the bottom of the displayed image
so as to detail which image you are seeing at any particular moment.
Comparing Animations
Comparing the differences in two coalesced animations using the 'film strip'
technique. See a similar 'append' technique in
Side by Side Appending.
convert \( anim1.gif -coalesce -append \) \
\( anim2.gif -coalesce -append \) miff:- | \
compare - miff:- |\
convert - -crop 160x120 +repage anim_compare.gif
|
This coalesces and appends the two (time synced) animations to form
one image for each animation. The two images are then compared, and
a new animation is created by splitting up the animation into separate
frames again.
The result is a animation of the 'compare' masked images.
Note that for this to work the "
-crop
" size much match the size of the animation. Also the
animation will lose any variable time delays that it may have had, using a
constant time delay based on the first frame of the original animation.
Another image comparison technique useful for animations is used to locate
all the areas in which an animation changes, so as to divide the animation
into separate images. See
Splitting up an
Animation.
Comparison Statistics
Just how different are two images?
Under Construction
Statistics from difference image...
convert image1 image2 -compose Difference -composite miff:- |\
identify -verbose - |\
sed -n '/statistics:/,/^ [^ ]/ p'
The numbers in parenthesis (if present) are normalized values between
zero and one, so that it is independent of the Q level of your IM.
If you don't have these numbers, you should think of upgrading your IM.
Reduce it to an average percentage above zero
convert image1 image2 \
-compose difference -composite -colorspace gray miff:- |\
identify -verbose - |\
sed -n '/^.*Mean: */{s//scale=2;/;s/(.*)//;s/$/*100\/32768/;p;q;}' | bc
With the same images I used for "compare" I had a result of...
.55
As you can see it is a VERY low number, so images are very similar.
For non percentage you can also use...
identify -format '%[mean]'
Compare Program Statistics...
You can get a actual average difference value using the -metric
compare -metric MAE image1 image2 null: 2>&1
Adding -verbose will provide more specific information about each separate
channel.
compare -verbose -metric MAE rose.jpg reconstruct.jpg null: 2>&1
Image: rose.jpg
Channel distortion: MAE
red: 2282.91 (0.034835)
green: 1853.99 (0.0282901)
blue: 2008.67 (0.0306503)
all: 1536.39 (0.0234439)
Their are a number of different metrics to chose from.
With the same set of test images (mostly the same)
Number of pixels
AE ...... Absolute Error count of the number of different pixels (0=equal)
This value can be thresholded using a -fuzz setting to
only count pixels that have a larger then the threshold.
As of IM v6.4.3 the -metric AE count is -fuzz effected.
so you can discount 'minor' differences from this count.
convert -metric AE -fuzz 10% image1.png image2.png null:
Which pixels are different can be seen using the output
image (ignored in the above command).
This is the ONLY metric which is 'fuzz' effected.
Maximum Error (of any one pixel)
PAE ..... Peak Absolute Error (within a channel, for 3D color space)
PSNR .... Peak Signal to noise ratio (used in image compression papers)
The ratio of mean square difference to the maximum mean square
that can exist between any two images, expressed as a decibel
value. The higher the PSNR the closer the closer the images
are, with a maximum difference occurring at 1.
A PSNR of 20 means differences are 1/100 of maximum.
Average Error (over all pixels)
MAE ..... Mean absolute error (average channel error distance)
MSE ..... Mean squared error (averaged squared error distance)
RMSE .... (sq)root mean squared error -- IE: sqrt(MSE)
More info
MEPP .... Normalized Mean Error AND Normalized Maximum Error
These should directly related to the '-fuzz' factor,
for images without transparency.
With transparency, makes this difficult the mask should
effect the number of pixels compared, and thus the 'mean'
but this is currently not done.
I produced the following results on my test images...
_metric_|__low_Q_jpeg__|__black_vs_white__
PSNR | 29.6504 | 0
PAE | 63479 | 65535
MAE | 137.478 | 65535
MSE | 4.65489e+06 | 4.29484e+09
RMSE | 2157.52 | 65535
The first column of numbers is a compare of images with low-quality JPEG
differences, while the second "black vs white", is a compare of a solid
black verses a solid white image
The e+06 is scientific notation, on how many places to shift the
decimal point. EG: 4.65489e+06 --> 4,654,890.0
Thus is equal to about 4 million, and is the square of 2157.52
WARNING: numbers are dependant on the IM Quality (Q) levels set at compile
time. The higher the quality the larger the numbers. Only PSNR should be
unaffected by this.
I have NOT figured out any of the existing "-define" options to the
"compare" function.
For more info, see my very old raw text notes...
Image Comparing, Tower of Computational Sorcery
Finding Duplicate Images
Identical files
Are the files binary identical that is they are exactly the same file and
probably just exact copies of each other. No ImageMagick required.
Don't discount this. You can compare lots of files very very quickly
in this way. The best method I've found is by using MD5 check sums.
md5sum * | sort | awk {'print $2 " " $1'} | uniq -Df 1
And that will list the md5's of images that are identical.
I have programs that can generate and compare md5sum lists of files
returning the files that are md5 identical.
After this things get a lot more difficult. And some type of human
verification is probably needed to validate the results.
Direct Comparison
You can directly compare two images (using the "
compare
" program)
if they are the same size, to see how well they match. (See above)
This is very slow, and in my experience not very useful because it is so slow.
Image Classification
In my attempts to compare images I have found that Color, Cartoon-like, and
Sketches all compare very differently to each other.
Line drawings and gray-scale images especially tends to have smaller
differences that color images, with just about every comparison method.
Basically as the colors are all in a line any color metric tends to
place such images 3 times closer together (one channel verses 3) than an
equivalent color image.
Basically this means that separating your images into at least these two
groups, can be a very important first step in any serious attempt at finding
duplicate or very similar images.
Other major classifications or image types can also make comparing images
easier, just by reducing the number of images your are comparing against.
See Image classification below.
Thumbnail Compares
You have a program create (in memory) lots of small thumbnails (say 64x64
pixels) of images to compare looking for duplicates, which you proceed to do
by direct comparison.
It is typically the first thing that people (myself included) attempt to do,
and in fact this is the technique most image comparing programs (such as photo
handling software) does.
In fact this works well and does find images that exactly match. Also with a
little blur, and loosing of the difference threshold, it can even find images
that have had been been slightly cropped, and resized
However attempting to store in memory 10,000 such thumbnails will often cause
a normal computer to start thrashing, becoming very slow. Alternatively
storing all those thumbnails (unless the program does this for user viewing
reasons) uses a lot of disk space.
One method of improving the disk thrashing problem, is to only have a smaller
number of images in memory. That is by comparing images in groups, rather
than one image to all other images. A natural grouping is by directory, and
comparing each directory of images with other directories of images.
In fact this is rather good, as images tend to be grouped together, and this
group of images will often match a similar group. Outputting matching images
by directory pairs, is thus a bonus.
Also how acceptably similar two images are depends on their image type.
Comparing two line drawings needs to have very small 'threshold' to discount
images that different, while comparing images with large areas of color often
needs a much larger threshold to catch similar images that were cropped.
Real world images have a bigger problem in that a texture can produce a very
serious additive difference between images that has a very slight offset.
Because of this you may need to simply such images, into general areas of
color, either by using median filters, blurring, color reduction, or color
segmentation. After such a process a real world image, generally can be
compares in a similar way to cartoons.
Image Metrics
Create a small metric for each image is a linear ordered (O) operation
While, comparing all images with all other images is a squared ordered (O^2)
operation.
A metric remember job is not to to actually find matching images, but group
similar images, on which you can do a more intensive comparison on. As such it
any metric comparison should be lenient, but not so lenient as to include too
many miss-matches.
In the next section (
Metrics) is a number of different
IM generated metrics I have experimented with, or theorized about, including:
average color, predominate color, foreground background, edge colors, matrix
of colors, etc.
Günter Bachelier, has also reported the possibilities of using more exotic
metrics for image comparison, such as: Fourier descriptors, fractal
dimensions, convex areas, major/minor axis length and angles, roundness,
convexity, curl, solidity, shape variances, direction, Euler numbers, boundary
descriptors, curvature, bending energy, total absolute curvature, areas,
geometric centrum, center of mass, compactness, eccentricity, moments about
center, etc, etc.
My current effort is in generating and using a simple 3x3 matrix of color
averages to represent the image (See
Color
Matrix Metric below). As these are generated (or requested) the metric is
cached (with other file info) into special files in each directory. This way
I only need to re-generate a particular metric when and if no cached metric is
available, or the image changed.
Similarity or Distance
The metrics of two image (or the actual images) can be compared using a number
of different methods, generally producing a single distance measure or
'similarity metric' that can be threshold limited or cluster 'similar' images
together.
- Direct Threshold, or Maximum Difference, (Chebyshev Distance)
Just compare images by the largest difference in any one metric.
The threshold will produce a hyper-cube of similar images in the
multi-dimensional metric space. Of course the image difference is only
based on one metric and not over all metrics.
- Average Difference (Mean Distance, Averaged Manhattan Distance)
Sum all the differences and optionally divided by the number of metrics.
This is also known as the Manhattan Distance between two metrics, as
is is equivalent to the distance you need to cover to travel in a city
grid. All metrics contribute equally, resulting in things appearing
'closer' than you expect. In space a threshold of this metric will
produce a diamond like shape.
- Euclidean (Pythagorean) Difference
Or the direct vector distance between the metrics in metric space.
The value tends to be larger when more metrics are involved. However,
one metric producing a big difference, tends to contribute more than the
other metrics. A threshold produces a spherical volume in metric space.
- Mathematical Error/Data Fit or (Moment of Inertia???)
Sum all squares of all differences, then get the square root
This is more typically used to calculate how close a mathematically curve
fits a specific set of data, but can be used to compare image metrics too.
This is seems to provide the best non-vector distance measure.
- Vector Angle
Find the angle between the two lines from the center of the vector space
and the images metric. This should remove any effect of contrast or
image enhancements that may have been applied to the two images.
Yet to be tested
- Vector Distance
For images that are line drawing or greyscale images, where all the
individual color vectors in a metric are in the same direction, it may be
the relative distances of the metrics from the average color of the image
that can be important. Normalizing the distances relative to the largest
distance may reduce the effect of contrast.
That is this is a line drawing image, comparison method.
Yet to be tested
- Cluster Analysis
All the metrics are plotted and grouped into similar clusters within the
multi-dimensional space. A good clustering package may even be able to
discover and discount metrics that produce no clustering.
Yet to be tested
At the moment I am finding that the "Mathematical Error" technique seems to
work well for both gray-scale and color metrics, using a simple 3x3 averaged
"
Color Matrix Metric" (see below).
Human Verification
After the computer has finished with its attempts to find matching images,
it is then up to the user to actually verify that the images match.
Presenting matches to the user can also be a difficult task, as they will
probably want the ability to...
- See the images side-by-side
- Flick very very quickly between two images, at their original size,
and optionally a common 'scaled' size.
- Flick between, or overlay, differently scaled and translated images.
to try to match up the images.
- See other images in the same directory (same source) as the matching
image, so he may like to deal with a whole group rather than
individually.
- Rename, Move, Replace, Delete, Copy the Images,
between the two (or more) directories.
- and so on...
.
Currently I group matches into sets and use a combination of programs to
handle them under the users control. These programs include IM's
"
display
" and "
montage
", as well as image viewers
"
XV
" and "
GQview
".
However I am open to other suggestions of programs that can open two or more
directories simultaneously, and display collections or image groups from
multiple directories. Remote or control by other programs or scripts can be
vital, as it allows the image groups to be setup and presented in the best way
for the user to look at and handle.
No program have yet met my needs.
For example "
gqview
" has collections, and a single directory
view, but does not allow multiple directory views, or remote / command line
control of the presentation. However the collections do not show what
directory each image is from, or flip the single directory view to some other
directory. It also has no remote program control.
On the other hand the very old "
xv
" does allow multiple directory
views (its 'visual schnauzer'), and a collection list in its control window,
but only one image can be viewed at a time, and only one directory opened
and positioned from its command line. Of course it also has no remote control.
These are the best human verification programs I have found, which I use a
script to setup and launch for each image group, matching pairs, or all group
matched images. But none are very satisfactory.
Cross-type Image Comparison
One of the harder things I would like to do is find images that were created
from another image. For example. I would like to find line drawings that
have been colored to produce cartoon like images. Or one cartoon image that
has been recolored with different colors, or had some type of background image
added.
These things are very difficult and my experiments with edge detection
techniques have so far been inconclusive.
Finding the right metric in this is the key, as humans can make the
'similarity' connection much better, but you still have to find possible
matches to present to the user.
Summary of Finding Duplicate Images
In summery, my current procedure of finding and handling duplicate images is
a pipeline of programs to find and sort out 'similar' images.
Generate/Cache Image Types and Metrics
-> Compare metrics
-> compare full image
-> regroup into sets of matching images
-> human verification
As you can see I am looking a highly staged approach.
Mail me your ideas!!!
Sorting Images by Type
Determining what type of image it is important as most methods of comparing
images only work for a specific type of image. It is no good comparing a
image of text, against a artists sketch, for example. Or use a color image
comparison method on image which is almost pure white (sketch).
Usually the first thing to do when comparing images is to determine what type
of image, or 'colorspace' the image uses. Basic classifications of images
can include...
- Black and white line drawing or text image.
- Gray-scale artists sketch.
- Cartoon like color image with large areas of solid colors.
- A real life image with areas of shaded colors
- Very dark, almost all black images.
- Very light, almost all white images.
- Images consisting of two basic colors (other than black and white).
- Image contains some annotated text or logo overlay.
After you have basic categories you can also attempt to sort images, using
various image metrics, such as...
- Average color of the whole image
- Predominate color in image
- Foreground/Background color of image.
What is worse, is that JPEG, or resized images are often also color distorted,
making such classifications much more difficult as colors will not be quite as
they should be. Greys will not be pure greys, and lines may not sharp and
clear.
Gray-scale vs Color
Basically if you do a comparison difference of the image, against a
gray-scaled version of itself, it should produce very little difference if it
is gray-scale, and quite a lot of differences if it is colorful.
Examples... Color cartoon, and a artist sketch.
Solutions:
* Compare image against a gray-scale version to see if any color is
present.
* use a slow -fx 'saturation' test.
* Or threshold that test with -fx 'saturation>.01' to
check if any pixel at all is 'colorful' rather than averaging it.
For example Color Difference maximum...
convert image.jpg \( +clone -colorspace gray \) \
-compose difference -composite -colorspace gray \
-format '%[maximum]' info:
which a shows 4% difference in colors to its pure grey scale equivalent.
Alternative
convert image.jpg \( +clone -colorspace gray \) miff:- |
compare -metric PAE - null:
A rose; image produced: 41120 (0.627451) or 62% error (not-gray)
while a granite: image produce: 2056 (0.0313725) or 3% error (gray)
The difference between a color average and and a peak pixel difference
could be used to separate gray-scale, from gray-scale with small patch of
color, and even isolate or mask the area that has some color.
PROBLEM: The above does not work for a Sketch on say a yellow 'paper'
background. That is the image has a linear color gradient, but it is
not purely a black and white gradient.
Is Image Linear Color
The better technique is to do a direct 'best fit' of a 3 dimensional line to
all the colors (or a simplified Color Matrix of metrics) in the image. The
error of the fit (generally average of the squares of the errors) gives you
a very good indication about how well the image fits to that line.
The fitting of a line to the 3 dimensional image generally involves some
vector mathematics. The result will not only tell you if the image uses a
near 'linear' set of colors, but works for ANY scale of colors, not just
light to dark, but also off-grey lines on yellow paper.
The result can also be used to convert the image into a simpler 'grey
scale' image, (or just convert a set of color metrics to grey-scale metrics)
for simpler comparisons, and better match finding.
My trial test program does not even use the full image to do this
determination, but works using a simple Color
Matrix Metric below of 9 colors (27 values) to represent the image).
Mail me if interested, and let me know what you have tried.
Pure Black and White images
To see if an image is near pure black and white image, with little in the way
of extra colors or greys (due to anti-aliasing), we can make a novel use of the
"
-solarize
" option (See
the IM example on
Solarize).
Applying this operation onto any image results in any bright colors becoming
dark color (being negated). As such any near white colors will become near
black colors. From such an image a simple statistical analysis of the image
will determine if the image is mostly a black and white image.
For example...
convert wmark_dragon.jpg -solarize 50% -colorspace Gray wmark_bw_test.png
identify -verbose -alpha off wmark_bw_test.png | \
sed -n '/Histogram/q; /Colormap/q; /statistics:/,$ p' > wmark_stats.txt
|
If you look at the statistics above you will see that the color 'mean' is very
close to pure black ('
0
'), while the 'standard deviation' is also
very small, but larger than the 'mean'. Thus this image must be mostly pure
black and white, with very few colors or mid-tone greys.
For general gray-scale and color images, the 'mean' will be much larger, and
generally the 'standard deviation' smaller than the mean. When that happens it
means the solarized image has very little near pure black in it. That is very
few pure black or white colors are present.
Lets repeat this test using the built in granite image.
convert granite: granite.jpg
convert granite.jpg -solarize 50% -colorspace Gray granite_bw_test.png
identify -verbose -alpha off granite_bw_test.png | \
sed -n '/Histogram/q; /Colormap/q; /statistics:/,$ p' > granite_stats.txt
|
Note how the 'mean' is now must larger toward the middle of the color range,
with a 'standard deviation' that is much smaller than the size of the 'mean'.
As of IM v6.4.8-3 you will also see two other statistic values that can be
heplful in determining the type of image. Both '
Kurtosis' and '
Skewness', are is
relativally large (and positive) in the first Black and White image also
reflects the fact that very few grays are involved when compared to a Gray
image. However 'mean' vs 'standard devination' is still probaly the better
indicator for comparision purposes.
Note that this comparision does not differentiate between 'black on white', or
'white on black' but once you know it isn't a gray-scale image a simple check
will tell you what the background color really is.
Text vs Line Drawing
If you have a almost pure black and white image then you can try to see
if the image contents could be classified as either text, or a line drawing.
Text will have lots of small disconnected objects, generally groupped into
horizontal lines. On the other hand line drawings should have everything
mostly connected together as a whole, and involving many different angles.
Does anyone have a suggestion on how you can test for this? Email me.
Note that cartoon like color images could also be turned into a line drawing
for simpler image comparing, so a line drawing comparison method would be a
useful thing to have. Anyone?
A line drawing image may also be better for matching images with shifted
positions, and or rotations. Anyone?
Real Life vs Cartoon Like
Basically cartoons have very specific blocks of color with sharp bordered
regions, often made sharper by using a separating black line. They also
usually have a minimal gradient or shading effects. Real life images however
have lots of soft edging effects, color gradients, and textures, and use lots
of different colors.
This is of course not always true. A real life image could have a very cartoon
like quality about it, especially is a very high contrast is used, and some
modern cartoons are so life-like that it can be difficult to classify them as
cartoons or animation.
Generally the major difference between a real life image and a cartoon is
textures and gradients. As such to determine what type of image it is
requires you to compare the image, to the same image with the fine scale
texture removed. A large difference, means the image is more 'realistic' and
'real world' like, rather than than 'cartoonish' or 'flat'.
Also remember a line drawing, artist sketch, and text can also be very cartoon
like in style, but have such a fine texture and detail to it that the above
could think of the image as real world. As such line drawings and sketches
should be separated out before hand.
Does anyone have a suggestion on how to do this? Email me.
Jim Van Zandt offers this solution...
- write out the color of every pixel
- sort by color
- write out the pixel count for every color
- sort by pixel count
- Work your way through the list until you have accounted for half the
pixels in the image.
- If #pixels >>> #colors then it's cartoon like.
The initial section can be classed as histogram. See the "
histogram:
" examples.
If you have created some sort of image classification scheme.. Even if
only roughly, please let us know your results, so others (including myself)
can benefit.
Handling Specific Image Types
Here are notes and information on more specific image determination
techniques.
Bad Scan or Printouts
In the real world, things never work quit as perfectly as you would like.
Scanners have broken sensors and printer drums have scratches. Both of these
problems generally result in scans and printouts containing long vertical
lines. Determining if an image has these vertical lines is however fairly
easy.
The idea is to average the pixels of all the rows in an image together. Any
'fault' will appear as a sharp blip in the final pixel row the number of which
you can count using a 'threshold histogram' of the pixel row.
FUTURE -- image example needed for testing
convert bad_printout.png -crop 0x1+0+0 -average \
-threshold 50% -format %c histogram:info:-
When you have determined and removed such 'bad lines' from a fax, printout, or
scan, you can then continue with your other tests without needing to worry
about this sort of real world fault.
Blank Fax
First you will need to "
-shave
" off any headers and footers that a fax may have added to a
page. You can then either to a 'threshold histogram' (see previous)
to see how many individual black pixels there are.
FUTURE -- image example needed for testing
convert blank_fax.png -threshold 50% -format %c histogram:info:-
Or you can do a
Noisy Trim to see if the
image actually contains any more solid area or objects worthy of your
attention.
FUTURE -- image example needed for testing
Spammed Images
A spammed image will generally show a predominate pure color spike in the
images color histogram. A check on the color in the image will usually show
it to be in one of the corners of the image.
EMail Spam Images
These are images designed to get past the various spam filters. basically the
text of the ad is hidden in an image using various colors and extra 'dirt' and
other noise added to make it harder to detect. And while these are difficult
to distinguish from say a logo of a company email header, they are usually
also much larger than the typical email logo.
One discovery technique is to use a median filter on the image. EMail spam
text will generally disappear, while a logo or image will still remain very
colorful.
Image Metrics, quickly finding images to compare
A metric represents a type of 'finger print' to represent an image, in a very
small amount of memory. Similar images should result in a similar metric.
Note however that a metric is not designed to actually find matching images,
but to try to discount images that are definitely not a match. That is a
good metric will let you disregard most images from further comparisons, thus
reduce the amount of time needed to search all the images.
Average Color of an image
You can use -scale to get an average color of an image, however I also suggest
you remove the outside borders of the image to reduce the effect of
any 'fluff' that may have been added around the image.
convert image.png -gravity center -crop 70x70%+0+0 \
-scale 1x1\! -depth 8 txt:-
Alternatively to get 'weighted centroid' color, based on color clustering,
rather than a average, you can use -colors
convert rose: -colors 1 -crop 1x1+0+0 -depth 8 -format '%[pixel:s]' info:-
rgb(146,89,80)
This will generally match images that have been resized, lightly cropped,
rotated, or translated. But it will also match a lot of images that are
not closely related.
The biggest problem is that this metric will generally disregard images that
have been brightened, dimmed, or changed the overall hue of the image.
Also while it is a great metric for color and real-world images, it is
completely useless for images which are greyscale. All such images generally
get lumped together without any further clustering within the type.
This in turn shows why some initial classification of image types can be vital
to good image sorting and matching.
Predominate Color of an image
The predominate color of an image is a little different, instead of the
average which merges the background colors with the foreground, you want to
find the most common foreground color, and perhaps a percentage of how much of
the image consists of that predominate color.
As such you can not just take a histogram of an image, as the image may use a
lot of individual shades of color rather than a specific color.
This can be done using the low level quantization function -segment, then
taking a histogram. This has an advantage over direct use of -colors as it
does not attempt to merge distant (color-wise) clusters of colors, though the
results may be harder to determine.
FUTURE example
After which a histogram will given you the amount of each of the predominate
colors.
However usually the predominate color of a cartoon or line drawing is the
background color of the image. So in it only really useful for real-life
images.
On the other hand, you may be able to use to discover if an image has a true
background, by comparing this to the images average border color.
Please note that a pictures predominate color is more likely to be more
strongly influenced by the background color of the image, rather than the
object of interest. That is usually in or near the center of the image.
Border Colors
By repeatedly cropping off each of the four edges (2 to 3 pixels at most) of
an image, and calculating the borders average color, you can determine if an
image is framed, and to how deep. Whether there is a definite background to
the image. Or if there is some type of sky/land or close-up/distant color
separation to the overall image.
By comparing the averaged side colors to the average central color of the
image you can discover if the image is uniform without a central theme or
subject, such as a photo of a empty landscape.
Foreground/background Color Separation
Using -colors you can attempt to separate the image into foreground and
background parts, by reducing the image to just two colors.
Using a -median filter first will remove the effect of minor details, and line
edges and noise that may be in the image. Of course that is not very good for
mostly white sketch-like images.
convert rose: -median 5 +dither -colors 2 \
-depth 8 -format %c histogram:info:-
This shows a red and a grey color as the predominate colors in the image.
A trim/crop into the center of the image should then determine what is
foreground and what is background.
convert rose: -median 5 +dither -colors 2 \
-trim +repage -gravity center -crop 50% \
-depth 8 -format %c histogram:info:-
Which shows the red 'rose' color is the predominate foreground color.
Note that a landscape image may separate differently in that you get a lower
ground color and an upper sky color. As such a rough look at how the colors
were separated could be very useful for image type determination.
Also a picture with some text 'spam' will often show a blob of color in one
corner that is far more prominent that the rest of the image. If found redo
with 3 colors, then erase that area with the most common 'background' color
found before doing your final test.
This technique is probably a good way of separating images into classes like
'skin tone' 'greenery' 'landscape' etc.
Average Color Matrix
A three by three matrix color scheme ("
-scale 3x3\!
") is a
reasonable color classification scheme. It will separate, and group similar
images together very well. For example sketches (all near white), gray-scale,
landscapes, seascapes, rooms, faces, etc. will all be separated into basic and
similar groups (in theory).
This is also a reasonable metric to use for indexing images for generating
Photo Mosaics.
The output of the NetPBM image format is particularly suited to generating
such a metric, as it can output just the pixel values as text numbers.
Remember this would produce a 27 dimensional result, (3x3 colors of 3 value),
so a multi-dimensional clustering algorithm may be needed.
Do you know of a good 3d clustering program/algorithm?
For example, here is the 3 x 3 RGB colors (at depth 8) for the IM logo.
convert logo: -scale 3x3\! -compress none -depth 8 ppm:- |\
sed '/^#/d' | tail -n +4
251 241 240 245 234 231 229 233 236 254 254 254
192 196 204 231 231 231 255 255 255 211 221 231
188 196 210
The above can be improved by using 16 bit values, and possibly cropping
10% of the borders to remove logo and framing junk that may have been added...
convert logo: -gravity center -crop 80% -scale 3x3\! \
-compress none -depth 16 ppm:- | sed '/^#/d' | tail -n +4
63999 59442 58776 62326 58785 58178 51740 54203 54965 65277 65262 65166
45674 47023 49782 56375 55648 55601 65535 65535 65535 52406 55842 58941
44635 48423 52881
Of course like the previous average color metric, this will also have problems
matching up images that have been color modified, such as hue, or brightness
changes. (See next section)
Also this metric can separate line drawings within its grouping, though only
in a very general way. Such drawing will still be grouped more by the color of
the background 'paper' rather than by content, and generally need a smaller
'threshold' of similarity, than color images.
Color Difference Matrix
The biggest problem with using the colors directly as a metric, is that you
tie the image to a particular general color. This means any image that has
been brightened or darkened, or its hue was changed, will not be grouped
together.
One solution to this is to somehow subtract the predominate or average color
of the image from the metric, and using a matrix of colors makes this
possible.
Here for example I subtract the middle or center average color from all the
surrounding colors in the matrix.
convert logo: -gravity center -crop 80% -scale 3x3\! -fx '.5+u-p{1,1}' \
-compress none -depth 16 ppm:- | sed '/^#/d' | tail -n +4
51093 45187 41761 49419 44529 41163 38834 39947 37950 52371 51007 48152
32767 32767 32767 43469 41393 38587 52629 51279 48521 39500 41587 41926
31729 34168 35867
Note that I add .5 to the difference as you can not save a negative color value
in an image. Also the use of the slow "
-fx
" operator is acceptable as it only 9 pixels are processed.
Note that the center pixel ("32767 32767 32767" at the start of the second
line in the above) will not change much (any change is only due to slight
rounding errors), and could be removed, from the result, reducing the metric
to 24 dimensions (values).
Alternatively, you can subtract the average color of the image from all 9
color values.
convert logo: -scale 3x3\! \( +clone -scale 1x1 \) -fx '.5+u-v.p{0,0}' \
-compress none ppm:- | sed '/^#/d' | tail -n +4
38604 35917 34642 37011 33949 32441 32839 33841 33649 39447 39259 38369
23358 24377 25436 33538 33174 32426 39612 39434 38605 28225 30576 32319
22271 24381 27021
This also could be done by the metric comparator, rather than the metric
generator.
The metric still separates and clusters color images very well, placing
similar images very closely together, regardless of any general color or
brightness changes. It is still sensitive to contrast changes though.
This metric modification could in fact be done during the comparison process
so a raw
Color Matrix Metric can still be
used as a standard image metric to be collected, cached and compared. This is
what I myself am now doing for large scale image comparisons.
Unlike a straight color average, you can use this metric to differentiate
between different line drawing images. However as line drawing use a linear
color scale (all the colors fall in a line in the metric space, the
differences between images is roughly 1/3 that of color images. As such a
very different threshold is needed when comparing line drawings. Is thus still
better to separate line drawings and grayscale images from color images.
In other words this is one of the best metrics I have yet found for color
images. Just be sure to determine what images are line drawings first and
compare them separately using a much lower threshold. Lucky for us the
metric itself can be used to do the separation of images into
greyscale, or
linear color
image.
Suggestions welcome.
Matching images better
Miscellaneous notes and techniques I have either not tried or did not work
very well, for comparing larger images for more exact image matching.
Segmentation Color
As you can see many of the above metrics use a blur/median filter followed by
then color reduction techniques are basic attempts to simplify images to better
allow them to be classified. However the
Color
Quantization Operator is not really designed for this purpose. It's job
is to reduce colors so as to highlight the important details of the image.
For image comparison however we don't really want to highlight these features,
but highlight areas of comparative interest. This is the job of a related
color technique known segmentation...
ASIDE: from Leptonica: Image segmentation is the division of the image into
regions that have different properties.
This operator blocks out areas of similar colors removing the detail from
those areas. That what when you compare the to images you are comparing
areas, rather than low level details of the image.
IM implements a segmentation algorithm, "
-segment
", for its implementation
details see
SegmentImage().
Example:
convert logo: -median 10 -segment 1x1 \
+dither -scale 100x100\! segment_image.gif
One problem is that -segment is VERY slow, and it only seems to work for
larger images. Small images (like a rose: or a 100x100 scaled logo:) seems to
result in just a single color being produced. This may be a bug.
Of course you can still scale the image after segmenting it, as we did above.
that way you can store a larger number of images in memory to compare with
each other.
Also the resulting segmentation does not seem to work very well, when compared
to the image segmentation algorithm that Leptonica provides. See
Leptonica: Color
Segmentation.
However an alternative to the IM segmentation, is to miss-use the color
quantization function to find areas of similar color. Example:
convert logo: -scale 100x100\! -median 3 \
-quantize YIQ +dither -colors 3 segment_image.gif
The disadvantage is that -color limits the number of color areas that may be
present in an image, where segment tries to preserve similar areas, regardless
of how many areas are really present in the image (or at least that is what it
should do).
Colorless Edge Comparison
Image color is notoriously unreliable, particularly for cartoon like images.
Different users could quite easily recolor such images, add different colors
backgrounds, or even take a sketch and color it in.
One way to match up such images is to some basic color reduction, as per
method above, but then rather than comparing images based on the resulting
color you perform an edge detection, and further processing so that only the
outlines of the most important color changes are used for the metrics and
comparison of the images.
For example...
convert logo: -scale 100x100\! -median 3 \
-quantize YIQ +dither -colors 3 -edge 1 \
-colorspace gray -blur 0x1 outline_image.gif
An alternative may be to use the -lat (Local Area threshold) for edge
detection, which may give you some better control...
convert logo: -scale 100x100\! -median 3 \
-quantize YIQ +dither -colors 3 \
-lat 3x3-5% -negate \
-colorspace gray -blur 0x1 outline_image.gif
Of course for comparing you would use a line drawing comparison method.
??? how would you compare line drawings in a workable way ???
Multiply images together and see if resulting image added or reduced the
intensity of the lines. Mis-Matching lines will become black.
Web Cameras
What has changed in fixed cameras
Under Construction
Walter Perry <gatorus13_AT_earthlink.net> reports...
The project I am working involves processing groups of 20 images sent from a
surveillance camera in response to the camera sensing motion. These cameras
are at remote locations and once they detect motion the images are sent to a
local server. Once at the local server, I want to be able to "filter" out
those images that do not contain what caused the event.
I use PerlMagick to compare the first image in the series (which will always
not contain anything other than the normal background) with the rest of the
images. I am getting an "average" difference for all the images and then if
the individual difference is greater than the average difference I keep the
image as it has something in it.
This approach works great no matter day or night or what the lighting
conditions. I originally was trying to use just a percentage difference above
the first image, but that was not too reliable and really depended on the
lighting conditions. Based on this comparison, I will then determine which
images have "content" and those images which are empty of any motion. Once I
obtain the only those images that contain "motion".
A script was also provided.
Sub-image Locating
Locating a known sub-image in a larger image
Under Construction
Peter Valdemar Mørch asked on 5 July 2006...
If I have a tiny crop of a part of an image, how would I find out if
that crop is present in the larger image, and if so, the (x,y)
coordinates in the larger image, where the crop is to be found?
Cristy wrote...
We've already sketched out a method to locate an image within another image.
We're trying to decide whether to use fuzzy color matching or a DB metric
(as used in the CompareImages() method).
Gabe Schaffer replied...
To do this simply, just write a 2-dimensional string search. Search
for the first pixel that has the same color as the first pixel in the
image you're looking for.
Then compare pixels until you either find a
match or a pixel doesn't match. If the images don't match, start
searching again at the pixel after the first one you found. This may
be slow in Perl, unless you dump the image into a Perl array and
search it that way. Even then it won't be very fast.
To do this quickly in a general purpose way may be a bit of work, but
for images with few colors, a 2D variant of Boyer-Moore string
searching would be fast and efficient.
Paraphrasing Peter Valdemar Mørch response..
So I could start by finding the pixels in the sub-image that are least likely
to occur in the main image then try to match another unlikely pixel at some
offset that should also match. When these two matches then I could try to fit
the sub-image at that location.
Matching color transitions are also another thing that could be looked for.
That is, rule out false matches quickly.
Anthony Thyssen added...
A Boyer-Moore search would certainly speed things up, but it starts to
fall down when you are 'fuzzy' matching images, say to handle JPEG image
distortions. Because of the fuzzy matching you can't skip pixels do easily.
It would also remove the possibility of finding the 'fuzzy matching for the
maximum number of pixels'. That is the sub-image may not match exactly
on the larger image, but be slightly obscured, so you want the position
with the maximum number of matching pixels.
For example:
You have an main image with a background tile, and you have another image which
you KNOW is the background tile used on the main image, but you want to find
the offset of that tile on the main image.
Because the foreground image could obscure parts background, you may not find a
perfect, unobscured background tile to match against, so the best match
with a extra report on the percentage of pixels that do fuzzy match, may be
the better algorithm.
I also don't want to specialize the function to the point where
it is only really useful to a single basic application.
This is why a more general sub-image locating function also needs to be
able to handle
* Fuzzy matching (near by not exact the same color
(caused by JPEG distortions, shading, color
* Perhaps only matching a high percentage of pixels
EG: a small part of the sub-image on the main image was obscured
say by other image overlays, copyright notices, etc.
* sub-image has transparent areas that are not to be matched against
the main image (EG: a shaped image)
* The possibility to find multiple matching locations
All of these are of course optional, but the initial algorithm should at
least be able to be expanded to handle these things.