Psych221/EE362 Project Report
Mar.15, 2001
Sangoh Jeong <sojeong@stanford.edu>
Abstract
In this project, six histogram-based search methods are investigated in two different color spaces, RGB and HSV. Histogram search characterizes an image by its color distribution, or histogram but the drawback of a global histogram representation is that information about object location, shape, and texture is discarded. Thus this project showed that images retrieved by using the global color histogram may not be semantically related even though they share similar color distribution in some results. An image retrieval demo system was built to make it easy to test the retrieval performance and to expedite further algorithm investigation. And six histogram-based image retrieval methods in two color spaces were exhaustively compared by providing precision vs. recall graphs for each image class and for all test images. In general, histogram-based retrievals in HSV color space showed better performance than in RGB color space. The hitogram Intersection-based image retrieval in HSV color space was found to be most desirable among six retrieval methods.
Table of Contents
1. Introduction
2. Color Space
3. Histogram-Based Image Search
4. Retrieval Effectiveness
5. Implemented Method
6. Results
7. Conclusion
8. References
9. Appendix
1. Introduction
In the past decade, many image
retrieval systems have been developed, such as the IBM QBIC system developed
at the IBM Almaden Research Center, the VIRAGE System developed by the
Virage Incorporation, the Photobook System developed by the MIT Media Lab,
the VisualSeek system developed at Columbia University, the WBIIS System
developed at Stanford University, and the Blobworld System developed at U.C.
Berkeley[1]. The common ground for them is to extract a signature for every
image based on its pixel values, and to define a rule for comparing images.
The signature can be shape, texture, color or any other information with
which two images could be compared. However, in this project, only the color
signature is used as a signature to retrieve images.
Existing color-based
general-purpose image retrieval systems roughly fall into three categories
depending on the signature extraction approach used: histogram, color
layout, and region-based search. And, in this project, histogram-based
search methods are investigated in two different color spaces. A color space
is defined as a model for representing color in terms of intensity values.
Typically, a color space defines a one- to four- dimensional space. A color
component, or a color channel, is one of the dimensions. Color spaces are
related to each other by mathematical formulas. In this project, only two
three-dimensional color spaces, RGB and HSV, are investigated.
Histogram search characterizes an
image by its color distribution, or histogram. Many histogram distances have
been used to define the similarity of two color histogram representations.
Euclidean distance and its variations are the most commonly used. The
drawback of a global histogram representation is that information about
object location, shape, and texture is discarded. The global color histogram
indexing method, which is used in this project, correlates to the image
semantics well. But, images retrieved by using the global color histogram
may not be semantically related even though they share similar color
distribution.
In the experiment, an image
retrieval demo system was built to make it easy to test the retrieval
performance and to expedite further algorithm investigation. And six
histogram-based image retrieval methods in two color spaces are exhaustively
compared by providing precision vs. recall graphs for each image class and
for all test images.
2. Color Space
A color space is defined as a model for representing color in terms of intensity values[1]. Typically, a color space defines a one- to four-dimensional space. A color component, or a color channel, is one of the dimensions. A color dimensional space (i.e. one dimension per pixel) represents the gray-scale space. The following two models are commonly used in color image retrieval system.
2.1 RGB color model
The RGB color model is composed of the primary colors Red, Green, and Blue. This system defines the color model that is used in most color CRT monitors and color raster graphics. They are considered the "additive primaries" since the colors are added together to produce the desired color. The RGB model uses the cartesian coordinate system as shown in Figure 1. (a). Notice the diagonal from (0,0,0) black to (1,1,1) white which represents the grey-scale. Figure 1. (b) is a view of the RGB color model looking down from "White" to origin.
Figure 1. (a) RGB coordinates system (b) RGB color model
2.2 HSV Color Model
The
HSV stands for the Hue, Saturation, and Value based on the artists (Tint,
Shade, and Tone). The coordinate system
in a hexacone in Figure 2. (a). And Figure 2.(b) a view of the HSV color
model. The Value represents intensity of a color, which is decoupled
from the color information in the represented image. The hue and
saturation components are intimately related to the way human eye perceives
color resulting in image processing algorithms with physiological basis.
As hue varies from 0 to 1.0, the corresponding colors vary from
red, through yellow, green, cyan, blue, and magenta, back to red, so that
there are actually red values both at 0 and 1.0. As saturation varies from 0
to 1.0, the corresponding colors (hues) vary from unsaturated (shades of
gray) to fully saturated (no white component). As value, or brightness,
varies from 0 to 1.0, the corresponding colors become increasingly brighter.
Figure 2. (a) HSV coordinates system (b) HSV color model
2.3 Color conversion
In order to use a good color space for a specific application, color conversion is needed between color spaces. The good color space for image retrieval system should preserve the perceived color differences. In other words, the numerical Euclidean difference should approximate the human perceived difference.
2.3.1 RGB to HSV conversion
In Figure 3., the obtainable HSV colors lie within a triangle whose vertices are defined by the three primary colors in RGB space:
.
Figure 3. Obtainable HSV color from RGB color space
The hue of the point P is the measured angle between the line connecting P to the triangle center and line connecting RED point to the triangle center. The saturation of the point P is the distance between P and triangle center. The value (intensity) of the point P is represented as height on a line perpendicular to the triangle and passing through its center. The grayscale points are situated onto the same line. And the conversion formula is as follows :
,
,
(1)
2.3.1 HSV to RGB conversion
Conversion from HSV space to RGB space is more complex. And, given to the nature of the hue information, we will have a different formula for each sector of the color triangle.
Red-Green Sector:
Green-Blue Sector:
Blue-Red Sector:
(2)
3. Histogram-Based Image Search
The color histogram for an
image is constructed by counting the number of pixels of each color.
Retrieval from image databases using color histograms has been investigated
in [tools, fully, automated]. In
these studies the developments of the extraction algorithms follow a similar
progression: (1) selection of a color space, (2) quantization of the color
space, (3) computation of histograms, (4) derivation of the histogram
distance function, (5) identification of indexing shortcuts. Each of these
steps may be crucial towards developing a successful algorithm.
There are several difficulties
with histogram based retrieval. The first of these is the high
dimensionality of the color histograms. Even with drastic quantization of
the color space, the image histogram feature spaces can occupy over 100
dimensions in real valued space. This high dimensionality ensures that
methods of feature reduction, pre-filtering and hierarchical indexing must
be implemented. The large dimensionality also increases the complexity and
computation of the distance function. It particularly complicates `cross'
distance functions that include the perceptual distance between histogram
bins [2].
3.1 Color histogram definition
An image histogram refers to the probability mass function of the image intensities. This is extended for color images to capture the joint probabilities of the intensities of the three color channels. More formally, the color histogram is defined by,
(3)
where A , B
and C represent the three color
channels (R,G,B or H,S,V) and N is the number of
pixels in the image. Computationally, the color histogram is formed by
discretizing the colors within an image and counting the number of pixels of
each color.
Since the typical computer represents color
images with up to 224 colors, this process generally requires substantial
quantization of the color space. The main issues regarding the use of color
histograms for indexing involve the choice of color space and quantization
of the color space. When a perceptually uniform color space is chosen
uniform quantization may be appropriate. If a non-uniform color space is
chosen, then non-uniform quantization may be needed. Often practical
considerations, such as to be compatible with the workstation display,
encourage the selections of uniform quantization and RGB color space. The
color histogram can be thought of as a set of vectors. For gray-scale images
these are two dimensional vectors. One dimension gives the value of the
gray-level and the other the count of pixels at the gray-level. For color
images the color histograms are composed of 4-D vectors. This makes color
histograms very difficult to visualize. There are several lossy approaches
for viewing color histograms, one of the easiest is to view separately the
histograms of the color channels. This type of visualization does illustrate
some of the salient features of the color histogram [2].
3.2 Color uniformity
The RGB color space is far from being perceptually uniform. To obtain a good color
representation of the image by uniformly sampling the RGB space it is necessary to select the quantization step sizes to be fine
enough such that distinct colors are not assigned to the same bin. The
drawback is that oversampling at the same time produces a larger set of
colors than may be needed. The increase in the number of bins in the
histogram impacts performance of database retrieval. Large sized histograms
become computationally unwieldy, especially when distance functions are
computed for many items in the database. Furthermore, as we shall see in the
next section, to have finer but not perceptually uniform sampling of colors
negatively impacts retrieval effectiveness.
However, the HSV color space mentioned earlier offers
improved perceptual uniformity. It represents with equal emphasis the three
color variants that characterize color: Hue, Saturation and Value
(Intensity). This separation is attractive because color image processing
performed independently on the color channels does not introduce false
colors. Furthermore, it is easier to compensate for many artifacts and color
distortions. For example, lighting and shading artifacts are typically be
isolated to the lightness channel. But this color space is often
inconvenient due to the non-linearity in forward and reverse transformation
with RGB space [2].
3.3 Color Histogram Discrimination
There are several distance formulas for measuring the similarity of color histograms. In general, the techniques for comparing probability distributions, such as the kolmogoroff-smirnov test are not appropriate for color histograms. This is because visual perception determines similarity rather than closeness of the probability distributions. Essentially, the color distance formulas arrive at a measure of similarity between images based on the perception of color content. Three distance formulas that have been used for image retrieval including histogram euclidean distance, histogram intersection and histogram quadratic (cross) distance [2, 3].
3.3.1 Histogram euclidean distance
Let h and g represent two color histograms. The euclidean distance between the color histograms h and g can be computed as:
(4)
In this distance formula, there is only comparison between the identical bins in the respective histograms. Two different bins may represent perceptually similar colors but are not compared cross-wise. All bins contribute equally to the distance.
3.3.2 Histogram intersection distance
The color histogram intersection was proposed for color image retrieval in [4]. The intersection of histograms h and g is given by:
(5)
where |h| and |g| gives the magnitude of each histogram, which is equal to the number of samples. Colors not present in the user's query image do not contribute to the intersection distance. This reduces the contribution of background colors. The sum is normalized by the histogram with fewest samples.
3.3.3 Histogram quadratic (cross) distance
The color histogram quadratic distance was used by the QBIC system introduced in [1, 6]. The cross distance formula is given by:
(6)
The cross distance formula considers the cross-correlation between histogram bins based on the perceptual similarity of the colors represented by the bins. And the set of all cross-correlation values are represented by a matrix A, which is called a similarity matrix. And a (i,j)th element in the similarity matrix A is given by : for RGB space,
(7)
where
is the
distance between the color i and j in the RGB space. In the case that quantization of the color space is not
perceptually uniform the cross term contributes to the perceptual distance
between color bins.
For HSV space it is given in [5] by:
(8)
which corresponds to the proximity in the HSV color space[fully].
กก
4. Retrieval Effectiveness
Two metrics for retrieval effectiveness are recall and precision. Recall signifies the relevant images in the database that are retrieved in response to a query. Precision is the proportion of the retrieved images that are relevant to the query. More precisely, let A be the set of relevant items, let B the set of retrieved items and a,b,c and d are given in Figure 4.

Figure 4. Sets for explaining retrieval effectiveness
In the picture, a stands for 'retrieved relevant' images, b for 'retrieved irrelevant' images, c for 'unretrieved relevant' images and d for 'unretrieved irrelevant' images. Then recall and precision are defined as the following conditional probabilities [3].

(9)
With these conditions, image retrieval is said to be more effective if precision values are higher at the same recall values.
5. Implemented Method
5.1 Overall scheme
The flow of the experiment in Figure 5. follows the procedure of the general image retrieval system .
Figure 5. Flow of implemented image retrieval system
In the figure above, once a query image and a retrieval method are chosen by users, the rest of whose process is done automatically. However, the histogram data for all images in database are computed and saved in DB (Database) in advance so that only the image indexes and histogram data can be used to compare the query image with images in DB. The six retrieval methods used in this experiment are all based on histogram distance measures introduced in equations (4)~(8). The three histogram distance measures are used in RGB color space and in HSV color space separately, which makes up the six retrieval methods. However, each color space requires different processing to the query image and images in DB. All processes were realized using MATLAB. The following sections explain the experiment in detail.
5.2 Generation of image DB
The image data used in this experiment were downloaded from a web site (http://wang1.ist.psu.edu/). Those were the same images used in SIMPLIcity[Wang]. However, in order to reduce the computation time of the whole process, only the first half out of 1000 images were used and their sizes were reduced to 64x64 pixels. Those 500 images have numbers as their names from 0 to 499 with JPEG-compressed for convenience and for saving storage. The original 1000 images has 10 categories (Africans & villages, Beach, Buildings, Buses, Dinosaurs, Elephants, Flowers, Horses, Mountains & glaciers, Food). And this experiment uses only first 5 categories which have 100 images each. The category information is used in calculating precision and recall for a retrieval method.
5.3 Quantization
In order to reduce computation time, a 256x256x256 = 16777216 color image is quantized into a 8 x 8 x 8 = 512 color image in RGB color space. Otherwise, it will take really long time to compute and compare 16777216 bins of a histogram with others. This is a trade-off between performance and time. Since R,G and B have same distance in its color space, they are quantized into same levels. However, in order for the HSV color space to be used, a RGB color space image should first be transformed to a HSV color space image. In HSV color space, quantization of hue requires the most attention. The hue circle consists of the primaries red, green and blue separated by 120 degrees. A circular quantization at 20 degree steps sufficiently separates the hues such that the three primaries and yellow, magenta and cyan are represented each with three sub-divisions. Saturation and value are each quantized to three levels yielding greater perceptual tolerance along these dimensions. Thus, H is quantized to 18 levels and S & V is quantized to 3 levels. The quantized HSV space has 18x3x3 = 162 histogram bins.
5.4 Computation and comparison of histogram
Computation of histogram is simple. All we have to do is just to count number of pixels that correspond to a specific color in quantized color space whether it is the RGB color space or the HSV space. In order to compare histograms of two images, we first need to generate specific codes for all histogram bins. In this experiment, (r : 0~7, g: 0~7, b: 0~7 ) codes were generated for 512 RGB histogram bins and (h: 0~17, s: 0~3, v: 0~3) codes were generated for 162 HSV histogram bins. Then, histogram distances introduced in (4)~(8) were computed in each quantized color space. Since the number of bins of qunatized HSV color space to be compared is less than a third, it was expected that the comparison in HSV color space would be done faster than in RGB color space.
6.1 Image Retrieval Demo System
In the experiment, a image retrieval demo system in Figure 6. was built to test image retrieval efficiently. If the user selects an image in the listbox on the upper left side and chooses a retrieval method on the lower right listbox, by pressing 'Apply' button, the retrieval system finds the best two candidate images excluding the same image as the query image. Internally, it has information about all sorted indexes of images in DB in the order of increasing histogram distance. But, it shows only two best candidates. This system has also a function that show a precision vs. recall graph for an image regarding 6 histogram-based retrieval methods (3 distance measures in RGB and in HSV color spaces).

Figure 6. Image retrieval demo system
6.2 Retrieval results for several images
Figure 7. shows retrieval results for several images. Each query image was chosen from each category stated in 5.2 and the following abbreviations represent distance measures: HE-hsv distance -> Histogram Euclidean distance in HSV color space, HI-hsv distance -> Histogram Intersection distance in HSV color space, HQ-hsv distance -> Histogram Quadratic distance in HSV color space. For RGB color space, HE, HI and HQ have the same meaning as in HSV space with rgb instead of hsv. For a query image, all six results were shown in the figure.
[ Query image 1] [ Retrieved images - Distance measures : HE-hsv distance, HI-hsv distance, HQ-hsv distance (from left) ]

[ Query image 1] [ Retrieved images - Distance measures : HE-rgb distance, HI-rgb distance, HQ-rgb distance (from left) ]

[ Query image 2] [ Retrieved images - Distance measures : HE-hsv distance, HI-hsv distance, HQ-hsv distance (from left) ]

[ Query image 2] [ Retrieved images - Distance measures : HE-rgb distance, HI-rgb distance, HQ-rgb distance (from left) ]

[ Query image 3] [ Retrieved images - Distance measures : HE-hsv distance, HI-hsv distance, HQ-hsv distance (from left) ]

[ Query image 3] [ Retrieved images - Distance measures : HE-rgb distance, HI-rgb distance, HQ-rgb distance (from left) ]

[ Query image 4] [ Retrieved images - Distance measures : HE-hsv distance, HI-hsv distance, HQ-hsv distance (from left) ]

[ Query image 4] [ Retrieved images - Distance measures : HE-rgb distance, HI-rgb distance, HQ-rgb distance (from left) ]

[ Query image 5] [ Retrieved images - Distance measures : HE-hsv distance, HI-hsv distance, HQ-hsv distance (from left) ]

[ Query image 5] [ Retrieved images - Distance measures : HE-rgb distance, HI-rgb distance, HQ-rgb distance (from left) ]

Figure 7. Retrieval results for several images
For query images 2,3 and 5, it's hard to judge in which color space the histogram-base retrieval showed better performance. For query image 1, it seems to be sure that histogram-based image retrieval in RGB color space showed better performance than in HSV color space. However, for query image 4, histogram-based image retrieval in HSV color space showed better performance than in RGB color space. Since these images are only one case in 100 images in each category and each image in the same category has quite different characteristics, it's hard to find a general trend by the subjective test. It depends on the user's inclination for the most part. Therefore, it is required to test the results on the objective basis. The following section shows those objective results.
6.3 Retrieval effectiveness for histogram-based image retrieval
In order to measure retrieval effectiveness for a image retrieval system, precision vs. recall graphs are used. Although this effectiveness measure is useful to test a retrieval system objectively, it cannot exclude subjectivity inherent in image retrieval system. It's because the category itself was classified by humans. However, it is still most popular to test an information retrieval system. In the experiment, since the precision vs. recall graph for a query image regarding six histogram-based retrieval methods cannot show the generality of a specific retrieval methods, averages were taken for each retrieval method. The results are shown in Figure 8.~ Figure 13.


Figure 8. Histogram-based Retrieval Effectiveness (Africans) Figure 9. Histogram-based Retrieval Effectiveness (Beach)


Figure 10. Histogram-based Retrieval Effectiveness (Buildings) Figure 11. Histogram-based Retrieval Effectiveness (Buses)


Figure 12. Histogram-based Retrieval Effectiveness (Dinosaurs) Figure 13. Histogram-based Retrieval Effectiveness (500 images)
As we can see from the result from Figure 8. ~ Figure 11., except for 'Africans' category images, histogram-based retrievals in HSV color space showed better performance than in RGB color space since it showed higher precision values for the same recall values. In Figure 12., all three histogram-based image retrieval methods in RGB color space showed better performances than those in HSV color space. It seems to come from the characteristics of 'Dinosaurs' images, since they are not real photographs but composite graphic images. They are created to the characteristics of the RGB color space faithfully from the start. Thus, although the histogram intersection-based retrieval in RGB space showed good performance in average in Figure 13, it's because 'Africans' images and 'Dinosaurs' images affected dominantly. And we can find that the histogram euclidean distance and histogram intersection distance in HSV color space are most useful among six histogram distance measures in the average sense. In a viewpoint of computation time, using HSV color space is faster than using RGB color space and using Intersection method is faster than using Euclidean or Quadratic method. Thus, we can conclude that the Histogram Intersection-based image retrieval in HSV color space is most desirable among six retrieval methods mentioned in considering both computation time and retrieval effectiveness. From the graphs, we can also find that the quadratic distance measure is not efficient though it is computationally burdensome.
7. Conclusion
In this project, six histogram-based search methods were investigated in two different color spaces, RGB and HSV. Histogram search characterizes an image by its color distribution, or histogram but the drawback of a global histogram representation is that information about object location, shape, and texture is discarded. Thus this project showed that images retrieved by using the global color histogram may not be semantically related even though they share similar color distribution in some results. An image retrieval demo system was built to make it easy to test the retrieval performance and to expedite further algorithm investigation. And six histogram-based image retrieval methods in two color spaces were exhaustively compared by providing precision vs. recall graphs for each image class and for all test images.
In general, histogram-based retrievals in HSV color space showed better performance than in RGB color space since it showed higher precision values for the same recall values. For graphic images, 3., all three histogram-based image retrieval methods in RGB color space showed better performance than those in HSV color space. And it is found that the histogram euclidean distance and histogram intersection distance in HSV color space are most useful among six histogram distance measures in the average sense. In a viewpoint of computation time, using HSV color space is faster than using RGB color space and using Intersection method is faster than using Euclidean or Quadratic method. Thus, in conclusion, the Histogram Intersection-based image retrieval in HSV color space is most desirable among six retrieval methods mentioned in considering both computation time and retrieval effectiveness. From the results, the quadratic distance measure is not efficient though it is computationally burdensome.
For future work, it is possible to use the histogram intersection distance
measure with other features such as shape and texture to improve the overall
image retrieval performance. Finding good parameters for quadratic distance
measure could be another subject to improve histogram-based search
algorithm. However, even in that case, it would be difficult to reduce the
computation time for quadratic measure significantly.
8. References
[1] James Z. Wang, "Integrated Region-Based Image Retrieval",
Boston, Kluwer Academic Publishers, 2001
[2] J. R. Smith and S.-F. Chang. " Automated image retrieval using
color and texture", Technical Report CU/CTR 408-95-14, Columbia
University, July 1995.
[3] J. R. Smith and S.-F. Chang. " Tools and techniques for color
image retrieval", In Symposium on Electronic Imaging: Science and
Technology - Storage & Retrieval for Image and Video Databases IV,
volume 2670, San Jose, CA, February 1996. IS&T/SPIE.
[4] M. J. Swain and D. H. Ballard. "Color indexing",
International Journal of Computer Vision, 7:1 1991.
[5] J. R. Smith and S.-F. Chang. "VisualSEEk: a fully automated
content-based image query system.", ACM Multimedia '96,
November, 1996.
[6] James Hafner, Harpreet S.Sawhney, Will Equits, Myron Flickner and Wayne
Niblack, "Efficient Color Histogram Indexing for Quadratic Form
Distance Functions", IEEE Trans. on Pattern Analysis and Machine
Intelligence, Vol. 17, No. 7, July 1995
9. Appendixกก