Perceptual Color Image Segmentation

project proposal - cs223b & psych221

angela chau + jeff walters
winter qtr 2003

Problem

The problem of segmenting images into coherent regions has been a major subject of research in the field of computer vision. Most of the literature written on the topic has concentrated on using texture information to perform the segmentation. Some researchers, however, have started looking at how color and texture information can be used in combination during the segmentation process, thus introducing the more specific problem of color image segmentation.

Current methods for segmenting color textures include clustering in the different color bands, clustering in RGB space then merging clusters in the Luv space, using a combination of Gabor filters and low-pass color filters, and using Markov Random Field models. The term 'color textures' is often used to describe the input images for color segmentation algorithms.

We would like to investigate the specific problem of color image segmentation. Moreover, we would like to incorporate into our solution findings from studies of the human visual system and its responses to colors and spatial frequencies of colors to see how that can aid and/or improve the segmentation process.

Proposed Solution

The ultimate judge of computer segmentation results is human perception, and therefore a successful segmentation algorithm will have a strong perceptual modeling component. We propose to implement the color texture segmentation algorithm that is outlined by Mirmehdi and Petrou. The algorithm incorporates two important perceptual concepts- color dependent perceptual smoothing and a multi-scale framework.

The first step is to convert the image to an opponent color space, as described by Wandell and Zhang. The color space is defined by three different color planes, O1 (luminance), O2 (red-green), and O3 (blue-yellow). Once projected onto these three planes, the variation of texture dependence with color, and vice versa, can be modeled using separate convolution kernels in each color plane. The sharpest kernel is applied to the luminance channel.

Convolution kernels are defined over cycles per degree, thus implying a viewing distance from the image. Mirmehdi and Petrou believe that segmentation is a process, where decisions at fine scales are made using prior information from decisions made at coarser scales. The movement to finer scales can be made by foveation to the area of interest, or by moving physically closer to the image (they are equivalent). This process is modeled in the algorithm by creating a causal, multi-scale tower of images. The image set is described as tower rather than a pyramid because no sub-sampling takes place between levels, meaning that the images at every scale have the same number of pixels.

Segmentation is initiated at the coarsest level using K-means clustering. At this early stage, a large number of clusters are used and then adjacent clusters are merged based on Euclidean distances of segment boundaries in the Luv space. Segmentation then descends down the image tower by using the segmentation result at the previous layer as prior information. This method, called perceptual probabilistic relaxation, incorporates information at long distances from the pixel of interest first, and then relies on information from the local region around the pixel as the algorithm progresses. Classical probabilistic relaxation, on the other hand relies on local information first, and then incorporates information further form the pixel of interest.

Testing

As mentioned above and in the Mirmehdi-Petrou paper, the criteria most often used to judge the performance of image segmentation algorithms are subjective because we usually want the segmentation of an image by the software to confirm that performed by our own visual systems. Thus, it makes sense to use this "measure" to characterize the performance of this project; if the algorithm achieves a segmentation that looks similar to how we would segment the image, then the algorithm performs well.

To test the algorithm, we will first use a set of generated color textures with very well-defined regions, i.e. having clear boundaries between regions, and then a set of natural color images, similar to those used for the Mirmehdi-Petrou paper. We plan to find appropriate test images from the web and/or create test images of our own using our digital cameras and Photoshop.

Comparisons

In addition to using the subjective criteria mentioned above to judge the segmentation algorithm, we would also like to compare this algorithm's results against those obtained using a few other schemes.

First, we would like to run the segmentation code described above on images that have not been perceptually filtered. In other words, the multiscale segmentation will be performed on a set of images filtered using simple Gaussian kernels in either the RGB space or the opponent color space (since the opponent color space and the RGB space are both linearly related to the XYZ space, it should not matter which space we run the filters in). This comparison will show us the improvement gained by using Zhang and Wandell's perceptual filters.

Secondly, we would like to compare our segmentation results against those obtained using a completely different color segmentation method called EdgeFlow, based on the work of Wei Ma and B.S.Manjunath. The EdgeFlow algorithm should provide an interesting comparison because it uses very different concepts such as Gabor texture feature extraction and segmentation using local texture gradients and also, has been tested with color images. Due to the time constraints of the project, we intend on using the existing implementation of the algorithm, available on the EdgeFlow website. If time permits, we may also run the EdgeFlow algorithm on images filtered with the perceptual filters, to see what interesting results that gives.

References

The following is a list of papers containing the algorithm we intend on implementing:

M. Mirmehdi and M. Petrou, "Segmentation of Colour Textures," IEEE Transactions on Pattern Analysis and Machine Intelligence, 2(2):142--159, February 2000.
M. Mirmehdi and M. Petrou, "Perceptual versus gaussian smoothing for pattern-colour separability," International Conference on Signal Processing and Communications, pages 136--140. IASTED/Acta Press, February 1998.
M. Petrou, M. Mirmehdi, and M. Coors,"Perceptual smoothing and segmentation of colour textures," Proceedings of the 5th European Conference on Computer Vision, pages 623--639, Freiburg, Germany, June 1998.
M. Petrou, M. Mirmehdi, and M. Coors, "Multilevel probabilistic relaxation," Proceedings of the Eighth British Machine Vision Conference 97, pages 60-69, BMVA Press, 1997.

Furthermore, we have found the following set of related publications and may use these as additional references during the project:

Papers on perceptual filters:

X. Zhang and B.A. Wandell, "A Spatial Extension for CIELAB for Digital Color Image Reproduction," Proc. Soc. Information Display Symp., 1996.
B.A. Wandell and X. Zhang, "SCIELAB: A Metric to Predict the Discriminability of Colored Patterms," Proc. Ninth Workshop IMage and Multidimensional Signal Processing, pp.11-12, 1996.

Papers on multiscale segmentation:

C. A. Bouman and M. Shapiro, "A Multiscale Random Field Model for Bayesian Image Segmentation," IEEE Trans. on Image Processing, vol. 3, no. 2, pp. 162-177, March 1994.
C. Bouman and B. Liu, "Multiple Resolution Segmentation of Textured Images," IEEE Trans. on Pattern Anal. and Mach. Intell., vol. 13, no. 2, pp. 99-113, Feb. 1991.
Luettgen, M.R., Karl, W.C., Willsky, A.S., Tenney, R.R., "Multiscale representations of Markov random fields," IEEE International Conference on Acoustics, Speech, and Signal Processing, 1993. ICASSP-93., vol. 5, pp.41-44, 27-30 Apr 1993.
Schneider, M.E.C., Fieguth, P.W., Karl, W.C., Willsky, A.S., "Multiscale methods for the segmentation of images," Conference Processings, IEEE International Conference on Acoustics, Speech, and Signal Processing, 1996. ICASSP-96., vol.4, pp.2247-2250, 7-10 May 1996.
Schneider, M.K., Fieguth, P.W., Karl, W.C., Willsky, A.S., "Multiscale methods for the segmentation and reconstruction of signals and images," IEEE Transactions on Image Processing," vol. 9, no.3, pp. 456-468, Mar 2000.