Of the many concerns facing the design of a digital camera, the problem of color balancing is of primary importance. One might naively believe that the best way to resolve this issue would be to create sensors which sample the spectral power distribution with extreme precision in order to create an image which most accurately represents the color properties of the world. However, this is not the case when one is making cameras for human observers. The major problem lies in the fact that the human visual system does not "see" colors by simple calculating the precise wavelengths absorbed by the three photoreceptors. Rather, the brain sees colors in terms of the relative spectral distribution of the entire scene. This explains the fact that a piece of paper looks white where it is viewed under a tungsten or halogen or fluorescent lamp or whether it is viewed outdoors. Each of these conditions vary greatly in terms of the spectral power distribution ultimately reflected to the eye, and yet the paper looks white, the letters look black and the colors do not seem to change. There are of course limits, under low lighting , and during sunsets. That same paper might appear orange in the summer when the sun is a large red disk in the horizon but will seem a bluish halo when the sun is setting under cloud cover. Yet despite these extremes the color balancing properties of the eye are quite amazing, and their exact mechanism still remain a mystery.
So, how does one create a camera which does indeed correct for the lighting situation, as one will invariably use that camera both indoors and out? Well, one must first clarify the problem further.
The basic process of picture taking is as follows:
Where the Es are the relative responses of the light sensors (to be converted to the RGB values which the camera outputs), the Ss are the relative sensitivities of these sensors to various wavelengths of light and where the reflectance is a property of the object being viewed and the SPD is the spectral power distribution of the light source. The combined product of reflectance and SPD gives the actual spectral power distribution which hits the eyes or is adsorbed by camera's sensors. The problem is that there are many possible ways of attaining the same product, so how is the unique reflectance of an object (which determines its perceived color) actually known?
There have been two major theories which provide a mechanism. One is called the Gray World theory which assumes that since all colors in a given image should average to some shade of gray, one can scale the responses with respect to how much of that shade is present at the various sampled points in the visual field. One in essence, factors out the average, which is assumed to represent the SPD of the light source in order to attain the relative reflectance of the objects as if they were viewed under a white light. This approach is not exact of course but it manages to scale the responses in terms of surrounding conditions and is therefore a good approximation for color balancing. Another such approach which also tries to approximate the lighting conditions is know as the Uniform Perfect Reflector hypothesis which assumes that the brightness object in an image reflects the best approximation of the SPD of the light source. Therefore, the RGB values of the brightest or most reflective spot are used to scale the RGBs of all other objects in the image in order to obtain a relative reading for the entire image. Both of these theories are going to be tested here with respect to the Sony Digital Camera. Some comparisons will also be made with the Olympus Digital Camera.
The experiment was divided into two parts, one involved with the analysis of the Gray World assumption and the other with the analysis of the Perfect Uniform Reflector assumption. Images were created on MatLab using the functions makegray44.m and testpattern.m (written by Kuan Yong) which generate a 6 by 8 checkered pattern that averages out to to a gray value specified by the user. Four of these checkers can have an RGB value different from the rest of the checkers but their average still remains that of the entire image. These four squares can be placed either in the center of the image or dispersed in the corners.
For the GrayWorld test, three different images were created:
1) A control image with an average RGB value of (120, 120, 120).
2) One yellowish image where the R and G values were both scaled by 1.3.
3) One bluish image where the B values was scaled by 1.3.
In each of these images, two light gray patches with RGB values of (140, 140, 140) were placed such that one lay in the upper right and corner and the other lay in the lower left hand corner. These light patches were compensated for by two dark gray patches in the opposite corners so that the over all gray value would remain the same.
For the Perfect Reflector test, three different images were also created:
1) A control image with the same gray patches used in the GrayWorld test but placed in the center.
2) One image with a bright yellow patches (with compensation by blue patches to keep the over all average the same) in the center.
3) One image same as number (2) except that the brighter two central patches were blue instead of yellow (but they too had the complementary yellow patches to keep the gray level at 120 for all experiments).
Most of the analysis revolves around the analysis of RGB values that the camera gives for the various checkered colors at the different positions in the image. Of special interest are the checkers that were specified ahead of time, to see how they varied under the different surround conditions.
However, before analysis could proceed, the raw RGB data collected from the camera had to be inverse gamma corrected in order to get at the actual RGB values that were reported by the sensors. One first had to calculate the gamma function that the camera used. This was done by taking the raw RGB measurements of the pictures taken of the Gray Series in the MacBeth Color checker. One then has to scale these measurements to match the actual intensities of this series (which is a known). A least squared error fit is then done to generate a log or exponential function -- the value of the exponential constant is called the gamma. The gamma for the Sony was done using the MatLab function evalgamma.m. It is plotted blow.
The gamma value is equal to 1.93. In order to get the pre-gamma corrected RGBs from the camera, one must raise each of the R, G, and B values to the 1/gamma value (this is the value reported in the graph). The resulting image should be darker than the original, which is what has been found.
The gamma corrected images are shown below. The top row shows the three Gray World images and the bottom row shows the three Perfect Reflector images as they were taken by the camera.

Now, in order to process the data, the central areas in each of the checkers for all of the images were sampled using the function processLabData.m (also see Li-Yi's web page). The RGB values for these sampled squares were averaged for each square and the image was recreated with these values. However, only the central 20 checkers were used in the analysis since the camera was focused on the center and some of the bordering checkers were cut out from the image.
The following graphs analyze the Gray World hypothesis as applied to the Sony Camera. The RGB values for all of the checkers were averaged into a single RGB for each of the three images: control, yellowish and bluish. The yellow and blue averages were divided by the control average. If no color correction took place, the resulting scale should be 1.3 for green and red in the yellow image (since this was the scale used to generate the test pattern) and 1.3 for the blue. If the camera took this information into account, it would correct the image (in order to resemble more of the control and the scale would be closer to 1. The following graph shows the R and B scales under the yellow condition.
As can be seen, little correction took place, especially when compared with the Olympus camera (see Kuan's page). To analyze the blue conditions, first the red, then the green value was set to one (to control for each other's effects) and the blue scales are shown below.
As can be seen here as well, little correction took place (again, compare to the Olympus analysis on Kuan's page).
However, one can much better see these effects by comparing the checkers in the image that did not change in all three tests -- these would be the light gray checkers placed in the corners. Below is an image which shows the light gray checkers in the upper left and the lower right (arranged left to right respectively and placed next to one another to enhance their differences). The left hand column shows data from the Sony camera and the right hand column shows data from the Olympus camera. Besides the overall differences in intensity observed in the checkers (which were created identically), there are much greater differences in the Olympus grays from picture to picture. The variance in the light grays from in the two different corners might have been due to exposure differences, but the variance in the Olympus grays seems to be illustrative of color balancing.

To see why this is so, we need to examine the RGB values directly. (The two corners were averaged in the following analysis.)
|
Control |
0.3191 |
0.3248 |
0.3097 |
|
Yellow |
0.2966 |
0.3033 |
0.2836 |
|
Blue |
0.2989 |
0.3073 |
0.2979 |
|
Control |
0.3201 |
0.3118 |
0.3004 |
|
Yellow |
0.3171 |
0.3224 |
0.3914 |
|
Blue |
0.3831 |
0.3601 |
0.3132 |
In looking at the RGBs for the Sony, one sees that there is little variation under all of the picture taking conditions. The Olympus RGBs seem to vary inversely with the conditions. This makes sense because under yellow lighting conditions, the compensation would be to enhance the blue and under blue lighting conditions, the compensation would be to enhance the yellow (i.e increase the R and the G proportionately).
The basis for this analysis is very similar to the gray world but the approach is slightly different. Where as the gray world assumption approximated the the ambient light conditions by averaging all the RGB values in the image, the perfect reflector hypothesis makes this approximation by looking at he brightest and most reflective object in the image. In order to test and see if the camera was indeed performing this type of calculation, the three testpatterns with the bright gray, yellow and blue centers were used and the question was asked whether the camera takes any of this information into account in changing the RGB values of the other checkers in the image.
As can be seen, there is little or no change in the RGB values in the checkers around the central pattern, no pattern what color is made brightest in the center. Tests were done only in the center of view since the manual said that color correction was centrally weighted. It is interesting to observe however, that the central, bight checker patterns and their complements (those colors generated to keep the over all average the same) were interpreted differently by the camera. Lets take a closer look at the outputs of the cameras as compared to the central patterns.

At first glance the matches seem relatively good, with Sony actually beating out the Olympus in the bright yellow testpattern. But let us look once again at the relative RGBs for the top left and bottom right checkers (which represent the brightest colors) and also the top right and bottom left checkers (which represent the complements to those colors).
As can be seen, the same kind of slight variation found in looking at the light gray checkers in the gray world assumption experiments, is also found here in looking at the camera RGB values for the brightest center colors. Here, however, it is more interesting because the complements of these colors (shown in rows 2 and 4 above) did not vary with respect to their two locations in the image. Why might this be so? Well, one can guess that the two cameras,although they do not seem to perform any color balancing based on the brightest color in the image (at least under these test conditions), they do seem to recognize the brightest color because the complement of that color was very precisely set by the camera output RGBs. This implies some opponent-color balancing, which would be highly analogous to the human visual system. However, more tests are necessary where the intensity of the bright color is varied, and the reliability of the complement RGBs can be measured to more precisely define the color-opponent balancing effects. This question might have more to do with color constancy than color balancing however.
As a final illustration of how the cameras did scale the RGBs for bright yellow to complementary yellow and for bright blue to complementary blue, the graph below shows these values relative to the central gray patterns. The values at sample indices 1 and 4 are the relative RGB values for the bright central colors and the values at sample indices 2 and 3 are the relative RGB values for the complementary central colors. The solid lines show the bright yellow condition and the dashed lines show the bright blue condition. One can see that the cameras had different scaling properties, especially in the complementary blue domain, thus resulting in the differences in color appearance, especially in the picture with the bright yellow center.
The limited analysis conducted in this project has nonetheless given a great deal of insight into the color balancing properties of digital cameras. It is most likely that the gray world algorithm is used over the perfect reflector, but the information conveyed by the brightest portion of the image might be used in opponent color processing rather than in color balancing per say. The test images were designed to be somewhat confusing for the camera. The use of a computer monitor allowed for great precision in the specification if RGB values at various locations in the image, but the lighting situation was far from ideal to conduct a controlled study. First of all, the flash on the cameras had to be disabled so that there would be no reflectance interference on the monitor. This meant that the shutter was opened for a longer time and the refresh rate on the monitor was hopefully high enough that the over all average colors recorded by the camera were not too biased by these extraneous factors. The images were also confusing in the sense that certain elements were kept constant while other aspects of color varied -- we hoped to find inconsitancies and we did. The cameras, of course have no where near the capabilities of the human eye to perform color balancing. The main advantage of the eye is not sensitivity but computation. The eye continually reevaluates the scene every time the eyes move from one location to the next, much feedback goes back to the neurons which seem to encode the very basic elements of the image, meaning that the higher levels emphasize what to look for and direct resources where appropriate. The digital cameras have far to go.