Demosaicing in the Kodak DC210 Digital Camera

Cleve Cheng
Psych 221 / EE 362
Winter Quarter, 1997-8

Abstract
This paper presents an attempt to reverse engineer the mosaic layout and interpolation that make up the demosaicing algorithm in a Kodak DC210 digital camera. Various test patterns were generated and captured with the camera in an attempt to produce chromatic aliasing artifacts that are clues to this demosaicing algorithm. Quantitative results were obtained by processing these images in Matlab. The end result is a prediction of the mosaic pattern and interpolation algorithm used to produce the final images stored in the camera's non-volatile memory.

Introduction
Image Acquisition Pipeline
Experiment Design
Results
Conclusion
Acknowledgements
Bibliography

Introduction

A consumer-grade digital camera acquires images with a charge-coupled device (CCD) array, or a CMOS array. In either case, each light sensing element integrates the incident light over the whole spectrum; each is essentially monochromatic. In order to capture color images, these elements must be grouped into several types, typically red, green and blue. To make the red elements sense red light, red filter material (green- and blue-absorbing) is placed over the element. Similarly with the green and blue elements. These several types of elements are arranged in some pattern, usually regular and usually rectangular, to make up the sensor array. Since any given element only senses one band of wavelengths, the raw image collected from the array is a mosaic of red, green and blue pixels, for instance. To form the pixels in the "final" image (there is still color balancing and compression to be done), the camera (or software) must interpolate pixels in order to fill in the missing values. For instance, the pixel which corresponds to a green CCD element will need to infer from neighboring elements what the red and blue values are at that point. Such an interpolation is typically carried out by applying a weighting matrix (kernel) to the neighborhood around a missing value. The size of the neighborhood and the shape of the kernel (i.e., the values of the weights) is left up to the camera engineer. These, along with the mosaic pattern, are what I attempt to recover through this experiment.

Examples of Mosaic Layouts

This demosaicing algorithm is only one intermediate step along the whole image acquisition pipeline. An outline of this pipeline, with important factors associated with each stage, is given below. The point here is that processing both before and after the demosaicing stage has a great effect on the actual values being processed in that stage, and in particular, the accuracy of an experiment where only data at the ends of the pipeline is available. The attempt is to determine internal properties buried deep within a black box by observing the overall system response to special inputs.

Image Acquisition Pipeline

Expose CCDs
- Properties of target
  
  Misalignment of phosphors/LCD pixels
  Red, green, and blue are not perfectly lined up
  Pointspread/linespread of pixel
- Distortions from optics
  
  Barrel distortion
  Image instensity falloff from center
  Chromatic aberration
  Inseparable from color balancing
  Focus
  Auto-focus cameras make this just approximate.
  Auto-exposure
  Various algorithms possible here.
- Saturation
- Spectral properties of RGB filters.
  Not crucial for demosaicing, as long as the RGB values of the test targets reasonably isolate the appropriate filters.
Download CCDs to RAM
- Dark noise
- Sensor crosstalk (noise)
- Electrical noise
Demosaicing (interpolation)
- Layout of Camera Filter Array (CFA) [2]
  - Bayer (most common)
  - Diagonal stripes
  - 3G/2G
  - Sparse checkerboard
  - Horizontal stripes
  - Vertical stripes
- Interpolation kernel
  - size
  - weights
Resampling
- Size of output image vs. # of sensors
Color Balancing
- Different lighting conditions trigger different color balancing algorithms
Compression (JPEG)
- Quality factor unknown
- Is it the standard algorithm?

Demosaicing, Resampling, and Color Balancing can be done in any order. The order affects my results, since the ordering determines what inputs the demosaicing algorithm sees.

Experiment Design

Camera

The camera used was a Kodak DC210 Zoom digital camera. It has a maximum picture resolution of 1152 x 864, with a 1160 x 872 CCD sensor array. It stores 24-bit color images in JPEG or Flashpix format. All images are compressed. It can zoom between an equivalent 29mm lens to a 58mm lens. The camera is autofocus, and autoexposure, although the exposure can be manually increased or decreased by up to 2 f-stops relative to the automatic exposure. Its focus range is from 19.8" to infinity. Its shutter speed is from 1/2 to 1/362 seconds. Its aperture range (for wide angle mode) is f/4.0 to f/13.5. Its ISO Equivalent is 140.

I performed all experiments with default exposure settings, highest quality compression (least compression), high resolution mode, and no flash.

Isolating Factors

In order to get accurate measurements we need to factor out all of the parts of the pipeline that are not part of the demosaicing algorithm. Given the black box approach to this experiment, some of these are impossible to single out. Others can be ignored, given an appropriate testing situation. Thus,

Distortion due to optics: is minimized if all targets are in the center of the camera, in controlled lighting conditions, and the camera is far enough away from the target.
Imprecision of target: could be a problem, but turns out not to be. Discussion below.
Saturation: is avoided by keeping target intensities low
Dark Noise: is factored out by taking a picture of the screen blank, and finding the mean and standard deviation in the target area
Resampling: is probably not done, since the sensor array size so neatly matches the output image size. (Note: this would not be the case if images were taken in low-resolution mode)
Color Balancing: is simply assumed to be a global shift in RGB values, and hence is not important for tests that are looking for repeating patterns of chromatic aliasing. Local shifts in color balancing are beyond the scope of this project to account for.
Compression: is also beyond the scope of this project to compensate for. A more detailed explanation follows.
Smoothing: happens at various stages in the pipeline. A discussion follows.

Smoothing

The difficulty in determining the interpolation algorithm is the blurring that occurs at various stages in the pipeline. At the beginning of the pipeline, there is a pointspread and linespread function for the optics of the camera. This may be asymmetric, and it may not be constant over the area of the lens. At the end of the pipeline, there is resampling and compression. In this case it is JPEG compression, which blurs more in the chromaticity channels than in the luminance, exactly the channels we need accurate information about.

I did not have access to intermediate values in the pipeline, so these are not independently measurable. There is only a final image which is blurred. If the blurring occurred primarily before the demosaicing, then the demosaicing interpolation would contribute few artifacts. If, however, the blurring is primarily due to the later stages, the demosaicing would have a noticeable effect on the image. Given this uncertainty, I have only been able to measure the size of the composite interpolation kernel comprised of these four separate interpolation steps. This is what I present in the results.

Compression

Perhaps the greatest difficulties were due to the automatic JPEG compression by the camera. JPEG compression produces "ringing" at sharp contrast edges. This ringing is due to quantization error and blocking artifacts, and has different magnitudes for chromaticity versus luminance channels [4]. Therefore, the most noticeable distortions occur at well-defined edges, exactly the things that are needed to accurately determine the interpolation and fine-grain mosaic pattern.

To put bounds on the magnitude of this error, I would need to have the quality factor used to compress each image. Assuming that it is possible to determine this from the JPEG file headers, it is still difficult to determine the precise character of the distortion at a given real-world edge. Therefore, in order to concentrate on more precisely determinable aspects of the pipeline, I chose to not compensate for this compression.

Imprecision of Target

I initially was concerned with the fact that both monitors had pixels that were composed of separate, non-spatially-aligned red, green, and blue elements. Since most of the targets were intended to represent ideal white lines or points, any chromatic aliasing was not necessarily due to the sensor mosaic, but could also be due to this target mosaic.

This obstacle is easily overcome. A careful examination of the phosphors and LCD element layout under a magnifying glass shows that in both cases the colors are offset horizontally (along the scanline). That is, as we proceed across a scanline, we encounter a mosaic of red, green, blue, red, green, blue, etc. There is no visible vertical offset between color elements for a given white pixel. Therefore, any aliasing that is due to the target mosaic will only show up in the horizontal direction. However, I detected chromatic aliasing patterns in the same target oriented both vertically and horizontally. I therefore posit that the blurring which occurs before the demosaicing is sufficient to erase artifacts due to this target mosaic. Any chromatic aliasing which does appear must then be attributed to demosaicing.

Test Targets

I developed a variety of test targets to give the best chance of producing meaningful chromatic aliasing. These included:

LCD Panel

Horizontally-moving pixel
Stripe patterns of varying spatial frequencies at varying angles (see [3]
Rotating single scanline
Moving diagonal scanline
Solid red, green, blue
Black

CRT

Rotating single scanline
Gradients (two-color, three-color, linear, non-linear)
Solid red, green, blue
Black

The LCD was a Apple Powerbook 5300CE active matrix display with 800x600 resolution, 16-bit color, and Active Matrix Gamma Correction turned on. The CRT was a 20" Digital PCXAV-BZ operating at 1280x1024 resolution with 24-bit color.

The gradient tests are most effective on the CRT, since its greater color depth reduces banding. The single line tests are also more effective on the CRT, because it has greater contrast. The rest of the tests depend very strongly on precise pixel location, and for this, I deemed the LCD best, with its square, sharp pixels.

The key technique is to use test patterns with spatial frequencies close to the Nyquist limit of the CCD array (i.e., on the order of a single pixel). Knowing the proportions of the screens, the pixel patterns of the targets, and the distance of the camera from the screen, I determined the spatial frequency of the images. By using images with sharp black & white edges, I attempted to capture chromatic aliasing artifacts of the demosaicing process.

Once some of these are detected, the images are read into Matlab, and analyzed to determine the spatial frequency, orientation, and magnitude of the aliasing artifacts. This is then compared with predicted aliasing patterns caused only by each class of mosaic. If these match, then we can be reasonably certain that this is the CCD layout being used; otherwise, we compare it with predictions from another mosaic layout. Once the layout is determined, we then go on to make some estimates of the interpolation kernel size used in demosaicing.

Results

Targets

As it turned out, the only interesting results were obtained from the second set of LCD images and the single scanline on the CRT. The greater precision of placement with the LCD helped produce chromatic aliasing, while the brightness of the CRT scanline made it possible to measure the linespread.

Noise

Two noise measurements were made, one of a black screen, and one of a white screen. The first tells us how much variation to expect in the bright pixels of a pattern, while the second gives us an indication for the dimmer pixels.

	White Screen		Black Screen
	Mean	Std. Dev.	Mean	Std. Dev.
Red	124	9.86	0.629	0.530
Green	122	9.11	0.609	0.523
Blue	125	8.77	1.664	1.233

Mosaic Layout

From just three of the approximately 400 images taken, I was able to rule out all but one class of mosaics.

First, let's look at vertical lines 1 pixel in width and spaced 4 pixels apart. (see third column of horizontal line target below) For each dot in the output image, calculations show there are 1.48 dots in the target image. Thus, two lines spaced 4 pixels apart corresponds to two columns in the sensor array spaced about 2.7 pixels apart. If there were pixels for every color in every column (as would be the case for a diagonal pattern like the one shown above), this would be significantly above the Nyquist frequency of the mosaic, and we should not get appreciable aliasing. On the other hand, if a given colored filter was only available every other column, this would be well within the Nyquist frequency, and we would expect aliasing. This suggests that the mosaic is either vertical stripes, the Bayer pattern, or a sparse checkerboard (see [2] for a discussion why 3G patterns are not typically used in commercial systems). By taking a closer look at the nature of the chromatic aliasing, we see some evidence for the Bayer or vertical stripe hypothesis. A plot along one scanline reveals that the green value stays relatively stable in the center, while the red and blue values fluctuate wildly. In fact, the red and blue values are exactly 180 degrees out of phase with each other. This seems to suggest that alternate columns. Only the vertical stripe layout and the Bayer layout conform to this pattern (actually, some 3G patterns do as well, but they are difficult to manufacture--see [2]).

Vertical line aliasing

Horizontal line aliasing

Next, we look at the same pattern turned 90 degrees. Since the Bayer pattern is the same when you rotate it 90 degrees, we should witness comparable chromatic aliasing. I did measure some chromatic aliasing, but its character is different than that for vertical lines. Looking at the graph, we see that all three filter classes are more or less in phase. So we become a little suspicious of the Bayer pattern hypothesis. Since there is still aliasing, we can safely rule out the vertically-striped pattern for the CFA, since horizontally, all three colors are sampled on every scanline. Which leaves the Sparse Checkerboard.

Sparse Checkerboard

G G G G G G G
G R G B G R G
G G G G G G G
G B G R G B G
G G G G G G G

Finally, we look at the same pattern turned approximately 45 degrees. Here we again see aliasing, but it is not clearly chromatic in nature. That is, there are instensity variations where none existed in the target, but there is a strong correlation between red, green, and blue elements in this alias. Therefore, it is safe to say that the demosaicing is not the cause of this aliasing. If we examine the Sparse Checkerboard pattern closely, we will notice that along diagonals, red and blue are sampled rather coarsely. Thus, at this frequency, chromatic aliasing should appear along the diagonals. That it does not is proof that the mosaic pattern is not a Sparse Checkerboard.

So we are left with the Bayer pattern. It is likely that the different character of the aliasing in the horizontal direction is due to factors not controlled in this experiment, for instance compression or color balancing.

Interpolation Kernel

Looking at a picture of a single scanline of a CRT, we see that this line is noticeably blurred. The actual scanline in the target was 1024 pixels wide, but the image of the line (of which you see just a part) is 731 pixels across, meaning each pixel in the output image corresponds to 1.40 pixels in the target. Since the target scanline was one pixel wide, it should span roughly 0.714 of a sensor pixel in the camera. However, we see that the pixel continues to affect the intensity of pixels up to two pixels away. This is roughly the "width" of the linespread. As I have already explained, the fact that chromatic aliasing is faint or non-existent in most of the test images taken is good evidence that much of this blurring occurs before the demosaicing stage. Interpolation with a kernel of this small size during the demosaicing stage would produce noticeable aliasing of these single scanlines, but as we can see, it is white within the bounds of the measured noise.

Conclusion

There were many factors which could not be dealt with within the scope of this project, and which have probably had an effect on the results presented here. To get more quantative results, one would need to gather the rest of this data, for instance the color balancing and resampling algorithms, or get data closer to the source, to remove factors such as compression. In order to get accurate measurements of the interpolation kernel's size and shape, we would also need the pointspread/linespread of the part of the pipeline which precedes demosaicing.

From examination of the output images obtained in this experiment, I am of the opinion that the JPEG compression causes the greatest problems, since it distorts most exactly those properties I have tried to measure. As a minimum, this would need to be removed to improve on these results.

In short, I have derived some information about the internal structure of the image acquisition pipeline for the Kodak DC210. Much information about demosaicing remains obscured because demosaicing is not independent of other processes in the pipeline. In order to get a truly clear picture of the demosaicing algorithm, we need a clear picture of these other algorithms.

Acknowledgements

Many of the initial ideas for good test targets and factors to consider in the image acquisition pipeline came out of discussions with Robert Erdmann, Jeremy Johnson, Hareesh Kesavan, and Christa Worley.

Bibliography

[1] Brainard, D., "Bayesian Method for Reconstructing Color Images from Trichromatic Samples", IS&T, 47th Annual Conference, pp. 375-380, 1994.

[2] Cok, D., "Reconstruction of CCD Images Using Template Matching", IS&T, 47th Annual Conference, pp.380-385, 1994.

[3] Gann, R., Reviewing and Testing Desktop Scanners. Hewlett-Packard Company, 1994.

[4] Wandell, Brian A., Foundations of Vision. Sinauer, Sunderland, Mass., ch.2-4,8, 1995.

Last modified: Fri Mar 13 05:53:24 PST 1998