Demosaicing in the Kodak DC210 Digital Camera
Cleve Cheng
Psych 221 / EE 362
Winter Quarter, 1997-8
Abstract
This paper presents an attempt to reverse engineer the mosaic layout and
interpolation that make up the demosaicing algorithm in a Kodak DC210
digital camera. Various test patterns were generated and captured with the
camera in an attempt to produce chromatic aliasing artifacts that are clues
to this demosaicing algorithm. Quantitative results were obtained by
processing these images in Matlab. The end result is a prediction of the
mosaic pattern and interpolation algorithm used to produce the final images
stored in the camera's non-volatile memory.
A consumer-grade digital camera acquires images with a charge-coupled device
(CCD) array, or a CMOS array. In either case, each light sensing element
integrates the incident light over the whole spectrum; each is essentially
monochromatic. In order to capture color images, these elements must be
grouped into several types, typically red, green and blue. To make the red
elements sense red light, red filter material (green- and blue-absorbing)
is placed over the element. Similarly with the green and blue elements.
These several types of elements are arranged in some pattern, usually regular
and usually rectangular, to make up the sensor array. Since any given element
only senses one band of wavelengths, the raw image collected from the array
is a mosaic of red, green and blue pixels, for instance. To form the
pixels in the "final" image (there is still color balancing and compression to
be done), the camera (or software) must interpolate pixels in order to fill
in the missing values. For instance, the pixel which corresponds to a
green CCD element will need to infer from neighboring elements what the red
and blue values are at that point. Such an interpolation is typically carried
out by applying a weighting matrix (kernel) to the neighborhood around a
missing value. The size of the neighborhood and the shape of the kernel
(i.e., the values of the weights) is left up to the camera engineer. These,
along with the mosaic pattern, are what I attempt to recover through this
experiment.
Examples of Mosaic Layouts
|
This demosaicing algorithm is only one intermediate step along the whole
image acquisition pipeline. An outline of this pipeline, with important
factors associated with each stage, is given below. The point here is that
processing both before and after the demosaicing stage has a great effect on
the actual values being processed in that stage, and in particular, the
accuracy of an experiment where only data at the ends of the pipeline is
available. The attempt is to determine internal properties buried deep within
a black box by observing the overall system response to special inputs.
- Expose CCDs
- Properties of target
- Misalignment of phosphors/LCD pixels
- Red, green, and blue are not perfectly lined up
- Pointspread/linespread of pixel
- Distortions from optics
- Barrel distortion
- Image instensity falloff from center
- Chromatic aberration
- Inseparable from color balancing
- Focus
- Auto-focus cameras make this just approximate.
- Auto-exposure
- Various algorithms possible here.
- Saturation
- Spectral properties of RGB filters.
Not crucial for demosaicing, as long as the RGB values of the
test targets reasonably isolate the appropriate filters.
- Download CCDs to RAM
- Dark noise
- Sensor crosstalk (noise)
- Electrical noise
- Demosaicing (interpolation)
- Layout of Camera Filter Array (CFA)
[2]
- Bayer (most common)
- Diagonal stripes
- 3G/2G
- Sparse checkerboard
- Horizontal stripes
- Vertical stripes
- Interpolation kernel
- Resampling
- Size of output image vs. # of sensors
- Color Balancing
- Different lighting conditions trigger different color balancing
algorithms
- Compression (JPEG)
- Quality factor unknown
- Is it the standard algorithm?
Demosaicing, Resampling, and Color Balancing can be done in any order. The
order affects my results, since the ordering determines what inputs the
demosaicing algorithm sees.
Camera
The camera used was a Kodak
DC210 Zoom
digital camera. It has a maximum
picture resolution of 1152 x 864, with a 1160 x 872 CCD sensor array.
It stores 24-bit color images in JPEG or Flashpix format. All images are
compressed. It can zoom between an equivalent 29mm lens to a 58mm lens.
The camera is autofocus, and autoexposure, although the exposure
can be manually increased or decreased by up to 2 f-stops relative to the
automatic exposure. Its focus range is from 19.8" to infinity.
Its shutter speed is from 1/2 to 1/362 seconds. Its aperture range (for
wide angle mode) is f/4.0 to f/13.5. Its ISO Equivalent is 140.
I performed all experiments with default exposure settings, highest quality
compression (least compression), high resolution mode, and no flash.
Isolating Factors
In order to get accurate measurements we need to factor out all of the
parts of the pipeline that are not part of the demosaicing algorithm. Given
the black box approach to this experiment, some of these are impossible to
single out. Others can be ignored, given an appropriate testing situation.
Thus,
- Distortion due to optics
- is minimized if all targets are in the center of the camera,
in controlled lighting conditions, and the camera is far enough away
from the target.
- Imprecision of target
- could be a problem, but turns out not to be. Discussion below.
- Saturation
- is avoided by keeping target intensities low
- Dark Noise
- is factored out by taking a picture of the screen blank, and finding
the mean and standard deviation in the target area
- Resampling
- is probably not done, since the sensor array size so neatly matches
the output image size. (Note: this would not be the case if images were
taken in low-resolution mode)
- Color Balancing
- is simply assumed to be a global shift in RGB values, and hence is not
important for tests that are looking for repeating patterns of chromatic
aliasing. Local shifts in
color balancing are beyond the scope of this project to account for.
- Compression
- is also beyond the scope of this project to compensate for.
A more detailed explanation follows.
- Smoothing
- happens at various stages in the pipeline. A discussion follows.
Smoothing
The difficulty in determining the interpolation algorithm is the blurring
that occurs at various stages in the pipeline. At the beginning of the
pipeline, there
is a pointspread and linespread function for the optics of the camera. This
may be asymmetric, and it may not be constant over the area of the lens.
At the end of the pipeline, there is resampling and compression. In this
case it is JPEG compression, which blurs more in the chromaticity channels
than in the luminance, exactly the channels we need accurate information about.
I did not have access to intermediate values in the pipeline, so these are not
independently measurable. There is only a final image which is blurred.
If the blurring occurred primarily before the demosaicing, then the
demosaicing interpolation would contribute few artifacts. If, however, the
blurring is primarily due to the later stages, the demosaicing would have a
noticeable effect on the image. Given this uncertainty, I have only been able
to measure the size of the composite interpolation kernel comprised
of these four separate interpolation steps. This is what I present in the
results.
Compression
Perhaps the greatest difficulties were due to the automatic JPEG compression
by the camera. JPEG compression produces "ringing" at sharp contrast edges.
This ringing is due to quantization error and blocking artifacts, and has
different magnitudes for chromaticity versus luminance channels
[4]. Therefore, the most noticeable
distortions occur at well-defined edges, exactly the things that are needed
to accurately determine the interpolation and fine-grain mosaic pattern.
To put bounds on the magnitude of this error, I would need to have the
quality factor used to compress each image. Assuming that it is possible to
determine this from the JPEG file headers, it is still difficult to determine
the precise character of the distortion at a given real-world edge. Therefore,
in order to concentrate on more precisely determinable aspects of the pipeline,
I chose to not compensate for this compression.
Imprecision of Target
I initially was concerned with the fact that both monitors had pixels that
were composed of separate, non-spatially-aligned red, green, and blue elements.
Since most of the targets were intended to represent ideal white lines or
points, any chromatic aliasing was not necessarily due to the sensor mosaic,
but could also be due to this target mosaic.
This obstacle is easily overcome. A careful examination of the phosphors
and LCD element layout under a magnifying glass shows that in both cases the
colors are offset horizontally (along the scanline). That is, as we proceed
across a scanline, we encounter a mosaic of red, green, blue, red, green, blue,
etc. There is no visible vertical offset between color elements for a given
white pixel. Therefore, any aliasing that is due to the target mosaic will
only show up in the horizontal direction. However, I detected chromatic
aliasing patterns in the same target oriented both vertically and
horizontally. I therefore posit that the blurring which occurs before the
demosaicing is sufficient to erase artifacts due to this target mosaic. Any
chromatic aliasing which does appear must then be attributed to demosaicing.
Test Targets
I developed a variety of test targets to give the best chance of producing
meaningful chromatic aliasing. These included:
LCD Panel
- Horizontally-moving pixel
- Stripe patterns of varying spatial frequencies at varying angles
(see [3]
- Rotating single scanline
- Moving diagonal scanline
- Solid red, green, blue
- Black
CRT
- Rotating single scanline
- Gradients (two-color, three-color, linear, non-linear)
- Solid red, green, blue
- Black
The LCD was a Apple Powerbook 5300CE active matrix display with 800x600
resolution, 16-bit color, and Active Matrix Gamma Correction turned on.
The CRT was a 20" Digital PCXAV-BZ operating at 1280x1024 resolution with
24-bit color.
The gradient tests are
most effective on the CRT, since its greater color depth reduces banding.
The single line tests are also more effective on the CRT, because it has
greater contrast. The rest of the tests depend very strongly on precise
pixel location, and for this, I deemed the LCD best, with its square,
sharp pixels.
The key technique is to use test patterns with spatial frequencies
close to the Nyquist limit of the CCD array (i.e., on the order of a single
pixel). Knowing the proportions of the screens, the pixel patterns of the
targets, and the distance of the camera from the screen, I determined the
spatial frequency of the images. By using images with sharp black & white
edges, I attempted to capture chromatic aliasing artifacts of the
demosaicing process.
Once some of these are detected, the images are read
into Matlab, and analyzed to determine the spatial frequency, orientation, and
magnitude of the aliasing artifacts. This is then compared with predicted
aliasing patterns caused only by each class of mosaic. If these match, then
we can be reasonably certain that this is the CCD layout being used;
otherwise, we compare it with predictions from another mosaic layout. Once
the layout is determined, we then go on to make some estimates of the
interpolation kernel size used in demosaicing.
Targets
As it turned out, the only interesting results were obtained from the second
set of LCD images and the single scanline on the CRT. The greater precision
of placement with the LCD helped produce chromatic aliasing, while the
brightness of the CRT scanline made it possible to measure the linespread.
Noise
Two noise measurements were made, one of a black screen, and one of a white
screen. The first tells us how much variation to expect in the bright
pixels of a pattern, while the second gives us an indication for the dimmer
pixels.
|
White Screen |
Black Screen |
|
Mean |
Std. Dev. |
Mean |
Std. Dev. |
| Red |
124 |
9.86 |
0.629 |
0.530 |
| Green |
122 |
9.11 |
0.609 |
0.523 |
| Blue |
125 |
8.77 |
1.664 |
1.233 |
Mosaic Layout
From just three of the approximately 400 images taken, I was able to rule
out all but one class of mosaics.
First, let's look at vertical lines 1 pixel in width and spaced 4 pixels apart.
(see third column of horizontal line target below)
For each dot in the output image, calculations show there are 1.48 dots in the
target image. Thus, two lines spaced 4 pixels apart corresponds to two columns
in the sensor array spaced about 2.7 pixels apart. If there were pixels for
every color in every column (as would be the case for a diagonal pattern like
the one shown above), this would be significantly above the Nyquist frequency
of the mosaic, and we should not get appreciable aliasing. On the other hand,
if a given colored filter was only available every other column, this would
be well within the Nyquist frequency, and we would expect aliasing. This
suggests that the mosaic is either vertical stripes, the Bayer pattern, or a
sparse checkerboard (see [2] for a discussion why 3G patterns
are not typically used in commercial systems). By taking a closer look at
the nature of the chromatic aliasing, we see some evidence for the Bayer or
vertical stripe hypothesis. A plot along one scanline reveals that the
green value stays relatively stable in the center, while the red and blue
values fluctuate wildly. In fact, the red and blue values are exactly 180
degrees out of phase with each other. This seems to suggest that alternate
columns. Only the vertical stripe layout and the Bayer layout conform to this
pattern (actually, some 3G patterns do as well, but they are difficult to
manufacture--see [2]).
Vertical line aliasing
Horizontal line aliasing
Next, we look at the same pattern turned 90 degrees. Since the Bayer
pattern is the same when you rotate it 90 degrees, we should witness
comparable chromatic aliasing. I did measure some chromatic aliasing, but
its character is different than that for vertical lines. Looking at the
graph, we see that all three filter classes are more or less in phase. So
we become a little suspicious of the Bayer pattern hypothesis.
Since there is still aliasing, we can safely rule out the vertically-striped
pattern for the CFA, since horizontally, all three colors are sampled on
every scanline. Which leaves the Sparse Checkerboard.
Sparse Checkerboard
G G G G G G G
G R G B G R G
G G G G G G G
G B G R G B G
G G G G G G G


Finally, we look at the same pattern turned approximately 45 degrees. Here
we again
see aliasing, but it is not clearly chromatic in nature. That is, there are
instensity variations where none existed in the target, but there is a strong
correlation between red, green, and blue elements in this alias. Therefore,
it is safe to say that the demosaicing is not the cause of this
aliasing. If we examine the Sparse Checkerboard pattern closely, we will
notice that along diagonals, red and blue are sampled rather coarsely. Thus,
at this frequency, chromatic aliasing should appear along the diagonals. That
it does not is proof that the mosaic pattern is not a Sparse Checkerboard.
So we are left with the Bayer pattern. It is likely that the different
character of the aliasing in the horizontal direction is due to factors not
controlled in this experiment, for instance compression or color balancing.
Interpolation Kernel


Looking at a picture of a single scanline of a CRT, we see that this
line is noticeably blurred. The actual scanline in the target was 1024 pixels
wide, but the image of the line (of which you see just a part) is 731
pixels across, meaning each pixel in the output image corresponds to 1.40
pixels in the target. Since the target scanline was one pixel wide, it should
span roughly 0.714 of a sensor pixel in the camera. However, we
see that the pixel continues to affect the intensity of pixels up to two
pixels away. This is roughly the "width" of the linespread. As I have
already explained, the fact that chromatic aliasing is faint or non-existent
in most of the test images taken is good evidence that much of this
blurring occurs before the demosaicing stage. Interpolation with a kernel of
this small size during the demosaicing stage would produce noticeable aliasing
of these single scanlines, but as we can see, it is white within the bounds
of the measured noise.
There were many factors which could not be dealt with within the scope of this
project, and which have probably had an effect on the results presented here.
To get more quantative results, one would need to gather the rest of this data,
for instance the color balancing and resampling algorithms, or get data
closer to the source, to remove factors such as compression. In order to get
accurate measurements of the interpolation kernel's size and shape, we would
also need the pointspread/linespread of the part of the pipeline which precedes
demosaicing.
From examination of the output images obtained in this experiment, I am of
the opinion that the JPEG compression causes the greatest problems, since it
distorts most exactly those properties I have tried to measure. As a minimum,
this would need to be removed to improve on these results.
In short, I have derived some information about the internal structure of
the image acquisition pipeline for the Kodak DC210. Much information about
demosaicing remains obscured because demosaicing is not independent of other
processes in the pipeline. In order to get a truly clear picture of the
demosaicing algorithm, we need a clear picture of these other algorithms.
Many of the initial ideas for good test targets and factors to consider in
the image acquisition pipeline came out of discussions with Robert Erdmann,
Jeremy Johnson, Hareesh Kesavan, and Christa Worley.
[1]
Brainard, D., "Bayesian Method for Reconstructing Color Images from
Trichromatic Samples", IS&T, 47th Annual Conference, pp. 375-380, 1994.
[2]
Cok, D., "Reconstruction of CCD Images Using Template Matching", IS&T,
47th Annual Conference, pp.380-385, 1994.
[3]
Gann, R., Reviewing and Testing Desktop Scanners. Hewlett-Packard Company,
1994.
[4]
Wandell, Brian A., Foundations of Vision. Sinauer, Sunderland, Mass.,
ch.2-4,8, 1995.
Last modified: Fri Mar 13 05:53:24 PST 1998