Psych 221 Final Project

Optimizing Perceptual Quality in JPEG Coded Images

by Brad Johanson
March 13, 1998


Introduction

Typically, the Signal to Noise Ratio (PSNR) or Mean Square Error (MSE) are used to evaluate the quality of a compressed image, which makes sense from a signal processing perspective.  Since most image consumption is by the human eye, however, this doesn't make a lot of sense as a measure of quality in real world situations.  Specifically, the human eye tends to be less sensitive to high spatial frequencies, and certain wavelengths of light.  One area where high frequency information is important to humans, however, is at edges in images which are thought to be crucial in scene segmentation.

The goal of this project was to look at ways of modifying the JPEG compression standard to create images that look better to the human eye, while maintaining a similar bit rate.  Standard JPEG and two modifications to it were examined.  The first modification is a process developed at NASA Ames known as DCTune which attempts to optimize the quantization coefficients in JPEG to achieve a certain perceptual quality.  The second was developed for this project and involves sending higher quality information for edge areas of the image, which hopefully will make images more intelligible.

For four standard images, each of the three types of compression were applied, and both PSNR and S-CIELAB error were examined.  S-CIELAB is a system developed by Wandell and Zhang which attempts to take into account image blurring in the eye in determining how well color values in two images agree.  For one of the images, the error measures were looked at across differing levels of compression.  High resolution versions of all images are embedded in this page so that the viewer can determine for themselves which are the most pleasing.


Basic JPEG Compression

Paint Shop Pro 4.10 was used to compress the images with the un-modified JPEG scheme.  Paint Shop Pro uses a "Quality" parameter to control compression which can be set from 1 to 99, where 1 is highest quality, and 99 is the worst quality.

Paint Shop Pro is available for Windows 3.1, Windows 95 and Windows NT from: http://www.jasc.com


DC Tune

DC Tune is a program written at NASA Ames Vision Research Center.  It has a built in perceptual image quality measure, and it tries to optimize the DCT quantization coefficients to achieve a certain perceptual quality at the best possible compression ratio.  The perceptual measure, P, goes from 0 to infinity, with 1 being perceptually lossless, and infinity being very different perceptually.  Another mode of the program optimizes perceptual quality at a given compression ratio.

The DC Tune Website can be found here: http://vision.arc.nasa.gov/dctune/dctune1.1.html


S-CIELAB

The CIELAB system predicts how well colors match each other perceptually.  S-CIELAB is an extension to the system by Wandell and Zhang which takes into account the spatial sensitivity of the eye.  By specifying the number of pixels per degree of arc, the system takes into account blurring in the eye to determine what colors are perceived over regions of the image.  This is then compared to the original image to determine the amount of CIELAB perceptual error between the color in the original and compressed image.  In this project the average error across the whole image is used.

Matlab source for S-CIELAB can be found here:  ftp://white.stanford.edu/scielab/scielab-1.1.tar.gz


 Edge Enhancement Method

Although the human eye is less sensitive to high spatial frequencies, edges are thought to be important in aiding the human brain to segment images.  My idea was to compress the image more, eliminating higher spatial frequencies, and then send a separate edge mask with the image information at edges in the image.  This way, high frequency background texture which is less important is traded for more high frequency information at the edges of the image.

The compression algorithm is as follow:

  1.  Extract the Y Color Plane from the uncompressed Image
  2. Use a Prewitt Edge Detector to find the edges in the image.
  3. Dilate the edge mask by some amount (4 was used in all images for this project)
  4. Let JSize, the target JPEG size = the target image size, TSize
  5. JPEG Compress the image to size JSize
  6. Extract the Y plane of the compressed image
  7. Subtract the Y plane of the compressed image from the uncompressed Y plane to create an error image
  8. Quantize the error image to some number of levels (16 for this project)
  9. Mask the error image using the mask from step 3
  10. RLE and entropy code the masked error image
  11. Let MSize be the size of the compressed masked error image
  12. If MSize+JSize > TSize, set JSize=TSize-MSize, goto step 5
  13. Done: Compressed masked error image and compressed JPEG are the Edge Enhanced JPEG
For decompression:
  1. Decompress the JPEG part and split into Y, Cr, and Cb planes
  2. Decompress masked error image and add into Y plane
  3. Combine Y, Cr, and Cb planes to create R, G, B values
The rest of this section walks through the process for the 'hats' image with a target size of 15KB.
 
 

The Original Hats Image and Y Plane

First, the Y plane is extracted from the original image.  An edge mask is created and dilated.  Then a compressed JPEG is created, and the Y plane extracted from that:
 

The Compressed JPEG Image and Y Plane

 Now an error image is calculated and masked using the edge mask created earlier:
 

The Edge Mask and Masked Error Image

The size of the quantized masked error image after compression is computed, and the past two steps are repeated until the size of the JPEG and compressed mask together meet the target size.

Finally, after storage or transmission, the image is reconstructed by adding the masked error image into the compressed Y Plane, and the final image is created.
 
 

The Error Compensated Compressed Y Plane and the Final Edge Enhanced Image

The code written to perform this process is available in the following gzipped tar file:  edge-jpeg.tar.gz

The files are:


Results

 This section contains results.  Four different images were compressed using the various techniques, and the Average S-CIELAB errors and PSNRs were calculated.  All images are 384x256 and are taken from 24-bit color Photo CD originals.  All images are copyright (c) by Eastman Kodak Company, 1991, and are used according to the access terms outlined on the original photo disc.   The credits for the images are as follows:
 
Parrots
Photographer:        Steve Kelly
Location:            Maui, Hawaii, USA
Island
Photographer:        Don Cochran
Location:            Bahamas
Hats
Photographer:        Don Cochran
Location:            Bahamas
Buildings
Photographer:        Alfons Rudolph
Location:            Essligen, GERMANY

Photographer Information for the Four Test Images

The first series shows the compression of the 'parrots' image.  It was compressed using DCTune, the standard compression in Paint Shop Pro, and my edge-enhanced method for each of four file sizes.  The file sizes were determined by using a Q factor of 25, 50, 75 and 99 in Paint Shop Pro, then forcing the other compressors to create files of equal sizes.  Recall that lower Q factors imply less compression, and therefore higher quality.  For each file size, the original is shown, along with the output of each compressor, and the PSNR and S-CIELAB errors (assuming a 72 dpi screen viewed at 18 inches).
 
Original Parrot Image
 
 
 
DCTune Optimized (P=0.72)
S-CIELAB Error: 2.2
PSNR: 29.0
 
Standard JPEG Compressed (Q=25)
S-CIELAB Error: 1.5 
PSNR: 31.1
Edge Enhanced JPEG
S-CIELAB Error: 2.0 
PSNR: 28.4

 13 Kbyte Parrot Images, 1.1 Bits/Pixel, 21.9x Compression

 At 13Kbytes, or 21.9x compression there is very little quality loss in any of the images.
 
Original Parrot Image
 
 
 
DCTune Optimized (P=1.29)
S-CIELAB Error: 3.2
PSNR: 26.8
Standard JPEG Compressed (Q=50)
S-CIELAB Error: 2.1 
PSNR: 28.7
Edge Enhanced JPEG
S-CIELAB Error: 2.6 
PSNR: 27.6

9 Kbyte Parrot Images, 0.7 Bits/Pixel, 34.3x Compression 

 At 34.3x compression, some loss is noticeable, mainly as mosquito-noise around the edges of objects.  It is still difficult to tell the difference between the three compression types.
 
Original Parrot Image
 
 
 
DCTune Optimized (P=1.77)
S-CIELAB Error: 5.5
PSNR: 23.6
Standard JPEG Compressed (Q=75)
S-CIELAB Error: 3.2 
PSNR: 26.4
Edge Enhanced JPEG
S-CIELAB Error: 5.3
PSNR: 24.7

6 Kbyte Parrot Images, 0.5 Bits/Pixel, 48x Compression

At 48x compression, all of the compressed images are starting to look degraded.  While the other two are looking blurrier all over, the edge-enhanced image shows a severely degraded background.  Good definition remains around the edges of the heads and in the eye areas of the parrots for the edge-enhanced image, however.
 
 
DCTune Optimized (P=13.38)
S-CIELAB Error:14.1
PSNR: 17.8
Standard JPEG Compressed (Q=99)
S-CIELAB Error:14.3
PSNR: 17.6

2 Kbyte Parrot Images, 0.2 Bits/Pixel, 120x Compression

At 120x, or maximum compression, the images are horribly distorted.  An edge-enhanced image could not be created, since the compressed JPEG part of the image already takes up the full size required for the image, and there is no space for the compressed edge masked error image.
 
(Blue=DCTune, Green=Standard JPEG (PSP), Red=Edge Enhanced JPEG)

The Error Rates for the Parrot Pictures

The above graph shows how the images faired in PSNR and S-CIELAB error for the four compression amounts used for the 'parrot' and illustrated in the past figure series.  It is interesting to note that the JPEG compression used by Paint Shop Pro does the best on both measures, while it is supposed to be the standard compression scheme.  I suspect that Paint Shop Pro is actually doing some optimization of its own; the other option is that DCTune really is not able to do any better than standard JPEG.  The other interesting thing is that the Edge-Enhanced JPEG technique does better than DCTune on the S-CIELAB measure at every compression level, and does better at SNR on the two lower bit rates.  This is a little suspicious, however, since at a bit rate of 0.5 bits/pixel, the edge-enhanced image is visually worse than the other two.  This gives further evidence of how difficult it is to come up with a reasonable measure or perceptual quality.

More Images

The rest of this section shows the result of compressing the other three images using the various techniques at the size resulting from a quality factor of 75.  This was chosen since it is creates images which are compressed to the level where loss is perceptible, but not overwhelming.
 
Original Buildings Image
 
 
DCTune Optimized (P=3.32)
S-CIELAB Error: 7.4
PSNR: 22.2
Standard JPEG Compressed (Q=75)
S-CIELAB Error: 4.3
PSNR: 22.6
Edge Enhanced JPEG
S-CIELAB Error: 8.9
PSNR: 20.4

13 Kbyte Building Images, 1.1 Bits/Pixel, 21.9x Compression

The 'buildings' image is a tough one to compress, since there is a lot of high frequency information in the lines on the buildings.  There is a large discolored region in the upper left of the DCTune image, and the edge masked error image was so large (because of the many edges) that the JPEG part had to be compressed at near maximum level, causing severe distortion, especially for color which is almost completely washed out in many areas, like the green building.   Interestingly the Paint Shop Pro standard JPEG compression looks pretty good, with only a little mosquito noise.
 
 
Original Hats Image
DCTune Optimized (P=1.77)
S-CIELAB Error: 6.2
PSNR: 22.8
Standard JPEG Compressed (Q=75)
S-CIELAB Error: 3.1
PSNR: 25.2
Edge Enhanced JPEG
S-CIELAB Error: 8.3
PSNR:  20.9

6 Kbyte Hat Images, 0.5 Bits/Pixel, 48x Compression

A similar situation happens with the hats.  Notice the distorted colors in the upper right of the DCTune image.  The edge-enhanced JPEG is once again very distorted in regions without the edge error added back in.  The benefit of the edge-enhancement technique is clear though if you look at the text on the yellow hat which is clearest in the edge-enhanced image.
 
Original Island Image
 
 
DCTune Optimized (P=2.93)
S-CIELAB Error: 7.4
PSNR: 24.0
Standard JPEG Compressed (Q=75)
S-CIELAB Error: 2.8
PSNR: 26.5
Edge Enhanced JPEG (7Kbyte- 6 Not Attainable)
S-CIELAB Error: 7.4
PSNR: 22.7

6 Kbyte Island Images, 0.5 Bits/Pixel, 48x Compression

A similar story is seen again for the 'island' image.  Notice how DCTune has caused a big glitch in the light patch below the island in the center right of the image.  Paint Shop Pro again does a remarkably good job with the colors and keeping transitions from begin too blocky.  The edge-enhanced image was not able to get to a 6Kbyte file size as it would have required the JPEG part of the image to be compressed to smaller than the lowest quality to allow space for the masked error image.  The effect of the masked error image is clear though, as the palm trees, cloud, and patches of the water are well defined.  Unfortunately the distortion in the rest of the image makes the perceptual quality unacceptable.
 
 


Conclusion

This project bore some interesting results.  First, it was interesting to find that the supposedly standard JPEG compression included in Paint Shop Pro was able to do a better job on the images than either of the two optimized schemes.  My hypothesis is that Paint Shop Pro is using some sort of optimization of its own.  The second interesting result was that DCTune, which is supposed to optimize perceptual quality caused significant color distortion patches compared to Paint Shop Pro, and in some cases the edge-enhanced images.  My guess is that DCTune optimizes for spatial frequencies, but not color matching-- an important oversight on the part of its designers.

Finally, the edge-enhanced JPEG technique developed shows promise, but did not perform as well as expected.  At high bit-rates, the additional information coded in the edge masked error image was not noticeable, and at lower bit rates, the masked error image took up so much space that the rest of the JPEG had to be coded at a very degraded quality.  Still, at these low bit rates the edge masked error image did significantly benefit some critical areas of the image, such as the text on the hat in the 'hats' image, and the palm trees in the 'island' image.  Currently lossless coding is used for the masked error image.  It may be possible to improve the scheme by using lossy coding at a higher quality on the masked error image, and a lower quality for the rest of the JPEG which might allow an apparently higher quality image to be constructed at the same bit rate as a standard compressed JPEG.



Page by bjohanso@stanford.edu