Shuo-yen Choo and Gregory Chew
Motivation:
Part 2. We'll present experimental data comparing the performance of DCT with the performance of various wavelet-based transforms.
Part 1. JPEG 2000: Under the Hood
Each Component can have different sizes and bit depths, and have different alignments relative to each other. [2]
Figure 1. Position of the image component relative to the reference grid. [2]
Each image component is further broken down into Tiles. Tile sizes are variable, and can differ from component to component. Similar to blocks in JPEG, but more flexible.
Figure 2. Image tiles [3].
For JPEG2000, the wavelet transform is applied to the image on a tile by tile basis. We will look at wavelet compression in more detail in the second part of this presentation.
But for now, an overview of how it works:
In the one dimensional case, the signal is broken into subbands by passing it through a low pass filter and a high pass filter, and both subbands are downsampled by 2. The same procedure can then be applied iteratively to the low frequency subband, and repeated for as many levels of decomposition as desired. If the filters used satisfy certain properties, the original signal can be reconstructed by reversing the procedure.
Figure 3. 2 level subband decomposition and synthesis. [4]
In the 2 dimensional case, the decomposition is applied
separably in the horizontal and vertical directions. This leads to a 2
dimensional signal getting broken down into four subbands, known as "LL",
"LH", "HL", and "HH". Conceptually, for a particular image, they translate
to a low-frequency approximation of the original, primarily horizontal
edges, primarily vertical edges, and diagonal edges. (other decompositions
are supported. See [10])
Figure 4. 2 Level Decomposition.
A quantization matrix is then applied to the decomposed
image. Uniform quantization is performed within each subband, with different
levels of quantization for each subband. Generally, we want to quantize
the higher frequency subbands more coarsely, since humans have lower contrast
sensitivity to high frequency information.
Figure 5. 2 level decomposition of baboon with Daubechies 4 wavelet (with false color for visibility)
It has been found that wavelet representations of an image generally perform than DCT representations for lossy image compression, as there is less perceptual loss for the same bit rate. This is the case even when performed on the same block size.
It is believed that multi-resolution wavelet representations give better performance because:
Region of Interest (ROI) coding
In ROI coding, portions of an image are stored at higher quality than the rest of the image. This is useful, because we may care more about detail in some portions of an image than in others. e.g. We've all experienced the unreadable text in the Stanford online videos. Similar applications exist in medical imaging, etc.
Figure 6. An example of ROI coding with a rectangular ROI mask [7].
ROI is easy to do when the image is stored compressed in a multi resolution format.
1. We first start with a ROI mask, which marks out a region of the image we wish to store at higher quality.
Figure 7. ROI mask [8]
2. The wavelet coefficients corresponding to the transform of the mask have to be stored at higher quality (quantized less coarsely). We can do this by applying the transform to the mask, and looking at which coefficients fall in the mask.
Figure 8. Transformed ROI mask [8]
Note that mask information is not needed at the decoder.
There can be more than one ROI in the image. Note that disjoint masks can overlap in the wavelet domain due to filtering. When this occurs, the region of overlap can be stored at the quality of the mask with the highest quality [9].
Progressive transmission
Lower resolution coefficients of the multi-resolution decomposition can be transmitted first. This allows for progressive transmission and display of the image.
The decoder can display the image progressively by resolution (the image gets larger as more information is received), or progressively by quality.

Figure 9. Progressively by resolution [11].

Figure 10. Progressively by quality. L: 0.0625 bpp , R: 0.5 bpp[12]
Because of image tiling, coefficients from different tiles have to be gathered, so that the lower resolution coefficients are sent first.
Other forms of progression are possible, such as progression by image channel. [13]
Compressed domain image manipulation
Basic geometrical transformations can be applied (easily) on the compressed representation of the image. This eliminates the need to decompress and recompress the image for transformation. e.g. vertical and horizontal flipping, rotation by multiples of 90 degrees.
Figure 11. Vertical Flipping [14].
Figure 12. Rotation [15]
Part 2. Wavelet Compression
We compared the quality of JPEG compressed images against the quality of images compressed with a variety of wavelet filters, in terms of the SNR and the subjective image quality.
We looked at 3 important classes of images: 4 natural images, 3 synthetic images and 4 textual images were used. The images were all 256 by 256 in size.
Natural images


Figure 1. L to R, Top to Bottom: Lena, Barabara, Baboon, Einstein
Synthetic images


Figure 2. L to R, Top to bottom: Sinusoid 1 (1 cycle every 100 pixels), Sinusoid 2 (5 cycles every 100 pixes), Checker pattern, Square.
Text:


Figure 3. L to R, Top to bottom: Text 2, Text 3, Text 4
The filters used were Daubechies 1,2 4, 5, 8 and Symlets 2, 4 ,5, 8. The Daubechies filters are popular in image processing. Symlets are less frequently seen, but surprisingly perform well (see later).
For purpose of comparison with JPEG, the wavelet filters were applied on 8 by 8 blocks.
Procedure:
1. A JPEG Quality factor is selected, and the bit rate is calculated from the the quantization matrix returned. The JPEG Quality factors used were 10, 15, 20, 25 and 30. Low quality factors were chosen, because artifacts are easily perceptible only at low quality levels.
2. An appropriate quantization matrix with the same bit rate is selected, and applied to the wavelet transform coefficients.
3. The inverse transforms are performed. SNR and subjective image quality are compared.
Quantization
JPEG2000 does not specify the use of particular quantization matrices. A way of calculating a quantization matrix for a particular filter is suggested, but the formula given seems fairly arbitrary (rather than being based on empirical experimental data) [16]
In "Visibility of wavelet quantization noise" [4], the authors perform experiments with human subjects to determine the perceptually lossless quantization matrix for the 9/7 biorthogonal filter.
However, in general, there does not seem to be much data on what the perceptually optimal quantization matrix for a given filter is. Therefore, we have elected for simplicity, to use a single quantization matrix for all the wavelet filters in this experiment (although this limits the validity of comparison somewhat).
The Quantization matrix used for all the wavelet filters was
8 7
8 8 34 34
34 34
7 7
8 8 34 34
34 34
8 8
12 12 34 34
34 34
8 8
12 12 34 34
34 34
34 34
34 34 55 55
55 55
34 34
34 34 55 55
55 55
34 34
34 34 55 55
55 55
34 34
34 34 55 55
55 55
scaled to obtain a particular bitrate. This was the quantization matrix used in [17].
Results
Detailed results are available in the Matlab .mat files and as image files. Please refer to the Appendix.
Click here to plots of the Average
SNR against the Bit Rate
(It may be hard to read the plots from the gifs. Load
the .mat file and use makeplots.m to plot the snr data.)
We present a summary of the results obtained here:
Sample images
Figure 4. Barbara JPEG. SNR 48.9, 1.76 bpp (Quality 30)
Figure 5. Barbara. db4. SNR 69.4, 1.76 bpp
Figure 6. Barbara. Symlet 5. SNR 79.4, 1.76 bpp
Figures 4 to 6 show the Barbara image at 1.76 bpp. Symlet 5 is the best performing wavelet transform (highest SNR) for this image, and db4 is the worst performing wavelet transform (lowest SNR). Both look better than the JPEG compressed image, and Symlet 5 looks better than db4.
The artifacts in wavelet compressed natural images tend
appear as fine lines with some gradation of color within each block (even
at low bit rates), as opposed to the severe blocking in JPEG compressed
images.
Figure 7. Barbara. Symlet 5. SNR 44.9, 0.67 bpp
Figure 8. Barbara. db4. SNR 44.7, 0.91 bpp
Figures 7 and 8 show the Barbara image compressed by the symlet 5 and db4 filters. The SNR was made (nearly) equal in both the images, but the Symlet 5 image looks better. This shows that there's a perceptual difference between different wavelet filters, which cannot be characterized by the SNR alone.
Figure 9. Text4. JPEG. SNR 63.2. 1.76 bpp
Figure 10. Text 4. db 5. SNR 235.5. 1.76 bpp
Figure 9. Text 4. db 1. SNR 333.5. 1.76 bpp
Figures 9 to 11 show the Text4 image at 1.76 bpp. db1 is the best performing wavelet transform (highest SNR) for this image, and db5 is the worst performing wavelet transform (lowest SNR). The db1 compressed image looks very similar to the original (slightly blurred), while for the db5 compressed image, ripples at the text boundaries from the Gibbs windowing effect can be seen. In the jpeg compressed image, the rippling and blurring of the text is even more severe.
Conclusion and further comments.
Although our results are not conclusive (due to the fact the we used the same quantization matrix for all the wavelet filters), it is clear that using wavelet transforms provides significant SNR and perceptual image quality gains over the traditional DCT used in JPEG, especially at low bit rates. This agrees well with published results in the literature (such as in [17]). Adopting wavelet transforms for compression in JPEG2000 should result in a significant improvement in the image quality per bit.
Besides image quality improvements from moving to wavelet transforms, JPEG 2000 also offers increased flexibility that should make it more applicable than JPEG, and has other interesting feature like ROI coding and progressive transmission. Our description of JPEG2000 is by no means complete. Visit http://www.jpeg.org for more information.
Futher comments:
1.
We saw in our experiment that different wavelet filters
produce results of different quality for different classes of images. For
a particular image class, some filters generally perform better than others.
However, it does not seem well understood what the best wavelet transform to use for a given image type is. And for a given wavelet transform, what is the optimal quantization matrix?
In "Visibility of wavelet quantization noise" , Watson et. al. [4] develop a perceptually lossless quantization matrix for the 9/7 biorthogonal filter. The reasons they give for selecting this particular filter are that the filter is "i) linear phase ii) symmetrical iii) argued to have mathematical properties attractive for image compression iv) used by FBI for compression of fingerprint images."
These reasons seem less than compelling to us. What seems to be missing is an extensive body of experimental work with human subjects, describing the optimal (or at least well chosen) quantization matrices for different filters, and the subjective performance of the filters on a large bank of images of different types.
From our reading of the literature, at the present, the process of selection of the wavelet filter and quantization matrix to use seem very abritrary. More work will have to be done to try to determine the best wavelet filter and quantization matrix to use for a given class of images, and this is a possible future direction for this project.
2. An obvious extension would be to look at color as well as black and white images. We looked at b&w images only in this project for simplicity.
3.
JPEG 2000 is a very ambitious image standard which should
provide much higher image quality and flexibility than JPEG. However, this
comes at a cost of increased computational complexity. Wavelet transforms
generally take longer to perform than the DCT. Implementation is also likely
to be more complex.
We think it's an open question whether JPEG 2000 can become widely accepted, especially as a standard for Web imaging, since inertia from existing image formats is great. e.g. PNG has not yet caught on.
References
[1] JPEG2000 Tutorial. Christopoulos, Charilaos. Lecture given at IEEE Int. Conference on Image Processing (ICIP 99), in Kobe, Japan, 24-28 Oct 99. Page 18. http://etro.vub.ac.be/~chchrist/jpeg2000_contributions.htm
[2] JPEG2000 Committee draft version 1.0, 9th December 1999. Page 81. http://www.jpeg.org/CD15444-1.htm
[3] ibid. Page 82.
[4] Visibility of wavelet quantization noise. Watson,
A.B.; Yang, G.Y.; Solomon, J.A.; Villasenor, J.
IEEE transactions on image processing, Vol. 6, No. 8,
August 1997, Pages 1164 to 1175.
[5] Foundations of Vision. Wandell, Brian. Published by Sinauer Associates, 1995.
[6] Image compression using Wavelets. Yeung E. IEEE 1997 Canadian Conference on Electrical and Computer Engineering, 1997. Engineering Innovation: Voyage of Discovery, Vol. 1, Pages 241 - 244.
[7] JPEG2000 Tutorial. Page 103.
[8] JPEG2000 Tutorial. Page 92
[9] Region of interest coding in JPEG2000 for interactive client/server applications. Cruz, D.S.; Ebrahimi, T.; Larsson, M.; Askelof, J.; Cristopoulos, C. Multimedia Signal Processing, 1999 IEEE 3rd Workshop. Pages 389 - 394
[10] JPEG2000 Tutorial. Page 52.
[11] JPEG2000 Tutorial. Pages 63 and 64.
[12] JPEG2000 Tutorial. Pages 80 to 85.
[13] JPEG2000 Tutorial. Page 83.
[14] JPEG2000 Tutorial. Page 141.
[15] JPEG2000 Tutorial. Page 142
[16] JPEG2000 Committee draft version 1.0, 9th December 1999. Page 95.
[17] Wavelet transforms in a JPEG-like image decoder. de Queiroz, R.; Choi, C.K.; Huh, Y.; Rao, K.R. IEEE Transactions on Circuits and Systems for Video Technology, Vol 7, No. 2, April 1997, Pages 419 to 424.
The IEEE papers cited here are available at the IEEE online library at http://iel.ihs.com/
http://www.jpeg.org is a good starting point to learn more about jpeg.
An informative lecture on JPEG2000 given by Majid Rabbani, but not cited directly in our report, is available at http://foulard.ee.cornell.edu/hemami/Cornell_JPEG2K.PDF
Appendix
1. Plots of SNR vs bitrate.
2. Matlab .mat data files. These
contain the data needed to generate the SNR vs bitrate plot. Load the appropriate
.mat file, and run makeplots.m to generate the plot.
3. BMP Image files produced from
the experiments in Part 2. The image files are named in this manner:
originalimagefilename_filtertype_jpegqualityfactor_SNR_bpp.bmp
filtertype is "j" for jpeg, db1,db2,db4,db5,db8 for the
Daubechies filter, and sym2, sym4, sym5, sym8 for the Symlets.
4. Matlab code
used to perform the experiments. begin.m is the topmost file.