MULTUM IN PARVO:
TOWARD A GENERIC COMPRESSION METHOD FOR
BINARY IMAGES

by

Arber Borici
B.S., State University of New York, Empire State College, 2006

THESIS SUBMITTED IN PARTIAL FULFILLMENT OF
THE REQUIREMENTS FOR THE DEGREE OF
MASTER OF SCIENCE
IN
MATHEMATICAL, COMPUTER, AND PHYSICAL SCIENCES
(COMPUTER SCIENCE)

THE UNIVERSITY OF NORTHERN BRITISH COLUMBIA
November 2010

© Arber Borici, 2010

1*1

Library and Archives
Canada

Bibliotheque et
Archives Canada

Published Heritage
Branch

Direction du
Patrimoine de I'edition

395 Wellington Street
OttawaONK1A0N4
Canada

395, rue Wellington
OttawaONK1A0N4
Canada
Your file Votre reference
ISBN: 978-0-494-75135-0
Our file Notre inference
ISBN: 978-0-494-75135-0

NOTICE:

AVIS:

The author has granted a nonexclusive license allowing Library and
Archives Canada to reproduce,
publish, archive, preserve, conserve,
communicate to the public by
telecommunication or on the Internet,
loan, distribute and sell theses
worldwide, for commercial or noncommercial purposes, in microform,
paper, electronic and/or any other
formats.

L'auteur a accorde une licence non exclusive
permettant a la Bibliotheque et Archives
Canada de reproduire, publier, archiver,
sauvegarder, conserver, transmettre au public
par telecommunication ou par I'lnternet, preter,
distribuer et vendre des theses partout dans le
monde, a des fins commerciales ou autres, sur
support microforme, papier, electronique et/ou
autres formats.

The author retains copyright
ownership and moral rights in this
thesis. Neither the thesis nor
substantial extracts from it may be
printed or otherwise reproduced
without the author's permission.

L'auteur conserve la propriete du droit d'auteur
et des droits moraux qui protege cette these. Ni
la these ni des extraits substantiels de celle-ci
ne doivent etre im primes ou autrement
reproduits sans son autorisation.

In compliance with the Canadian
Privacy Act some supporting forms
may have been removed from this
thesis.

Conformement a la loi canadienne sur la
protection de la vie privee, quelques
formulaires secondaires ont ete enleves de
cette these.

While these forms may be included
in the document page count, their
removal does not represent any loss
of content from the thesis.

Bien que ces formulaires aient inclus dans
la pagination, il n'y aura aucun contenu
manquant.

1+1

Canada

Abstract
Data compression is an active field of research as the requirements to efficiently
store and retrieve d a t a at minimum time and cost persist to date. Lossless or lossy
compression of bi-level data, such as binary images, has an equally crucial factor of
importance. In this work, we explore a generic, application-independent method for
lossless binary image compression.
The first component of the proposed algorithm is a predetermined fixed-size codebook comprising 8 x 8-bit blocks of binary images along with the corresponding codes
of shorter lengths. The two variations of the codebook—Huffman codes and Arithmetic codes—have yielded considerable compression ratios for various binary images.
In order to attain higher compression, we introduce a second component—the rowcolumn reduction coding—which removes additional redundancy.
The proposed method is tested on two major areas involving bi-level data. The
first area of application consists of binary images. Empirical results suggest that our
algorithm outperforms the standard JBIG2 by at least 5% on average. The second
area involves images consisting of a predetermined number of discrete colors, such as
digital maps and graphs. By separating such images into binary layers, we employed
our algorithm and attained efficient compression down to 0.035 bits per pixel.

Table of Contents
Abstract

ii

Table of C o n t e n t s

iii

List of Tables

v

List of Figures

vii

Abbreviations

viii

Acknowledgements

ix

Dedication

x

1

Introduction

1

1 1
1 2
1 3
1 4

2
4
5
6

2

Motivation
Contributions
Mechanism of the Pioposcd Appioach
Thesis Ovcivicw

The Proposed Method
2 1 Background
2 1 1 Compression Methods
2 1 2 The Modeling and Coding Paiadigm
2 1 3 Entropy The Coding Terminus
2 1 4 Huffman Coding
2 1 5 Anthmetic Coding
2 2 Definitions
2 3 Toward a Umveisal Codebook
2 3 1 The Sample of Binary Images
2 3 2 Constructing the Codebook
2 3 3 Distribution of Blocks and Huffman Codes
2 3 4 Employing the Codebook
2 4 The Row-Column Reduction Coding
2 4 1 The RCRC Algoiithm
2 4 2 An Example
m

8
8
9
10
11
14
15
16
19
20
23
29
32
33
34
37

2 4 3 A Word on RCRC Compression Probability
2 5 Computational Complexity
2 6 The Coding Scheme
2 7 Alternative Coding Scheme
2 8 Sensitivity Analysis on the Coding Schemes
2 9 The Codebook Model for Arithmetic Coding
2 10 Protagonists and Antagonists

40
44
45
48
53
68
68

3

Applications
3 1 Binary Images
3 2 Discrete-Coloi Images
3 3 Discussion
3 4 Huffman Coding Is Not Dead

71
72
80
83
83

4

R e l a t e d Work
4 1 Binary Image Compression Techniques
4 1 1 JBIG2
4 2 Discrete-Color Image Compiession Techniques

88
88
89
90

5

Conclusions and Future Work
5 1 Concluding Remaiks
5 2 Future Woik

91
91
92

Published Material

94

Bibliography

95

A M o d e l s and D e r i v a t i o n s

100

A 1 Waiting Piobabihties
A 2 Asymptotic Expansion
A 2 1 Gcneial Asymptotic Analysis
A 2 2 Ei 101 Analysis foi the Consiiucted Codebook

100
102
102
105

B Test Images
B 1 Binary Images
B 2 Selected Discrete-Color Images

111
111
120

Index

123

IV

List of Tables
2 1 Effect of block dimensions on entropy
2 2 Some statistics for the constructed codebook
2 3 Hypotheses testing for the distribution of 8 x 8 blocks
2 4 Effect of block dimensions on RCRC
2 5 The coding scheme
2 6 The alternative coding scheme
2 7 Encoding of blocks in Figure 2 17
2 8 Simulation results for P = 0 5
2 9 Simulation results for P = 0 6
2 10 Simulation results for P = 0 7
2 11 Simulation results for P = 0 8
2 12 Simulation results for P = 0 9
2 13 Simulation results for P = 0 98

26
28
32
37
46
51
51
56
56
57
57
58
58

3 1
32

73

Compression ratios for solid binary images
Percentage of blocks compressed by the codebook, RCRC, and incompressible blocks
3 3 Compression ratios for irregular bmary rmagcs
3 4 Percentage of blocks compressed by the codebook, RCRC, and incompressible blocks
3 5 Compression latios foi other binary images
3 6 Percentage of blocks compiessed by the codebook, RCRC, and rncomprcssible blocks
3 7 Compression ratros foi bmary images with line boundaries
3 8 Compiessron ratros for mvertcd bmary rmages wrth lme boundaries
3 9 Peiccntage of blocks compiessed by the codebook, RCRC, and incompressible blocks
3 10 Percentage of blocks compressed by the codebook, RCRC and incompressible blocks mveited counterparts
3 11 Descirptron of selected topographic map rmages
3 12 Compression lcsults for map rmages
3 13 Compression results for charts and graphs

79
81
82
82

B 1 Sohd test rmages
B 2 Test images with boundaiy lines
B 3 Topogiaphrc maps

112
119
120
v

74
75
76
77
77
77
78
79

B 4 Charts and graphs

121

vr

List of Figures
2 1 The two phases of compression methods
2 2 A sample binary image
2 3 Salt-and-pepper noise
2 4 Trimming a binary image
2 5 Distribution of the first 20 8 x 8 blocks
2 6 Cumulative probability of the 6952 blocks
2 7 Sample codebook entries
2 8 The RCRC algorithm
2 9 A binary matrix of 8 x 8 blocks
2 10 The row-ieductron operation
2 11 The column-reduction operatron
2 12 The column reconstruction
2 13 The row reconstruction
2 14 A plot of P{R) as a function of p for 8 x 8 blocks
2 15 A plot of P(R) as a function of p for various block dimensions
2 16 Huffman tree for string LILIANA
2 17 Example of erght input 8 x 8 blocks
2 18 Compressed bit stieam of blocks m Figure 2 17
2 19 Simulation lcsults for P = 50%
2 20 Simulation lesults foi P = 60%
2 21 Simulation lesults for P = 70%
2 22 Simulation results for P = 80%
2 23 Simulation results for P = 90%
2 24 Simulation lesults for P = 98%
2 25 Visualization of the first 100 8 x 8 blocks
2 26 Randomly generated images
2 27 An incompressible geometric primitive
2 28 An incompressible 8 x 8 block

9
17
20
22
30
31
33
36
37
38
38
39
39
43
44
49
52
52
62
63
64
65
66
67
69
69
70
70

3 1
32
33
34

72
78
78
81

Generic diagram of the pioposed compression scheme
Binary image with line boundaiies
Binary image with line boundaiies inverted counterpart
An example of color scparatron

vrr

Abbreviations
AC

Arithmetic coding

AD

Andeison-Darling test

BCC

Break-codebook-coding signal

bpp

bits per pixel

CRV

Column reierence vector

CR

Compression ratio

EOF

End-of-file signal

GIF

Graphics Interchange Format

ICB

Incompressible-block signal

JBIG

Joint Bi-Level Image Expeits Group

JPEG

Joint Photographic Expeits Group

KS

Kolmogoiov-Smimov test

MPEG

Moving Picture Expeits Gioup

PNG

Portable Network Giaphics

RB

Reduced block

RCRC

Row-Column Reduction Coding

RRV

Row reference vector

TIFF

Tagged Image File Format

VMR

Variancc-to-mcan ratio

via

Acknowledgements
I am obliged to my supervisor, Saif Zahir, who exposed a tolerant degree of patience toward the successful completion of my thesis work, despite brief moments
of asymmetric coordinations between my pro-attitudes and objectives. I thank my
committee members, Charles Brown, Michael Rutherford, and Iliya Bluskov for their
insightful comments and questions that lead me to dig up certain important components for my work. I have learned from Charles Brown that analytical thinking
attiicd with proper methodology leads to appreciable answers for many theoretical
and practical problems. Michael Rutherford serenely examined my work and rendered
attention-drawing questions from the perspective of a non-Computer Scientist. Iliya
Bluskov provided necessary suggestions for rigorously improving the mathematical
elegance of some theoretical results of my work.
Special thanks go to Jennifer Hyndman for her savvy advice on technological
considerations related to the final appearance of this document. Samuel Watson was
kind enough to point out several analytical elements regarding asymptotic analyses
in my work, despite the incompleteness of my questions. Warm acknowledgments go
to Dcsa and Jernej Polajnar for useful discussions and comments about my work in
its final state.
Above all, I am sincerely grateful to my mother and biother for substantiating my
decisions and foi bcaiing with my whims and foi having stood by me heart and soul
Throughout my hie, the deep-rooted intellect and will of my paicnts have encouraged
me to preserve the eagerness for pursuing knowledge. My veneration towards such a
wonderful family deserves more than two lines, but that is a story that belongs solely
to me.

IX

For Viktoria

There is a pleasure in the pathless woods,
There is a rapture on the lonely shore,
There is society, where none intrudes,
By the deep sea, and music in its roar:
I love not man the less, but Nature more,
From these our interviews, in which I steal
From all I may be, or have been before,
To mingle with the Universe, and feel
What I can ne'er express, yet cannot all conceal.
-

X

BYRON

Chapter 1
Introduction
I do not pretend to start with precise questions I do not think
you can start with anything precise You have to achieve such
precision as you can, as you go along
- BbRrRAND R U S S E L L

Data compression is generally defined as the task of transforming or representing
data with a smaller amount of units of information than the original size Compression algorithms aie used to transform an initial amount of information to a reduced
amount, thus representing information m a compact form Instances of data include,
but aie not limited to, text, black and white (also lcfciicd to as binary, two-color, or
bi-lcvcl) images, color p u t m e s , scanned documents, sound, videos, and other digital
signals [1]
Data compression is ubiquitous despite the paiadoxical fact that storage and transmission costs keep decicasing as technological advances increase Web page images,
video streams, digital TVs, cellular communications and many other technologies exploit compression, these technologies would otheiwisc lose clarity 01 practicality in
performing their servrces [2] Famous examples of standard compression methods aie
JBIG2 for binary and half-tone images, J P E G , PNG, and GIF foi images m general,
and MPEG foi videos

In this work, we illustrate the thcoietical development and

cmpnical results of a novel compression method foi bmaiy images
1

Binary image

compression, too, is an active area of research, as shall be seen in Chapter 4.
The purpose of Section 1.1 is to expose the motivation that incited us to develop
a new yet conceptually distinct approach to compressing binary images and, to some
extent, other bi-level data, such as circuit test vectors. In Section 1.2, a succinct list
of the major contributions is posited. Chapters 2 and 3 attempt to fill in the gaps in
that list. Finally, an overview of the thesis is provided in Section 1.4.

1.1

Motivation

Consider devising a generic compression method for some predetermined set of
data such as binary images. On one side, having a unified theoretical approach could
be advantageous in focusing research in one main stream to ameliorate the generic
method. On the other side, empirical results should ascertain an appreciable degree
of compression efficiency in order for the method to survive.
Imagine a dynamical system comprising an information source, which assembles
uniformly sized chunks of binary images, and a channel that outputs them. Think
of these chunks as being analogous to the lctteis of some huge alphabet and consider
the images as being analogous to woids oi even sentences of some not necessaiily
meaningful language, in the sense that one would think of meaning and language.
Or, imagine these chunks are mosaic tiles, which, when assembled in some way, will
rcpioduce and give meaning to the image conveyed by the mosaic. In light of that
metaphor, suppose that the procedure of generating those image chunks obeys a stationary stochastic process. For every time shift, the distribution of such chunks should
remain the same. Consequently, the probabilities may be used to determine the compression terminus for each and eveiy possible binaiy image that the fictitious source
can yield. An appiopiiatc theoretical compression method could then be devised

2

And this approach would lead to a much aspired universal yet simple method for
compressing binary images
In truth, such a presupposed mfoimation source implies the involvement of an
infinitude of binary images and one should be very well aware of this However, the
Law of Large Numbers proves to be a strong mathematical aegis which enables one
to examine a relatively large sample of binary image chunks and deduce, to some
theoretical approximation, a quasi-universal compression method

The idea behind

the proposed method m this work lies m between these strains
For reasons which will be laid out m Chaptei 2, we specified the aforementioned
chunks as 8 x 8-bit blocks

We specify theoretical as well as empirical reasons for

choosing non-overlapping 8 x 8 blocks In order to constiuct a large sample of binary
images, we considered collecting and paititionmg binary images into 8 x 8 blocks to
examine the oveiall system entiopy

The latter provides a useful guide in learning

the theoretical uppei-bound of compiessmg binary images using a yet-to-be-devised
application-independent method
The existence of entiopy codeis such as Huffman and Aiithmctic coding, adds to
the idea of developing a umvcisal dictionaiy (oi codebook) comprising pans of 8 x 8
blocks and then Huffman codes oi cumulative piobabilities for the case of Aiithmctic
codes

In oidci to achieve such a dictionary, we constructed a system of binary

images randomly collected fiom different sources and weie as diveise as possible m
then pixel lcpiesentations

Thereaftei, we eliminated images that contained salt-

and-pepper noise This preprocessing piovcd to be useful m constructing an unbiased
codebook

Finally, we studied the lelativc probabilities of all blocks m the sample

and, thus, we calculated the entiopy of binary images based on the ldatively large
sample

Wo used the piobabihty distirbutron of 8 x 8 blocks to construct extended

Huffman codes for all blocks with absolute hcquoncy gieatei than 1, as shall be seen

3

m a later chapter
All m all, we are aware that a stochastic discrete dynamic system may assemble an
infinitude of binaiy images for a precise entropy value to be determined A hypothetical machine (or source) such as the one described here may not produce impiobable
sequences of blocks, any more than an equivalent souice may not produce sequences of
incomprehensible, say, English words As we shall state subsequently, such sequences
are merely unlikely We focus on the theoretical and empirical aveiage measures pertaining to blocks of binary images

In light of that, the sample we constructed is

representative of the most frequently occurring binaiy image blocks

Furthermore,

based on our literatuie review, this is the first attempt to model a generic codebook
for compressing binary images using Huffman 01 Arithmetic codes

1.2

Contributions

As stated in Section 1 1, a universal codebook could be constructed, in principle, by considering chunks of all possible binaiy images assembled by some dynamic
system In light of that the following list piovicles the caidinal contributions of this
leseaich
• T h e c o m p r e s s i o n apparatus. The pioposed method compnscs a codebook
and the row-column leduction coding, an algorithm which removes additional
redundancy in an 8 x 8 block This method may be viewed as an applicationindependent compression appaiatus, since the codebook component attempts
to endoise a generic coding scheme
• Efficient c o m p r e s s i o n of binary i m a g e s . The pioposed method achieves, on
aveiage, higher compression than the standard JBIG2 on bmary images which
do oi do not favoi the latter

In addition, the method can efficiently compicss
4

textual images, such as scanned documents, books, and so forth.

Naturally,

in order to attain even higher compression, the codebook must be extended to
include empirical distributions and Huffman codes of 8 x 8 blocks that have not
appeared in our data sample.
• Efficient c o m p r e s s i o n of discrete-color i m a g e s .

The method has been

observed to efficiently compress discrete-color images through color separation.
The latter procedure slices a color image into binary layers and compresses
each layer individually. Examples of discrete-color images include, but are not
limited to, maps, graphs, charts, and the like.
The first item in the list will be explored in detail in Chapter 2, whereas the
remaining contributions will be clarified in Chapter 3.

1.3

Mechanism of the Proposed Approach

The general mechanism of the pioposed method may be succinctly dcsciibcd as
follows The lossless compression algoiithm consists of two components- (l) a predetermined codebook; (n) an additional coding algoiithm—the low-column reduction
coding (RCRC)—designed to fuither compress data. Details of these two components
aic exposed in Chapter 2. As pei the pioposcd scheme, a binaiy image is partitioned
into non-ovcilapping 8 x 8 blocks and each block is compiessed individually.
In order to construct the codebook, we landomly collected 120 binary data samples, such as binary images, textual and document images, and so forth. The samples
are of different dimensions and were gathered from various applications. The dimensions vary between 149 x 96 and 900 x 611 bits, whereas then representations vary in
complexity and redundancy

5

The proposed method works as follows. For each 8 x 8 block, the codebook is
searched to check if the block is found. If so, the block of size 64 bits is replaced by
the corresponding code in the codebook. The latter code has a shorter length. The
minimum and maximum lengths of such corresponding codes are 1 bit and 17 bits,
respectively. If the block is not found in the codebook, we resort to an additional
coding procedure, the row-column reduction coding (RCRC), to compress the block.
If the size of the RCRC-compressed block is smaller than its original size (that is,
smaller than 64 bits), we use the compressed bit string. Otherwise, we do not compress
and represent the block with its original bits.
In general, blocks may be classified as compressible by the codebook, compressible
by the row-column reduction coding, or incompressible if the first two attempts fail.
Based on empirical results, the portion of incompressible blocks is, on average, less
than 6% of the total number of distinct blocks in a given binary image.

1.4

Thesis Overview

The remainder of the thesis is organized as follows. In Chaptci 2, we expose the
pioposed compicssion method in detail. We provide a basic thcoictical background
and lay some definitions pertaining to this woik. Then, we exhibit the construction of
a codebook per the motivation described above and the row-column reduction coding,
an algoiithm that removes additional redundancy in 8 x 8 blocks.
Empirical results are exposed in Chapter 3, categorized according to the related
areas of applications. The proposed method achieves efficient compression in various
classes of binary images. In addition, we observed that images comprising discrete
colois, which can be separated into binary layers, are highly compressible via the
proposed method. Finally, with a slight modification to the second component, the

6

proposed algorithm attains good compression for integrated circuit test vectors.
Chapter 4 provides a summary of recent and mainstream compression techniques
related to binary images and discrete-color images. Conclusions and Future Work
follow in Chapter 5.

7

Chapter 2
The Proposed Method
You have your way. I have my way. As for the right way, the
correct way, and the only way, it does not exist.
-

FRIEDRICH NIETZSCHE

The purpose of this chapter is to expose the analytical details and components of
the proposed compression method. The foundational armor of the method consists
of an admixture of theoretical and empirical arguments. It is the objective of each
subsequent section to provide the reader with these aiguments as well as with the
thcoictical context pcitaining to the proposed lossless compression algorithm.

2.1

Background

In this section, we provide a basic overview of compression methods and the
modeling and coding paradigm. We also cover the definitions of entropy and joint
entropy and an information-theoretic result that has been of central importance to
the development of efficient entropy coders. We conclude with a general exposition of
two famous entropy coders—Huffman Coding and Arithmetic Coding—that will be
encountered throughout the remainder of the chapter.

8

2.1.1

Compression Methods

Compression methods generally operate in two phases. The first phase consists of
the compression algorithm, which takes input (or source) data, denoted by 3Q, and
transforms them into 3c, which is expected to contain fewer bits of information. The
second phase is the inverse operation of the first phase: the decompression algorithm—
also referred to as reconstruction or decoding—takes the compressed data 3c and
reconstructs the original d a t a 3Q. In what follows, decompression, reconstruction and
decoding refer to the same process and may be used interchangeably.

Figure 2.1

illustrates a general exhibit of the two compression phases.

Compression
Algorithm

V

Compressed
Data 3 C

Initial Data 3 Q

Decompression
«t
Algorithm

Figure 2.1: The two phases of compression methods

Compression methods are classified into two major categories: lossless

compres-

sion schemes, in which case the compressed data must be recovered exactly, and lossy
compression schemes, where compressed data is allowed to be different from the original data to some predetermined extent. This research focuses on lossless compression
methods.
Lossless compression methods involve the exact reconstruction of the original data
from the compressed data. This implies that the compression technique applied on
the input data 30 to generate the compressed data 3c should be such that the decompression method applied on 3c reconstructs 3$ with no loss of information.
2.1 may be viewed as a schematic representation of lossless compression.

9

Figure

Lossless compression methods have a wide realm of applications, particularly when
the integrity of d a t a must be preserved. Instances of such applications include text
compression, where the exact reconstruction of a particular text message is required
[2]. For instance, a bank statement containing important information such as "Credit
card balance due April 15, 2010" and "Credit card balance due April 5, 2010" convey perceptually almost the same data, but completely diverging information if not
reconstructed exactly. Also, binary, grayscale, or color images, such as medical MRI
or similar graphics must be reconstructed exactly, otherwise the nearly, yet not completely, reconstructed information may lead to a completely different, and plausibly
erroneous, interpretation of the data.

Other examples include scientific databases

and images arising in remote sensing applications [3]. Last, but not least, lossless
compression is crucial for cryptographic data, in which case data are compressed for
added security and must be precisely reconstructed in order to preserve cryptographic
keys.

2.1.2

The Modeling and Coding Paradigm

Archetypal to lossless compression methods is the Modeling-Coding paradigm [2,
3]. Based on this paradigm, the set of central components of any compression method
comprises a mathematical model and a coder. The model is generally a stochastic
model describing the distribution of the source data symbols, 3Q. that are to be
compressed. For example, if the coder is intended to compress English text, then the
stochastic model could be a second-order Markov chain describing the distribution
of English trigrams. Given the distribution description for each symbol, the coderattempts to represent the symbol into codewords of shorter length. The coder output
will be a concatenated string of codes, 3C) along with additional information for
updating the stochastic model, which may require prior knowledge of symbols. As
10

this paradigm implies, compression may be referred to as coding or encoding, while
decompression is synonymous to decoding. A codeword is simply defined as a sequence
of binary digits. Huffman and Arithmetic Coding are two famous coding schemes,
and will be discussed subsequently.

2.1.3

Entropy: The Coding Terminus

Data compression may be considered as a branch of Information Theory, the
purpose of which is the study of efficient coding or quantification of information.
From the information theoretic standpoint, the data being compressed are referred to
as the message. A central question that is addressed to compression methods is how
efficient they are. In his seminal paper, 1 Shannon introduced the concept of entropy
in Information Theory as an attempt to answer that question.
Entropy, analogous to its counterpart in Thermodynamics, is a quantitative measure of the uncertainty contained in a particular message or, in general, a system
[4]. The more random or disordered a system is, the more information is contained in
that system and the highci its entropy becomes—that is, the predictability of the next
object of the system given a pievious object of that system depends on the system
entropy. This implies that the predictability of the next object given the previous
object increases by reducing the entropy of the system. Note that an object may be
a letter of the alphabet if the system under observation is a natural language.
Entropy is also referred to as the Shannon entropy, to distinguish between the
concept in Physics, or more accurately as the first-order entropy, and is defined as
follows.
Definition 2 . 1 . First-Order

Entropy

Consider a dynamical system (fi, F, P,T),
l

where 0 is the sample space, F is a a-

Sec [4].
11

algebra, P is the probability operator, and T is a time shift operator

Let X

be a discrete random variable that has a finite alphabet A = {cui, O^J

, a||.A||}>

\\A\\ is the size of the alphabet, and let {Xn, n G Z + } be a stationary
process defined on the probability space (f2, F, P)

discrete

ft —> O
w

here

stochastic

Then, the entropy of X is defined

as
H{X)

= H (jpx) = ~Y,Px

Inpx = -Ep

where px = P(X = a) and E[ ] is the expectation operator
on the continuity

(2 1)

Note that OlnO = 0 based

argument that hm.c_o+ x l o g x = 0 Also, 0 < H(X)

a concave function
equals

[lnpx],

< ln(\\A\\)

If the outcomes are equiprobable, the entropy is at maximum

is
and

ln(\\A\\)

Entropy is expressed in bits per object, where an object is any member of a predefined system See [5, 6] for a rigorous treatment of the subject
Consider the following examples
E x a m p l e 2.1. Consider the message Ai = "aaaabbaaccccaaaa", containing three
symbols (oi objects)
b' and c'

'a', b', and 'c' Lettei a occurs moie frequently than letters

In other words, it seems normal to expect letter 'a' to appear moie

frequently should the message be shifted m time The piobabihty distnbution foi the
three letters rs p(a) = 10/16, p{b) = 2/16, and p(c) = 4/16 Based on equation (2 1),
the entropy of the given message is H(M)

= 1 3 bits pei letter

on average 1 3 brts to encode each letter m message M

T h a t rs, wc need

Here, the message may be

consrdered as a system whose objects are Englrsh letters
E x a m p l e 2.2. The message M. = "vbdkfawrptlhksaq' is apparently more disoideied
than the message m example 2 1 In othei words, the entropy of tins message is higher,
given the highly lanclom drstnbution of rts letters Here, H(A4) = 4 13 bits per lettei
Entropy provrcles a theoretical lower bound for coding and serves as a compression
12

target. If the entropy of a particular message is H, the highest compression ratio that
can be achieved for that message is (S — H)/S,

where S is the size (in bits) of the

message [7]. This implies that the smaller the entropy, the higher the compression
ratio, and conversely. First-order entropy can be extended to define a vector X of
random variables. T h e entropy in a dynamical system of two or more discrete random
variables is referred to as the joint entropy and is defined as follows.
Definition 2.2. Joint

Entropy

Let X be a vector of k random variables Xi,X2,...

,Xk.

Then, the joint entropy is

given by:

H(X)

= H(X1,Xi,...,Xk)

where P(Xi, X2,...,

= -Ep[hyP(X1,X2,...,Xk)],

Xk) is the joint probability distribution

variables in X . Note that joint entropy is non-negative
property: H(Xi,X2,...

,Xk)

< H(Xi) + H(X2)-\-..

k random variables are independent
thatH{X,X)

=

(2.2)

of the k discrete

and satisfies the

. + H(Xk)

random

subadditwity

with equality only if the

in the sense of probability theory. Also,

observe

H{X).

In order to lay out the lational foundation of the proposed method in this woik,
it is useful to define the entropy function for systems based on binary alphabets.
Definition 2.3. Binary

Entropy

Let A = {0,1}, p(0) = P{X

= 0) = p, and p ( l ) = P(X

entropy function

= 1) = 1 - p. The binary

is defined as:

Hb(X)

= -plog2p

- (1 - p ) l o g 2 ( l -p).

(2.3)

Equation (2.3) is easily derived from the general model of first-order entropy given
in (2.1). Definition 2.3 posits the following theorem.
13

T h e o r e m 2.1. Let A{n) = { 0 , 1 } n be the extended alphabet of A = {0,1}
is, the members of A^

are all the binaiy n-tuples, where | | ^ 4 ^ | | = 2n

That

We refer to

these groups of symbols as block symbols or, simply, blocks Then,

H [X ( n ) ] = nHb(X)

The proof follows from equation (2 2)

(2 4)

See [2] for the proof of the weak case for

extended alphabets
Theoiem 2 1 is important for the following two mam reasons

First, it expresses

the entiopy of random functions defined over an extended alphabet in tcims of the
entiopy of the same functions defined over the basic alphabet

Second, the entropy

of longer groupings of symbols guarantees a rate closer to the system entropy—that
is, higher compression can be attained by considering blocks of symbols rather than
single symbols

In general, encoding blocks of symbols defined over an extended

alphabet guaiantees an average codeword length upper bound closer to the entiopy
late This observation is fiuthei claimed when the proposed method is posited

2.1.4

Huffman Coding

Huffman coding is a popular entropy encoding algorithm that can generate optimal
piefix codes

The basic pimciple behind this method is to optimally assign shortei

codes to symbols that appear more fiequcntly in a given message Thercfoie, souice
statistics aie supposed to be available m advance

Foi instance, in the string in

Example 2 1, letter 'a' will be assigned the shortest Huffman code because it has the
highest relative probability
If Huffman coding is used to encode binary messages, wheie symbols aie eithci 0
oi 1, then based on equation (2 3), whatevci the probability distribution and entropy,

14

the binary symbols will still require one bit to be encoded. Therefore, no compression
can be achieved. However, according to Theorem 2.1, if binary symbols are grouped
together to form blocks of symbols, then Huffman codes will guarantee compression.
When Huffman codes are applied on extended alphabets, they are referred to as
extended Huffman codes [2].

2.1.5

Arithmetic Coding

In cases when the probability distribution of symbols is skewed and when symbol
probabilities cannot be redefined, Huffman coding may be inefficient to employ [1,
2, 8]. A competitive alternative is Arithmetic Coding, which is a core component of
standard compression schemes, such as JBIG [9], J P E G , and M P E G . This method
does not encode symbols with specific codes; rather, it encodes an entire sequence of
symbols with a real number C, 0 < C < 1. This mapping is accomplished through a
simple bounding function.
Arithmetic coding has a higher complexity than Huffman coding, but achieves
better lcsults in practice for small alphabet sizes. However, when the alphabet size is
very laige and the probability distribution of symbols is not too skewed, the efficiency
of the two methods is comparable.

If used on a very large alphabet, Arithmetic

coding may become inefficient in terms of complexity relative to Huffman coding [1].
In addition, Arithmetic coding is affected by inaccurate probabilities more often than
Huffman coding [8]. All in all, Huffman codes are fast and efficient and are preferable
for most applications.

15

2.2

Definitions

It is necessary to provide definitions of certain concepts that will prove useful in
understanding details of the proposed method, and then construct the denotational
aspect of this work.
Definition 2.4. Compression

Ratio

Image compression ratio, CR, is measured by an index defined as follows:

where w and h represent the width and height of the image, B is the number of bits
required to represent each pixel in the image, and \\CF\\ is the size, in bits, of the
compressed data.
In the case of binary images, B = 1 bit/pixel, abbreviated as bpp. For images
containing k discrete colors, there are k — 1 binary layers, where one of the colors
represents the background color which is common to all layers. Let CF%, i = 1,... ,k —
1 denote the layer size in bits Then the compression ratio lor discictc-color images
is given by:
CR = f-';,1
I .
hw(k - 1)

(2.6)
'

v

Compression ratio measures the average number of bits required to encode one pixel
and may also be expressed as the percentage decrease in input file size. We use these
measures interchangeably.
A binaiy image may be defined topologically such as in [10]. Foi simplicity, we
define a binary image as follows.
Definition 2.5. Binary

Image

Also rejeiied to as bi-level image, a binary image is a collection of picture elements
16

(pixels), each of which conveys either the color black or white. By convention,

we use

symbol '0' to denote a white pixel, and symbol '1' to denote a black pixel.
Figure 2.2(a) shows an enlarged 16 x 16 binary image (letter 'A' in 12 pt Old
English font face). The size of this image is 256 bits. Partitioning the image into
8 x 8 blocks will yield four such blocks as depicted in Figure 2.2(b). For computational
simplicity, a binary image is represented as a binary matrix d a t a structure.
ooo 0 0 0
ooo O i l

000000
001100
111000
110000
110000
110000
110000
110000
110000
110000
110000
110000
110000
111100
111000
110000

001 1 1 1
001 1 1 0

on 1 0 1
on O i l
on 0 0 0
on 0 0 1
Oil 0 0 1
000 111
000 110
000 1 1 0
00 1 1 0 0
00 1 1 1 1
Oil O i l
Oil 0 0 0

(b)
Figure 2.2: (a) An enlarged 16 x 16 binary image; and (b) the corresponding bit matrix

The central component of the proposed method is a codebook comprising p a h s
of fixed-length (8 x 8) blocks and variable-length Huffman codes. Such a codebook
is referred to as a fixed-to-variable dictionary. A block is any member of alphabet
= {0, l } n , which is the extended alphabet of A(l)

A^

Hence, an 8 x 8 block is a member of A^m\

= {0, 1} (see Theorem 2.1).

with a cardinality of 2 64 symbols.

An encoding scheme C, defined as a mapping C: A^

—> A, where A = ( J i = i ^S%\

is a function that maps an 8 x 8 block (64 bits) to a bit string of shoiter length. T h a t
is, an 8 x 8 block can be encoded as a sequence of 1 bit (A^),
(A^),

a sequence of 2 bits

. . . , up to a sequence of 64 bits, in which case the compression ratio would

equal 0. A Huffman encoding scheme complies with such a definition, since in general

17

no Huffman code can be longer than the alphabet size less one. 2
There exist schemes that may encode some low-redundancy data into longer bit
strings than the original data length. In such cases, the original d a t a are preserved
rather than compressed.

The row-column reduction coding may encounter such

cases as we shall see in Section 2.4. Such an encoding scheme may be defined as
CRCRC-A^^AX

A.

Define function L: A —> N + , where A = \Jl=1A^\

Function L gives the length

(in bits) of the encoded data.
The inverse procedure of encoding is referred to as decoding. A lossless compression algorithm should be able to recover the original data exactly. The codebook
component of the proposed method may be viewed as a partial injective mapping
[12], wherein encoding and decoding are well-defined. The same applies to the second
component, the row-column ieduction coding.
An input image 3 with dimensions h x w is defined as a multiset of 8 x 8 blocks,
since a block may appear at least once. Suppose that 8\h A 8\w, then the cardinality
ol 3 is ||2f|| = }~,wh
D e f i n i t i o n 2.6. Let B? be the set of blocks in image 3 arid let V denote the codebook,
defined as a set of paws (b,C(b)).
(i) B]i = {b\b £ Bj A b E V}.

Then, define the following

sets:

Set BH contains all blocks of image 3 that are in

the codebook, i.e. that can be compressed with Huffman codes.
(ii) BR = {b\b EB0Ab(£VA

CRCRC{b) / 0 A L(CRCUC{b))

< 64}.

Set BR con-

tains all blocks that are not in the codebook, but that can be compressed by
2

In [11], it is shown that the maximum length of Huffman codes is:
mm

log*

whcie n is the numbci of tiec levels, $ = L+2

$ +1

«-l

, and pi and %>2 aic the two smallest piobabilities.

18

the row-column reduction coding m less than 64 bits
(in) Bv = {b\b eB0Ab(£VA

[CRCRC(b)

= 0 V L(CRCRc{b))

> 64]}

Set Bv con-

tains all incompressible blocks
It should be noted that sets BH, BR, and Bu are partitions of set B?
Wc will lecur to these definitions and notations whenever it is deemed necessary
and appropriate throughout the detailed explanation of the proposed method

2.3

Toward a Universal Codebook

The proposed method operates on a fixed-to-vanable codebook, wherein the fixed
part consists of 8 x 8 blocks and the variable part comprises Huffman codes corresponding to the blocks

In oidei to devise an efficient and piactical codebook, we

conducted a frequency analysis on a sample of more than a quarter million 8 x 8
blocks obtained by partitioning 120 landomly chosen binary data samples By studying the natuial occunence of 8 x 8 blocks m a relatively laige binaiy data sample,
the Law of Laigc Numbcis motivates us to devise a geneial (empirical) piobabihty
distribution of such blocks

In pimciplc, this could be used to construct a universal

codebook based on extended Huffman codes, which can be employed for compressing
efficiently (on aveiage) all sorts of bi-levcl data In thrs section, we piovide details on
the data sample wc constructed to generate a codebook and how we constructed the
codebook Thereafter, we expose how the codebook rs employed to compress brnary
images

19

2.3.1

The Sample of Binary Images

In order to perform a frequency analysis on 8 x 8 blocks, a sample of more than
300 binary images of various dimensions and compositions was compiled. The images
varied from complex topological shapes, such as fingerprints and natural sceneries,
to bounded curves and filled regions. The candidates were extracted from different
sources, mainly randomly browsed web pages and public domain databases. Because
these representative images are widely available, it is reasonable to deduce that they
are more likely to be considered for compression. Also, the main criterion in constructing an unbiased data sample was that images should convey the clear meaning
they were constructed to convey without unintentional salt-and-pepper noise. Such
a noisy image is illustrated in Figure 2.3(a) along with the corresponding "noiseless"
counterpart in Figure 2.3(b). Perceptually, the images in Figure 2.3 may be regarded
as conveying the same meaning. However, we assume that the observer's perception
is strictly defined. 3

(a) Salt-and-pepper noise

(b) No noise

F i g u r e 2.3: An image containing salt-and-pepper noise and its noiseless counterpart

Having removed noisy binary images, the initial data sample reduced to 120 images
with dimensions varying from 149 x 96 to 900 x 611 bits yielding approximately 250000
8 x 8 blocks. Before proceeding with the frequency analysis, we preprocesscd binary
^Consider, for instance, a machine that cannot distinguish between the two images in Figme 2.3.

20

images m two steps The first step consisted of trimming the margins (or the image
frames) m order to avoid biasing distribution of 0-valucd or 1-valued 8 x 8 blocks
In the second step, we modified image dimensions to make them divisible by 8 for
attaining an integral number of 8 x 8 blocks
Trimming images m order to remove redundant background frame is important for
the first preprocessing step Preserving such frames increases the relative probability
of 8 x 8 blocks consisting of zeios or ones if the background is white or black, respectively Consequently, the probability of such blocks creates a skewed distribution of
blocks in the codebook It has been leported that Huffman codes do not perform well
with such a distribution of symbols [2, 8] 4
Consider the binary image shown m Figure 2 4(a) Prior to trimming the margins,
which comprise 8 x 8 blocks filled with zeros, it is necessary to determine the four
extreme points depicted with the lines tangent to the closed curve

Otherwise, one

might clip portions of the image that contribute to the overall meaning the image
conveys In addition, it is impoitant that the distance between the tangent point and
the actual trimming point is divisible by 8, as depicted m Figuie 2 4(b) The reason
for this is to avoid biasing the content of an 8 x 8 block, which would otherwise add
to the ovuall redundancy of the image

The latter would posrtrvcly, but urrlarrly,

influence the compression ratio of the proposed method

For mstancc, usmg thrs

trimming procedure, the first row of 8 x 8 blocks wrll be filled wrth 0-valucd blocks
The second such row will compose blocks that start to represent portions horn the
fully-trimmed rmage, as deprcted in Figuie 2 4(c)
The second pieprocessrng step consrsted of making the image height and width
divisible by 8 Let w and h denote the width and height of an image We convert w
and h to w* and h* such that 8|to* A 8\h* as follows
4

It should be stated, howevei, that the distribution of blocks in the constiucted codebook is
dominated by 0-valued 8 x 8 blocks followed by 1 valued 8 x 8 blocks

21

-^8 bits
(a)

(b)

BLOCK BLOCK
1
2

<-8 bits^-

t
. . .

8 bits

J|
1

w
F i g u r e 2.4: Dcteimining the extieme points piior to tiimniing t h e binary image

• If h mod 8 ^ 0 , then h* = h + 8 - (h mod 8);
• If w mod 8 ^ 0 , then w* = w + 8 — (w mod 8).
Thus, the new image dimensions aie h* x w*. For instance, a 100 x 100 image will
be padded to become a 104 x 104 image using the two steps above. The newly
padded vector entiies are filled with the image background bit. For instance, if the
background color m the binary image is white (represented conventionally with 0),
the padded entiies will be filled with 0 bits.
22

Having gone through the two preprocessing steps, we conducted a frequency analysis on the 250000 8 x 8 blocks and we used these relative probabilities to construct
the codebook This is the topic of the next subsection

2.3.2

Constructing the Codebook

From an information theoretic standpoint, we consider the images m the d a t a
sample described m Section 2 3 1 to have been geneiated by the hypothetical source
described m Section 1 1 As such, the set of 8 x 8 blocks may be characterized as a
discrete stochastic process 5 defined over a veiy laige discrete alphabet of size equal to
2 64 symbols that represent all possible patterns of zeros and ones m an 8 x 8 block
Essentially, one can study the distribution of 8 x 8 blocks for a relatively large data
sample, such as the sample descubed m the piecedmg subsection

It is, however,

not possible to estimate empirical probabilities foi all 2 64 8 x 8 blocks, and it is
ceitamly not feasible oi time efficient to construct a codebook containing all possible
blocks and their Huffman codes Therefore one should consider devising a codebook
compusmg the most ficquently occumng blocks In gencial, the moic one mcieases
block dimensions the smaller the waiting probability of observing all possible blocks
becomes because the size of the alphabet mcieases exponentially

Thus, it would

be reasonable to have an expected value of the numbei of t u a l samples lequned to
obseive all the possible 8 x 8 blocks
The latter pioblem of determining the waiting probability of obseivmg a particular numbei of blocks and the expected numbei of samples needed may be viewed
as an instance of the moic general Coupon Collector's Pioblem which is elegantly
posed m [14] This pioblem is lllustiated m Appendix A 1 In our case, we considei
5

Inioimation Theory was developed by lelymg on the assumptions of eigochcity and stationaiity
[n] Thus, such a landom piocess should be ehaiactenzcd as an cigodie and stationary discicte
stochastic process

23

"coupons" to be the 8 x 8 blocks for a total of 2 6 4 symbols. Hence, we are interested
in determining the number of blocks we must collect from the dynamical system in
order to have observed all possible blocks. Thereafter, we may deduce an estimate of
the expected number of trials required.
Answering these two questions per the Coupon Collector's Problem gives analytical insight on the size of the d a t a sample required to estimate probabilities for observing all 8 x 8 blocks in the sample. The probability of waiting exactly n trials in order
to observe all 2 64 8 x 8 blocks is equal to P{T = n) = P{T > n - 1) - P{T > n),
where
2

/ 9 6 4 \ / 964 - 7 \ n
1

r~ 1

p(r>n)= £ ( - ! ) • + ' ( ^ ( V ) •

(2 7)

'

The probability in formula (2.7) is difficult to compute for all possible 8 x 8 blocks.
However, we may resort to an asymptotic approximation of the expected number of
trials required to observe all blocks using the following formula:

E[T] « 2 64 In 2 64 + 7 2 6 4 = 8.29 x 10 20 ,

(2.8)

where 7 ~ 0.5772 is the Euler-Mascheioni constant. The result in (2.8) implies that
we need to compile a piactically huge numbei of samples in older to attain a complete
set of 8 x 8 blocks and to estimate, in turn, all relative probabilities.
Based on the latter lemark, the only way to 1 educe the number of samples needed
would obviously be to reduce the block dimensions from 8 x 8 to, say, 4 x 4, so that
H^ll = 2 1 6 . We did experiment with blocks of smaller dimensions in order to decrease
the alphabet size. However, the efficiency of extended Huffman codes for smaller block
dimensions decreased. This is an expected result in Information Theory, as succinctly
stated in Theorem 2.1. For instance, for 2 x 2 blocks, ||^4|| = 16, and the expected
number of samples needed to observe all possible blocks is approximately equal to 55.

24

Also, the waiting probability given in formula (2 7) tends to an arbitrarily small value
for larger values of the number of samples needed For example, P(T = 100) = 0 0017,
which implies that all 2 x 2 blocks will certainly be observed Foi 4 x 4 blocks, on the
other hand, the expected number of samples needed would be approximately equal
to 764647 Yet, even this number imposes difficult computations m determining both
relative probabilities and extended Huffman codes of 4 x 4 blocks
The fact that decreasing block dimensions decreases the maximum compression
ratio can be explained by the following observation In general, suppose the entropy
for n x n blocks is H

Then, the compression ratio upper bound (m bpp) is ^ , I e

the number of bits required to encode an n x n block is inversely proportional to
the square of block dimensions Thus, compression ratio per block increases as longer
block dimensions aie considered and decreases otherwise For example, the entropy of
the system compiismg all 65536 4 x 4 blocks was observed to be equal to 2 12 bits per
block, yielding a satisfactory compression bound of 86 74% However, this would be
the case if Huffman codes could be derived for all such blocks Constiuctmg Huffman
codes foi such a laige cardinality is piactically inefficient

Therefoie, one needs to

consider an empnical balancing between block dimensions, entropy, and extended
Huffman codes

Based on such considerations, we decided to study the empnical

distribution of 8 x 8 blocks
Table 2 1 shows various candidate block dimensions along with the lesultmg entropy values 6 and the expected sample sizes

An increase in entiopy is expected as

block dimensions increase because the alphabet size becomes laigcr, thus increasing the number of possible states

The theoictical maximum compression ratio m

percentage, howevei, mcieases proportionally to the squaie of block dimensions, as
6

It should be noted that the entiopy values given in the table—and, hence the maximum com
piession latios, CRma%—are extrapolated from the analysis we earned out on 4 x 4 and 8 x 8 blocks
These appioximativc values should suffice to give a gcneial idea ol how entiopy and compression
idtio vaiy with block dimensions

25

stated above

At this point, one may conclude that using 8 x 8 blocks consists a

better choice than considering 2 x 2 or 4 x 4 blocks, or blocks with smaller dimensions
than 8 x 8

After all, even for 4 x 4 blocks the number of samples required to observe

all such blocks and to deduce a more reliable empirical probability distribution is
practically unattainable and offers no promising compression ratio
On the other side, the expected samples needed to observe all blocks increases
exponentially, as can be noticed from the last column of Table 2 1

Despite the

increase m entropy, the alphabet size imposes an empirical limit m selecting blocks
laiger than 8 x 8

Also, the probability that blocks not in the codebook will be

compressed by the row-column reduction coding dccieases with an increase m vectoi
size, as shall be observed m Section 2 4 In all, oui choice of 8 x 8 blocks is based on
these strains of remarks and would probably be no different than selecting 7 x 7 or
9 x 9 blocks, except for some decrease or increase in the theoietical compression ratio
and the feasibility m handling Huffman codes

Table 2.1: Effect of block dimensions on entropy and the expected sample size
Block

|j,4||

Entropy

CRmar

E[T]

2x2
4x4
8x8
12 x 12
16 x 16

A

1 36
2 12
4 09
7 89
9 72

66 00%
86 74%
93 60%
94 52%
96 20%

55
764647
8 29 x 1020
2 24 x 1045
2 06 x 1079

2

216
264
2144
2256

In the d a t a sample of 120 images, we identified a total of 65534 distinct blocks
Fiom this total, we selected the 6952 blocks that occiuicd 2 times or more and clrscarded all other blocks with an absolute frequency equal to 1 Thus the cardinality
of the fixed-to-variable codebook equals 6952 entries

This is a small number com-

pared to the total numbci of blocks equaling 2 6 1 Since the codebook contains such
a small fraction of 8 x 8 blocks and since we do not know the theoretical piobabihty
26

distribution of blocks, it is reasonable to provide an estimate for the error between
the theoretical and the empirical average code lengths (or entropies).
The observed average code length, L, is given by:
N

1

(2.9)

i=i

where q% arc the empirical probabilities of blocks and N is the number of blocks.
Similarly, we define the theoretical average code length for the theoretical probabilities
P.:
N

1
iy = ^ P t l o g 2Pi- .

(2.10)

i=i

Then, we examine the error model:
N

(

1

E = L- H = ^2lqt log2
l=1

(2.11)

p, log 2 —
Qi

V

1\
Pi/

Let e% = qt—Pi be the discrepancy between empirical and theoretical probabilities,
\/i = 1, 2,.. . , N. Then, a second-order asymptotic expansion on e, yields the following
error approximation:
N

E

E i^(

ei

+

( max 1|e ? 2}})
V*e{i, ,N}
J

!;.)+e*log^

as

e, -> 0 . (2.12)

The derivation of formula (2.12) is given in Appendix A.2.1.
The asymptotic appioximation in formula (2.12) implies that discrepancies between the theoretical and empirical average code lengths arc negligible as e, —> 0.
However, in our case we included only 6952 blocks in the codebook. If we let N = 6952
in equation (2.12), we have to add an additional error term for all other possible
blocks not included in the codebook. Practically, we consider qt = 0,Vz > 6952.
The additional enoi tcim to be added to cciuation (2.11) would thus be equal to
27

~ Si=6953 A 1°S2 Pi

in our

view

) these theoretical probabilities aie very small and

have a minor effect on the code length erior, as will be seen in the next paragraph
This fact is, however, one of the motivations that incited us to develop the additional
coding module—the low-column reduction coding—as will be illustrated m Section
2 4 See Appendix A 2 2 for a detailed discussion on the additional error term
As stated earlier, the constructed codebook is a set of pairs V =
where b is an 8 x 8 block and C(b) is the Huffman code of b

{(b,C(b))},

The Huffman code

length L (C(6)) varies from 1 bit to 17 bits, while the observed average code length
is 4 094 bits, which is greater than the codebook entiopy value of 4 084 bits pci
block

The difference between the observed average code length and the entropy

value (defined in formula (2 11)) is equal to 0 01
redundancy

This difference is referred to as

In percentage, the redundancy is found to be 0 252% of the entropy

This means that, on average, we need 0 252% more bits than the minimum required
m order to code the sequence of blocks m the codebook

In compliance with the

asymptotic expansion of the error given in (2 12), this value, too, exposes a minoi
excess in codeword lengths and accounts for a near-optimal encoding of blocks by
means of the constructed codebook

Table 2 2 summarizes some statistics foi the

codebook

Table 2.2: Some statistics for the constructed codebook
Entnes
6952

Mm length
1 bit

Max length
17 bits

Mean length
4 094

Vanance
1695

Entropy
4 084

Ei i or
0 252%

In the context of the modeling and coding paiadigm piesented m Section 2 1 2 , the
constructed codebook acts as the static modeling part of the proposed compression
method

In static modeling, statistics aie collected for most or all alphabet symbols

in older to construct representative codes

28

While static modeling reduces the com-

plexity of the decoder [15, 16], it is not widely used m practice because sample d a t a
may not always be representative of all d a t a [17] Albeit m this section, we tackled a
way to construct efficient and representative codes for the most frequent 8 x 8 blocks
of binary images

The analysis on the constructed codebook suggests a small lower

bound and a negligible asymptotic upper bound on the discrepancy between theoretical and empirical code lengths

Moreover, having established a compression model

based on a fixed-to-variable codebook, we have selected Huffman and Arithmetic coding to implement the coder The former has been presented m this section, whereas
the latter will be exposed subsequently
As a final note to this section, to calculate the frequencies of all distinct 8 x 8
blocks observed in the data sample, the program we designed executed for approximately 500 hours on an Intel Dual Core machine at 1 6 GHz per processor and 2 4
GB of RAM

2.3.3

Distribution of Blocks and Huffman Codes

A noimahzed measuie to study the chspersron of the blocks in the constructed
codebook could be the vairancc-to-mcan latio (VMR) for the block counts

Such

a measuie can provide insight on the theoietical distiibution ot 8 x 8 blocks
VMR

= 1, the data can be modeled by a Poisson process If VMR

II

> 1 the data are

over-drspcrsed, m the sense that they aie spatially concentrated, and if VMR

< 1

the d a t a are said to be under-drspersed In oui case, the mean occurrence of blocks rs
x = 62232 88, the variance is s2 = 1723497 76, and VMR

= 27 69 Because VMR

=

27 69 > 1, the blocks aie over-drspersed and do not follow a Poisson distribution
This result also suggests a relatrvely hrgh degree of randomness m the distnbutron of
the 6952 8 x 8 blocks
Frgure 2 5 illustrates the distiibution of the 20 8 x 8 blocks with the largest piob29

abilities. Observe that the blocks with the largest probabilities—namely P(k = 1) =
50.4% and P{k = 2) = 26.3%—are, respectively, filled only with zeros and only with
ones.

In addition, Figure 2.6 shows the cumulative probability of the 6952 8 x 8

blocks in the codebook.

06

05

04

?

03

02

01

0
0

5

10
k

15

20

Figure 2.5: Distiibution of the first 20 8 x 8 blocks

At this point, a goodness-of-fit test is useful in ascertaining whether the sampled
8 x 8 blocks follow any of the known discrete distributions. We test the following
hypotheses at the 5% significance level using the Kolmogorov-Smirnov and AndersonDarling tests:
7io : The 8 x 8 blocks follow distribution V.
TCA- The 8 x 8 blocks do not follow distribution V,
where V denotes any of the following disciete distributions [14]:
(i) Log-series, with probability mass function (pmf) P(n;9) = - ]~n'lQ^ i
30

6000

Figure 2.6: Cumulative probability of the 6952 blocks
(n) Geometric, with pmf P(n,p)

= p{\ — p)n, 0 < p < 1,

(m) Hypcigeometnc, with pmf P(k, TO, n, N) =

(?)

whereTOdenotes the total

numbci of successes and N — m denotes the total number of failuics foi n chaws
(IV) Negative binomial, with pmf P(k, ? ,p) = ( +'_~ ) (1 — p)rpk, where r denotes the
number of failmes until the process is stopped,
(v) Poisson with pmf P(n, A) = A'e
Test lcsults aie given in Table 2 3 The last two columns display the KolmogoiovSmnnov (KS) and Andci son-Dai ling (AD) statistics, which aie compaied with the
respective ciitical values equal to 0 019 and 2 5 at the significance level a = 0 05 In
all fitted cases, the null hypothesis is lcjectcd m favoi of the alternative hypothesis
The log-scnes (discicte logaiithmic) distribution is, however, lauked fiist based on

31

the fact that it has the smallest test statistic

T a b l e 2.3: Hypotheses testing for the distribution of 8 x 8 blocks

Rank
1
2
3
4
5

Distribution V
Log-series
Geometric
Poisson
Hypei geometric
Negative binomial

KS Statistic
0 324
0 619
0 948
No fit
No fit

A D Statistic
1089 5
4390
101910
No fit
No fit

Parameter
9 = 0 995
p = 0 026
A = 37 737
—
—

An index of dispersion measure can be given for the distribution of the constructed
Huffman code lengths as well Among other statistics, Table 2 2 shows the mean and
the variance of the constructed Huffman code lengths The index of dispersion for this
case is VMR

= 0 41, which implies that data aic under-dispersed

In othei woids,

the constructed Huffman code length values aie more regular than the randomness
associated with Poisson-distnbuted data

2.3.4

Employing the Codebook

Let V be the binaiy matrix lepiesentmg some input image J that is to be compiessccl

F u s t we pad m a t n x V to make its dimensions divisible by 8, as shown m

Section 2 3 1 Then, V is partitioned into 8 x 8 blocks, fev Foi each block 6y, the
codebook V is scaichcd foi a match 6© If a match is detected, the input block 6v is
encoded by the Huffman code of block bp We denote this operation as bv <— C(bx>)
This pioccduie iterates until all blocks in matrix V have been piocessed
Decoding a compressed bit stream is simple The Huffman code is searched m the
codebook and the corresponding 8 x 8 block is then retrieved

For fastci sequential

seaich, the codebook entnes aie soitecl m descending ordei based on the probabilities
of the 8 x 8 blocks 7 Figure 2 7 shows the hist thiee entnes of the codebook
7

Scqueutial scaich is expected to mn ptactically fast because the most ficqucntly occumng

32

8 x 8 block
0 0
0
0 0
0
0 0
1 1
1 1

0
1
1

1
1

1
1

1
1

0 0

0

0 0

0

Huffman

Code

0

10

11101101

F i g u r e 2.7: Sample codebook entries

It can be observed fiom Figure 2 7 that the 0-valued 8 x 8 block has the shortest
Huffman code length, equal to 1 bit, followed by the 1-valued 8 x 8 block In terms
of the probability distiibution, these two blocks alone compiise approximately 75%
of the 6952 blocks m the codebook This is an expected result as, in general, binary
images have a white background and regions filled with black
On the othei hand, if no codebook match for block &v is found, RCRC attempts
to compress £>v The RCRC algonthm is explained in the next section

2.4

The Row-Column Reduction Coding

The codebook component of the pioposed method is efficacious m compressing
the 6952 blocks it contains

These blocks, as seen in the pievious section, aie the

most frequently occumng symbols as pei the empnical distribution

Compaiccl to

the alphabet size of 264, the caidmahty of the codebook is veiy small Hence, theie
will be blocks from input images that cannot be compiesscd via the codebook Foi
blocks appeal at the beginning of the codebook In theoiy, howevci, the limning time is constant
since the block dimensions as well as the size of the codebook aio fixed

33

that purpose, we designed the row-column reduction coding (RCRC) to compress
8 x 8 blocks of a binary matrix, V , that are not in the codebook, T> In this section,
we illustrate how the algorithm works

2.4.1

The R C R C Algorithm

RCRC is an iterative algorithm that removes redundancy between row vectors
and column vectors of a block and functions as follows

For each 8 x 8 block b ,

b G V , b ^ V, RCRC generates a row reference vector (RRV), denoted as r and a
column reference vector (CRV), denoted as c

Vectors r and c may be viewed as

8-tuples which can acquire values r, = {0,1}, c, = {0,1}, for i = 1, 2,

,8

These

vectors are iteratively constructed by comparing pairs of row or column vectois from
the block b

If rows or columns are identical m a given pair, then the first vector

m the pair eliminates the second vector, thus reducing the block If the two vectois
are not identical, then they are both pieserved The eliminations or preservations of
rows and columns are stoied m RRV and CRV, respectively, which are constructed
m a similar way The iterative construction procedure is exposed m what follows foi
the case of RRV, while noting that the same proccdme applies to constructing the
CRV
Let hl3 denote the ? th low of block b, for j = 1,2,

,8

RCRC compaies rows

m p a n s starting with the first two row vectors m the block, (b ] j ,b2 : ,)

If bi-, = b2-,,

i"i = 1 r 2 = 0, and low b 2 j is eliminated fiom block b

Next, bij rs compared wrth

b 3 j and, rf they are equal, a value of 0 rs stored m r 3

If, however, bv, ^ ba-,, then

a value of 1 rs assrgncd to r 3 , implying that the t h u d low has been pieserved, and
the RCRC will create the new pair ( b ^ b ^ ) to compare as above

Thrs procedure

rterates until RCRC compaies rows m the parr that contains the last row of block b
By convention, rz = 1 means that the iih row of the block has been prcscived while
34

i"j = 0 marks an eliminated row. Clearly, r! and Ci will always take on a value of 1.
The result of these RCRC operations will be a row-reduced block. Next, RCRC
constructs the column reference vector based on the row-reduced block. There will
be 7 pairs of column vectors to compare and at most 7 pairs of entries to compare
depending on the number of eliminated rows. CRV is constructed in a similar way as
RRV through the procedure illustrated above. The end result will be a row-columnreduced block (RB). The block is encoded as a sequence of bits, where the first 8 bits
represent RRV, the second 8 bits represent CRV, and the remaining bits represent
RB. The minimum size RB can assume is 1 bit. Thus, the maximum compression
ratio attainable by RCRC is (64 - 17)/64 = 73.44%. Figure 2.8 illustrates the RCRC
algorithm for some input vector v.
The RCRC decoding process is straightforward. The number of l's in RRV and
CRV indicates the number of rows and columns in the reduced block, respectively. If
r% = 1 and r l + i = 0, then the row in block b having index i + 1 will be reproduced
exactly by the row with index i. Also, if vt = 1 and the k consecutive entries are all
equal to 0, then the decoding procedure will reproduce k copies of the Ith row of block
b to construct that paiticular portion of the block. Having reconstructed lows, the
decoding of columns proceeds in similar ways.
In Table 2.1 of Section 2.3.2, we illustrated how entropy and expected sample size
vary with different block dimensions. Having exposed the details of RCRC, Table
2.4 illustrates how RCRC compression changes with varying block dimensions. T h e
last column shows the probability that any two vectors match. Here, we consider
pixels to take on values independently. This fact is, however, not realistic because
for binaiy images a pixel is dependent on its neighboring pixels. For simplicity, let
F(vj = 0) = p and P(vj = 1) = 1 — p denote, respectively, the probability that the
ith vector entry has a value of 0 and 1. Then, the probability that two vectors v and

35

Row-column reduction coding
{Input: Vector v, ||v|| = 8}
{ O u t p u t : [RRV, CRV, RB]}
i = 1
while i < 7 do
r, = l
j = t+ l
while bjfc = bjk a n d j < 8, k = 1, 2 , . . . , 8 do
b = b \ b jfc
J = J + 1
end while
i = 3
end while

Figure 2.8: The RCRC algorithm
u of same size n match is F ( v = u) = (2p 2 — 2p + l ) n . 8
It can be observed from Table 2.4 that, for 2 x 2 blocks, the maximum compression
ratio RCRC can achieve is —25%. That is, RCRC fails to compress 2 x 2 blocks;
instead, it adds 25% more bits to the compressed data stream.

The compression

increases as a function of block dimensions, whereas the probability that any two
vectors match decreases exponentially.

3

Assume the random events {vj = u;} are independent. Then, foi i = 1 , . . . ,n, we have.
P(v = u) = P | / \ [(vi = 0 A u, = 0) V (v; = 1 A U i = 1)] I
n

= n [p(vi=°)p(ui=°)+p(v'=i)p(ui=:)i
W + (l~pY]

= ( 2 J / - 2p + 1)" .

36

Table 2.4: Effect of block dimensions on RCRC
Block
2x2
3x3
4x4
5x5
6x6
7x7
8x8
12 x 12
16 x 16

2.4.2

RRV
2
3
4
5
6
7
8
12
16

CRV
2
3
4
5
6
7
8
12
16

•Ti'-t'rmn

^ J^max

1
1
1
1
1
1
1
1
1

-25.00%
22.22%
43.75%
56.00%
63.89%
69.39%
73.44%
82.64%
87.11%

P ( v = u)
(2p2-2p+l)2
(2p2 - 2p + l ) 3
(2p2 -2p + l ) 4
{2p2 - 2p + l ) 5
( 2 p 2 - 2 p + l)6
(2p2-2p+l)7
(2p2
-2p+l)8
(2p2-2p+l)12
(2p2
-2p+l)16

An Example

Figure 2.9 shows a binary image partitioned into regions along with the binary
matrix representing a portion of the partition depicted by the extended lines. This
binary matrix contains eight 8 x 8 blocks. We use some of these blocks to illustrate
how the RCRC algorithm works.
Block 1

H
A binary image

Block 2

Block 3

0 0 111111 1 1 1110
0 0 0
0 0 1 1 1 1 1 1 111 1110
0 0 0
111 1111
01
0 0 0
111 1111
01 11
0 0 0
1 1 11 1 1 1
01 11
0 0 0
111 1111
0 0 1 11
0 0 0
0 0 0 1111 111 1111
10 0
1 1 0
0 0 0 1 111 111 1111
1 1 1
1 1 1
0 0 0 1
1 1 1
111
0 00 1
1 1 11 1 1
0 0 11
111
111
0 0 11 1 1 1
11111
1 1 1
0 111
11111
11111111
0 111
11111
11111111
11111
1111
11111
1111

Block 5

Block 6

Block 7

Block 4
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0

Block 8

Matrix corresponding to the upper-left part of the
depicted section in the image: eight 8x8 blocks

Figure 2.9: Portion of a binary image and its corresponding 8 x 8 blocks

In Figure 2.10, the row reduction operation is applied on Block 2 of the binary
matrix in Figure 2.9. The row reference vector (RRV) is shown on the left of the block.
In this case, the first row is identical to the second row, which is removed from the
block. Therefore, a value of 1 is placed in the first location of RRV (for the first row),
37

and a value of 0 is stored in the second location of RRV for the second (eliminated)
row. Next, row 1 is compared with row 3, but the two rows are not identical. Hence,
a value of 1 is placed for row 3 and the pair comparison proceeds between row 3 and
rows 4, 5 , . . . , 8. Finally, a value of 0 is placed for the corresponding RRV locations
of rows 4 to 8, which are eliminated since they are identical to row 3.

RRV

1
0
1
0
0
0
0
0

1 1 1 1 110 0
1 1 1 1
1 0 0
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1

1 1 1 1 1 1 0 0
1 1 1 1 1 1 1 1

Row — reduced block

Figure 2.10: The row-reduction operation applied on a block

The column-reduction operation is applied on the row-reduced block, as depicted
in Figure 2.11. The column-reference vector (CRV) is shown on top of the block. In
this case, the first column is identical to and eliminates columns 2 to 6. Also, column
7 eliminates column 8. This yields the reduced block, RB, shown on the right of the
column-reduced block. For this example, the output of RCRC is a concatenated string
composed of the RRV (the first group of 8 bits), CRV (the second group of 8 bits), and
RB (the last 4 bits), all displayed as one vector: 10100000 10000010 1011, for a total
RRV

CRV

RB

of 20 bits. The compression ratio achieved for this block is (64 — 20)/64 = 68.75/
CRV

10

0 0 0 0 1 0

1 1 1 1 1 1 0

0

1 1 1 1 1 1 1

1

1 0
1 1

The reduced block
Figure 2.11: The column-reduction operation applied on the row-reduced block in Figure
2.10

To clarify the decoding process, we consider the row-column reduced block of the
preceding example. The number of l's in RRV and CRV shows the number of rows
38

and columns of the reduced block, respectively. The output 10100000 10000010 1011
RRV

CRV

RB

contains two ones in the first group of 8 bits (the RRV), and two ones in the second
group of 8 bits (the CRV). This means that there are 2 rows and 2 columns in the
reduced block. T h a t is, the first two bits of the reduced block, '10', represent the first
reduced row, and the second two bits, '11', represent the second reduced row. Then,
given the l's and 0's in the reference vectors, we construct the rows and columns
of the original block.

Figure 2.12 shows the column reconstruction based on the

column-reference vector.
CRV - - ^ 1 0 0 0 0 0 1 0
•i

i-

1 0
1 1

I

^

\

/

1 1 1 1 1 1 0 0
1 1 1 1 1 1 1 1

The reconstructed colun

The re duce<i block

Figure 2.12: Column reconstruction based on the column-reference vector (CRV)

In Figure 2.12, CRV informs the decoder that columns 2 to 6 are exact copies of
column 1, and column 8 is an exact copy of column 7. The block on the right depicts
this operation. Figure 2.13 shows the row reconstruction process, which terminates
RCRC decoding and we obtain the original block of Figure 2.10.
<

RRV

0 0
0 0

1111110 0
11111111

Row - reduced block

The decoded block
Figure 2.13: Row reconstruction based on the row-reference vector (RRV)

39

2.4.3

A Word on R C R C Compression Probability

One should likely ponder about the probability of an 8 x 8 block being compressed
by RCRC In Section 2 4 1, we established that the probability of two vectors of length
8 being identical is given by (2p 2 — 2p+ l ) 8 , where p denotes the probability that the
pixel has a value of 0 This expression holds for zero-order Markov chains T h a t is to
say, the probability of the curient pixel being 0 or 1 does not depend on the values of
neighboring pixels This assumption is strong, since pixel values m binary images do
depend on neighboring pixel values For simplicity, however, this assumption should
suffice to provide a general idea of the RCRC compression probability
The output of RCRC is a bit stream comprising RRV, CRV, and RB The sizes
of RRV and CRV aie fixed to 8 bits each The size of RB may vaiy from 1 bit to 64
bits

A block is considered compressible by RCRC if the total length of the RCRC

output is less than 64 bits Thus, let R denote the random event that an 8 x 8 block
is compressible by RCRC The objective of this section is find an expression foi the
probability P(R)

Let R' denote the complement of event R Then, P(R) =

We focus on determining P(R'),

l-P(R')

as it is simpler to considci the cases when RCRC

fails to compicss a 8 x 8 block
The length oi the RCRC output is 8 + 8 + L{RB)
bits Therefore, L(RB)

and it should be less than 64

< 48 The random event B! thus denotes R!

L(RB)

> 48

The size of the leduced block, RB, is gieater than 48 bits in the following four cases
(1) Only one row and only one column has been eliminated

T h a t is, L(RB)

= 49

bits
(2) Only one row and no column has been eliminated T h a t is, L(RB)

= 56 bits

(3) No low and only one column has been eliminated T h a t is, L(RB)

— 56 bits

(4) No low and no column has been eliminated T h a t is, L(RB)
40

= 64 bits

Each random event of each case may be viewed as a success/failure event. Therefore,
a binomial distribution is suitable to study their probabilities. In the end, the sum of
probabilities of these four cases will give the value P{R')- Let us now consider these
cases. In what follows, we let (2p2 — 2p + l) 8 denote the probability that two vectors
match and q = (2p2 — 2p + 1).
(1) Let C\ denote the random event "Only one row and only one column has been
eliminated", E\ denote the random event "One row has been eliminated", and E^
denote the random event "One column has been eliminated". Events E\ and E<i
are independent, in the sense of Probability Theory. Thus, P{C\) = P{E\)P{E'i).
In total, there are only 7 pairs of consecutive rows (or columns) to compare and
we require only one pair out of 7 to match. However, when a row is eliminated,
there are only 7 entries per column pair to compare. Then,

P(El)=Qq8(l-q8f

(2.13)

W=Q?7(1-?7)G

(2-14)

and

Finally, from (2.13) and (2.14) we have:

P(C1)=49qls{l-<f)6(l-q7)*.

(2.15)

(2) Let C'2 denote the random event "Only one row and no column has been eliminated" , Ex denote the random event "One row has been eliminated", and E-i
denote the random event "No column has been eliminated". Events E\ and Ei
are independent; thus, / 3 (C 2 ) = P(Ei)P(E2).

Similar to the previous case, there

are only 7 pairs of consecutive rows to compare and we require only one pair out
41

of 7 to match, and no pairs of columns. Once again, a row is eliminated and there
are only 7 entries per column pair to compare. Then,

[7i)q*(i-qr

p(E1) =

(2.16)

and
P(E2) = [7QWni

- q7)7 = {I - q7)7

(2.17)

Finally, from (2.16) and (2.17) we have:

7\7

p(c2)=[i)q»(i-qy(i-q7)

(2.18)

(3) This case is similar to Case (2). Let C3 denote the random event "No row and
only one column has been eliminated". Then,

P(C3

,8/i

„8

(i - <?T(i - q7)7 •

(2.19)

(4) For this case, no row or column is eliminated. Let C4 denote the random event
"No row and no column has been eliminated". Then, the probability of this event
is:

P{CA

8\7

(98)0(l-O

:w

\14

(2.20)

The random events Ci, C?, C3, and C\ are mutually exclusive and, therefore, P(R')
is the sum of the probabilities of these events. From algebraic manipulations of the
expressions above, P(R') may be succinctly expressed as:

P(R') = (1 - g8)6 [49g15(l - q7f + 14^(1 - q7)7 + (1 - cf)8]

42

(2.21)

and the required probability P{R) of an 8 x 8 block being compressed by RCRC is,
therefore
P(R) = 1 - (1 - q8f [A9q15(l - q7)6 + 14g8(l - q7)7 + (1 - g8)8] ,

(2 22)

where q = (2p2 — 2p + 1)

1

08

06

o,
04

02

0

0

02

04

06

08

1

P

Figure 2.14: A plot of P(/?) as a function of p for 8 x 8 blockb
The plot m Figuie 2 14 illustrates the probability m formula (2 22) as a function
of p Recall that p denotes the probability of a pixel assuming a value equal to 0
From the giaph, we can observe that RCRC performs well if the probability value p
is relatively small oi relatively large, and docs not do well if O's and l's are uniformly
distributed

A fust-derivative analysis shows that the minimum of the function is

leached at p = 0 5 when pixel values aie unifoimly chstrrbutcd That is, if p = 0 5,
P(R) is practically small A umfoim distribution of pixel values among binary rmages
is not realrstically the case because of the high dcgicc of inhcicnt coirelatron and
43

4x4
7x7
8x8
9x9

16x16

0

02

04

06

08

1

P

F i g u r e 2.15: A plot of P(R) as a function of p foi various block dimensions
redundancy among pixels [18]. For example, if one is sketching a human eye, then the
black and white pixels cannot be uniformly distributed, since the eye has a particular
topological shape and the prior sequence of pixel values will determine the cm rent
pixel value.
Figuic 2.15 plots P(R) as a function of p foi various block dimensions. Obscive
that the probability P(R) for 7 x 7 and 9 x 9 blocks is close to the probability for
8 x 8 blocks. As an example, if p = 0.1, then P(R)

= 0.28 for 16 x 16 blocks, but

P{R) = 0.75 foi 8 x 8 blocks.

2.5

Computational Complexity

Here, we give an analytical time complexity analysis for the proposed method.
Let h and w be the dimensions of an input binaiy image matrix. Assume, without

44

loss of generality, that the image dimensions are divisible by 8 For each 8 x 8 block,
the algorithm searches the codebook for a matching block

If a match is detected,

the block is compressed and the next 8 x 8 block is processed

The codebook has

a fixed size of 6952 entries, therefore, it has 0 ( 1 ) running time

If a match is not

found in the codebook, RCRC attempts to compress the block In the context of the
proposed method, the RCRC input is of fixed size and has 0 ( 1 ) running time, too
The codebook search and RCRC are executed for at most ^wh

8 x 8 blocks Thus,

the total complexity of the proposed method is Q(hw)
In Section 2 3 2, we noted that codebook entnes are sorted in descending order
based on their empirical probabilities

While this fact does not contribute to the

analytical time complexity, we noticed it had some positive impact on the empirical
complexity metric of the proposed method

2.6

The Coding Scheme

The encoding piocess of the proposed method is simple and stiaightfoiwaid

In

oidci to distinguish between blocks compicsscd by the codebook, blocks compiessccl
by RCRC, oi uncompiesscd blocks, wo considoi thice cases which aic s u m m a n / c d
m Tabic 2 5

Based on these cases, we constiuct a general model foi the expected

compression ratio attainable by the encoding scheme of the pioposed method
C a s e 1 If a block is found m the codebook, we use the corresponding Huffman code
This pioviclcs for two sub-cases
Case

la

If the coiiespondmg Huffman code is the shortest in the codebook,

I c 1 bit m the case of the constructed codebook, then assign bits 1 1 to encode that block The leason we use two ovcihcad bits foi the block that has
the shoitest code is based on the empnical fact that this cntiy compuscs
45

about 57%-70% of the total blocks found in the codebook, depending on
the type of bi-level data.
Case

lb If the corresponding Huffman code has a length L{C(b)) > 1 bit, then
assign bits 0 0 to encode the block. As stated in Section 2.3.2, the length
of Huffman codes in the codebook varies from 1 bit to 17 bits. Thus, after
0 0, use 5 additional bits to encode the length of the codeword that follows.
Lastly, add the Huffman code to the bit stream. For instance, if a block b
that is found in the codebook has a code of length 7, then the block will
be encoded as 0 0 + 00111 + C(b). In this case, the second group of 5 bits
(00111) tells the decoder that C(b) has a length equal to 7 bits; thus, the
decoder will read the subsequent 7 bits.

C a s e 2 If the block is compressed by RCRC, then use overhead bits 0 1. Following
these two bits is the bit stream RCRC produces. The decoding process for
RCRC is explained in Section 2.4.1.
C a s e 3 If the block is neither found in the codebook, nor compressed by RCRC, we
use the two overhead bits 1 0, after which the 64 brts of the incompressible block
arc appended.

Table 2.5: The coding scheme
Case
la
lb
2
3

Coding Bits
11
0 0 + 5 bits + C(b)
0 1 + CRCRc(b)
10 + 64 bits

Description
For the block with the shortest code in
the codebook
For other blocks found in the codebook
For blocks compressed by RCRC
For uncompressed blocks

The decoding process is straightforward.

If the decoder encounters 1 1, then it

recognizes the symbol as the codebook entry with the shortest code. If the decoder
46

reads 0 0, it identifies the codebook entry whose code length, L(C(b)), is given by the
next 5 bits. Then, the decoder reads the next L bits to determine the codewords
which leads to the corresponding block in the codebook. If the decoder encounters
0 1, then the RCRC decoding process follows. Finally, if the decoder encounters 1 0,
then it reads the subsequent 64 bits.
The details of the proposed method exposed in the previous sections and the cases
illustrated in Table 2.5 motivate the following general encoding model. Let BH,

BR,

and By be the partitions of set By (see Definition 2.6 in Section 2.2). Recall that
L(C(b)) and L(CRcRc{b))

denote, respectively, the length in bits of the Huffman code

of block b and the length in bits of the RCRC output. Let £ be a random variable
denoting the random event £ = 63, by G Bj and let P(£ = 63) = p(b^) denote the
probability of £. Based on an empirical approach, one can evaluate the probabilities
P(Z = bH) = p{bH), bH e BH] P(£ = bR) = p{bR), bR G BR] and P(£ = bv) =

p(bv),

by G By. Then, the expected compression size in bits is given by the following model:

Ev\i\=

2P({bTI\L(C(bH))

= 1})

+ T,bHeBlI,L(c(bII))^P(h^)[7
+ EbReBRP(bR)

Case l a
+ L(C(bH))]

[L(CRCRC(bn))}

Case l b
Case 2

+ 6 6 E 6 c / e /i [ / P(M •

Case 3

Here, Ep[-} denotes the expectation operator. Formula (2.23) provides a general model
for the compression size in bits. The expected compression latio, by foimula (2.6) for
k = 2. is equal to:
=

EpK] • W 6 4 _
hw

EM
64

where h and w are the image dimensions, and hiv/M

v

;

is the number of 8 x 8 blocks

in the image. In general, the expected compression ratio depends on the distribution
of 8 x 8 blocks of an input binary image.

47

One may speculate on the overhead amount of bits this scheme uses for encoding
blocks. Specifically, if the 8 x 8 block with a Huffman code length of 1 bit occurs, say,
50% of the time and it will be encoded with two bits (Case la), then the expected
compression size for that block will double. Also, 7 overhead bits are used for the
remaining Huffman codes in the codebook (Case lb), whereas ideally Huffman codes
should solely be employed as per their purpose. While this encoding per se is correct,
it seems reasonable to look for a more efficacious coding scheme for the proposed
method. This is the objective of the next section.

2.7

Alternative Coding Scheme

The reason why the coding scheme introduced in Section 2.6 incurs a considerable
amount of overhead encoding information lays on the fact that RCRC interferes with
codebook coding. Consequently, the compressed bit stream contains an admixture
of strings repiesenting Huffman codes and strings representing the RCRC output per
block. Thus, a distinction between such encoded bits needs to be made explicitly for
the decoder to function coriectly. The cases covered in Table 2.5 are sufficient and
nccessaiy to reconstruct the original binaiy image exactly. This issue being stated,
in this section we look at an alternative coding scheme and we conduct a sensitivity
analysis between the two schemes to study under what conditions one outperforms
the other.
In oider to understand the mechanism of the alternative coding scheme, it is
impoitant to illustrate with a simple example how Huffman decoding works. Consider
the string LILIANA of length 7 and relative probabilities of letters: P(L) = P(I)
P{A) = 2/7 and P(N)

=

= 1/7. The Huffman algorithm will yield the following codes

for the four letters: C(L) = 00, C(I) = 01, C(A) = 10, and C(N) = 11. Figure 2.16

48

is an exhibit of this particular Huffman tree, where the labels on the edges denote
the codes employed for encoding and the nodes represent the letters and their parent
nodes. The same tree has to be supplied to the decoder, which decodes a given bit
string if a leaf node is encountered in the tree.

Q

F i g u r e 2.16: Huffman tree for string LILIANA

Suppose the decoder receives the string S = 00010000011011. It reads the fiist
bit, S[l] = 0, and starts to traverse the tree in Figure 2.16 from the root to the node
on the left, since the label on the left edge is 0 and equals S[l]. However, the node is
not a leaf node and the decoder reads the next bit in the sequence, which is S[2] = 0.
Finally, leaf node L is reached and the decoder outputs letter 'L'. This procedure
continues until the end of the received string is encountered. It is easy to check that
the decoded string will be LILLIAN.
Technically, the decoding process terminates when the decoder encounters a special signal called the end-of-file (EOF) signal. In practice, given an alphabet A of
cardinality ||*A||, an additional EOF symbol is added to the alphabet with a very
small probability value. This symbol is treated the same way as the other members
of A, and will thus be included in the Huffman tree. The E O F signal will have its
own binaiy code. Since it is assigned a very small probability value (because it occurs
only once at the end of the string), then its binary code is usually the longest. The
effect of such a code is practically negligible [8].
49

In light of the aforementioned, we introduce two 'flag' signals for the alternative
coding scheme of the proposed method. The first signal is the
(BCC) signal, and the second is the incompressible-block

break-codebook-coding

(ICB) signal. These two flags

are considered as members of the alphabet of 8 x 8 blocks and will be added to the
constructed codebook with probabilities PBCC and PICB, and the Huffman algorithm
will assign binary codes to both flags. The purpose of the BCC block is to mark the
interruption of codebook encoding for an input 8 x 8 block and the commencement
of the RCRC encoding for that block. If RCRC encodes the block in less than 64
bits, then the compressed bit stream for that block will consist of the Huffman code
of BCC, C(BCC),

and the RCRC output, CRcRc(b).

If the block is incompressible,

then the 64 bits are preceded by the Huffman code of ICB,

C(ICB).

Decoding is straightforward. If the decoder encounters bits C(BCC),

it will tra-

verse the tree to decode flag block BCC. T h a t block calls for an interruption of
Huffman tree decoding and the decoder turns to RCRC decoding. If bits C(ICB)

are

encountered, then the flag block ICB informs the decoder to read the subsequent 64
bits in the compressed bit stream.
The piobabilitics PBCC

an

d PICB determine the Huffman codes for the two flag

blocks. These values aie assigned emphically based on the aveiagc late of RCRC
compiession and the average percentage of incompressible blocks for large data sets.
For a large variety of binary images, we obseived that, on average, 93% of the blocks
aie compressed by the codebook. Out of the remaining 7% of blocks, 5% are compressed by RCRC and 2% lemain uncompressed. The constructed codebook for this
scheme has 6954 entries. The Huffman codes foi flag signals BCC and ICB arc 1110
and 110001, respectively.
The alternative coding scheme has some apparent advantages ovci the coding
scheme described in Section 2.6. First, it eliminates Case l a in Table 2.5 as it does not

50

require two overhead bits to encode the codebook block with the shortest Huffman
code.

Second, it eliminates the 7 overhead bits of Case l b that are used for the

remaining Huffman codes. We observed that the empirical occurrence of the blocks
covered by Cases l a and l b was larger than the blocks compressed by RCRC and
the incompressible blocks, on average. Therefore, the alternative coding scheme is
expected to yield better compression ratios for binary images. Table 2.6 summarizes
the three cases of the alternative coding scheme.

Table 2.6: The alternative coding scheme
Case
1
2
3

Coding Bits
C(b)
C(BCC) + CRCRc(b)
C(ICB) + 64 bits

Description
For blocks in the codebook
For blocks compressed by RCRC
For uncompressed blocks

As an example, consider the 8 x 8 blocks in Figure 2.17. Table 2.7 shows the blocks
compressed by the codebook, the blocks compressed by RCRC, and one incompressible
block. The resulting encoded bit stream is illustrated in Figure 2.18. In this example,
operator || denotes string concatenation. Flag block BCC informs the decoder that
the subsequent 8 + 8 + L(RB)

bits will be decoded using the RCRC algorithm, whereas

flag block ICB tells the decoder to merely scan the subsequent 64 bits.

Table 2.7: Encoding of blocks in Figure 2.17
Block b
1
2
3
4
5
6
7

b GV
YES
YES

L(CRCR.c(t>)) < 64

Incompressible

Final C o d i n g
_ _

YES
YES
YES
YES
YES
YES

51

C(b2)
C(BCC)\\CRCRc(b3)
C(b4)
C(h)
C(b6)
C{BCC)\\CRCRC{b7)
C(ICB)\\bi8

Block 1

Block 2

Block 3

Block 4

0 0 111111 1 1 1 1 1 1 0 0 oooooooo oooooooo
0 0 111111 1 1 1 1 1 1 0 0 oooooooo oooooooo
0 0 111111 1 1 1 1 1 1 1 1 oooooooo oooooooo
0 0 11111 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 oooooooo
0 0 0 11111 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 oooooooo
0 0 0 11111 1 1 1 1 1 1 1 1 1110 0 0 0 0 oooooooo
0 0 0 11111 1 1 1 1 1 1 1 1 11110 0 0 0 oooooooo
0 0 0 11111 1 1 1 1 1 1 1 1 111110 0 0 oooooooo
0 0 0 11111 1 1 1 1 1 1 1 1 111110 0 0 1 0 0 0 0 0 0 0
0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 10 0 0 10 0
0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 0 0 0 01
0 0 1 1 1 1 1 1 • 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 10 0 10 0 0 0
0 1111111 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1 0 0 0
0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 10 0 10 0
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 1 0
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1 0 0 01

Block 5

Block 7

Block 6

Block 8

Figure 2.17: Example of eight input 8 x 8 blocks
C(61)||C(62)||C(BCC)CflcJRc(b3)||C(64)l|C(65)||C(66)||C(SCC)CflCfic(67)l|C(/CB)68
Figure 2.18: Compressed bit stream of blocks in Figure 2.17
As for the previous encoding scheme, we provide the following general model
for the alternative scheme. Let BH, BR, and By be the partitions of set By (see
Definition 2.6 in Section 2.2). Let £ be a random variable denoting the random event
£ = by, by G By and let P ( £ = by) = p{by) denote the probability of £. Based on an
empirical approach, one can evaluate the probabilities P ( £ = bn) = p{bn), bjj G BH;
P(i = bR) = p(bR),

bR G BR; and P(£ = bv) = p(bu), bv G Bv.

Then, the expected

compression size in bits for the alternative coding scheme is given by the following
model:

E;[£] =

EbHeBH,bH^{BCC,iCB}P(bn)L(C(bH))

Case 1

b
+E
J2bR
[L{C{BCC]) + L(CRCRC(bR))}
6 ef Bi M
e Bn)s KM[^(^c))u(c

Case 2

[L{C(ICB))+ 64] E M E ^ P M ,
where, E*{-\ denotes the expectation operator.

52

(2.25)

Case 3

Formula (2.25) provides a general

model for the compression size in bits. The expected compression ratio, by formula
(2.6) for k = 2, is equal to:
E*[Z}-hw/64
El[£]
CR* = - ^ '— = - ^ ,
hw
64 '

y

2.26J

where h and w are the image dimensions, and hw/64 is the number of 8 x 8 blocks in
the image. We denote the expected compression size of the alternative coding scheme
as E* to distinguish it from its counterpart Ep given in formula (2.23).
One may notice that the overhead bits of the alternative coding scheme incurred
for blocks compressed by RCRC and for incompressible blocks are larger than the
amount given for Cases 2 and 3 in Table 2.5. Specifically, L(C(BCC))
L(C(ICB))

= 4 and

= 6, which are greater than the 2 overhead bits used for the same cases

in Table 2.5. For that reason, we aim at determining which scheme outperforms the
other on average and under which conditions. Thus, we perform a sensitivity analysis
using a Monte Carlo simulation between the models in formulas (2.23) and (2.25) to
observe how the two proposed coding schemes perform for various input 8 x 8 blocks
and their distributions. From this point onward, we denote as C0 the coding scheme
introduced in Section 2.6 and as CA the alternative coding scheme introduced in this
section.

2.8

Sensitivity Analysis on the Coding Schemes

Based on the results in Table 2.3, we geneiate 8 x 8 block samples from a log-seiics
distribution with parameter 9 — 0.995 and probability mass function P(n) =

^ " ^ l

with support n = {1, 2 , . . .}. By convention, n = 1 denotes the 0-valucd block, n — 2
denotes the 1-valued block, and the other ordinal values denote the remaining 8 x 8
blocks in the coclcbook, which have been initially soitcd in descending older pei the
53

coiresponding empnical probabilities (see Section 2 3 2)
The procedure we constructed for the Monte Carlo simulation is as follows
1 Generate 100 1024 x 1024 binary images, for a total sample size equal to 25600
8 x 8 blocks

In order to obtain results with an error less than 2%, approxi-

mately 24000 blocks per sample are required 9
The binary images are generated by retrieving blocks from the codebook with
probability D and constructing the remaining 8 x 8 blocks with probability
(1 — D)

The probability of generating a white pixel for each of the remaining

blocks is denoted as P

For instance, D = 0 8 and P = 0 5 imply that blocks

are generated from the codebook 80% of the time and aie constructed uniformly
(I e the probability of a white or black pixel is 0 5) 20% of the time

Parameter

D vanes from 0 to 1, whereas parameter P vanes from 0 5 to 1

Sensitivity

analysis is earned out on both parameters

Note that P < 0 5 implies that

the probability of constructing a black pixel is 1 — P > 0 5, which is technically
coveied by the range 0 5 to 1 by considering the inverted pixel This observation
reduces the total numbei of samples required to conduct the simulation and
sensrtivity analysis Also, P = 1 implies the generation of all-white blocks and
these blocks aie generated from the codebook

Therefoie, wc consider a value

equal to 0 98 as the maximum langing value for paiameter P
In this experiment we let D = {0,10, 20,

, 100} and P = {0 5, 0 6,

,0 98}

For each parameter value, we generated 100 bmaiy images as desenbed above to
evaluate, on average, the compression ratios yielded by the two coding schemes
q

The en or e m Monte Cailo analysis is given by the expiession e = 4 = , wheie a is the standard
deviation and N is the sample size Heie, a is estimated by the population's standaid deviation
between the minimum and maximum compicssion ratios pei block, which aie —10 9375% (with
occurience only when all blocks aie lncompicssiblc) and 0 9362 (imposed by the cntiopy), respectively The eiior is given by the avciage of the minimum and maximum compicssion ratio pei block
multiplied by 2% Substituting these values in the eiior expiession above and solving for N, vields
N w 24000

54

2. For each value of parameter P and for every D, evaluate Ep and E* averaged
over 100 samples.
The results for each value of P are illustrated in Tables 2.8 to 2.13. From the
results, we can observe how the two coding schemes perform under various values
of parameters D (% of blocks found in the codebook) and P (the probability of a
white pixel). The farther away D gets from the break-even point, the more does the
discrepancy between Co and CA increase, wherein a smaller D favors CQ and a larger
D favors CA- For small values of D (typically D < 10%) and for 0.5 < P < 0.8, the
coding schemes do not compress: the negative ratios imply an overhead coding size
larger than the original image size.
It can be noticed that parameter P does not have any major effect on the relative
average performance of the two coding schemes. In all graphs, the break-even point
is between 45% and 60% and the two schemes perform almost similarly in this range.
Therefore, it may be conjectured that the alternative coding scheme, CA, is preferable
over scheme CQ if the percentage of blocks found in the codebook is greater than 60%.
Moreover, the results in Table 2.13 show that for large values of P, CA attains better
compression ratios than Co for all values of D.
In Section 2.4.3, we illustiatcd theoretically the probability, P{R), of RCRC compressing a block. We established that P(R) depends on the probability P of white
and black pixels in the block. We concluded that for relatively small or relatively
laige values of P, the chances RCRC compresses arc high. In light of that, the parameter value P affects the probability P(R).

Obseive the results in Table 2.13 for

P = 0.98. Based on equation (2.22), we have P(R) — 0.9999. Here, we may speculate
that no block is incompressible: thus, CA should be the pieferred coding scheme, as
also veiified graphically.

55

Table 2.8: Simulation results for P = 0.5
D (%)
0
10
20
30
40
50
60
70
80
90
100

Er
-0.0311
0.0434
0.1502
0.2149
0.2888
0.3821
0.4812
0.5528
0.6593
0.7585
0.8754

E!
-0.0935
-0.0082
0.1115
0.1859
0.2696
0.3748
0.4886
0.5725
0.6934
0.806
0.9366

Table 2.9: Simulation results for P = 0.6

D (%)
0
10
20
30
40
50
60
70
80
90
100

RP
-0.0312
0.0529
0.1273
0.2177
0.2952
0.3732
0.4756
0.5404
0.6593
0.7779
0.8799

V*

Ep
-0.0936
0.0019
0.086
0.1899
0.2788
0.3692
0.481
0.5575
0.6892
0.8216
0.9404

56

Table 2.10: Simulation results for P = 0.7

D(%)
EP
EP
0
-0.0309 -0.093
10
0.0515 0.0011
0.0942
0.134
20
30
0.2158 0.1869
0.2912
40
0.308
50
60

70
80
90
100

0.3867
0.4766
0.5427
0.6562
0.7665
0.8748

0.3815
0.4832
0.5587
0.6908
0.8133
0.9381

1

1 1
09 •
OS •

i
o
j

w£

c

1 I
1

.
I

o

03 -

'X^

02s'jr

0 1-

s'lr

o •
• ' /

0 1 -K

10

20

30

40

50

60

D(K)

Table 2.11: Simulation results for P = 0.8

D(%)
0
10
20
30
40
50
60
70
80
90
100

EP

p

-0.019
0.06
0.1437
0.2387
0.319
0.4008
0.4552
0.5618
0.6627
0.762
0.8762

-0.0777
0.0116
0.1074
0.214
0.3066
0.397
0.4616
0.5814
0.6954
0.8098
0.9381

57

7

Table 2.12: Simulation results for P = 0.9
11

D(%)
EP
EP
0.116 0.0751
0
0.1792 0.1471
10
20
30
40
50
60
70

80
90
100

0.2402
0.3298
0.3999
0.4713
0.5475
0.6187
0.6814
0.7892
0.8745

0.216
0.3187
0.3965
0.4782
0.5656
0.6473
0.7171
0.8387
0.9366

09 08 -1
07 -

g
(0

c
o

06 •

j

^

'

s

Q. 04 •
E
o
u 03 -

'-^S

02 -

C

10

20

30

40

50

60

70

80

90

100

D(%)

Table 2.13: Simulation results for P = 0.98
1

r>(ttA)
0

10
20
30
40
50
60
70
80
90
100

-I

OS

0.6745
0.7032
0.7105
0.7298
0.7436
0.7723
0.7818
0.801
0.8254
0.8496
0.8749

0.7057
0.7369
0.7434
0.7678
0.7816
0.8175
0.8285
0.8517
0.8797
0.9076
0.9379

08
07
06
05
04

E;-03
02 -

01

40

50

D(%)

58

60

In addition to parameters D and P, we conduct sensitivity analysis on the probabilities pw and pt, of white and black 8 x 8 blocks, respectively There are two reasons
we consider these parameteis Fust, pw affects the peifoimance of Co, as discussed m
Section 2 6 Also, CA was designed precisely to reduce the overhead that white blocks
impose on Co Hence, it is important to observe how Co and CA behave under different
values of pw

Second, white and black blocks tend to have the highest frequencies of

occurrence m relatively large samples of binary images The empirical probabilities illustrated m Section 2 3 3 suggest that black and white blocks comprise approximately
73% of blocks Then, reducing the probability of white and black blocks for the sensitivity analysis brings about an increase m the frequency of occuirence of other blocks
with longer code lengths Under such circumstances, we want to observe whether the
two flags of scheme CA mcui more coding overhead than the straightforward coding
scheme Co
The ranges we selected for the probability of white and black blocks are, respectively, pw e {0 1, 0 15, 0 2,

,0 5,0 55} and pb e {0,0 05,0 1,0 15,0 16,

,0 25}

The lowei bound for pw is based on the cmpnical judgment that binary images arc
expected to have a ceitam amount of white background, wheieas the lowei bound foi
Pi IS based on the obseivation that bmaiy images need not ncccssaiily compiise black
8 x 8 blocks Foi instance, bmaiy textual images containing text with thm font faces
and small font sizes (such as Anal, 9pt) do not generally yield black 8 x 8 blocks
Similar to the simulation for paiameteis D and P, we genciate 25600 blocks foi
each value of pw and pb White blocks aie generated 100p w % of the time, black blocks
100pb% of the time, the remaining 1 — (pw + pb) of the blocks are generated fiom the
codebook following a log-senes distribution

This piocedure is repeated for vanous

values of D and P, as lllustiated in the piecedmg sensitivity analysis

59

Figures 2 19 to 2 24 exhibit results for

P=

{50,60,70,80,90,98%},

£ = {0,30,60,90%},
pw = {0 1,0 15, 0 2, 0 3,0 4, 0 5, 0 55},
pb = {0, 0 1, 0 15, 0 17,0 2, 0 22, 0 23, 0 25}

For each of the 56 p a n s (pw,Pb), we graph the average compression ratios yielded by
C0 and CA for 100 1024 x 1024 binaiy images used per pair
Consider the case when D = 0%, l e no blocks are generated from the codebook
The only component left to compress blocks is RCRC with probability P(R)
noted m Section 2 4 3, P(R)

depends on the probability P

As

the higher the value of

P is, the higher the chances RCRC compresses a block become

It can be observed

from Figures 2 19 to 2 22 that for D = 0% and P = {50, 60, 70, 80} RCRC fails
to compress, and most blocks remain incompressible

Notice that for small values

of P the resulting compiession trend is almost flat

For D = 0% and P = 90%

(Figme 2 23), there is compression but at insignificant latcs, whereas foi D = 0%
and P = 98% RCRC compi esses significantly most blocks

Moicovci, foi all P, C0

peifoims better than CA because CA incuis more overhead brts with flags BCC

and

ICD (see Table 2 6) than the 4 overhead bits incurred by C0 (see Table 2 5)
Foi D = 30% and higher, CA outperforms C0 and the compressron discrepancies
tend to increase as D increases Both coding schemes expose increasing compression
trends as the probability pw changes fiom 0 1 to 0 55

It can also be noticed that

the compressron rates yreldcd by CQ fluctuate moic than those yielded by CA FOI
example, consider cases (b), (c) and (d) m Figuie 2 23 In these cases, for every value
of pw, Ep decreases as pb increases from 0 to 0 25 whereas E* is always increasing

60

The reason for this fluctuation lays on the coding of black blocks: Co incurs 7 overhead
bits, whereas CA employs solely the Huffman code of black blocks. Hence, for small
values of pi,, Co will yield higher average compression ratios, but CA is still superior.
This fluctuation lessens when both pw and pb are large, as observed in the figures.
In the preceding simulation results, we stated that scheme CA should be chosen
over C0 for D > 60%. Based on the sensitivity analysis on pw and p^, it can be
conjectured that for some value D* between 0% and 30%, Co outperforms CA for
all D < D*.

Hence, we may conclude here that for all D > D* (or, specifically,

D > 30%), the preferred coding scheme should be CA- For a more solid conclusion,
one needs to conduct sensitivity analyses on all the probability parameters of the two
coding schemes. In practice, however, such simulations incur expensive computational
costs. In all, the results presented here suffice to conclude that the alternative coding
scheme illustrated in Section 2.7 should be the preferred scheme for compressing the
average binary image.

61

0
16.

0 01 0

24

32

a-o%,

P = 50%

40

4.8

56

0 02
0 03

•

0 04

•

0 05
0 06
0 07

•

0 08
0 09
01

fa) D = 0%

(b) D = 30%

D=60%, P = 50%

0 = 90%, P = 50%

(c) D = 60%

(d) D = 90%

1
0 95
09
0 85
08
0 75

12

F i g u r e 2.19: Siixmlation results for P = 50%

62

10

18

0
3

0 01 0
0 02

—

-0 03

•__.

0 04

•

•

0 05
0 06
00?

24

16

,-- ~ 4 ~
•
— ._ _

.-

40

,

<~ -—- _ .
_,. ......
.

48

< _. — _

5.6

_

E,
E

•

'

I

0 08 •
-0 09

32

....

- -

_. .

16

24

32

40

01
D = 3 0 % , P = 60%

0 = 0%, P = 60%

(a) D = 0%

(b) D = 30%

0 95
o

09

"2
c
o
3

085

08

16

24

32

40

48

D = 60%, P = 60%

(c) D = 60%

(d) D = 90%

F i g u r e 2.20: Simulation results for P = 60%

63

0
16

0 01 0

24

32

40

48

0 02
0 03 • .
0 04 •
-0 05

EX-

0 06
-0 07 *
-0 08
-0 09
-0 1

D=0%, P=70%

0 = 30%, P = 70%

(a) D = 0%

(b) D = 30%
1

0 98
o

0 96

.

DC

co 0 94 •* /
Of

9

a

If)

24

3~>

10

1R

r

£

=

fi

0 92
09

<

'^ \

16

?\

M

-ID

D = 60%, P = 70%

0 = 9 0 % , P = 70%

(c) D = 60%

(d) D = 90%

F i g u r e 2 . 2 1 : Simulation results for P = 70%

64

48

S6

001
0 02
0 03
0 04
0 05
0 06
0 07
0 08
0 09
D = 0 % , P = 80%

(a) D = 0%

(b) D = 30%

0 - 6 0 % , P = 80%

D = 90%, P = 80%

(r) D = 60%

(d) D = 90%

0 95
09
0 85

07
0 65

F i g u r e 2.22: Simulation lesults for P = 80%

65

16

24

32

40

16

D = 0 % , P = 90%

(a) D = 0%

16

24

37

24

32

40

D = 3 0 % , P = 90%

(b) D = 30%

40

D= 60%, P = 90%

D = 90%, P - 90%

(c) D = 60%

(d) D = 90%

F i g u r e 2 . 2 3 : Simulation results for P = 90%

66

48

0 575
0 57

i-^

0565 •

w ~

\

0 56
0 555 '
0 55
0 545 *
0 54 ~
0535

„<, .

,•
.— -x '" ~"•
___
—•- ....
—
.......

........
~...... — ,. _

„r

__

r

_

„

-

. ... !_ ......
—r-^\-- - -v—*
\

0

8

0 53

f»

E

" —

1

0 525
16

24

32

D=0% P

40

48

16

56

!8%

24

32

40

D = 30%, P = 98%

(a) D = 0%

(b) D = 30%

D=60%, P = 9 8 %

D= 90%, P = 98%

(c) D = 60%

(d) D = 90%

F i g u r e 2.24: Simulation lesults for P ~ 98°

67

48

2.9

The Codebook Model for Arithmetic Coding

To this point, the illustrated codebook model has been used m conjunction with
Huffman codes

Two coding schemes were developed based on that model

The al-

ternative coding scheme exposed in Section 2 7 motivated us to use the codebook
model along with the empirical piobabihties of 8 x 8 blocks to compress via Arithmetic coding In this case, the codebook compiiscs 6954 blocks (including the two flag
symbols discussed in Section 2 7) along with the lower and upper probability values
of each block

Technically, we implemented an integer-based arithmetic codei [19]

Encoding and decoding for arithmetic coding work the same way as the alternative
coding scheme, CA, illustrated m Section 2 7

2.10

Protagonists and Antagonists

The 100 most frequently occurring blocks are shown m Figuie 2 25 10 The blocks
in cells a l and a2 depict the 0-valued and the 1-valued blocks, respectively Observe
that the most frequently occurung blocks represent geometric pirmitivcs, such as
ponrts (cells a4-a7), lmcs (cells a3 a9, e20, etc ), triangles (cells c4, clO e l 5 , etc ),
lectanglcs (cells b l - b 6 , c9, e t c ) , or a combination thcicof

Moreover, there cxrst

blocks that are inverted veisions of each-other For instance, the block m cell d l is the
inverted counterpart of the triangle m cell c7 This is because the data sample fiom
whrch the codebook was constructed contained a combination of binary rmages with
white and black backgrounds In general, the codebook comprises regulai geometric
constructs
As stated m Section 2 2, the set of all binaiy rmages can be partrtionccl into
images compressible by the codebook, images compressible by RCRC but not by
10

Tho digits in boldface icpic^ent numbcis 10-20

68

1 2 •J 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0
a
1
1
I 1•
l l
™• •
^^
,
•
b 1
•r
•
1
l • , •
n
1
II ^ l
l A
c
1
1F
1
1
'
i
I
m r"
1 k.
L
—
1 1
™
l J "• 1 J1

•

M

mm

i

•• •
i• • •

••••
•
•
••
•
•
•

j

• • " "

L

•• 1
••
-1

T

k

Figure 2.25: Visualization of the first 100 8 x 8 blocks
the codebook, and incompressible images. Images belonging to the first class expose
primitive-geometric construct, given the nature of 8 x 8 blocks in the codebook. We
randomly generated three 64 x 64 such images using the 6952 8 x 8 blocks per their
probabilities. These images are shown in Figure 2.26. Notice the dominance of basic
geometric constructs, which resemble some of the blocks in Figure 2.25. Such binary
images are efficiently compressed by the codebook component of the proposed method,
but such images can also be compressed using RCRC alone. However, as discussed in
Section 2.3.2, the maximum Huffman code length of codebook blocks is 17 bits while
blocks compressed with RCRC take on at least 17 bits. In piactice, binary images
exposing the regularity depicted in Figure 2.26 have a low (empirical) probability of
occurrence.

tf»
(a)

ft MM
(b)

(c)

Figure 2.26: Binary images randomly generated using codebook blocks only

Blocks not in the codebook, but compressible by RCRC, may also evince regular
geometric constructs

Foi instance, the codebook docs not contain all 8 x 8 blocks
69

consisting of 63 O's and a 1 or 63 l's and a 0 Such blocks are efficiently compressed by
RCRC However, RCRC compresses triangles less efficiently
in an 8 x 8 block, the less efficient RCRC becomes

the larger the triangle

Figuie 2 27 illustrates an 8 x 8

block evincing a triangle This block cannot be compressed by RCRC

Figure 2.27: An incompressible geometric primitive
In addition, RCRC does not perform efficiently for 8 x 8 sparse matrices and
matrices where the l's are aligned m a non-hneai fashion, such as diagonally, as
depicted in Figure 2 28 As an instance, RCRC fails to compress 8 x 8 peimutation
matrices, I e matrices t h a t have exactly one entry equal to 1 m each low and each
column and 0 elsewhere

F i g u r e 2.28: An mcompicssible 8 x 8 block

Furthermore, there exist blocks that cannot be compiessed by the proposed method
One way to ensuic plausibly higher compression latcs could be to lesoit to additional
conventional coding techniques, such as Run-Length Encoding

However, such ap-

proaches are not efficient foi two reasons Fust, the enipnical complexity of the proposed scheme would mciease Second, the coding schemes would have to be extended
to accommodate the new add-ons, thus adding moie overhead bits to compression In
general, it is inconclusive whether adding more schemes could increase compressron
rates, but it is almost ccitain that such add-ons would mciease the complexity
70

Chapter 3
Applications
There is nothing so agonizing to the fine skin of vanity as the
application of a rough truth.
-

EDWARD

BULWER-LYTTON

In this chapter, we report empirical results of the proposed compression method on
binary and discrete-color images in comparison with JBIG2. The main reason why we
compare results for binary image compression only with JBIG2—despite the fact that
both methods arc lossless—lays on that the standard JBIG2 is viewed as a generic
compression scheme in much the same way as we claim the proposed method to be.
Nevertheless, we note that for specific classes of binary and discictc-coloi images,
ad-hoc compression methods have been proposed and successfully implemented, as
exposed in Chapter 4.
T h e schematic diagram shown in Figure 3.f illustrates the generic operation of the
proposed compression scheme. The input image is appended in both dimensions to
become divisible by 8. Then, layers are extracted through color separation yrclcling
a set of bi-level matrices. Note that if the input image rs bi-level, such as binary
images, then it represents one layer by default. Next, each layei rs partrtioncd into
8 x 8 blocks. Each 8 x 8 block of the original data is searched m the codebook. If it
is found, the corresponding Huffman or Airthmctrc code is selected and added to the

71

compressed data stream. If it is not found, the row-column reduction coding attempts
to compress the block. If the output of RCRC is smaller than 64 bits, the reduced
block is appended to the compressed d a t a stream. Otherwise, the original block is
preserved. An example of color separation is illustrated in Section 3.2.
No

Use the original
block Bk

t?

Append
test data

Use code
of block Bk

Block Bk

Figure 3.1: Generic diagram of the proposed compression scheme.

3.1

Binary Images

We tested the proposed method on a variety of more than 200 binary images collected from different sources. The sample we compiled comprises varying topological
shapes ranging from solid objects to less regular, complex geometries. The empirical
results presented here are classified in three categories: solid binary images, irregular
geometries, and images JBIG2 compresses more efficiently than the proposed method.
In all three cases, a selected set of binary images is given along with compression ratios of the proposed method using Huffman and Arithmetic codes, and JBIG2. A set
of 112 labeled images and their compression ratios is exhibited in Appendix B.l.
Table 3.1 displays the compression ratios for 15 solid binary images. In the case
of Huffman codes, the alternative encoding scheme, C,\, exposed in Section 2.7 per72

forms better t h a n the other coding scheme, Co, given in Section 2.6. On average, the
proposed method outperforms the standard JBIG2 by approximately 3.06% in the
case of Huffman codes when CA is employed, and 3.07% when Arithmetic coding is
employed. As stated in Section 2.7, the coding scheme CA is more efficacious than
scheme C0. The sensitivity analysis on the stochastic parameters of the coding models
exposed in Section 2.8 provides a useful reference to apprehend the performance of
the two coding schemes. In all cases, the alternative coding scheme, CA, outperforms
the other coding scheme, Co. Finally, Table 3.2 shows the percentage of blocks compressed by the dictionary, the percentage of blocks compressed by RCRC, and the
portion of incompressible blocks. Notice that the percentage of the latter is relatively
small. This observation complies with the theoretical analyses on the codebook error
as well as the sensitivity analysis for D close to 98% (see, for instance, Table 2.13 in
Section 2.8).
Table 3 . 1 : Empirical results for 15 selected binary images: solid shapes.

mage
059
071
074
075
076
077
079
080
081
082
083
085
086
087
090

Dimensions
200 X 329
545 X 393
203 X 247
790 X 480
245 X 226
450 X 295
245 X 158
491 X 449
245 X 248
491 X 526
354 X 260
167 X 405
335 X 500
447 X 459
350 X 357
Average

Proposed Method
Ev
AC
E;
88.99 93.72 94.58
90.72 93.75 94.22
86.9 92.66 93.16
92.78 96.5 96.25
86.24 92.75 93.41
88.47 95.65 95.38
85.35 91.42 91.96
91.86 95.71 95.86
89.2 92.84 93.3
92.21 96.33 96.08
88.93 95.29 95.48
86.9 92.55 93.62
91.12 95.88 95.62
95.73
89.89 96.2
95.02
94.8
86.68
90.36 95.26 95.27

JBIG2
88.58
89.82
87.68
94.8
86.82
94.83
84.91
92.17
86.69
94.31
92.24
87.7
94.97
93.86
92.18
92.43

Table 3.3 shows empirical results for 15 less regular binary images. These images
73

Table 3.2: Percentage of blocks compressed by the codebook, RCRC, and mcompiessible
blocks
Image

059
071
074
075
076
077
079
080
081
082
083
085
086
087
090

Codebook
96 7
95 48
95 04
98 53
95 44
98 62
94 19
98 73
94 96
98 9
98 52
95 42
98 49
99 38
98 08

RCRC

Incompressible

33

0
02
05

4 32
4 47
1 46
4 56
1 23
5 65
1 25
4 84

103
1 28
4 39

14
0 52

187

0 02

0
0 14
0 16
0 03

02
0 07

02
0 19
0 11
0 09
0 05

are not as solid geometries as the images given m Table 3 1 On average, the pioposed
method perfoimed better than JBIG2 by 4 32% when Huffman coding with scheme
CA IS used and 5 62% when Arithmetic coding is employed

Moreover, Table 3 4

shows the peicentage of blocks compressed by the dictionary, the percentage of blocks
compicsscd by RCRC, and the portion of mcompiessible blocks
Table 3 5 piovicles the 8 bmaiy images whcicm JBIG2 outpcifoims either the Huff
man coding, oi the Arithmetic coding, oi both components of the pioposed method
On avciage, JBIG2 scoics 0 5 1 % higher compared to the Huffman coding and 0 92%
highei than the Anthmetic coding aspect of the pioposed method

Moicovei, Table

3 6 shows the percentage of blocks compressed by the dictionary, the peicentage of
blocks compressed by RCRC, and the poition of mcompiessible blocks
In addition, Table 3 7 exposes cmpnical lcsults foi 6 bmaiy images comprising
contour lines lather than filled regions A sample image is shown in Figure 3 2, while
the six images aic given in Appendix B 1 Pei the discussion m Section 2 10, 8 x 8

74

Table 3.3: Empirical results for 15 selected binary images irregular shapes
Proposed Method
mage

004
005
009
012
014
016
018
019
021
029
034
042
056
057
084

Dimensions
512 x 800
1024 x 768
1061 x 1049
575 x 426
498 x 395
400 x 400
483 x 464
791 x 663
360 x 441
196 x 390
372 x 217
490 x 481
450 x 360
180 x 210
240 x 394
Average

Ep

E;

AC

JBIG2
94 69
93 25 95 99 95 73
89 06
88 35 92 89 93 92
88 44
87 9 92 31 94 05
92 83
90 85 94 45 95 29
88 4 92 74 93 69
87 58
74
77 45 819 86 17
89 67 93 73 94 78
89 8
9181
9168 94 84 95 3
85 94
88 49 90 39 90 98
84 86 89 17 91 62
8159
81 71 85 56 87 16
80 1
89
92
86 04
90 96
84 99
85 44 91 16 92 69 86 86
83 56 89 31 91 35
80 17
87 85 92 52 92 63 89 32
88 44 92 45 93 6
88 62

blocks extracted from such images may be classified as antagonists to the proposed
method because of their topological nregularity In addition, JIBG2 has been reported
to compress efficiently topological objects enclosed by contour lines On average, the
proposed method perfoims better than JBIG2 by 2 42% for Huffman codes and 2 48%
for Anthmetic coding
Foi empirical puiposes, wc conducted the following experiment

We mveited the

bit values in the six bmaiy images discussed above, as illustrated in Figure 3 3 Bit
inversion causes the white image background to become black When partitioned into
8 x 8 blocks, 1-valued blocks will dominate the set of blocks Based on the constructed
codebook, 1-valued 8 x 8 blocks have a Huffman codewoid length of 2 bits

Hence,

all else equal, it is expected that, on aveiage, the proposed method will peifoim
worse on the mveited images, but no change should be expected fiom JBIG2 since
it is a context-based modeling scheme Next, we employed the pioposed method and
IBIG2, and the lesults aie exposed m Table 3 8
75

Obseive that, on aveiage, JBIG2

Table 3.4: Percentage of blocks compressed by the codebook, RCRC, and incompressible
blocks.
Image

004
005
009
012
014
016
018
019
021
029
034
042
056
057
084

Codebook
97.38
95.92
95.82
96.35
95.65
83.89
97.03
97.42
91.5
92.98
85.03
90.67
93.59
92.59
93.74

RCRC
2.54
3.55
3.89
3.65
4.03
15.3
2.92
2.52
7.26
6.37
14.36
8.99
6.14
7.09
6.19

Incompressible
0.08
0.54
0.29

0
0.32
0.81
0.06
0.06
1.24
0.65
0.61
0.34
0.27
0.32
0.06

has gained a relative additional compression of 0.52% from bit inversion, while the
individual coding schemes of the proposed method have decreased by 9.97% for C0,
1.72% for CA, and 2.2% for the Arithmetic coding. In the case of bit inversion, JBIG2
outperforms Arithmetic coding by a relative difference of 0.3%, but scheme CA still
outperforms JBIG2 by a relative difference of 0.15%. All in all, in these cases, the
proposed method and JBIG2 score relatively close compression ratios with no major
cost difference.
Tables 3.9 and 3.10 display the percentage of blocks compressed by the dictionary,
the percentage of blocks compressed by RCRC, and the portion of incompressible
blocks for the 6 original and inverted binary images, respectively.

76

Table 3.5: Empirical results for 8 selected binary images: JBIG2 more efficient.

Image

Dimensions

Proposed Method
Ep
E*v
AC

JBIG2

008
022
028
064
088
094
096
100

2400 x 3000
315 x 394
2400 x 1920
640 x 439
1203 x 1200
1018 x 486
516 x 687
765 x 486
Average

92.56
84.38
94.37
94.59
91.12
92.43
93.6
95.54
93.05

98.34
91.94
98
96.84
95.94
96.57
97.9
97.71
97.83

97.59
89.66
97.73
96.77
95.88
96.42
97.08
97.54
97.33

96.98
93.32
97.47
96.93
95.3
96.32
97.06
97.59
96.93

Table 3.6: Percentage of blocks compressed by the codebook, RCRC, and incompressible
blocks.
Image

Codebook

RCRC

Incompressible

008
022
028
064
088
094
096
100

99.81
93.05
99.65
98.14
97.49
98.49
99.21
99.18

0.18
6.85
0.34
1.77
2.34
1.45
0.79
0.8

0.01
0.1
0.01
0.09
0.17
0.06
0
0.02

Table 3.7: Empirical results for 6 selected binary images: line boundaries.

nage
101
102
103
104
105
106

Dimensions
512 x 512
514 x 514
512 x 512
512 x 512
512 x 512
512 x 512
Average

Proposed Method
rp*
Ep
AC
Ep
87.47 89.09 89.04
93.01 94.91 94.82
85.56 87.4 87.99
85.85 87.51 87.87
89.66 91.41 91.11
88.89 90.55 90.29
88.41 90.14 90.19

77

JBIG2
87.68
94.7
83.92
84.82
88.63
88.3
88.01

Table 3.8: Empirical results for the 6 inverted binary images: line boundaries.

Image
101
102
103
104
105
106
Average

Proposed Method
Ep
AC
E;
78.89 87.64 87.11
83.09 93.36 92.71
77.51 85.66 85.97
77.42 85.93 85.64
80.66 89.91 89.21
80.02 89.1 88.44
79.6
88.6 88.18

JBIG2
88.08
95
84.45
85.26
88.83
89.18
88.47

Figure 3.2: Binary image with line boundaries

Figure 3.3: Binary image with line boundaries: inverted counterpart

78

T a b l e 3.9: Percentage of blocks compressed by the codebook, R C R C , and incompressible
blocks

Image
101
102
103
104
105
106
Average

Codebook
87 86
95 91
84 9
87 01
91 22
90 04
89 49

RCRC
10 51
3 88
13 78
10 6
7 53
8 62
9 15

Incompressible
163
0 21
1 33
2 39
1 25
135
135

T a b l e 3.10: Poicentage of blocks compicssccl by the codebook, R C R C , and incompicssible
blocks inverted countciparts

Image
101
102
103
104
105
106
Average

Codebook
87 01
95 41
82 98
85 59
90 56
89 28
88 47

RCRC
1136
4 38
15 55
12
8 14
9 37
10 13

79

Incompressible
163
0 21
1 47
2 41
13
1 35
135

3.2

Discrete-Color Images

In addition to binary images, we tested the proposed scheme on two sets of discretecolor images. The first set consists of a sample of topographic map images comprising
four semantic layers that were obtained from the GIS lab at the University of Northern
British Columbia [20]. Semantic layers include contour lines, lakes, rivers, and roads,
all colored differently.

Figure 3.4 illustrates layer separation of a topographic map

from our test sample. In this case, the map image has four discrete colors (light blue,
dark blue, red, and olive), and thus four layers are extracted. Each discrete color
is coupled with the background color (in this case being white) to form the bi-level
layers.
The second set of discrete-color images consists of graphs and charts, most of
which were generated with spreadsheet software. Graphs and charts consist of discrete colors and are practically limited to no more than 60 colors. Such data are
extensively used in business reports, which are in turn published over the web or
stored in internal organizational databases.

As the size of such reports increases,

lossless compression becomes imperative in the sense that it is cost-effective for both
storage and transmission.
Table 3.11 provides a summarized description of three map images used in this
process. Table 3.12 gives the compression results in bits per pixel (bpp) of the proposed method for the three map images shown in Table 3.11. Formula (2.5) has been
used to calculate compression ratios for both the proposed method and JBIG2. From
the results we may conclude that the proposed method achieves high compression on
the selected set of map images. On average, the proposed method achieves a compression of 0.035 bpp (96.5%) on the selected data sample, whereas JBIG2 yields a
compression rate of 0.053 bpp (94.7%). These results are higher than or comparable
to those results reported in [21]. We note that while compression rates yielded by

80

the proposed method outperform JBIG2 by an average 2%, JBIG2 has been reported
to generally compress at 0.22 to 0.18 bits per pixel [22]. Finally, results for charts
and graphs are given in Table 3.13, wherein the proposed method and JBIG2 compress, respectively, at 0.03 bpp and 0.087 bpp. In this case, the proposed method
outperforms JBIG2 by 6.24%, on average. The maps, graphs and charts used in these
experiments are exhibited in Appendix B.2.

Layer 1: Contour Lines

Layer 2; Lakes

4 different colors

Layer 4: Roads

Layer 3: Rivers

F i g u r e 3.4: Example of color separation: a topographic map of a p a r t of British Columbia
containing 4 different colors excluding the white background.

T a b l e 3 . 1 1 : Description of selected topographic map images, scale 1: 20000 [20]

Map

Dimensions

Size (KB)

1
2
3

2200 x 1700
5776 x 13056
5112 x 11600

10960
220979
173769
406708

Total Size

81

Table 3.12: Compression results for map images using the proposed method vs. JBIG2

Map
1
2
3
Total

Compressed
Size (KB)
210.77
7489.27
6626. 27
14326.42

Compression
Ratio (bpp)
0.019
0.034
0.038
0.035

JBIG2
0.029
0.052
0.055
0.053

Table 3.13: Compression results for charts and graphs using the proposed method vs.
JBIG2
Map

Compressed

Compression

JBIG2

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
Total

Size (KB)
1281.33
899.28
865.97
863.74
378.33
607.67
590.07
1039.81
590.07
590.07
773.49
839.71
590.07
590.07
590.07
590.07
590.07
367.01
590.07
590.07
1050.83
588.87
309.45
607.44
16373.62

R a t i o (bpp)
0.014
0.065
0.065
0.045
0.057
0.021
0.021
0.018
0.033
0.031
0.033
0.023
0.022
0.032
0.032
0.018
0.028
0.017
0.034
0.07
0.017
0.017
0.02
0.022
0.03

0.082
0.063
0.135
0.077
0.143
0.088
0.077
0.053
0.099
0.089
0.089
0.079
0.08
0.088
0.087
0.055
0.059
0.126
0.167
0.103
0.072
0.065
0.074
0.12
0.087

82

3.3

Discussion

The compression results shown m Table 3 1 reveal that the proposed method
outperforms JBIG2 m 92% of the cases

We note that the 100 binary images are

mostly solid topological shapes, which intentionally favor the JBIG2 compression
algorithm

Moreovei, the alternative coding scheme, CA, modeled m equation (2 25)

outperforms the other coding scheme, Co, modeled m equation (2 23) for all the binary
images exhibited m Table B 1 of Appendix B 1 The reason for this is that 98 42%
of the blocks are found m the codebook and can be compressed with less overhead
bits if the alternative coding scheme is used

This observation is also confirmed by

examining the simulation lesults m Section 2 8 for large values of D, pw and pi, It may
also be concluded that the piobabihty P should be relatively large (more than 90%)
because the portion of incompressible blocks is 0 08%, on average, and the portion
RCRC compresses is 99 92% of the blocks not m the codebook
In the case of the six binary images shown in Table B 2 of Appendix B 1, 89 29%
of the blocks weic found in the codebook, 9 16% were compressed wrth RCRC and
1 35% were mcompicssible For the mveitccl rounteipaits, 88 47% of the blocks were
compiesscd with the codebook, 10 18% with RCRC and 1 35% lemamcd mcompicssible

We may conclude that RCRC has proved to be an efficient auxiliary coding

module

Also, we may surmise as above that the probability paiametci P is quite

large, thus complying with the simulation results of Section 2 8
Similai conclusions can be drawn for the coding of discrete-color images

3.4

Huffman Coding Is N o t D e a d

The advent of Anthmetic Coding along with the giavity of many lesults lepoitcd
m htciatuie over the last two decades has biought about an ovcishadowmg of the
83

power and simplicity of Huffman Coding, but not without principal reasons [1, 8].
Arithmetic codes
• attain compression rates closer to the source entropy than Huffman codes;
• outperform Huffman codes by pi + 0.086, where pi is the probability of the most
frequently occurring symbol; 1
• are adjustable to adaptive models.
These advantages, however, come at a cost. Arithmetic codes do not generally
perform better than Huffman codes if inconect probabilities are fed to the coder.
For instance, if the coder encodes according to a probability model M. while the
tiue probabilities are described by model M*, then Arithmetic coding is expected
to perform worse than Huffman coding.

A hypothetical example illustrating this

observation for a small number of symbols is given in [8]. Consider the following
extracts from [13]:
[GJottlob Buimann, a Geiman poet who lived from 1737 to 1805, wrote 130
poems, including a total of 20000 words without once using the letter R. [... ]
In 1939, Ernest Vincent Wright published a 267-page novel, Gadsby, in which
no use is made of the lettci E (Souicc: [13], p. 48)
If a probability model foi compiessmg Geiman text is constiucted using Buiniann's
poems as the body of data, then chances are that the inconect probability of R will
reduce the efficiency of Arithmetic coding because the occurrence of R in other German
texts will reveal a conspicuous disci epancy between the theoretical and cmpincal
probabilities.

To better comprehend the inefficiency of arithmetic codes resulting

from erroneous probabilities, we illustrate the following analysis taken from [8] for
the first 10028 words of Wright's novel, Gadsby:
If one uses this novel to estimate the chaiacter frequencies in English, the
Huffman codeword assigned to E would be 14 bits long [ . . ] , instead of just 3
bits on regulai English text. For aiithmetic codes, each E would add 15.4 bits.
1

In [23], it is shown that the nxlundancy of Huffman codes is bounded hom above by pi+0 086
84

In reality, the true model M* describing English language requires an average of
4.19 bits per letter for Huffman codes and 4.16 for arithmetic codes. In the case of
the erroneous probability model Ai described above, the average length of Huffman
codewords would increase from 4.19 to 5.46, whereas for arithmetic codes from 4.19
to 5.60. Therefore, one has to carefully choose correct probability models in order to
strictly avoid such errors propagating throughout the entire coding procedure.
Empirical results for the solid images exposed in Appendix B.l show that Huffman coding (via scheme CA) slightly outperforms Arithmetic coding on average. The
rationale for this is that the probability model we constructed for the codebook may
have predicted slightly lower or slightly higher relative frequencies for certain 8 x 8
blocks which appeared with slightly higher or slightly lower probabilities in specific
test images. To clarify this point, consider the six binary images with white background (Table 3.7) and the six inverted counterparts (Table 3.8). In the case of white
background, Arithmetic coding slightly outperforms Huffman coding per scheme CA
by 0.044%. This implies that the empirical probability model describing these six
images is almost consistent with the theoretical model, probably because of the dominance of white blocks both in the codebook and in the set oi the six images. However,
when the images are invcitcd, white blocks aic lcplaccd by black blocks, which have
a codebook probability equal to 0.263. In this case, Huffman coding outperforms
Arithmetic coding by 0.15% (Tabic 3.8). It may be surmised that the empirical probability model which describes the mveited images is not as particularly consistent
with the theoretical probability model that describes the constructed codebook as
the probability model which dcsciibes the images with white background. Therefore,
Arithmetic coding performs less efficiently than Huffman coding, as expected theoretically. The case presented here posits a complex situation involving an alphabet
of 2 64 symbols and a less contiguous body of data, such as is the case of binary im-

85

ages We may, at least in principle, conclude that a more rigorous codebook model
needs to be constructed in order to accommodate the strict modeling requirements of
Arithmetic coding

One way to approach a more correct model would be to enlarge

the data sample used to construct the codebook

This is, nevertheless, a daunting

computational task, as illustrated by the time (500 hours) it took to generate the
codebook based on only 120 binary images
In addition to being susceptible to incorrect probabilities, Arithmetic coding is
generally slower than Huffman coding in terms of execution time for encoding and
decoding [24, 25] Improvements for increasing the speed of Arithmetic coding have
brought about sacrifices for coding optimahty [8] In this work, we implemented an
integer-based arithmetic coder based on the guidelines m [19]

For binary images

we used for testing, for instance, the overall execution times for scheme CA and the
arithmetic coder were, respectively, 261 and 287 seconds
In terms of optimahty, the Huffman codes we constructed for the codebook blocks
have an absolute redundancy equal to 0 01, which implies that 0 252% more bits
than the cntiopy cncumsciibes are required to code blocks in the codebook (sec
Section 2 3 2)
Pi = 0 503
optimal

The uppei bound toi the redundancy is px + 0 086 in our case

Thus, we may conclude that the constructed Huffman codes aic near-

And given the pionc-to-mconcct-probabilities facet of Arithmetic coding

as well as the high compression lesults of the proposed method, we conclude that
for practical applications the pioposcd Huffman coding scheme, CA, should be the
preferred compression choice
All m all, the whole purpoit of the thcorctrcal and empirical observations posited
m [1, 8] is that Huffman coding is generally moie lobust than Arithmetic coding,
which can expose emphatic advantage m lare cases The emprncal results shown in
this work suggest that the proposed codebook model works efficrently with Huffman

86

codes, but models slightly discrepant probabilities for the arithmetic coder to function
effectively.

87

Chapter 4
Related Work
Mankind is not a circle with a single center but an ellipse with
two focal points of which facts are one and ideas the other
-

VICTOR HUGO

In this chapter, we present the mainstream research pertaining to lossless compiession of binary and discrete-color images m the context of block coding

4.1

Binary Image Compression Techniques

Central to the pioposcd method is the idea of paititionmg a binary image 01 the
bi-level layers of discrete-color images into non-overlapping 8 x 8 blocks

Paititionmg

and encoding binary images into blocks, referred to as block coding, has been summarized m [26], wherein images arc divided into blocks of totally white (0-valued) pixels
and non-white pixels The former arc coded by one single bit equal to 0, whereas the
lattei aie coded with a bit value equal to 1 followed by the content of the block in
a row-wise oidei similai to RCRC coding

Moieovei, the hierarchical variant of the

block coding lays on dividing a binaiy image into 6 x 6 blocks (typically, 16 x 16),
which are then icpresentcd m a quad-tree stiuctme

In this case, a 0-valued 6 x 6

block is encoded using bit 0, whereas othei blocks are coded with bit 1 followed by

88

recursively encoded blocks of pixels with the base case being one single pixel.

In

[18, 27], it is suggested that block coding can be improved by resorting to Huffman
coding or by employing context-based models within larger blocks.
A generalized approach to block coding is illustrated in [28], wherein it is argued
that such a method achieves near-optimal encoding of sparse binary images, especially
when source statistics are not available.
In [29], a hybrid compression method based on hierarchical blocks coding is proposed. Here, predictive modeling has been employed to construct an error image as
a result of the difference between the predicted and original pixel values. Then, the
error image is compressed using Huffman coding of bit patterns at the lowest hierarchical level. This work builds upon the ideas presented in [30, 31], wherein block
coding with Arithmetic coding has been employed.
Closely related to the idea of (non-)overlapping blocks is rectangular partitioning,
wherein 1-valued (black) regions in the input binary image are partitioned into rectangles [32]. In this case, the top-left and bottom-right coordinates of a given rectangle
are encoded, whereas a different code is used for isolated pixels.

4.1.1

JBIG2

JBIG2, the successor of JBIG1, is a platform-independent lossless and lossy coding
standard primaiily designed for compressing bi-level images, but is also capable of
encoding layers of multiple-bit pixels, such as halftone images [9, 33]. The underlying
method is based on adaptive coding, in which case current information about an image
pixel is adapted contextually to preceding pixels. In light of that, JBIG2 uses adaptive
arithmetic coding to predict future pixel codes based on previously encountered pixel
data.
JBIG2 operates by segmenting an input image into regions, such as text and
89

images, and encodes each region using different methods embodied m the standard
If X is the current pixel to be predicted, than JBIG2 resorts to a set of adjacent
pixels, referied to as the context, to code X

The context includes adaptive pixels

as well All in all, it has been observed that JBIG2 compi esses at rates higher than
other known standards or generic methods

4.2

Discrete-Color Image Compression Techniques

In the context of discrete-color images, lossless compiession methods aie generally
classified into two categories (1) methods applied directly on the image, such as the
graphics mteichange format (GIF), the portable network graphics (PNG), or lossless
J P E G (JPEG-LS), (n) and methods applied on eveiy layer extiacted (or sepaiated)
from the image, such as TIFF-G4 and JBIG In this work, we focused on the second categoiy

Previous work m literature amounts to several lossless compression

methods for map images based on layer separation

The standard JBIG2, which is

specifically designed to compi ess bi-level data, employs context-based modeling along
with Arithmetic coding to compi ess bmaiy layeis In [21], a lossless compression technique based on semantic binary layeis is proposed

Each binary layer is compressed

using context-based statistical modeling and arithmetic coding, which is slightly different from the standard JBIG2 In [34], a method that utilizes mteilayer correlation
between coloi separated layers rs proposed

Context-based modeling and arithmetic

coding are used to compress each layei An extension of this method applied on layeis
separated into bit planes is given m [35]

90

Chapter 5
Conclusions and Future Work
I was born not knowing and have had only a little time to
change t h a t here and there.
-

5.1

RICHARD

FEYNMAN

Concluding Remarks

This thesis exposed the details of a novel lossless compression method for binary
and discrete-color images. The core of the method lies in generic block coding and
operates per the empirical distribution of the most, but not all, frequently occurring
8 x 8 blocks. Asymptotic analyses suggest that the en or incurred from trimming the
codebook to a particular number of blocks is small to negligible. The clistiibution of
blocks was employed to construct Huffman and Aiithmctic codes. The latter coding
algorithms led to the development of two coding schemes for the proposed method.
To attain higher compression, an additional coding helper module-the row-column
reduction coding-was introduced. The proposed method was tested on various binary
and discrete-color images. Results were compared to JBIG2, the standard coder for bilevel layers. Empirical results suggest that the proposed method outperforms JBIG2
compression rates in most, if not all, of the cases. The proposed method is efficient in
terms of the memory required for the codebook as well as the arralytical and empirical
complexities.
91

5.2

Future Work

In Section 2 10, we illustrated protagonist and antagonist blocks with respect to
the proposed lossless compression method However, there exist seemingly protagonist
blocks with basic geometric constructs which do not appear in the codebook and are,
theiefore, considered antagonists

For example, not all 8 x 8 blocks containing 63

zeros and a 1 are m the codebook In such cases RCRC will certainly compress at a
maximum rate equal to 73 4% Moreover, such cases instigate premises to extend the
proposed lossless compression method into a lossy method For example, substituting
the 1 with 0 m the case discussed here yields a white block, which can then be
efficiently compressed via the codebook using only 1 bit

Extending the proposed

method to lossy coding is a plausible future aiea of lesearch
In light of the latter extension, the pioposed method can be employed for interactively reconstructing broken regions in binary images If certain pixels aie missing
m some input image, then the region comprising the missing bits is referred to as
a broken region

If one consideis the missing bits as "don't care" bits, then the

RCRC algorithm can be modified to accept such bits to determine the best 8 x 8
block that would reconstruct a partrculai portion of the biokcn legion In aclclrtron, if
the RCRC-decocled block strll contains 'don't care' bits, the codebook model can be
used to seaich foi the best match of the block Notice that the best codebook match,
if it exists, has the highest probability as the codebook entries are sorted in that fashion Hence, it may be surmised that chances are that the reconstructed block for the
particular portion of the missing regron is the most piobable block that exists to fill
that region

This process may reconstruct blocks which aie not nccessanly suitable

foi the missing legion The latter observation brings about the mteiactive facet of the
reconstruction, in whrch case a usci can accept oi reject a suggested reconstruction, or
can modify the preconditions per the perception of how a fully reconstructed binary

92

image would look like.
In addition, with a slight modification to the row-column reduction coding algorithm, the proposed method can be applied to compress Very-Large-Scale Integration
(VLSI) circuitry test data. VLSI data consists of 0-1 matrices as well as "don't care"
bits. If we view don't care bits as "wild card" bits, then the codebook may be search to
find the match with the shortest code. On the other hand, RCRC may be modified to
deal with "don't care bits" based on the following observation. Three row (or column)
vectors, Vi, v 2 , v 3 do not satisfy the Euclidean relation, i.e. Vii?v2 AviZ?v3 -» v 2 i?v 3 .
Thus, "don't care" bits should be replaced with care bits in a way that maximizes
the number of eliminated rows (or columns).

93

Published Material
Portions of Chapters 2 and 3 appeared in conference proceedings [36] and [37].

94

Bibliography
[1] D Salomon, Variable-Length

Codes for Data Compression

[2] K Sayood, Introduction to Data Compression
Kaufmann Publishers Inc , 3 ed , 2005

Springer, 2007

San Francisco, CA, USA Morgan

[3] A Moffat, T C Bell, and I H Witten, "Lossless Compression for Text and
Images," International Journal of High Speed Electronics and Systems, vol 8,
no 1, pp 179-231, 1997
[4] C E Shannon, "A Mathematical Theory of Communication," The Bell
Technical Journal, vol 27, pp 379-423, 1948
[5] R M Gray, Entropy and Information
[6] R M Gray, Probability, Random
2 ed , 2009

Theory

Systems

Springer, 1990

Processes, and Ergodic Properties

Springer,

[7] K J Balaknshnan and N A Touba, 'Relationship Between Entiopy and Test
Data Compiession," IEEE Transactions on Computer-Aided Design, vol 23,
no 4, pp 386-395, 2007
[8] A Bookstcm and S T Klein, 'Is Huffman Coding Dead 7 ,' Computing, vol 50
no 4, pp 279-296, 1993
[9] P G Howard, F Kossentim, B Mai tins, S Foichhammei, S - R Forchhammei,
W J Ruckhdge, and F Ono, The Emeigmg JBIG2 Standard,' IEEE Trans
Circuits and Systems for Video Technology, vol 8, pp 838-848, 1998
[10] S B Gray, "Local Pioperties of Binary Images m Two Dimensions,' IEEE
actions on Computers, vol C-20, no 5, pp 551-561, 1971
[11] M Brno, ' On the Maximum Length of Huffman Codes,' Infoi mation
Letters, vol 45, pp 219-223, 1993

Trans-

Piocessmg

[12] G Hansel, D Pciim, and I Simon, Compiession and Entiopy," in STAGS '92
Proceedings of the 9th Annual Symposium on Theoretical Aspects of Computer
Science, (London, UK) pp 515-528, Spnngci-Vcrlag, 1992

95

[13] J. R. Pierce, An Introduction to Information
Dover Publications, Inc., 2 ed., 1980.

Theory: Symbols, Signals and Noise.

[14] W. Feller, An Introduction
Wiley, 3 ed., 1968.

Theory and Its Applications,

to Probability

vol. 1.

[15] P. G. Howard and J. S. Vitter, "Fast and Efficient Lossless Image Compression,"
in Data Compression Conference, pp. 351-360, IEEE, 1993.
[16] A. Moffat, J. A. Storer, and J. H. Reif, "Two-Level Context Based Compression
of Binary Images," in Data Compression Conference, pp. 382-391, IEEE, 1991.
[17] J. Zobel and A. Moffat, "Adding Compression to a Full-Text Retrieval System,"
Software - Practice and Experience, vol. 25, no. 8, pp. 891-903, 1995.
[18] J. L. Mitchell, "Facsimile Image Coding," in Conf
(Anaheim, CA, USA), pp. 423-426, 1980.

Proc. National

Computer,

[19] P. G. Howard and J. S. Vitter, "Practical Implementations of Arithmetic Coding," in Image and Text Compression (J. A. Storer, ed.), pp. 85-112, Kluwer
Academic Publishers, 1992.
[20] University of Northern British Columbia, Geographic Information Systems Lab,
"Topographic Map of British Columbia," 2009.
[21] P. Franti, E. Ageenko, P. Kopylov, S. Giohn, and F. Berger, "Map Image Compression for Real-Time Applications," in Proc. Spatial Data Handling 2002 Symposium SDH 2002 (part of 2002 Joint International Symposium on Geospatial
Theory, Processing and Applications, 2002.
[22] P. Franti, P. Kopylov, and E. Ageenko, "Evaluation of Compression Methods for
Digital Map Images," in Proc. Automation, Control, arid Information
Technology
- 2002 (M. H. Hamza, O. I. Potaturkin, and Y. I. Shokm, eds.), (Novosibirsk,
Russia), June 10-13 2002.
[23] R. G. Gallagher, "Vaiiations on a Theme by Huffman," IEEE Trans, on Information Theory, vol. 24, no. 6, pp. 668-674, 1978.
[24] I. H. Witten, R. M. Neal, and J. G. Cleaiy, "Arithmetic Coding for Data Compression," Communications of the ACM, vol. 30, pp. 520-540, 1987.
[25] A. Moffat and J. Zobel, "Coding for Compiession in Full-Text Retrieval Systems," in Proceedings of the Data Compression Conference DCC'92, 1992.
[26] M. Kunt and O. Johnson, "Block Coding of Giaphics: A Tutorial Review,"
Proceedings of the IEEE, vol. 68, no. 7, pp. 770-786, 1980.
[27] X. VVu and N. Mcmon, "Context-Based, Adaptive, Lossless Image Coding," IEEE
Trans. Communications, vol. 45, no. 4, pp. 437-444, 1997.
96

[28] B Y Kavaleichik, "Generalized Block Coding of Black and White Images,"
IEEE Trans Image Processing, vol 1, pp 518-520, 1992
[29] P Franti and O Nevalamen, "Compression of Binary Images by Composite
Methods Based on Block Coding," Journal of Visual Communication and Image
Representation, vol 6, no 4, pp 366-377, 1995
[30] P Fianti and O Nevalamen, "A Two-Stage Modeling Method for Compressing
Binary Images by Arithmetic Coding," The Computer Journal, vol 36, no 7,
pp 615-622, 1993
[31] P Franti, "A Fast and Efficient Compression Method for Binary Images," Signal
Processing Image Communication (Elsevier Science), vol 6, pp 69-76, 1994
[32] A Quddus and M M Fahmy, "Binary Text Image Compression Using Overlapping Rectangular Partitioning," Pattern Recognition Letters, vol 20, no 1,
pp 81-88, 1999
[33] ISO/IEC J T C 1 / S C 2 9 JBIG Committee, "JBIG2 Working Draft WD14492,"
Tech Rep ISO/IEC JTC1/SC29, International Organization for Standardization, August 1998
[34] P Kopylov and P Fianti, "Compiession of Map Images by Multilayer Context
Tree Modeling," IEEE Trans on Image Processing, vol 14, no 1, pp 1 11,2005
[35] A Podlasov and P Fianti, "Lossless Image Compression via Bit-Plane Separation
and Multilayei Context Tree Modeling," Journal of Electronic Imaging, vol 15,
no 4, p 043009, 2006
[36] S Zahii and A Bonci, 'A Fast Lossless Compression Scheme foi Digital Map
Images Using Coloi Sepaiation," in Proceedings of IEEE International
Confei
ence on Acoustics, Speech, and Signal Processing (ICASSP '10), (Dallas, Texas,
USA), pp 1318-1321, IEEE Computei Society, 14 19 Maich 2010
[37] S alZahir and A Bonci, "Lossless Compression of Maps, Chaits, and Giaphs
via Coloi Separation," m Proceedings of the Data Compression
Conference
(DCC '10), (Snowbnd, Utah, USA) p 518, IEEE Computei Society, 24 26
Maich 2010
[38] I Adler, S Orcn, and S M Ross, "The Coupon-Collcctoi's Problem Revisited,'
Journal of Applied Probability, vol 40, pp 513 518, 2003
[39] E Agecnko, P Kopylov, and P Fianti, "On The Size and Shape of Multi-Level
Context Templates foi Compression of Map Images,' m Proceedings International Confei ence on Image Processing, vol 3, (Thcssalomki, Greece), pp 458
461, 2001

97

[40] A. Akimov, A. Kolesnikov, and P. Franti, "Lossless Compression of Map Contours by Context Tree Modeling of Chain Codes," Pattern Recognition, vol. 40,
pp. 944-952, 2007.
[41] S. Alcaraz-Corona, R. A. Neri-Calderon, and R. M. Rodriguez-Dagnino, "Efficient Bilevel Image Compression by Grouping Symbols of Chain Coding Techniques," Optical Engineering, vol. 48, no. 3, p. 037001, 2009.
[42] R. Arnold and T. Bell, "A Corpus for the Evaluation of Lossless Compression
Algorithms," in DCC '97: Proceedings of the Conference on Data Compression,
p. 201, IEEE Computer Society, 1997.
[43] A. L. Berger, S. A. Delia Pietra, and V. J. Delia Pietra, "A Maximum Entropy
Approach to Natural Language Processing," Computational Linguistics, vol. 22,
no. 1, pp. 39-71, 1996.
[44] A. Bookstein and S. T. Klein, "Models of Bitmap Generation,"
Processing & Management, vol. 28, pp. 795-806, 1992.

Information

[45] P. A. Bromiley, N. Thacker, and E. Bouhova-Thacker, "Shannon Entropy, Renyi
Entropy, and Information," Internal Memo 2004-004, School of Cancer and Imaging Sciences, The University of Manchester, U.K., 2004.
[46] T. M. Cover and J. A. Thomas, Elements of Information Theory. Wiley Series
in Telecommunications and Signal Processing, Wiley-Interscience, 2 ed., 2006.
[47] A. H. El-Maleh, E. Khan, and S. alZahir, "A Geometric-Primitives-Basecl Compression Scheme for Testing Systems-on-a-Chip," in VTS '01: Proceedings of the
19th IEEE VLSI Test Symposium, (Maiina del Rey, CA, USA), p. 54, IEEE
Computer Society, 2001.
[48] R. Estrada and R. P. Kanwal, Asymptotic
Boston, MA: Birkhauser, 1994.

Analysis:

A Distributional

Approach.

[49] S. T. Klein and D. Shapira, "Huffman Coding with Non-sorted Frequencies,"
in DCC '08: Proceedings of the Data Compression Conference, p. 526, IEEE
Computer Society, 2008.
[50] H. S. Malvar, "Fast Adaptive Encoder for Bi-Level Images," in DCC '01: Proceedings of the Data Compression Conference, p. 253, IEEE Computer Society,
2001.
[51] P. D. Miller, Applied Asymptotic Analysis, vol. 75 of Graduate Studies in Mathematics. American Mathematical Society, 2006.
[52] S. Zahir and R. Chowdhury, "A Hybrid 2-3-3 Bits-Based Image Encoding
Scheme," in IEEE International Symposium on Signal Processing and Information Technology (ISSPIT '06), (Vancouvei, BC, Canada), pp. 781 786, IEEE
Computer Society, 27-30 August 2006.
98

[53] L. Zhou and S. Zahir, "A New Efficient Algorithm for Lossless Binary Image
Compression," in IEEE Canadian Conference on Electrical and Computer Engineering CCECE'06, pp. 1427-1431, 2006.

99

Appendix A
Models and Derivations
Everyone engaged in research must have had the experience of
working with feverish and prolonged intensity to write a paper
which no one else will read or to solve a problem which no one
else thinks important and which will bring no conceivable
reward—which may only confirm a general opinion that the
researcher is wasting his time on irrelevancies.
- NOAM

A.l

CHOMSKY

Waiting Probabilities

Let S denote the set of coupons (or items, in general) having cardinality ||<S|| =
N.

Coupons arc collected with replacement from S. Then, the problem poses the

following two questions:
(i) What is the probability of waiting more than n trials in order to observe all N
coupons?
(ii) What is the expected number of such trials?
Let M. denote the set of observed coupons and T be a random variable. More
accurately, Ai is a multiset

of coupons because coupons are drawn from S with

replacement. To answer the first question, it is more convenient to consider the probability of collecting more than n coupons, i.e. P{T > n). The required probability

100

P(T = n) is easily denved as P(T = n) = P(T > n - 1) - P ( T > n) The following
model gives P(T > n) [14]
N-\

^>«)-E(-i)- +1 C)(V)"

<A1>

From formula (A 1), we have

P{T = n) = P{T > n - 1) - P(T > n)

(A 2)

Formula (A 2) gives the probability of waiting for n samples before observing N
coupons and that answers the first question posed above
To determine the expected numbei of trials, E[T] required to collect n coupons
we use the following model
E[T] = nHn ,

(A 3)

wheie Hn is the harmonic number Hn = ^ " = 1 - Based on the asymptotic expansion
of haimonic numbers, one may derive the following asymptotic approximation for the
expectation given m (A 3)

E[T] = nlnn

+ 7/7, + - + o(l) ,

as

n —^ 00 ,

(A 4)

where 7 ~ 0 5772 is the Eulci-Mascheiom constant If we have n = 2 64 coupons, then
the expected number of trials bounded by (A 4) is equal to 8 29 x 10 20 , which implies
a practically unattainable numbei of trials
Since T takes nonncgative values, we can employ foimula (A 3) to bound the
probability given m (A 1) using Markov's Inequality for a > 0

P(T > a) < ^ E
101

(A 5)

See [14, 38] for more details on the Coupon-Collector's Problem.

A.2

Asymptotic Expansion on the
Entropy Discrepancy

A.2.1

General Asymptotic Analysis

Let N be the number of 8 x 8 blocks and pu \/i = 1, 2 , . . . , N be the theoretical
probability distribution of the N blocks.
Define the discrete function L: L(qt) to be the observed average code length given
by:
N

1

L = ^qt\og2-,

(A.6)

where qz denotes the empirical probability distribution of the N 8 x 8 blocks. Similarly, define H: H(pz)

to be the theoretical average code length function for the

theoretical probabilities pz:
N

1

H = 5>,log2-.

(A.7)

Function H is the entiopy of the theoretical distribution p,. We want to asymptotically study the eiroi between the obseived and the theoictical cntiopies given
respectively by (A.6) and (A.7). For that purpose, we examine the absolute error
model:
N

/

I

i\

E = L-H = J2 U.log 2 --p.log 2 t=1

V

Qi

•

(A.8)

Pi J

Define the discrepancy ely V? = 1, 2,. . . , N, between the empirical and theoretical
probabilities as:
e* = <7i - Pi •

(A.9)

Then, we derive an asymptotic expansion on e7 in order to observe how E in

102

formula (A.8) behaves asymptotically.
First, observe that based on (A.9), L(qz) = H(pt + el) and equation (A.8) may be
expressed as E = H(pz + et) — H(pt), \/i = 1 , 2 , . . . , N. The first derivative of function
H(pt) = - p , l o g 2 p , is:
H'(pt) = - ^ - l o g 2 P l .

(A.10)

Then, we look for an expression such as the following:

[H{pl + el)-H(pt)]~H'(pl)el

as

et -* 0 ,

(A.ll)

where the left-hand side of the relation represents the error function E given in (A.8).
Note that this expression is the discrete version of the first-order Taylor series defined
as follows:
Definition A . l . Taylor
For a function
Expansion)

f{x)

Series

defined in a set D the corresponding

of the function

Taylor Series (or Taylor

about a point XQ G © is given by the following

/w = ^ ( , - . „ r ,

formula:

(A.i2)

n=0

where /<-n')(.Xo) is the nih derivative of function

f at x = x0.

If we let / = H, x = Pi + €i, and x0 = p% in (A. 12), we get the following expression:

N

Hln)
J2
wc«
H{p + e.) = Y,
%

(A.13)

nl

n=0

The dominant term in the series (A.13) is function H(pt).

103

Subtiacting this from

H(pi + e8) yields the following series:
N

H(Pl + e.) - H(pt) = J2
i=i

E

#H(A)

n

(A.14)

ra!

The left-hand side of equation (A.14) is the error function given in (A.8). Hence,
the Taylor series serves as a suitable approximation to the absolute error function E.
That is,
oo

N

E ~ 2 J z_] — •
=1

n=\

n!

e

™ as e'

0.

(A.15)

e, —> 0

(A.16)

We may write equation (A.15) as follows:
N

oo

* = EE

^(n)(ft

n!

i=i

< + o(£)

as

For most purposes, a second-order approximation is appropriate to provide useful
insight into errors. That is, if we consider the first and second derivatives of function
H, we may write (A.16) as follows:
N
,=1

I

( 1

"\

L

\

/

2 In 2 p, +o(a

as

e, —> 0

(A.17)

Real ranging the terms in the summation and by the properties of asymptotic
estimates, we write (A.17) in the following form:
iV

* = -EIn 2
i=i

2p,

o |

e» l o &2 Vx

max

\e?\

See [51] for a ligorous treatment of asymptotic analysis.

104

as

e,

0 . (A.18)

A.2.2

Error Analysis for t h e Constructed Codebook

The asymptotic expansion exposed in Section A 2 1 applies to the empirical and
theoretical average code lengths given, respectively, in formulas (A 6) and (A 7),
wherein the summation bound is equal to the number of blocks, N

However, as

stated m Section 2 3 2, the cardinality of the constructed codebook is equal to 6952
entries Theiefore, the probability terms in formula (A 6) are summed up to 6952 and
the sum is thereafter considered to equal zero This affects the discrepancy m formula
(A 8) m that an additional error term equal to — X)i-6953 A io §2Pi 1S to be added
Since the theoretical probabilities pz are fixed, this additional enor term may be
added to the model m foimula (A 18) Here, we piovide a more elaborate discussion
on the additional error m c u n e d on the constructed codebook of 6952 entries

We

first consider the theoretical implication of the Principle of Maximum Entropy when
applied on unknown distributions Then, we establish a bound on the additional error
term
Let N denote the total numbci of blocks and M the number of blocks included
in the constructed codebook, M < N

The Principle of Maximum Entropy (PME)

mstiucts one to assume a uniform probability distribution over all symbols that have
not been obscivcd m a given data sample, but which aic part ol some alphabet [43]
In oui case, we consider qt = 0, foi i = M + 1, M + 2,

,N

Based on PME,

we should considei smoothing the empirical probability distribution and consider all
unobserved empirical probabilities as umfoimly distributed That is, q% = q*, foi i —
M + 1, M + 2,

, N and for some fixed probability value q* > 0 In order to achreve

a uniform distribution, we need to smoothen the observed probability values ql =
0, for i = 1 2,

,M

Because the number N rs very large, any smoothing method

cannot be practrcally applied as it would disrupt rrrf or matron aboirt observed blocks
For example, the probability of f valued 8 x 8 blocks in the constructed codebook

105

is equal to 26%. A smoothing method, such as Laplacian smoothing, would replace
this observed probability by a very small value. Then, the constructed Huffman
code would not realistically represent the observed distribution of the 1-valued 8 x 8
block. For the purpose of determining a bound, however, it is theoretically possible
to consider the implications of applying PME on the models described above.
The error model in formula (A.8) may be rewritten as follows:
M

(
1
1\
E = £Ulog2--paog2N

(

+ E

(A.19)

1\

(0-Alog2-J .

(A.20)

The asymptotic approximation we derived in formula (A. 18) applies to formula (A.19).
We focus on determining a bound for foimula (A.20), denoted hereafter as E2.
Based on the Principle of Maximum Entropy, the following inequality holds:
N

E

(

N

1\
1O

( A &2 - ) ^

E

(

1\

lo

M

1

lo

W g 2 -, = (* - )p* g 2 -* >

(A- 2 1 )

where p* is some unifoirn probability value. Multiplying both sides of inequality
(A.21) by — 1, we have:

E2 = ~ Yl

U l o g 2 ~ )>-(N-M)p*log2-.

i=M+l \

P1/

(A.22)
P

In other words, E2 > —(N — M)p* log2 ~. Inequality (A.22) establishes a lower bound
for the additional error term E2Now, suppose that the cmphical probabilities q,, % = M + 1, • .. , N, are uniformly
distributed with a probability value q*. In this case, the emphical entropy value is

106

maximized and we would have

E2<(N-M)q*log2\q

If we assume that p% = p*, i = M + 1,

f^ (pxlog2-)
Pl
z=M+r V
'

,N, then the following inequality holds

E2 < (N - M)q* log 2 \ - ( N N

< (w - M y log *
because p* log 2 ± > J2?=M+I

(P* 1O§2

(A 23)

M)p* log 2

\

,

*

(A 24)

£ (p.iog2^-)

^)

Inequality (A 24) establishes an upper bound for the additional enoi term Z?2
Based on inequalities (A 22) and (A 24), E2 is bounded by below and above as follows

-(N - M)p* log 2 — < E2 < (N - M) q* lo S2 — - P* !og2 —
p*
q*
p*
Let e = q* — p*

(A 25)

Then, we can provide a scconcl-oider asymptotic expansion on e in

older to approximate the uppei bound of E2 in inequality (A 25) Using a simplified
version of the asymptotic expansion given m toimula (A 18), we have the following

tp(e) = q* log 2 — - p* log 2 —
q*
p*

(A 26)
2

ln2

\ + 2 p

:

] - e log 2 p* + o (e )

as e —• 0

For e —>• 0, <£>(e) is negligible Inequality (A 25) may now be w n t t c n as

- ( i V - M)p* log 2 — < E2 < (N - M)f(e)
p*
The cmpnical probabilities qt, i = 1,2,

(A 27)

, M, do not follow a uniform distri-

bution, as noted in Section 2 3 2 The asymptotic approximation in foimula (A 18)
107

suggests that the theoretical probabilities, too, are not uniformly distributed in the
strict sense. Suppose that, for the purpose of finding a more reasonable lower bound,
we wish to smoothen the probabilities p*, i = M + 1 , . . . , N, with the caveat that the
probability values for i = 1, 2 , . . . , M are not modified. Suppose that, after smoothing,
the resulting probability value is p*. Then, the following inequality holds:1

1

< P* <N-M
A
N + S -"

'>

(A-28)

for \S\ < M. We focus on the left-hand side of inequality (A.28)

"' a JTTi •

(A29)

Taking the logarithm base 2 of both sides in (A.29), we have:

log2p*>log2-l—.
N +5'

(A.30)

Multiplying both sides in (A.30) by — p*, we have

-P* lo &2 P* < ~P* l o S2

p*\og2\<p*\og2(N
p*

N +S

(A.31)

+ 5)

Multiplying both sides in (A.31) by — (N — M), we have:

-(N - M)p* l o g 2 1 > -(iV - M)p* log2(7V + 5) .
p*

x

(A.32)

To see why this is the case (for theoretical piuposes), let S = X^i? 5 *- Then, a unifor m
probability value p* for probabilities ]\, i = M + 1 to J = N, could be an aveiage value p*
jfz~M Y2',=A4+\ V%- This value may be rewritten as p* = jj^jj> which is less than jfzjj-

108

Multiplying both sides of inequality (A 29) by ~(N - M) \og2(N + 5) gives
N - M
N + S \og2(N + 5)>-{N-

M)p* \og2(N + S)

(A 33)

For \S\ < M, the following holds
N

~log2(N
N +S

+ S)>-log2(N

+ S)

(A 34)

From the right-hand side of inequality (A 28), we have (N — M)p* < 1 Multiplying
both sides by — \og2(N + 5) gives

-(N-

M)p* log2(N + 5) > - log2(N + S)

(A 35)

Taking the logarithm to base 2 of the tcims m inequality (A 29) and multiplying both
sides by (N — M)p* yields the following

-(N-

M)P* l o g 2 1 > -(AT - M)p* log2(7V + 5)

(A 36)

From formulas (A 32) and (A 35), wc have

-(TV - M)p* log2 - i > - log2(/V + 5)

(A 37)

Fiom inequalities (A 27) and (A 37), wc may bound enoi E2, given m formula (A 20),
as follows
-\og2{N

+ 8) < E2 < (N-M)tp(e)

,

(A 38)

where ip(e) is given m formula (A 26)
In the case of 8 x 8 blocks, the bounds in formula (A 38) suggest that the additional crroi incur led on the constiuctcd codcbook is negligible Foi the lowci bound,
109

consider the worst case when 5 is very close to M = 6952 and let N = 2 64 . Then,
log2(iV + S) ~ 64 bits. The upper bound, on the other hand, tends to zero as e —> 0.

110

Appendix B
Test Images
W h a t most experimenters take for granted before they begin
their experiments is infinitely more interesting than any lesults
to which their experiments lead
- NORBbRl WlENbR

B.l

Binary Images

Heie, we exhibit over 100 binary images (source

[41]) employed for compiession

via the pioposed method and JBIG2

Displayed underneath each image are the di-

mensions and the compression lesults

Oveiall, the pioposed method outpeifoims

JBIG2 by 5 33% with aiithmetic coding and 5 45% with CA In addition, Cy\ outpci
foims Co m all images It can be noticed that Huffman and Arithmetic coding yield
close compiession ratios

111

T a b l e B . l : Solid test images

001

002

274 x 208
Ep
84 2
E*
88 66
AC
90 06
JBIG2 81 18

gr I
003

004

1006 x 669
Ep
93 11
E*
96 78
AC
96 6
JBIG2 95 46

900 x 899
Ep
91 23
E*
95 43
AC
95 31
JBIG2 94 27

512 x 800
Ep
93 25
E*
95 99
AC
95 73
JBIG2 94 69

006

007

008

2126 x 1535
Ev
89 97
E*
96 23
AC
95 75
JBIG2 95 78

640 x 492
Ep
93 29
E*
96 56
AC
96 41
JBIG2 95 32

2400 x 3000
Ep
92 56
E;
97 59
AC
96 98
JBIG2 98 34

1024 x 768
Ep
88 35
E*
92 89
AC
93 92
JBIG2 89 06

010
1061 x 1049
Ep
87 9
E*
92 31
AC
94 05
JBIG2 88 44

2329 x 854
Ep
93 47
E*
96 68
AC
96 86
JBIG2 94 83

Oil

012

013

014

m

800 x 524
Ep
91 77
E*
95 47
AC
94 96
JBIG2 94 3

575 x 426
Ep
90 85
E*
94 45
AC
95 29
JBIG2 92 82

535 x 518
Ep
89 57
E*
94 85
AC
94 51
JBIG2 92 84

498 x 395
Ep
88 4
E*
92 74
AC
93 69
JBIG2 87 58

1986 x 1500
Ep
91 32
E;
96 5
AC
96 12
JBIG2 95 37

112

015

Table B 1 (continued)

MPI
017

QUANTUM3D

018

400 x 400
Ev
77 45
E*
819
AC
86 17
JBIG2
74

542 x 493
Ep
87 27
E*p
93 27
AC
93 98
JBIG2 90 8

019

o

791 x 663
Ep
91 68
E*p
94 84
AC
95 3
JBIG2 91 81

448 x 444
Ep
89 12
E*
94 63
AC
94 54
JBIG2 92 29

A <§1

483 x 464
Ep
89 67
E*
93 73
AC
94 78
JBIG2 89 8

020

3>
®
m
INEXTI

i 1TL
360 x 441
Ev
88.49
E*
90 38
AC
90 98
JBIG2 85 94

022

023

024

025

315 x 394
Ep
84 38
E*
89 66
AC
93 32
JBIG2 91 94

241 x 490
Ep
86 9
E*p
89 44
AC
91 03
JBIG2 84 36

542 x 564
Ep
94 4
E*
97 31
AC
97 19
JBIG2 97 01

1500 x 845
Ep
93 28
E*p
96 88
AC
96 78
JBIG2 95 92

ft
026

027

3000 x 2400
Ep
94 1
E*
97 5
AC
97 2
JBIG2 97 14

3000 x 2150
Ep
94 14
E*
97 26
AC
96 92
JBIG2 94 96

T

030

028

2400 x 1920
Ep
94 37
E;
97 73
AC
97 47
JBIG2 98

113

196 x 390
Ep
84 85
E*
89 17
AC
91 61
JBIG2 81 59

167 x 252
Ep
87 28
E*
92 79
AC
93 69
JBIG2 86 67

Table B.l (continued)

031

032

H

034

035

372 x 217
Ep
81.71
El
85.56
AC
87.16
JBIG2 80.1

527 x 354
Ep
89.33
E*p
94.67
AC
95.37
JBIG2 92.7

033

263 x 318
Ep
90.41
E*v
93.95
AC
94.39
JBIG2 89.77

603 x 337
Ep
88.28
E*
92.65
AC
93.47
JBIG2 88.97

M flV

205 x 207
Ep
86.84
E*p
92.43
AC
93.52
JBIG2 86.92

036

037

038

1103 x 1088
Ep
90.18
E;
96.06
AC
95.68
JBIG2 94.98

357 x 281
Ep
84.59
E*
91.73
AC
92.52
JBIG2 87.85

226 x 418
Ep
84.99
E*v
91.52
AC
93.53
JBIG2 90.07

w

wm
039

040

805 x 447
Ep
85.67
El
91.12
AC
91.72
JBIG2 87.5

300 x 300
Ep
92.12
E*p
95.38
AC
96.18
JBIG2 91.4

044

043

340 x 493
Ep
84.47
E*
89.68
AC
90.55
JBIG2 85.38

490 x 481
Ep
86.04
E*
89.92
AC
90.96
JBIG2 84.99

150 x 149
Ep
83.88
E*p
89.02
AC
91.76
JBIG2 78.92

114

391 x 282
Ep
88.94
E'l
93.22
AC
94.17
JBIG2 88.91

fi
045

287 x 481
Ep
88.55
E;
92.51
AC
93.58
JBIG2 85.81

Table B 1 (continued)

046

1

048

049

050
325 x 299
Ep
92 03
E*
94 83
AC
95 26
JBIG2 91 36

047

325 x 195
Ep
90 46
E*
93 92
AC
95 09
JBIG2 89 18

225 x 375
Ep
90 63
E*
94 26
AC
94 87
JBIG2 90 75

325 x 195
Ep
90 79
E;
93 7
AC
94 91
JBIG2 89 21

275 x 275
Ep
90 55
E*
93 75
AC
94 81
JBIG2 86 71

051

T

054

350 x 229
Ep
91 75
E*
95 75
AC
95 46
JBIG2 93 69

052
360 x 270
Ep
92 76
E*
95 4
AC
95 29
JBIG2 92 29

053

350 x 340
Ep
88 65
E*
94 29
AC
94 34
JBIG2 91 92

263 x 233
Ep
87 87
E*
92 31
AC
93 09
JBIG2 85 14

E*
92 1
AC
92 81
TBIG2 87 81

056

057

058

059

060

450 x 360
Ep
85 44
E;
91 16
AC
92 69
JBIG2 86 86

180 x 210
Ep
83 55
E*
89 31
AC
91 35
JBIG2 80 17

300 x 300
Ep
91 36
E;
94 87
AC
96 22
JBIG2 90 03

200 x 329
Ep
88 99
E*
93 72
AC
94 58
JBIG2 88 58

586 x 193
Ep
90 38
E*
93 76
AC
94 34
JBIG2 90 88

115

I

055
216 x 348
Ep
87 76

Table B 1 (continued)

061

062

063

064

065

640 x 439
Ep
93 98
E*v
96 39
AC
96 7
JBIG2 96 01

640 x 412
Ep
94 62
E*
97.09
AC
97 23
JBIG2 96 42

640 x 439
Ep
93 71
E*v
96 76
AC
96 91
JBIG2 96 24

640 x 439
Ev
94 59
E*
96 77
AC
96 93
JBIG2 96 84

576 x 640
Ep
91 61
E*
9614
AC
95 86
JBIG2 93 88

V

>
066

067

068

069

070

922 x 649
Ep
92 36
E*
96 48
AC
96 06
JBIG2 96 2

403 x 616
Ep
88 46
E*
94 46
AC
94 36
JBIG2 91 76

460 x 460
Ep
93 16
E*
96 49
AC
96 44
JBIG2 95 02

784 x 536
Ep
93 03
E;
96 69
AC
96 5
JBIG2 95 21

524 x 641
Ep
90 58
E;
95 63
AC
95 12
JBIG2 94 08

*fc X071

072

073

074

075

545 x 393
Ep
90 72
E*
93 75
AC
94 22
JBIG2 89 82

729 x 434
Ep
91 63
E*p
95 54
AC
95 3
JBIG2 93 64

434 x 365
Ep
93 36
E*}
96 65
AC
96 49
JBIG2 95 8

203 x 247
Ep
86 9
E*
92 66
AC
93 15
JBIG2 87 68

790 x 480
Ep
92 78
E*
96 5
AC
96 25
JBIG2 94 8

116

Table B.l (continued)

<

*

*

076

077

078

079

080

245 x 226
Ev
86.24
E*
92.75
AC
93.41
JBIG2 86.82

450 x 295
Ep
88.47
E*
95.65
AC
95.38
JBIG2 94.83

285 x 504
Ep
90.65
E*
94.64
AC
95.1
JBIG2 90.71

245 x 158
Ep
85.35
E*
91.42
AC
91.96
JBIG2 84.91

491 x 449
Ep
91.86
E*
95.71
AC
95.86
JBIG2 92.17

X

*

081

082

083

084

085

245 x 248
Ep
89 2
E*
92 84
AC
93 3
JBIG2 86.69

491 x 526
Ep
92 21
E*
96 M
AC
96 08
JBIG2 94.31

354 x 260
Ep
88 93
E;
95 29
AC
95 48
JBIG2 92.24

240 x 394
Ep
87 85
El
92 52
AC
92.63
JBIG2 89.31

167 x 405
Ep
86 9
El
92 55
AC
93.62
JBIG2 87.7

117

Table B 1 (continued)

wmr~
086

087

088

089

090

335 x 500
Ep
91 12
E*p
95 88
AC
95 62
J B I G 2 94 97

447 x 459
Ep
89 89
E*
96 19
AC
95 73
J B I G 2 93 85

1203 x 1200
Ep
91 12
E*
95 88
AC
95 3
J B I G 2 95 94

610 x 763
Ep
90 39
E*
95 85
AC
95 3
J B I G 2 94 36

350 x 357
Ep
86 68
E*
95 02
AC
94 8
J B I G 2 92 18

Q

$

091

092

093

094

095

381 x 497
Ep
88 65
E*
93 65
AC
93 42
JBIG2 91 35

500 x 500
Ep
90 78
E*
96 08
AC
96 11
J B I G 2 94 43

516 x 687
Ep
93 04
E*
96 02
AC
96 02
JBIG2 94 87

1018 x 486
Ep
92 43
E*
96 42
AC
96 32
JBIG2 96 57

680 x 449
Ep
92 86
E*
96 06
AC
96 41
JBIG2 95 02

Vk

i

*

&

« &

096

097

098

099

100

516 x 687
Ep
93 6
E*
97 08
AC
97 06
JBIG2 97 9

510 x 727
Ep
94 06
E*
9719
AC
96 88
1BIG2 96 87

561 x 339
Ep
91 39
E*
94 86
AC
95 55
JBIG2 92 86

889 x 567
Ep
94 64
E*
97 22
AC
97 36
J B I G 2 97 16

765 x 486
Ep
95 54
El
97 53
AC
97 59
J B I G 2 97 71

Table B 2 displays the six binaiy images (souire [53]) compiismg boundaiy lines

118

and the inverted counterparts. Results for these images are exhibited in Tables 3.9
and 3.10 in Section 3.1.
Table B.2: Test images with boundary lines

i^~

^0

-40il
"^

101-w

102-w

103-w

(w y
•^A*r
104-w

105-w

106-w

101-b

102-b

103-b

104-b

105-b

106-b

119

B.2

Selected Discrete-Color Images

Table B 3 exhibits the three topographic maps (source [20]) used m Section 3 2
Each map contains four layers at 24-bit depth and 200 dpi resolution

Compression

results for these maps are exposed in Table 3 12
Table B.3: Topographic maps

t

Map 1

Map 2

Map 3
Table B 4 lllustiatcs the rhaits and giaphs used in Table 3 13 of Section 3 2
120

Table B.4: Charts and graphs

'111,
;.',

SSscCfiiaL MM

.1,1 i

i,i

fum H!

2
97 x 181

3
97 x 174

4
103 x 164

5
86 x 86

6
73 x 163

7
74 x 157

8
98 x 209

9
74 x 157

114 x 221

m.

J

IJJMIJJ

1 '

1 <

' A

.JIJ.. i

10
74 x 157

Till1
ti* i>
• <\v

11
90 x 169

12
93 x 177

14
74 x 157

15
74 x 157

t—>•
t~

—r

13
74 x 157

121

5

11

Table B 4 (continued)

,

1

|p«B<^|§|
J

%

16
74 x 157

17
74 x 157

18
74 x 98

19
105 x 114

20
74 x 157

21
74 x 157
§5110~—7^
13 ^ 0 ^ 0 0 ,
V Qe, un

^^o^*^

16 I

»/'

5 ti—i

1 ££
—

22
101 x 205

13X

—

23
74 x 157

122

F~) 6

/ / t

14

vA5

/7
8

1T— —HTf

24
71 x 83

Index
Arithmetic coding, 15, 68, 83
codebook, 19
block distribution, 23, 29
coding, 32
decoding, 32
entiopy, 28
e n o i , 27, 102, 105
fixed-to-vaiiable, 19
redundancy, 28
coding scheme, 45, 48
C0, 53, 55, 73, 111
CA, 53, 55, 73, 111
expected compression late Ep, 47
expected compression rate E*, 52
compression, 9
decoding, 9
lossless, 9
lossy, 9
methods, 9
ratio, 16
CRV, see RCRC

RCRC, 34
column reference vector, 34
compiession probability, 40
reduced block, 35
row reference vector, 34
row-column reduction coding, see RCRC
RRV, see RCRC
waiting probability, 23, 100

entropy, 11
bmaiy, 13
fhst-ordei, 11
joint, 13
Huffman coding, 14, 23, 84
image, 16
bmaiy, 16, 72, 111
disciete-coloi, 16, 80
modeling, 10, 28
and coding paiadigm, 10
RB, see RCRC
123