You are on page 1of 5

110 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMSII: EXPRESS BRIEFS, VOL. 58, NO.

2, FEBRUARY 2011
A Modied Chaos-Based Joint Compression and
Encryption Scheme
Jianyong Chen, Junwei Zhou, and Kwok-Wo Wong, Senior Member, IEEE
AbstractAn approach for improving the compression perfor-
mance of an existing chaos-based joint compression and encryp-
tion scheme is proposed. The lookup table used for encryption is
dynamically updated in the searching process. Once a partition
not matched with the target symbol is visited, this and other parti-
tions mapped to the same symbol are reallocated to a nonvisited
symbol. Therefore, the target symbol eventually associates with
more partitions and fewer number of iterations are needed to nd
it. As a result, expansion of the ciphertext is avoided, and the
compression ratio is improved. Simulation results show that the
proposed modication leads to a better compression performance,
whereas the execution efciency is comparable. The security of the
modied scheme is also analyzed in detail.
Index TermsChaos, compression, cryptography, simultaneous
compression and encryption.
I. INTRODUCTION
T
HE EFFICIENCY and security requirements of informa-
tion transmission lead to a substantial amount of research
work in data compression and encryption. In order to improve
the performance and the exibility of multimedia applications,
it is worthwhile to perform compression and encryption in a
single process [1][5]. In general, there are two distinct research
directions in this area. One of them embeds key-controlled
confusion and diffusion in source-coding schemes, whereas
another incorporates compression in cryptographic algorithms.
Some attempts of introducing key-controlled operations in
entropy coding can be found in [1][4]. The approach based
on multiple Huffman tables [1] simultaneously performs en-
cryption and compression by a key-controlled swapping of the
left and right branches of the Huffman tree. Some approaches
such as randomized arithmetic coding [2] and key-based in-
terval splitting [3], [4] were proposed to embed cryptographic
features in arithmetic coding. The compression capability of
these algorithms is close to that of traditional entropy coding.
However, security and efciency problems were found [6][8],
and further modications are required.
Manuscript received August 4, 2010; revised November 2, 2010; accepted
December 21, 2010. Date of publication February 14, 2011; date of current
version February 24, 2011. This work was supported by Shenzhen University
Research and Development Fund under Grant 200903. This paper was recom-
mended by Associate Editor G. Grassi.
J. Chen and J. Zhou are with the Department of Computer Science
and Technology, Shenzhen University, Shenzhen 518060, China (e-mail:
jychen@szu.edu.cn).
K.-W. Wong is with the Department of Electronic Engineering, City Univer-
sity of Hong Kong, Kowloon, Hong Kong (e-mail: itkwwong@cityu.edu.hk).
Digital Object Identier 10.1109/TCSII.2011.2106316
In recent years, there is an increasing trend of designing
ciphers based on chaos. This is because chaotic systems are
sensitive to the initial condition and the system parameters.
These properties are desirable in cryptography. Moreover, the
knowledge on chaos and nonlinear dynamics can be applied
in the eld of cryptography. A chaos-based cipher designed
by Baptista [9] searches the plaintext symbol in the lookup
table using a key-dependent chaotic trajectory and treats the
number of iterations on the chaotic map as the ciphertext.
However, it suffers from the problem of ciphertext expansion
that the ciphertext length is usually about 1.5 to 2 times
that of the plaintext length. An attempt was made in [5] to
incorporate certain compression capability in the Baptista-type
cipher by adaptively constructing the lookup table according
to the probability of occurrence of the plaintext symbols. The
ciphertext is not longer than the plaintext, but the compression
ratio still has a distance fromthe source entropy. This is because
the chaotic search trajectory frequently lands on the partitions
corresponding to irrelevant source symbols, the number of
iterations is larger than necessary, and the compression ratio
is not close to the source entropy.
In this brief, the model of sampling without replacement
is adopted in the joint compression and encryption scheme
proposed in [5], with the goal of improving the compression
performance. The lookup table used for encryption is dynam-
ically updated in the searching process. If a symbol visited
by the chaotic trajectory is not the one being encrypted, it
is removed from the current lookup table by assigning the
phase space associated with this symbol to another nonvisited
symbol. Since the target symbol eventually associates with a
larger phase space, it can be searched with a higher chance.
As a result, the number of iterations required for encryption is
reduced. The ciphertext is shortened, and a better compression
performance is achieved.
The rest of this brief is organized as follows. In the next
section, the problem of ciphertext expansion in the original
cipher [5], [9] and its mathematical model are analyzed. The
modied scheme is described in Section III. Simulation results
and security analyses are presented in Sections IV and V,
respectively. In the last section, some concluding remarks are
given.
II. CIPHERTEXT EXPANSION PROBLEM
Before the proposed scheme is described, a brief introduc-
tion of embedding compression in a chaos-based cryptosystem
[5] is presented. This scheme can be considered as a hybrid
1549-7747/$26.00 2011 IEEE
CHEN et al.: MODIFIED CHAOS-BASED JOINT COMPRESSION AND ENCRYPTION SCHEME 111
cipher. Source symbols with high probability of occurrence are
encrypted by searching in the dynamic lookup table, and this is
called the search mode. Entropy coding is performed on the
output of this mode. Then, both the entropy codewords and
other less probable symbols are masked by a pseudorandom
bitstream, and this is named the mask mode. The compression
capability of this hybrid cipher is mainly contributed by the
search mode as the mask mode does not lead to any reduction in
plaintext length. Therefore, the more the symbols are encrypted
in the search mode, the higher the compression ratio can be
achieved.
In the search mode, the plaintext symbol being encrypted
is searched in the lookup table using a pseudorandom
sequence generated by iterating a chaotic map from the key-
dependent parameters and initial condition. In [5], the logis-
tic map is chosen as the underlying chaotic map. It has the
form
x
n+1
= bx
n
(1 x
n
) (1)
where x
n
[0, 1] is the output at discrete time n = 0, 1, 2, . . ..
The control parameter b should be a real number between 3.6
and 4 for generating chaotic output sequences.
The lookup table is composed of the partitioned phase space
of the chaotic map and the corresponding symbol mapping.
The phase space is rst divided into a number of equal-width
partitions, each of which maps to a possible plaintext symbol.
More probable symbols are assigned with more partitions so
that they will have a higher chance to be visited by the secret
searching trajectory. The length of the searching trajectory is
equal to the number of iterations of the chaotic map, which is
then taken as the ciphertext.
The encryption process is similar to the following model:
Encrypting the target plaintext symbol using the lookup table
is equivalent to randomly fetching a symbol until the desired
symbol is drawn. The time of drawing is equivalent to the
length of the searching trajectory, i.e., the ciphertext. Here, a
model of sampling with replacement is presented to analyze
the ciphertext expansion problem. Suppose that there are four
source symbols, i.e., A, B, C, and D, with probabilities of
occurrences of 1/2, 1/4, 1/8, and 1/8, respectively. Hence, half
of the partitions are mapped to symbol A, a quarter to symbol
B. Symbols C and D each associates with 1/8 of the total
number of partitions.
In sampling with replacement, the procedure of encrypting
symbol A corresponds to the geometric distribution in prob-
ability theory and statistics. It can be considered as the rst
success in getting A at the kth draw after the k 1 failures.
The total number of times k is the ciphertext for symbol A.
The probability of drawing A from the table with replacement
is given by (2), where [1 P(A)]
k1
is the probability of
failing to obtain A in the rst (k 1) times. The cumulative
probability for the rst k times, denoted as CP
k
(A), is given by
(3). These expressions indicate that P
k
(A) is close to zero and
CP
k
(A) approaches 1 only if k tends to innity. The ciphertext
of A could be any value from one to innity in theory. It could
occupy a large number of bits in practice and leads to ciphertext
expansion in this type of chaos-based cipher [5], [9],
P
k
(A) = [1 P(A)]
k1
P(A) (2)
CP
k
(A) =P(A) + [1 P(A)] P(A)
+ + [1 P(A)]
k1
P(A)
=1 (1 P(A))
k
. (3)
III. MODIFIED SCHEME
A. Encryption
Unlike the model of sampling with replacement as adopted
in [5], here, we propose to use the model of sampling without
replacement in updating the lookup table. The encryption pro-
cedures of the proposed scheme are described as follows.
Step 1) Following the approach of [5], as described in
Section II, construct the lookup table according to the
probabilities of the occurrence of the source symbols.
Step 2) Sequentially encrypt each symbol in the plaintext
by searching in the lookup table using a secret chaotic
trajectory. If the symbol being encrypted is found, the
number of iterations of the chaotic map is considered as
the ciphertext. Otherwise, the lookup table will be updated
using a new process based on the model of sampling
without replacement. If the partition just visited maps to
a nontarget symbol, all the partitions associated with that
symbol need to be reassigned to another symbol. With
the considerations of compression ratio and simplicity of
the encryption process, those partitions are assigned to the
nonvisited symbol with the highest probability. However,
it should be noticed that the partitions can be randomly
assigned to other symbols to increase the difculty of
attack. In the next iteration, the chaotic trajectory continues
to search the target symbol in the updated lookup table until
the symbol is found.
Step 3) When the current plaintext symbol has been encrypted,
the lookup table is initialized again according to the sym-
bols probabilities of occurrence, as described in Step 1).
However, the exact partitions mapped to a symbol may
be shifted [5] so that the lookup table for encrypting the
next symbol may not be the same as the one for the current
symbol. After that, the next symbol is encrypted using the
same procedures in Step 2). These operations are repeated
until all the symbols in the plaintext sequence have been
processed.
As the proposed scheme mainly focuses on the searching
model, the other procedures remain the same as those described
in [5] and are not repeated here.
An illustration of the lookup table update process for the ex-
ample source in Section II is shown in Fig. 1. The parameter b of
(1) is arbitrarily set to 4.0, and the initial condition is randomly
chosen as 0.33866355. Suppose that the target symbol is C and
the rst iterated value x
1
of the chaotic map is 0.89588219,
which lands on the partition mapped to the irrelevant symbol
A. For that reason, all the partitions corresponding to symbol
A should be assigned to symbol B in the second iteration, and
112 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMSII: EXPRESS BRIEFS, VOL. 58, NO. 2, FEBRUARY 2011
Fig. 1. Illustration of the proposed table update scheme. Only three iterations
are needed to encrypt the symbol C; therefore, the ciphertext is 3.
Fig. 2. For comparison, the scheme used in [5] is presented. Four iterations
are needed to encrypt the target symbol C, and therefore, the ciphertext is 4.
the lookup table is updated. In the next iteration, the second
chaotic map output x
2
is equal to 0.37310916. Unfortunately,
it again does not land on a partition corresponding to symbol
C. The searching process continues with all the partitions of
B assigned to C, and the lookup table is updated again. The
third iterated value x
3
of the chaotic map is 0.93559486, which
falls into the partition associated with the target symbol C.
Therefore, the ciphertext is 3, and the process of encrypting the
current symbol is complete.
For comparison purpose, the original scheme [5] is illustrated
in Fig. 2. After four iterations, the fourth iterated value x
4
of
the chaotic map is 0.24102849, and the chaotic trajectory lands
on the phase space mapped to symbol C. The corresponding
ciphertext is 4, which is larger than that of the modied scheme.
Thus, the required number of iterations, i.e., the ciphertext
value, is reduced in the new approach.
B. Decryption
The secret key consists of the initial value and the parameter
of the chaotic map as well as the initial 32-bit mask block [5].
It must be secretly delivered to the receiver. In addition, the
information of plaintext probability must be available to the
receiver for reconstructing the lookup table. The decryption
process is similar to the encryption one. The decoder regen-
erates the chaotic trajectory from the secret key and then looks
up the plaintext symbol from the lookup table. If the number
of iterations is smaller than the ciphertext, the chaotic search
trajectory has not landed on the target symbol yet. Then, the
same symbol replacement process needs to be performed so as
to synchronize with the encryption part. With the correct key,
the original plaintext sequence can be reconstructed.
IV. PERFORMANCE OF THE MODIFIED SCHEME
A. Efciency Analysis
Due to the hitting of nontarget symbols, the required num-
ber of iterations in the original scheme [5] is large, and the
compression performance is thus limited. In order to avoid the
chaotic map trajectory falling into the partitions corresponding
to the irrelevant symbols again, those partitions are released in
the modied scheme by replacing the nontarget symbol visited
in the previous step by another symbol. This process is similar
to the model of sampling without replacement, and it follows
the hypergeometric distribution for successfully obtaining a
sample from a nite population at the kth time without re-
placement. Equation (4) is an expression of the probability of
nding symbol A from the lookup table without replacement,
where N is the total number of partitions, and M is the number
of partitions corresponding to A. The cumulative probability
CP
k
(A) of the rst k draws is given by (5). This value is
equal to 1 when k is (N M + 1). This means that the
maximum value of the ciphertext for each plaintext symbol
is (N M + 1). The pigeonhole principle also illustrates this
fact. The number of iterations does not exceed the number
of distinct plaintext symbols, and therefore, the ciphertext is
guaranteed not longer than the plaintext. The modied scheme
nds the target plaintext symbol faster than the original one [5]
and results in a higher compression ratio,
P
k
(A) =
C
1
M
C
k1
NM
kC
k
N
(4)
CP
k
(A) =C
1
M
k

s=1
C
s1
NM
sC
s
N
. (5)
B. Simulation Results
The proposed algorithm is implemented in C++ program-
ming language running on a personal computer with an Intel
Core 2 2.00-GHz processor and 2-GB memory. We follow the
choices in [5] so as to make a fair comparison with its results.
The logistic map is chosen as the underlying chaotic map. The
parameter b is set to 3.999999991, whereas the initial condition
is chosen as 0.3388. The maximum number of iterations is set
to 15.
To test the compression capability of the proposed scheme,
the standard les from the Calgary Corpus [10] are used. There
are 18 distinct les of different types, including text, executable
geophysical data, and picture. Two simulation congurations
are chosen. In the rst conguration, only the top 16 probable
plaintext symbols are selected, and all of them are stored in one
table. In the second conguration, the top 128 probable symbols
CHEN et al.: MODIFIED CHAOS-BASED JOINT COMPRESSION AND ENCRYPTION SCHEME 113
TABLE I
CIPHERTEXT-TO-PLAINTEXT RATIO OF THE CALGARY CORPUS FILES
are chosen. They are distributed to 16 tables, and each table
contains eight symbols. To make a comparison with the mod-
ied scheme, the compression results of the original scheme
are directly extracted from [5], without reexecution. This is
because the compression ratio is independent of the hardware
computing platform. In addition, the les are compressed by
the Huffman-coding scheme without encryption as a reference
of the performance of traditional entropy coding.
Table I lists the ciphertext-to-plaintext ratios of the three
approaches. The data are smaller than 100%, which imply
that all les can be compressed. However, the compression
performance of our scheme is far better than that of [5]. For
the 16-map case, the compression performance improves by
4.06%14.62% with an average improvement of 10.96%. For
the one-map case, the improvement falls between 4.41% and
16.85%, with the mean value of 12.54%. Furthermore, the
compression performance is only 4% in average worse than that
of the Huffman coding. This can be considered as the tradeoff of
having additional cryptographic features in traditional entropy
coding.
C. Encryption and Decryption Efciency
For the 16-map case, the encryption and decryption times
of the proposed scheme are listed in Table II. Moreover, the
original algorithm [5] is reexecuted on our computing platform,
and the results are also presented in the same table. The data
showthat the modied scheme is slightly faster than the original
approach [5]. The les are also compressed by the Huffman
coding, followed by the 128-bit Advanced Encryption Standard
(AES). The comparison results show that the proposed scheme
needs less time in encrypting and decrypting 16 out of 18 les.
TABLE II
COMPARISON OF ENCRYPTION AND DECRYPTION TIMES
Moreover, the average encryption and decryption speeds are
increased by 8.7% and 11.8%, respectively.
V. SECURITY ANALYSES
A. Key Space, Key, and Plaintext Sensitivities
The key of the proposed scheme is composed of the con-
trol parameter b and the initial value x
0
of the logistic map,
together with the 32-bit initial cipher block. In the software
implementation, b and x
0
are represented by double precision
format using 52 bits. The total key space can reach 136 bits
and is comparable to 128-bit AES. However, b should be
carefully chosen to avoid the nonchaotic regions with a negative
Lyapunov exponent [11], [12]. Aplot of the Lyapunov exponent
against b is shown in Fig. 3. In the chaotic region with a positive
Lyapunov exponent, the period of the output sequence can be
considered as innity. However, the actual period is limited by
the computation precision. A constructive scheme for nding
the maximum period of a chaotic map under limited precision
is described in [13]. It is employed to nd the period of the
logistic map. The results show that all the periods in the chaotic
region are far longer than 10
7
in double precision computation,
which are sufcient for practical ciphers [13].
The key and plaintext sensitivities are evaluated as follows.
The les from the Calgary Corpus [10] are encrypted using
two sets of secret key with only a tiny difference. The two
resultant ciphertext sequences are then compared bit by bit,
and the percentage of difference bits is calculated. Ten tests
are performed for the one-map and 16-map cases, respectively.
The average values are listed in Table III. They show that the
bit change percentages are very close to 50%, which justify
the high sensitivity of the ciphertext to the key. To numerically
114 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMSII: EXPRESS BRIEFS, VOL. 58, NO. 2, FEBRUARY 2011
Fig. 3. Plot of the Lyapunov exponent computed at increment of 0.0001 of the
control parameter b in the interval [3.6, 4].
TABLE III
AVERAGE KEY AND PLAINTEXT SENSITIVITIES
evaluate the plaintext sensitivity, a bit is changed at different
positions of the plaintext sequence, which is then encrypted
using the same key. The two resultant ciphertext sequences are
compared bitwise. These operations are repeated 20 times, and
the average values can be found in the rightmost column in
Table III. The measured average bit changes are close to 50%,
which imply that the ciphertext is very sensitive to the plaintext.
These tests conrm the high key and plaintext sensitivities of
our scheme. This is because the lookup table is disturbed by
the plaintext. A tiny change in the plaintext affects not only the
corresponding ciphertext block but also the encryption process.
B. Other Security Issues
Two encryption rounds are suggested in the original chaos-
based joint compression and encryption scheme [5]. The search
mode can be considered as a variant of the Baptista-type
cryptosystem [9]. The mask mode is indeed a stream cipher that
masks the plaintext by a pseudorandom bitstream. Its security
is determined by the randomness of the occurrence of the
numbers or bits in the mask stream [14]. The statistical test
suite recommended by the U.S. National Institute of Standards
and Technology [15] is employed to evaluate the randomness of
the mask stream. In the test, 300 sequences, each of 1 000 000
bits, have been extracted. They all pass the statistical tests
including frequency, block frequency, cumulative sums, runs,
longest run, rank, and fast Fourier transform. All the P-values
are larger than 0.01. Therefore, the sequences are considered as
sufciently random according to [15].
In the original scheme [5], most selected probable plaintext
symbols are encrypted in the search mode, but some are en-
crypted in the mask mode together with low probable symbols.
Since the expansion of ciphertext is avoided, all the chosen
probable symbols are encrypted in the search mode of the
proposed scheme. Therefore, the security of our scheme is
higher than that of [5].
VI. CONCLUSION
The existing approach of embedding compression in the
chaos-based cryptosystem suffers from the drawback of low
compression performance. This can be explained by the model
of sampling with replacement. A modied scheme based on the
model of sampling without replacement has been proposed. As
a result, the number of chaotic map iterations wasted for visiting
irrelevant symbols is reduced. The compression capability is
improved, which is close to that of conventional entropy coding.
Moreover, the lookup table can be realized by memory chips or
eld programmable gate array, and therefore, the proposed joint
compression and encryption scheme is easy to be implemented
by hardware circuits.
REFERENCES
[1] C. P. Wu and C. C. J. Kuo, Design of integrated multimedia compression
and encryption systems, IEEE Trans. Multimedia, vol. 7, no. 5, pp. 828
839, Oct. 2005.
[2] M. Grangetto, E. Magli, and G. Olmo, Multimedia selective encryption
by means of randomized arithmetic coding, IEEE Trans. Multimedia,
vol. 8, no. 5, pp. 905917, Oct. 2006.
[3] J. Wen, H. Kim, and J. Villasenor, Binary arithmetic coding with key-
based interval splitting, IEEE Signal Process. Lett., vol. 13, no. 2, pp. 69
72, Feb. 2006.
[4] H. Kim, J. Wen, and J. Villasenor, Secure arithmetic coding, IEEE
Trans. Signal Process., vol. 55, no. 5, pp. 22632272, May 2007.
[5] K. W. Wong and C. H. Yuen, Embedding compression in chaos-based
cryptography, IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 55, no. 11,
pp. 11931197, Nov. 2008.
[6] J. Zhou, Z. Liang, Y. Chen, and O. C. Au, Security analysis of multime-
dia encryption schemes based on multiple Huffman table, IEEE Signal
Process. Lett., vol. 14, no. 3, pp. 201204, Mar. 2007.
[7] G. Jakimoski and K. Subbalakshmi, Cryptanalysis of some multimedia
encryption schemes, IEEE Trans. Multimedia, vol. 10, no. 3, pp. 330
338, Apr. 2008.
[8] J. Zhou, O. C. Au, and P. H. Wong, Adaptive chosen-ciphertext attack
on secure arithmetic coding, IEEE Trans. Signal Process., vol. 57, no. 5,
pp. 18251838, May 2009.
[9] M. S. Baptista, Cryptography with chaos, Phys. Lett. A, vol. 240,
no. 1/2, pp. 5054, Mar. 1998.
[10] [Online]. Available: ftp://ftp.cpsc.ucalgary.ca/pub/projects/text.compression.
corpus
[11] G. Alvarez and S. Li, Some basic cryptographic requirements for chaos-
based cryptosystems, Int. J. Bifurcat. Chaos, vol. 16, no. 8, pp. 2129
2151, 2006.
[12] C. M. Ou, Design of block ciphers by simple chaotic functions, IEEE
Comput. Intell. Mag., vol. 3, no. 2, pp. 5459, May 2008.
[13] T. Addabbo, M. Alioto, F. Fort, A. Pasini, S. Rocchi, and V. Vignoli, A
class of maximum-period nonlinear congruential generators derived from
the Rnyi chaotic map, IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 54,
no. 4, pp. 816828, Apr. 2007.
[14] S. Tezuka, Uniform Random Numbers: Theory and Practice. Norwell,
MA: Kluwer, 1995.
[15] A. Rukhin, J. Soto, J. Nechvatal, M. Smid, E. Barker, S. Leigh,
M. Levenson, M. Vangel, D. Banks, A. Heckert, J. Dray, and S. Vo, (2010,
Apr. 27). A Statistical Test Suite for the Validation of Random Number
Generators and Pseudo Random Number Generators for Cryptographic
Applications, NIST Special Publication 800-22. [Online]. Available:
http://csrc.nist.gov/groups/ST/toolkit/rng/documentation_software.html

You might also like