Professional Documents
Culture Documents
Abstract--This paper presents a new proposed 3- H(x) is relatively easy to compute for any given input x.
message digest algorithm 0) .
Many of its characteristics 4- One-way: for any given code h, it is computationally
(applications domain, performance and implementation infeasible to find x such that H(x) = h.
structure) are similar to those of MDCfamily of hash 5- Weak collision resistance: for any given block x, it is
functions. The proposed algorithm takes as input a message computationally infeasible to find y f xwithH(y)=
of arbitrary length and produces as output a 128/160-bit
fugerprint or message digest. New features of the proposed H(x).
algorithm include the heavy use of data-dependent rotations,
6- Strong collision resistance: it is computationally
and the inclusion of integer multiplication as an additional infeasible to find any pair (x,y) such that H(y) = H(x).
primitive operation. These proposed features are expected to The strength of a hash function against brute-force attacks
provide high security level with enhancement in throughput. [l] depends only solely on the length of hash code produced
The proposed algorithm is intended for digital signature by the algorithm. Table I Summarizes the level of effort
applications, where a large fde must be compressed in a required producing a birthday or square root attack (which
secure manner before being signed (encrypted) with a private is referred to strength of hash code) for merent types of
secret key under a public-key cryptosystem. The proposed hash functions, assuming m-bit result.
algorithm is designed to be quite fast on 32-bit machines. In
addition, it does not require any large substitution tables, so TABLEI. COMPARISONOF HASH-CODE STRENGTHBETWEEN DFFERENT
that the algorithm can be coded quite compactly. We describe TYPES OF HASH FUNCTIONS
the general characteristics, architecture and implementation,
and give a complete specifcation for MD-160/128. Several [ Type of hash function 1 Strength of hash function I
One-way 2"
test vectors are used for inspecting the validity of the
proposed algorithm. Also, we compare the software
performance of several MDCbased algorithms, which is of I Strong collision resistance I 2mn
independent interest. Simulation results show that the
throughput of the proposed MD-128 is about 76.4 Mbitlsec Almost all hash functions are iterative processes, which
while in RIPEMD-128 is about 69.8 Mbitlsec. hash inputs of arbitrary length by processing successive
fixed-size blocks of the input. The input Xis padded to a
Index Terms-Cryptographic Hash Functions, Cryptanalysis, multiple of the block length and subsequently divided into t
Software Performance. blocks x1 through x,. The overall structure of a typical
secure hash function is indicated in Fig. 1.
I. INTRODUCTION The hash function h can then be described as follows:
& = initial n-bit value
C RYPTOGWHIC hash functions are an important
tool in cryptography for applications such as digital
fingerprinting of messages, message authentication, and
Hi = @&-,,Xi),1s i 5 t, h(X) = Ht
The first constructions for hash functions were based on
key derivation. Hash functions can map bitstrings of block ciphers (such as DES) [l]. Although some trust has
arbitrary finite length into strings of fvred length. A hash been built up in the security of these proposals, their
value is generated by a b c t i o n H of the form h = H(M) software performance is not very good, since they axe
where M is a variable-length message and H(M) is the typically from 2 to 4 times slower than the corresponding
fured-length hash value. The purpose of a hash function is block cipher.
to produce a fingerprint of a file, message, or other block of The most popular hash functions, which are currently
data. To be useful for message authentication, a hash used in a wide variety of applications, are the custom
function must satisfy the following properties: designed hash functions from the --family. MD4 was
1- H can be applied to a block of data of any size. proposed by R Rivest [4,5]. It is a very fast hash function
2- H produces a fixed-length output. tuned towards 32-bit processors. Because of unexpected
vulnerabilities identified in [6,7] (namely collisions for two
This work was supposed by Department of Computer Sci. & Eng.in rounds out of three), a strengthenedversionofMD4was
Faculty of Electronic Engineering, Menouf-32952, Egypt. designed which is called MD5 [8,9].
Abdul Hamid M. Ragab is the head of Computer Sci. & Eng. Department,
Faculty of Electronic Engineering, Menouf-32952, Egypt, (e-mail: MD5 is slightly slower than MD4, but it is more
ah m -r
ag awhot"). conservative in design. It was being implemented fast into
Nabil A. I d is with the Department of Computer Sci. & Eng., Faculty products. MD5 is probably the most widely used hash
of Electronic Engineering, Menouf-32952, Egypt, (e-mail: nabil_is@
hotmail.com). function, in spite of the fact that the compression function
Osama S. Farag AUah is with the Department of Computer Sci. & Eng., of MD5 is not collision resistant [lo].
Faculty of Electronic Engineering Menouf-32952, Egypt, (e-mail: This does not pose a threat for standard applicationsof
osami-salah_faragallah@ hotmail.com).
MD5, but still implies a violation of one of
11. COMPARATIVEDESCRIP~ON
OF THE MD4-FAIVIILY
192
L14
was quite surprising. Moreover, it introduced a new diffusion primitive, and is used in MD to compute rotation
technique to cryptanalyze this type of functions. These amounts, so that the rotation amounts are dependent on all
techniques can be extended to produce collisions for MD4 of the bits of another register. As a result the proposed MD
[7], and for the compression function of the extended has much faster diffusion. This allows MD to run with
version of MD4 [4]. The attack on MD4 requires only a fewer rounds at increased security and with increased
few seconds on a PC, and still leaves some freedom to the throughput.
message; it clearly d e s out the use of MD4 as a collision The permutation of the message words of MD (likewise
resistant function. RIPMED) was designed such that two words that are 'close'
It is expected that these techniques can be used to in round 1-2 are far apart in round 2-3 (andvice versa).
produce collisions for MD5 and perhaps also for RIPEMD. The permutation p was chosen such that two message
This will probably require an additional effort. An words, which are close in the left half, will always be at
independent reason to produce a new architecture of MD is least seven positions apart in the right haK For Boolean
the limited resistance against a brute-force collision search functions, it was decided to eliminate the majority function
attack. Taking into account the fact that the cost of because of its symmetry properties and a performance
computation and memory is divided by four everythree disadvantage. The Boolean functions are now the same as
years (this observation is known as Moore's law), one can those used in MD5. As mentioned above, the Boolean
conclude that a 128-bit hash-result does not offer sufficient functions in the left and right half are used in a different
protection for the next ten years. The current situation order.
brings us to the conclusion that it would be prudentto Our security goals are that the data-dependent rotation
upgrade current implementations, and to consider a more amount that will be derived from the output of
secure scheme for standardization. SHA-1 has alreadya transformation function fix) should depend on all bits of
160-bit result, and because of some of its properties it is the input word and that the transformation should provide
quite likely that SHA-1 is not vulnerable to the known good mixing within the word. The particular choice of this
attacks. However, its design criteria and the attack on the transformation for MD is the quadratic function f(x). This
first version are secret. transformation appears to meet our security goals while
taking advantage of simple primitives that are efficiently
IV. DESIGN PRINCIPLES AND implemented on most modem processors. qx) is one-to-one
IMPLEMENTATION ISSUES OF MD modulo 2w, and that the high-order bits of f(x), which
The main design principle of MD is to overcome the determine the rotation amount used, depend heavily on all
problems raised ip the above section, to maximize on the bits of x. The quadratic function is aimed at providing
confidence previously gained with RIPEMD and its a faster rate of diffusion thereby playing a simple yet
predecessors h4D4 and MD5. Also, it was decided to aim important role in complicatingboth linear and differential
for a rather conservative design, which offers a high cryptanalysis. The quadratically transformed values of x is
security level, rather than to push the limits of performance increasing the nonlinearly of the scheme while not losing
with the risk of a redesign a few years from now. The two any entropy [23].
main improvements are the heavy use of data-dependent
rotations and the inclusion of integer multiplication as an V. DESCRIPTION OF THE PROPOSED MD-128/160
additional primitive operation [17,181. MD-128/160 is a hash function based on MD4, taking
The basic design of MD was to have two parallel into account knowledge gained in the analysis ofMD4,
iterations that are completely different. The operation for MD5, RIPEMD, and RIPEMD-128/160. The MD-128/160
MD-160 on the A register is related to that of MD4 (but compression function differs from MD4 in the number of
five words are involved). The amount of rotation is words of chaining variable, the number of rounds, the
determined by a quadratic function Kx) = x(2x + l)(mod round functions themselves, the order in which the input
2 9 , here it is computed as a function of the E register. words are accessed, and the amounts by which results are
Also, the rotate of the C register has been added to avoid rotated. The left and right computation lines differ from
the MD5 attack, which focuses on the most significant bit each other in these last two items, in their additive
[lo]. The step operation for MD-128 is identical to that of constants, and in the order in which the round functions are
MD-160, but the amount of rotation is determined as a applied. This design is intended to improve resistance
function of D register. against known attack strategies.
The number of rounds is four (three for MD-128), and
that the two parallel rounds are made more different. From A. MD-128/160 Primitive Operations
the attack on RIPEMD we conclude that having only The proposed algorithm uses primitive operations as
different additive constants in the two lines is not sufficient. shown in Table IV.
The order of the message blocks in the two iterations is
completely Merent. In addition, the order of the Boolean TABLE N.NOTATION
FORMD-128/160
Notation
functions is reversed. Two's c lement addition of words
The philosophy of MD is to exploit operations such as Bit-wise exclusive-OR of words,
rotations that are efficiently implemented on modem A B The clic mtation of word A left B bits
processors. Also, takes advantage of the fact that 32-bit Ink. er multi lication modulo 2"
integer multiplication is now efficiently implemented on 1's complement of word A
AAB Bit-wise AND of words
most processors. Integer multiplication is a very effective AvB Bit-wise OR of words
193
214
B. MD-160
The overall MD-160 compression function maps 21-
word inputs (5-word chaining variable plus 16-word
message block, with 32-bit words) to 5-word outputs. Each
input block is processed in parallel by distinct versions (the
left line and right line) of the compression function. The
160-bit outputs of the separate lines are combined to give a
single 160-bit output. Fig. 2 shows an elementary operation
Xi
of h4D-160 as the following. An outline of the compression
function is given in Fig. 3.
A B C xd?
I I \ I I
%?
%?
hikbbhs
L
A
I
B
I
C D
I
E I
I
Fig. 3. Outline ofthe compresskmfunction ofMD-160
Fig. 2. Elementary MD-160operation (single step) (d) Amount for rotate left (rol) is determinedby the
quadratic function f(x)=x(2x+l)(mod 2”). This tends
C. Algorithm MD-160 Hash Function to E*(2*E+1) since f(x) is a function of E register.
INPUT:bitstring x of bitlength b 1 0. (e) Compression function (Step function):
OUTPUT: 160-bit hash-code of x. A = (A+f(B,C,D)+X+K ) <<< E*(2*E + l), C <<< 10.
1. Notation. Primitive operations are shown in Table IV. 3. Preprocessing. Pad x such that its bitlength is a multiple
Round functions are defined in Table VI, and order by of 512-bit, as follows. Append a single 1-bit, then
which round functions are applied is given in Table append r - 1 0) 0-bits for the smallest r resulting in a
VII. bitlength 64 less than a multiple of 5 12. Finally append
2. Constants. Define five 32-bit initial chaining values: the 64-bit representation of b mod 2@, astwo32-bit
hi = 0x67452301, h2 = OxEFCDAB89, h3 = words with least significant word first. (Regarding
Ox98BADCFE, h4 = 0x10325476, hS= OxC3D2ElFO. In converting between streams of bytes and 32-bit words,
addition: the convention is little-endian). Let m be the number of
(a) Define four new additive constants for the left line 512-bit blocks in the resulting string (b + r + 64 = 512m
(sequare roots of 0,2,3,5): K L ~=] 00000000, (0 5 j = 32*16m). The formatted input consists of 16m 32-bit
5 15); KLb] = 5A82799, (16 1 j 131); K L ~ = ] words: xarl ...xl6m-I.
6ED9EBA1, (32 5 j 547); KLG]= 8FlBBCDC, 4. Processing. For each i from 0 to m - 1, copy the ithblock
(48 5 j 5 63). of sixteen 32-bit words into temporary storage: Xu]=
(b) Define four new additive constants for the right line xi6iCj,0 5 j 5 15. Then:
(cube roots of 0,2,3,5): = 50A28BE6, (0 5 j (a) Execute four 16-step rounds of the left line as
. 515); KRG] = 5C4DD124, (16 5 j 531);KRb]= follows:
6D703EF3, (32 5j 547); KRb]= 00000000, (48 5 j (AL,BL,CL,DL, EL)= (hi, h2, h3, h4, h5)-
563). These constants for both leftandright line (Left Round 1) For j from 0 to 15 do the following:
appear in Table VIII. T = (AL+~~(BL, ] &U]) <<< E*(2*E+1),
CL, DL)+X [ S L ~ ] +
(c) Ordering of the message words. Take the (AL,BL,CL,DL,EL)= (EL,T, BL, CL<<<~O, DL).
permutation p as shown in Table V. (Left Round 2) For j from 16 to 3 1 do the following:
Further define the permutation v by setting v(i) = 9i+5 T = (AL+Q(BL, CL,DL)+X [SL~]]+ &E]) <<< E*(2*E+1),
(mod 16). Table IX gives access order for source words. (AL,BL, CL,DL,EL)= (EL,T, BL,CL<<<~O, DL).
Table X specifies the selection of message words in the left @.,eft Round 3) For j from 32 to 47 do the following:
and right lines. T = (AL+B(BL,CL, DLPX [ S ~ f i ] ]+ K ~ f i ] )<<< E*(2*E+1),
194
L14
(AL,BL,CL,DL,EL)= (EL,T, BL, CL<<<~O, DL). line working VariableS:(AR, BR, Cb &ER)=
(Left Round 4) For j from 48 to 63 do the following: (hl, hz, h3, h4, h5).
T = (AL+~~(BL, CL,D&X [SLG]]+ ku]) <<< E*(2*E+1), (c) After executing both the left and right lines above,
(AL,BL,CL,DL,EL)= (EL,TyBL, CL<<<~O, DL). update the chaining values as follows: T = h2 +
(b) Execute in parallel with the above four rounds an CL + DR, h2 = ~ ~ + D L + E R , ~ ~ = ~ + E L + A ~
analogous right line with (Ab BR,CR,DR,ER), h4= h5 + AL + BR, h5= hi + BL + C b hi = T.
&ti], K R ~ ] replacing the corresponding 5. Completion. The final hash-value is the concatenation:
quantities with subscript L, and the order of the ( h i 11 h2 11 h3 11 h4 11 h5) (with first and last bytes the low-
round functions reversed so that their order is: and high-order bytes of H1, H5, respectively).
f4, f3, f2, and fl. Start by initializing the right
i 0 1 1 2 1 3 4 1 5 1 6 7 8 1 9 10 11112 13 14 15
P(i) 7 1 4 1 3 1 1 1016115 3 1210 9 5 1 2 14 11 8
195
214
numbers are for realistic inputs, i.e., 256 Megabyte of data MD-128/160 to run with fewer rounds at increased
are hashed using an 8 K buf€er. In addition, to improve the security and with increased throughput.
accuracy of our measurements, the throughput tests were 3- The heavy use of data-dependent rotations.
executed 10 times, and we report the average of the 4- The quadratic function is aimed at providing a
throughput thereby obtained. Table XIII summarizes the faster rate of diffusion thereby complicating both
comparative analysis between the proposed message digest linear and differential cryptanalysis.
algorithm (MD-128/160) and 6 hash algorithms considered The proposed MD-128/160 makes utilization from the new
for different design parameters such as digest length, features that contribute to provide high security level with
number of steps, maximum message size. From Table XIII, increased throughput.
we can note that MD-128/160 has the following new Also, Table XJY Summarizes the performance comparison
features: of proposed MD-128/160 and several MD4-based hash
1- The number of used primitive operations are seven functions at fixed 256 Megabytes of data. The relative
(7), while other MD4-hash function uses less throughputs coincide more or less with predictions based
number of primitive operations (6). on a simple count of the number of operations. MD-160
2- Use of 32-bit integer multiplication that is now has an increment in throughput by % 0.54 than RIPMED-
efficiently implemented on most processors. 160, and about yo15 slower than SHA-l.Also,MD-128
Integer multiplication is a very effective &ion has an increment in throughput by %9.46 than RIPMED-
primitive. As a result the proposedMD-128/160 128, about %14 slower than RIPEMD, and nearly two
has much faster diffusion. This allows the proposed times slower than MD4.
TABLEm.
COMPARISONBETWEENDIFFERENT HASHINGALGORITHMS
196
214
REFERENCES
[l] R. Merkle, "One Way Hash Functions and DES," Advances in
Cry~tol~gy, Roc. Crypto"89, LNCS 435, G. BrassaKi, Ed.,Springer-
Verlag, 1990, pp. 4 2 8 4 6 .
[2] B. Preneel, R Govaerls, J. Vandewde, "Hash Functions Based on
Block Ciphers: A Synthetic Approach," Advances in Cryptology, Roc.
Crypto'93, LNCS 77S, D. StinSon, Ed., SPringer-Verlag, 1994, pp. 368-
378.
[3] C.H. Meyer, M. Schilling, "Secure program load WithManipUlation
DetectionCode,"Roc.Securicom 1988,pp. 111-130.
[4] RL. Rivest, "The MD4 Message Digest Algorithm," Advances in
Cryptology, Roc. Crypto'90, LNCS 537, S. Vanstone, Ed, Spriager-
Verlag, 1991,pp. 303-311.
[5] RL. Rivest, "The MD4 Message-Digest Algorithm," Request for
Comments (RFC) 1320, Internet Activities Board, Internet Privacy Task
Force, April 1992.
[6] B. den Boer, A. Bosselaers, "An Attack on the Last Two Rounds of
MD4," Advances in Cryptology, proc. Crypto'91, LNCS 876, J.
Feigenbaum, Ed, Spriager-Verlag, 1992, pp. 194-203.
[7] H. Dobbertin, "Cryptanalysis of MD4," Fast Software Encryption,
LNCS 1039, D. Gollmann, Ed., Springer- Verlag, 1996.
[8] RL. Rivest, "The MD5 Message-Digest Algorithm," Request far
Comments (RFC) 1321, Internet Activities Board, Internet Privacy Task
Force, Apnll992.
[9] J. Touch, "Report on MD5 Performance," Request for Comments (RFC)
1810, Internet Activities Board, Internet Privacy Task Force, June
1995.
[lo] B. den Boer, A. Bosselaers, "Collisions for The Compression Function
of MD5," Advances in Cryptology, proc. Encrypt 93, LNCS 765, T.
Helleseth, Ed, Springer-Verlag, 1994,pp. 293-304.
[ll] FIPS 180-1, Secure Hash Standard, NIST, US Department of
Commerce, Washington D.C., A@ 1995.
[U] RIPE, "Integrity Primitives for Secure Information Systems. Final
Report of RACE Integrity Primitives Evaluation (RIPE-RACE 1040),"
LNCS 1007, SPringer-Verhg, 1995.
[13] S. Vaudenay, "On the need for multipermutations: Cryptanalysisof
MD4 and SAFER," Fast Software Encryption, LNCS 1008, B. Preneel,
Ed., Springer-Verlag, 1995, pp. 286-297.
[141 R Anderaon, "The Classificationof Hash Functions," Proc. of the JMA
Conference on Cryptography and Coding, Cirencester, December 1995,
o x f o r d University Press, 1995, pp. 83-95.
[ 151 LB. Damgard, "A Design Principle for Hash Functions," Advances in
Cryptol~g~, pro^. Crypto'89, LNCS 4SS, G. Bra~sard,Ed, Springer-
Veda& 1990, pp. 416-127.
[16] H. Dobbertin, A. Bosselaera, B. m e e l , "RIPEMD-160: A
Strengthened Version of RIPEMD," Fast Sofhvare Encryption,LNCS
1039, D. Gollmann, Ed, Springer-Verlag, 1996, pp. 71-82.
[17] RL. Rivest, "RC5 Encryption Algorithm", In Dr. Dobbs Joumal,
nmber226, pages 146-148, January 1995.
[18] B.S. Kaliski and Y.L.Yin. "On the Security of RC5 Encryption
Algorithm", RSA Laboratories Technical Report TR-602, Version 1.O,
September 1998.
[19] R.L. Rivest', M.J.B. Robshaw' , RSidneJ , and Y.L. Yin', "The
RC6?MBlock cipher", M.1.T Laboratory for Computer Science,545
Technology Square, Cambridge, MA 02139,USA,2 RSA
Laboratories,2955 Campus Drive ,Suite 400, San Mateo, Ca 98.
[ZO] Schneier, Bruce., "Applied Cryptography" , Second Edition, John
Wiley and Sons,New York,1996.
[21] W. Stahgs., "Network and Internetwork Security: Principles and
Practice", Rentice-Hall, New Jersey, 1995.
[22] W. Stahgs., "Cryptography and Network Security: Principles and
Practice", Rentice-Hall, New Jersey, 1999.
[23] Abdd Hamid M. Ragab, Nabil A. Ismail,, and Osama S. Farag Allah,
"Enhancements and Implementation of RC6?M Block Cipher for Data
Security", to be appear in IEEE TENCON 2001, Singapore, august 19-
22 2001.
197