PkCrack - README
Version info
This is the README file for pkcrack 1.2.1.
Version 1.2.1 is a bugfix release with little new features.
Disclaimer
This program may or may not do what you think it does. It may or may not
do what its documentation (including this file) tells you it does. Use 
at your own risk! The author may not be held liable for any damage
caused by running this program. Or any other damage, for that matter.
This program was written for people who have encrypted their own files and
forgotten their passphrases, or for people who have been the victim of
some 'practical joke'. In any case it is for people who have a legitimate
right (in whatever sense) to gain access to the encrypted data.
It is not meant as a tool for wannabe hackers who try to steal other
people's intellectual property!
Copyright
This package was written and is copyright by
Peter Conrad, 
<conrad@unix-ag.uni-kl.de>.
Commercial use in any form is strictly forbidden!
You may use parts of the code in your own programs for non-commercial use
in case you clearly state where you got it.
Do not release software using parts of the code without the author's
explicit consent.
What is this?
This package implements an algorithm for breaking the PkZip cipher that was
devised by Eli Biham and Paul Kocher. A paper describing the attack is
included in this package.
Since an astonishingly large number of people request the package every day,
I have decided to release this program as CardWare. I don't
remember who coined that term, but its meaning is simple:
  If you like the program, send me a postcard. Picture postcards from the area
  where you live are preferred. On the card you may write anything you like, 
  e. g. how much you like the program, what a great person I am, or whatever 
  comes to your mind. Be creative! :-)
My snail mail address is:
    Peter Conrad
    Am Heckenberg 1
    56727 Mayen
    Germany
Requirements
- ANSI compatible C-compiler (gcc is fine)
- about 33MB of (virtual) memory
 Note that most of the memory is used only during the first cycles
   of the key-reduction stage. It runs fine with 16MB RAM (and sufficient
   swapspace).
- patience :-)
- PGP (if you want to check the signatures)
Building pkcrack
Unpack the package by entering 
  zcat pkcrack.tar.gz | tar xvf -
This will
produce a directory "pkcrack". Cd into that directory: 
  cd pkcrack
Since you're reading this file you probably have done that already.
The sources are kept in the file "src.tar". Unpack them by entering
  tar xvf src.tar
This will create a directory "src". cd to that dir.
The program was written and tested under Linux, SolarisX86 2.4 and MSDOS
(the DOS version was built using DJGPP-2.00). In most cases typing 'make'
should be sufficient.
If you want to port this to other platforms you should check the definitions
of some data types:
byte	should have a range of [0..255]		(unsigned char)
ushort	should have a range of [0..65535]	(unsigned short)
uword	should have a range of [0..(2^32-1)]	(unsigned int)
I can't think of other important changes right now. Please inform me of
successful ports to other platforms, I may include them in the Makefile.
Using pkcrack
With version 1.2 pkcrack has become one of those "fire & forget" programs.
If you are a hacker of the experimental kind you may want to look at the "more
complete instructions" below. Otherwise, stick to the "simple instructions".
The first thing you have to know is that this program applies a known
plaintext attack to an encrypted file. A known plaintext attack recovers
a password using the encrypted file and (part of) the unencrypted file.
Before you ask why somebody may want to know the password when he already
knows the plaintext think of the following situations:
-  There's usually a large number of files in a ZIP-archive. All these files
     are encrypted using the same password. So if you know one of the files,
     you can recover the password and decrypt the other files.
-  You need to know only a part of the plaintext (at least 13 bytes). Many
     files have commonly known headers, like DOS .EXE-files. Knowing a
     reasonably long header you can recover the password and decrypt the
     entire file.
Back to the program.Simple Instructions
You need two files: 
-  the ZIP-archive which you want decrypted, and
-  another ZIP-archive, containing at least one of the files from the
     encrypted archive in unencrypted form. This one has to be
     compressed
     with the same compression method used for the encrypted file.
Now, enter
  pkcrack -C encrypted-ZIP -c ciphertextname -P plaintext-ZIP -p plaintextname -d decrypted_file
("Real computer scientists never comment their code - the
 identifiers are so long they can't afford the disc space." :-/ )
- encrypted-ZIP
- is the name (and path) of the encrypted ZIP-archive (see 1. above)
- ciphertextname
- is the name of the file in the archive, for which you have
    the plaintext
- plaintext-ZIP
- is the name (and path) of the ZIP-archive containing the compressed
    plaintext (see 2. above)
- plaintextname
- is the name of the file in the archive containing the
    known plaintext
- decrypted_file
- is the name of a file to which the decrypted archive will be written
All you have to do now is wait a little. Depending on the size of the 
plaintext and the speed of your computer after about an hour the program
should terminate. If the plaintext is very short (less than 100 byte or so),
it will probably take a lot longer.
After pkcrack is finished, you will find the decrypted archive in the file
decrypted_file. You can unzip it using any unzip-program, e. g.
pkunzip under DOS, or unzip under UNIX.
If decrypted_file doesn't exist, or if unzipping it produces
CRC-errors, there are several things that may have gone wrong:
- Maybe you used the wrong plaintext. Find better plaintext.
- Maybe you used the correct plaintext but used the wrong compression method.
   Compress the plaintext with a different method.
- Maybe pkcrack found more than one set of matching 'keys' and used the wrong
   one to decrypt the file. Examine the output of pkcrack, and use the
   sets of key[012]-values with the "zipdecrypt" program to decrypt the file.
- Maybe something else went wrong. Now you're in trouble. Read the "more
   complete instructions" and try to understand why the program failed. Read
   the FAQ below.
More Complete Instructions
This section will explain some of the more esoteric options of pkcrack, as
well as some of the other programs in this package.
Just like in the "simple instructions" you need two files. Not necessarily
ZIP-archives, though. You can specify any file containing nothing but plain
data, i. e. no ZIP-headers, and no other fancy stuff. In that case, simply
don't specify the -C or -P options, but only the -c and -p
options.
So there are two possible ways to tell pkcrack where to find the ciphertext:
- -C encrypted-ZIP -c ciphertextname(see above), or
- -c ciphertextname
As I said, in the latter case ciphertextname is the name of a file 
containing nothing but encrypted data.
Analogously there are two possible ways to specify the plaintext:
- -P plaintext-ZIP -p plaintextname(see above), or
- -p plaintextname
In the latter case, plaintextname is the name of a file containing 
nothing but (compressed) plaintext.
Note that PkZip prepends 12 random bytes to the compressed data before
encryption, so the ciphertextfile has to be 12 bytes longer than the plaintext.
If you know only part of the plaintext, the plaintext can be even shorter.
Usually, though, a difference of more or less than 12 bytes in the file sizes
indicates wrong plaintext, or wrong compression method of the plaintext.
If you want to extract data from a ZIP-archive, you can use the "extract"
program contained in this package. Invoke it by entering
  extract ZIP-name name-in-ZIP
This will extract the (possibly encrypted and/or compressed) data stored in
the archive ZIP-name under the name name-in-ZIP, and write
those data to the file name-in-ZIP in the current directory.
Another option used in the "simple instructions" above is the -d
decrypted_file - option. It tells pkcrack to decrypt the
archive specified with the -C option and write the decrypted
results to the file decrypted_file. Naturally, it can only be used in
conjunction with the -C option. 
If you do not specify the -d option, pkcrack will try to find a PkZip-password
when it has found a set of keys. If it finds a password, you can use it to
decrypt the archive with the pkunzip-program. If it doesn't, you can use the
set of keys found by pkcrack with the zipdecrypt program contained in this
package to decrypt the archive. Zipdecrypt must be called as follows:
  zipdecrypt key0 key1 key2 encrypted_archive decrypted_archive
where key0, key1, key2 is a set of keys found by pkcrack,
encrypted_archive is the name of the archive to be decrypted, and
decrypted_archive is the name of the file to which the archive will be
written by zipdecrypt.
One option to pkcrack has not been mentioned yet: with -o offset
 you can specify an offset of the plaintext data into the ciphertext.
This is for the special case that the known plaintext starts somewhere in the
middle of the encrypted data. The default value for offset is 0, i. e.
the 12 encrypted random bytes are not to be included in the
offset.
There is another possible application of the offset: the so-called
"random" bytes aren't that random. Older versions of PkZip
used the CRC-checksum of the file as the last 4 "random" bytes, newer versions
use "only" one byte of CRC. In that case you have to prepend the known
CRC-bytes before the known plaintext file, and specify a
negative offset (e. g. -1 if you know the last one of the
"random" bytes). This feature has not been tested very thoroughly. Be warned.
The one thing that remains to be explained is the findkey program. You can
use it to find a PkZip-password for a set of key[012]-values found by
pkcrack. This is exactly what pkcrack does if you do not
specify the -d option. Periodically, findkey prints information
about its progress which can be used to restart the program at a later time.
The information printed is of the form
  10: xx, or
  11: xxxx, or
  12: xxxxxx and so on.
To restart findkey, enter
  findkey key0 key1 key2 pwdlen initvalue
where key0, key1, key2 is a set of keys found by
pkcrack, pwdlen is 10 or 11 or 12 (depending on the point where you
want to resume), and initvalue is the "xx" printed by findkey.
pwdlen and initvalue are optional parameters.
Note that findkey will take very long to find long passwords.
There are 255 times as many possible passwords with a pwdlen of 11 as
with a pwdlen of 10. It is probably wiser to use the zipdecrypt
program instead.
Some details
Here is a short description of the source-files:
- crc.c
- This file contains functions for calculating CRC-32 checksums.
The CRC-polynomial used is defined in crc.h
A lookup-table is used, which has to be initialized first.
- crc.h
- Header file for crc.c - This file contains macros for computing
CRC-checksums using a lookup-table which has to be initialized using a
function in crc.c
- exfunc.c
- This file contains a function for reading data from a ZIP-archive into
memory.
- extract.c
- This file contains the main() function of the "extract" program, which may
be used to extract data from a ZIP-archive and write it to a file.
- findkey.c
- This program tries to find a PkZip-password for a given initial state
of key0, key1 and key2. In the current version it prints information about
the progress of the search to stdout every couple of minutes. You can use
that information for resuming the search at a later time.
- headers.h
- This headerfile contains declarations of several data types used in 
ZIP-archives.
- keystuff.c
- This file contains functions for initializing and updating the
internal state of the PkZip cipher.
- keystuff.h
- This is a header file for keystuff.c
- main.c
- This file contains the main() function of the PkZip-cracker.
It reads the ciphertext and plaintext files and makes calls to the
actual cracking stages.
- mktmptbl.c
- This file contains a function for initializing a lookup-table that
is used for finding "temp" values for a given "key3" (refer to the paper if
you want to know what "temp" and "key3" are).
- mktmptbl.h
- This is a header file for mktmptbl.c
- pkcrack.h
- This header file contains some constants used in the program and some
global variables from main.c
- readhead.c
- This file contains several functions for reading and parsing headers in a 
ZIP-archive. Refer to the file "pkzip.txt" for details.
- stage1.c
- This file implements stage 1 of the cracking process, namely finding
initial values for key2_n and reducing the number of possible values.
See sections 3.1 and 3.2 of the paper.
- stage1.h
- This is a header file for stage1.c
- stage2.c
- This file implements stage2 of the cracking process, namely creating a
lists of key2-values, calculating the corresponding key1 and key0
values, decrypting the 12 prepended bytes and finally calling stage3.
See sections 3.3, 3.4 and 3.5 of the paper.
- stage2.h
- This is a header file for stage2.c
- stage3.c
- This file implements stage 3 of the cracking process, namely finding
a PkZip-password for a given internal state of key0, key1 and key2.
It re-uses some code from stage2.c
See section 3.6 of the paper.
- stage3.h
- This is a header file for stage3.c
- writehead.c
- This file is the counterpart of readhead.c - it contains functions for
writing headers in a ZIP-archive.
- zdmain.c
- This file contains the main function of the zipdecrypt program.
- zipdecrypt.c 
- This file contains a function for decrypting a ZIP-archive with a given
set of key[012] values. It produces a ZIP-archive which can be unzipped using
pkunzip under DOS or unzip under UNIX.  I wrote this be cause stage 3 (password
generation) takes a couple of eons for finding long passwords.
For further information on the attack refer to the paper describing the
algorithm (pkzip.ps.gz). Information on the format of pkzip-archives is
contained in the file pkzip.txt (this was taken from a pkzip distribution).
That's it. Some of the less obvious sections of the code are commented.
Most aren't.
Hints
- From a person wishing to remain anonymous:
> I had asked Dimitri for the source to the Win32 version
> of PkCrack because I wanted to change it to allow it
> to scan an encrypted PKSFX-style self-extracting
> Zip file.
> 
> However, after a while, I figured out
> that you can use PkZipFix to convert a Zip .EXE to
> a regular .Zip, which will work with PkCrack.
> PkCrack didn't recognize the .EXE file.
 
Frequently Asked Questions
- Q: "When I run the program it says something about increasing constants and
    rebuilding. What is that supposed to mean?"
- A: The program uses an internal array to store intermediate values. During
the first cycles of the key-reduction-stage the number of intermediate values
can exceed the size of the array. Since the program cannot increase the size
of the array it prints an error message and stops. There are two ways to 
handle that problem:
-  If you have lots of plaintext you can simply strip off some of the 
plaintext bytes from the end of the plaintext file. This may
help.
-  Increase the value of the constant KEY2SPACE in the file pkcrack.h by
at least (1<<21) (that's 2M entries with 4 bytes each = 8MB), and
recompile the program. Obviously, you need the sources and a decent C-compiler
to do this.
 
- Q: "Shall I use compressed or uncompressed plaintext?"
- A: You have to use plaintext compressed with exactly the
same method that was used to compress the ciphertext. So if the
ciphertext is uncompressed, use uncompressed plaintext. If the ciphertext is
shrunk (or imploded, or mega-hyper-special-compressed), shrink (or implode, or
mega-hyper-special-compress your plaintext). A good indicator of the correct
compression method is the size of the compressed plaintext: it has to be 12
bytes shorter than the ciphertext (that's for pkzip -v output, unzip -v under
UNIX should report the same compressed sizes for plain- and ciphertext).
I have heard that there are subtle differences in the compression methods
in different versions of pkzip, so be sure to use the correct version for
compressing the plaintext.
- Q: "You said there were 12 random bytes prepended to the ciphertext. Does 
    the ciphertext input to pkcrack have to include these 12 bytes?"
- A: Yes.
- Q: "But where shall I get the plaintext for the random bytes?"
- A: You don't have to. Use only the plaintext, without anything prepended to
it.
- Q: "Hi, can you decrypt a file for me? I've attached it below." 
    (followed by a ton of MIME-encoded junk)
- A: Please don't send me your ZIP archives unless I ask you for it.
        Thanks.
Bye,
        Peter
Back to pkcrack main page
 My Homepage  
 Unix-AG Homepage
 
conrad@unix-ag.uni-kl.de