PACKING ALGORITHMS
by Axe of Superior
Axe of Superior (previously known as Axe of Delight) is a German
hacker that has gained quite a lot of respect and fame by
creating what it assumed to be the best packer on the ST - "Pack
Ice". In this issue, we are proud to offer you an EXCLUSIVE
article he wrote about packing techniques - as well as the latest
version of the packer, which you will find in the PROGRAMS
folder. Be warned, dear reader, that some knowledge of
programming is required to understand this article!
Hello everybody. Well, here is an article that should interest
you if you want to save some disk space. In fact, it should
interest everybody who does some programming or other serious
work with a computer.
I am sure you have all been confronted with packers, that can
often by recognised by a color flashing with random colors in the
background color of the screen. Depending on the size of the
program and the speed of the packer, the flashing lasts longer or
shorter.
I was not the first programmer who coded a packer on the ST, but
I have always been interested in compressing data. When I had my
Apple Computer, I was always angry that keeping screen shots
takes so much memory, even if there is not much on the screens.
Every picture took 8192 bytes (I think) no matter if the screen
was empty or filled with nice graphics. For that reason I
programmed my first packer ever - a picture compressor. I first
wrote the program in Basic and then the unpacking routine in
Assembler. Unpacking time took one second, which is very fast for
an 8-Bit machine.
On the ST, there was need to pack more than just pictures. All
data blocks and programs were bigger and needed to be packed. In
search for a good packer, I tried several packing algorithms:
Run Length Decoding
It is always easy to program a packer that packs repeating bytes
and sets one command byte if the following bytes are packed, but
this way was not very effective. I will now explain it anyway,
because it is very easy. Suppose you have the following bytes to
pack:
--> 06, f0, 46, 46, 46, 46, 46, 46, 46, 46, 46, ef, 34.
You can define one byte as command byte (say ff) after which you
indicate the byte and amount to be repeated. This would give you
the following:
--> 06, f0, ff, 46, 09, ef, 34.
You have packed 13 bytes to 7 bytes which is about 50%. This is
a good packing rate, but most times, you will not have 9
repeating bytes in a sequence in Assembler or C code. Well, it is
good for packing blank data areas or screens with big areas of
one color.
By the way, it will be more effective if you select a command
byte that appears least often in your programming code. So, first
count all bytes and define the command byte as first byte in your
packed data. This Packing Algorithm is called Run Length
Encoding.
Bytekiller
Don't get too enthusiasic with a packing rate of 50% in the
upper example. The bytes were specially selected. Just imagine
you have a data block with a repeating pattern. For example:
Assembler code: moveq #0,d0
nop
nop
nop
nop
nop
nop
bra.s test
If you assemble this, you will get the following bytes:
--> 70 00 4e 71 4e 71 4e 71 4e 71 4e 71 4e 71 60 08
Using the upper routine, you will not be able to reduce the
length of the program by a single byte. You need a new routine
that looks for repeating bytes and sets an information for the
unpacker, like the following:
--> 70 00 4e 71 <Information to repeat the string> 60 08
This would exactly look like the following:
Packed data: Unpacked data:
Write: 70 00 4e 71 > 70 00 4e 71
Copy the string with the length
of 2 bytes from position 2 to the
current position. 70004e71 > 4e 71
Copy string: length 4 from pos.
2 to current pointer. 70004e714e71 > 4e 71 4e 71
Copy string: length 4 from pos.
2 to current pointer 70004e714e714e714e71 > 4e714e71
Write: 60 08 70004e714e714e714e714e714e71 > 6008
That means: You kept 6 bytes of data in your packed block and
you need 3 informations to copy bytes from another position to
the current. If you consider that the informations to copy bytes
don't need to be bytes themselves, but can also be bits, you can
save some memory. Are you wondering how to mix bits with bytes?
No problem. There are 2 ways. The first one is used in the
"Bytekiller" (also known as "Jek-packer" - which was not
programmed but converted from Amiga to ST by Sharaz Jek. By the
way, "Bytekiller" is still one of the most often used packers).
You rip every byte apart to 8 bits and combine them with the
bits for the packing information. Those bits are together put in
the packed memory as bytes.
The other - 40% faster - method is used in "Pack-Ice" and most
of the good packers. The idea is to combine unpacked bytes and
information bytes, that consist of bits. The unpacking routine
will know when there is an information byte and when there is an
unpacked byte.
Now I will tell you in detail what these information bits look
like and what they do in the example of "Pack-Ice".
When data can not be packed, that means there are no repeating
byte patterns, it must be copied to the packed data directly, so
it can be copied back to unpacked data when unpacking. This is
the only way to keep all data. So you put those un-packable bytes
in the packed data, followed by an information about the length
in the upper mentioned format.
Let's say you have the following bytes:
d3 a7 7b 3a 44 69 79 b0 Those bytes can not be packed, so they
can also be found in the packed data,
followed by the length information (8)
in bits (see below).
The number of bytes is indicated by the following bits.
Number of bytes: 0 Information Bits: %0
1 %10
2 %1100
3 %1101
4 %1110
5 %111100
6 %111101
7 %111110
8 %111111000
9 %111111001
10 %111111010
..... .....
As you can see, the smallest length (0) needs just one bit. The
reason is that this length appears most often. In fact it appears
every time when packed data follows and not unpacked bytes that
just have to be copied. So there is a good reason why the bits
look so strange and do not all have the same length.
If data can be packed, then you need the %0 from the upper table
for no unpacked bytes to switch to packed bytes and after that
follows the information for the packed data, which is the
following:
Copy data with <length> from <offset> to the current position.
For length and object you use a similar table as the one shown
above. When you have read the two values, you simply copy the
needed bytes.
Example: 01 02 03 04 05 06 07 08 09 0a 0b <<Copy some of the
previous byte, offset 4, length 3>> 0c 0d 0e
Note that in "Pack-Ice" and nearly all other packers the offset
is not counted from the beginning of the file, but it is counted
backwards from the current position, so you will copy the
following bytes: 08 09 0a (4 to the left from << and then 3 bytes
taken). So you will get the following unpacked data:
01 02 03 04 05 06 07 08 09 0a 0b 08 09 0a 0c 0d 0e
Well, now you should know everything about the most often used
packing algorithm. And if you use it in an optimized way, like in
"Pack-Ice", you will get good Packing-Rates. The old "Bytekiller"
uses the same algorithm, but not as optimized which makes the
file a lot bigger and unpacking half as fast.
Huffmann
Now I have introduced you to two packing routines and will now
show you a third method. It might be the best known - the
Huffmann Packer.
As the Huffmann Packer is quite slow for unpacking, it is not
often used, except for some archive programs like "lzh" or "arc".
Right now I don't feel like explaining in detail how it works,
otherwise I won't finish this article in time and it won't be in
ST NEWS at all. I am already late with writing it. But if you are
interested in the Huffmann Packer, you will find lots of
documentation in your local library.
The Huffmann Packer expects that the bytes in a file appear with
a different frequency. Most times the bytes $00, $ff, $01, $02
are more likely to come up than $be or $97. For that reason, you
can shorten the 8 bits of the most frequent bytes to - say 6 bits
(to save 2 bits whenever this byte appears) and enlargen other
rare bytes' 8 bit to 10 or more bits. If you do this in the best
possible way, you will save a lot of bytes all in all. To pack
with Huffmann, you will need to create a tree and sort the bytes
for the most often and least often appearing bytes. If every byte
appears the same times as the others, you will not save any
bytes. In fact you will lose some, because for the Huffmann
packer you need a table in front of the packed data that tells
the unpacker how many bits every byte takes.
Dynamic Huffmann Packer
If you still improve your Huffmann Packer, you will get a
Dynamic Huffmann Packer. It is better because it modifies its
table while unpacking. Imagine you have a lot of 00 in the
beginning of the file and a lot of ff bytes in the end. In this
case it will be useful to change your table during unpacking.
Shannon-Fano Compression
Shannon-Fano Compression is another algorithm to reduce your
amount of bytes. Like Huffmann, the SF Packer is a variable
length packer. That means that you operate with bits, not with
bytes. To explain how it works, imagine you have a text. Like in
Huffmann, you count how often each character appears in this
text. Then, arrange your character set in the right order based
upon the probability of occurrence. After that is done, the set
must be divided into two equal or almost equal subsets based upon
the probability of occurrence of the characters in each subset.
The first digit in one subset is assigned a %0 and the first
digit in the second subset is %1. Now keep on dividing the
subsets until each character has its own binary number.
Compared to the Huffmann Packer, you will get similar packing
rates, so use the one you like best.
Others
There are still some other Packing Algorithms, that are good for
special uses, like compression of text, pictures, digital sound
etc.
To pack text, you can sometimes use the 8th bit that is not used
by ASCII. Or you can use a dictionary packer, in which the text
is analysed and every known word, like "and" or "computer" is
replaced by a number. This is a special use and requires a big
dictionary. You can also replace spaces by tabs (good for source
codes). There are many ways to save some bytes for texts.
Pictures can be reorganized. Low res screens consist of 4
planes. One pixel is saved in 4 bits, that are in 4 different
words in a sequence. If you put the 4 bits in 1 word, repeating
pixels can be packed. You need different routines for medium
res screens. You don't need to do any changes for hires screens,
because there is only one plane. Another way to pack pictures is
to use the fact that repeating pixels are often to be found in
vertical lines.
Digital Sound is not easy to be packed, but it can be done with
relative packing. That means, you always calculate the difference
from every byte to the preceeding byte. Those differences repeat
more often than the other bytes.
Animations are also packed with relative packing. That means,
you just save the difference to the old picture. This saves a lot
of data. Remember this when you digitize your next porno show...
Packers - a short summary of the most popular ones
Now I have shown you the most common packing routines and I
think that your brain is burning now. So let us change the
subject to some easier-to-read subject. I will introduce some
different packers:
The first packer I have seen on the ST was the "Happy-Packer".
It was named Happy Packer because at that time there was an
article about packing algorithms in a German magazine called
"Happy Computer". That was maybe 4 years ago and as you know
this packer was not very good in size and speed of unpacking. A
60 kB program was unpacked in about 10 seconds.
The "Happy Packer" is a two-pass packer. It first does run
length encoding and then uses the Huffmann Packer. In fact, it
works exactly as explained in the magazine.
Another packer that was made was an improved "Happy Packer".
This was done by -me- of The Exceptions. He added a pass and
reduced the size of the packed programs. This packer was used in
the legendary "Union Demo".
The next packer was the "Jek-Packer" ("Bytekiller" on the
Amiga). It had best results compared to other packers and I often
used it to pack some data. Until I suddenly found out that there
were bugs in the unpacked data. The unpacked data was different
than the original data. I changed the program a little and
inserted a verify routine, that simply checks if packing was done
correctly and I was stunned to see how often there were errors.
The reasons was that the data was unpacked to the same address
were the packed data was. This is a nice thing, but has the
effect that packed data is sometimes overwritten by the unpacked
data. This causes some fatal errors. There were some other bugs
in the packer and an improved version coming from Ilja/Level 16,
STC/Brainpower, -me-/Tex and Axe/Delight (note: now Superior!)
was finished more than one year ago (February 1990). I named this
packer "Delight Packer".
Oh yes, the "Jek Packer" only uses string packing, and it is
also one of the packers that unpack all data from behind. There
is not much advantage in this method, except that the files can
be packed from the front, starting with the first byte and ending
with the last. The "Bytekiller" uses d5 for a checksum and after
unpacking a byte tells if the file was damaged on the disk.
The "Automation Packer" was well known, but there were also a
lot of bugs in there. The packer didn't work on hard drives and
the like. In fact, it was some bad programming and I am sure that
the packing routine was also ported from the Amiga. It was the
only intelligent part of the program, even though it was also
full of bugs. Now, the "Automation Packer" has a nice shell and
has been improved a lot. I don't know if the packing routine has
also been improved or not. I never use it anyway.
The "Pompey Pirates Packer" is one of the better packers around.
It can compress strings, repeating bytes and replace bytes with
only 1 bit set (1,2,4,8,..) by less than 8 bits. This is quite
useful, as those bytes come up very often in files. Some things
could be better though: The string packer is not very optimized,
the offset could be set higher for better compression (up to
$10000) and packing takes very long (Hi JPM! Are these enough
improvement suggestions?).
Note: I am referring to the "Pompey Pirates Packer" 1.5, as I
don't have 1.9 yet. Well, it might be improved. I don't know.
A valuable lesson
By doing a lot of packing stuff, I have learned that it is very
useful to have a verify option in a packer, because there are
often bugs hidden in the packer, even if you think that you have
eliminated all. And nothing is worse than creating damaged files
when packing. So I suggest all packer writers to include verify.
Another thing that is interesting for all lamers: Please learn
to remove this horrible color flashing in the background. It
hurts my eyes, it makes good graphics look like shit and uses up
memory and processor time. Especially those guys who use "Pack-
Ice" and insert their color flashing shit in the unpacking
routine should be hot into space where they can do as much color
flashing as they want to. Why do I optimize an unpack routine as
much as possible if some lamer just adds 20 bytes for color
flashing.
If you want to demonstrate that a program is packed, then say
"Packed by...". That should do it.
I have often been asked when "Pack-Ice" 3.0 will be finished....
Well, to be honest, I haven't done any coding on "Pack-Ice" for
2 or 3 months. But right now I have the time to get back to
coding. Maybe you will not get 3.0, but only 2.5 or something,
but I'll try to give you the best packer in your hands, so you
and everybody else can save a lot of disk-space.
Disclaimer
The text of the articles is identical to the originals like they appeared
in old ST NEWS issues. Please take into consideration that the author(s)
was (were) a lot younger and less responsible back then. So bad jokes,
bad English, youthful arrogance, insults, bravura, over-crediting and
tastelessness should be taken with at least a grain of salt. Any contact
and/or payment information, as well as deadlines/release dates of any
kind should be regarded as outdated. Due to the fact that these pages are
not actually contained in an Atari executable here, references to scroll
texts, featured demo screens and hidden articles may also be irrelevant.