INTRODUCTION:
In the short time since its launch, Sun Microsystems's Java Technology
has become almost synonymous with portable software that can be distributed
across the internet. Java's pre-eminent position is reinforced by the fact
that built-in support for its distribution format, the JVM, is now not
only part of every World Wide Web browser, but is starting to appear even
within operating systems.
This distribution of mobile code is achieved using platform neutral
byte codes which are contained in a unit referred to as a ‘class’. One
of the most important factors to consider for code mobility is the network
transmission performance. For applications running across the internet,
this factor could be quite indeterministic and could prove to be a bottleneck
in more than one situation. To reduce the dependency on the network performance
it may become necessary to achieve good compression of code before sending
down the wire.
Taking a look at the class file structure defined in the JVM specification
it becomes obvious that there is potential scope to reduce the file size
using compression techniques.
OVERVIEW:
Even though there are several structures within a java class which can
be compressed this exercise primarily focuses on compression of methods
( byte codes). The primary technique used here for compression
is based on patternization as suggested by [Code Compression; Jens Ernst,
etc..][Custom inst. Sets for code compression; Fraser, Proebsting].
The algorithm tries to match the input instructions with known instruction
patterns and substitutes the matched instructions with new specialized
intructions.
Post byte code compression, the compression algorithm further compresses
using the GZIP compression to achieve more than 50% compression on the
overall class file size.
Finally a general purpose file and URL class loader have been implemented
which can read compressed as well as uncompressed class files. All compressed
class files are passed through a decompressor to return the original class
bytes back to the JVM. One additonal noteworthy benefit is that,
as the compression algorithm changes the output class file structure(maintaining
the semantic structure) in such a way that only a proprietary class File
reader can read the class, we automatically also get "obfuscation" of the
class files.
GOALS:
-
Provide a byte code compression algorithm which will compress any java
class file. The algorithm can be just specific to only handle java class
files.
-
Further provide compression of the byte code compressed classes using general
purpose compression techniques like GZIP, etc., to achieve overall compression.
-
Provide a decompression algorithm which can handle java classes which are
just byte code compressed or both, byte code compressed and GZIP compressed.
-
Provide ways by which unnecessary debugging information can be eliminated
from class files.
-
Compare the results.
-
Create a Class file loader which can load the classes which are compressed
or uncompressed.
FEW PUBLISHED COMPRESSION METHODS:
There are several compression techniques available today and trade-offs
have to be made to choose one over the other. For instance some of the
better compression techniques are multi-pass resulting in more time spent
for compression and decompression.
I)
One compression technique suggested involves patternizing the input
[Proebsting; Fraser and Proebsting]. Patternization accepts an actual program
and proposes specialized instructions that might help compress that program.
The patterns replace each combination of operands with wildcards.
For example the code "FetchInt( AddrLocal[4])" generates the patterns:
1. FetchInt(*)
2. FetchInt(AddrLocal[*])
3. FetchInt(AddrLocal[4])
II)
[Jens Ernst, etc] proposes the above technique along with other compression
techniques to achieve a compression factor of about 5. They use the term
wire-format for the compressed code as they cannot be interpreted directly
and need to be decompressed before they can be used.
The technique suggested involves patternizing out all literals, form
one stream for all patterns and one for containing the literal operands
associated with each opcode or class of related opcodes, MTF code(discussed
below) each stream, and gzip the resulting stream in isolation.
MTF coding (Move-to-front) technique starts by replacing sequence elements
with their indices in a table the changes dynamically. The table’s elements
are ordered such that the first element was the most recently accessed
element; after each new access, the accessed element is moved to the front
and all intermediate elements are shifted down one place. A sequence with
high spatial locality tends to yield a sequence of small indices, which
should compress well.
III)
[Franz, Kistler] suggest a different intermediate code representation
than the linear format seen in Java byte codes. This intermediate code
is referred to as "slim binaries". The slim binary representation is based
on abstract syntax trees and describes the actions of the original program
similar to a parse tree. This intermediate tree-representation is compressed
by merging isomorphic sub-trees, using a variant of Welch’s classes LZW
algorithm that has been specifically adopted towards compressing program
trees.
ANALYSIS:
The JVM instruction set lends itself well for the technique suggested in
I) for two primary reasons:
-
The instruction set is small and mostly stack based and uses numbered "local
variables" to store results. There are number of instruction which involve
moving data onto the top of the stack, operating on them and returning
the results back to the top of stack. E.g.:
1. ALOAD_0 -> INVOKEVIRTUAL <operand> ; 2. ALOAD_0 -> ICONST_0
-> PUTFIELD <operand> ( Notation:: -> stands for "followed by" )
-
The JVM instruction is a single byte instruction allowing 255 possible
instructions. Only 201 instructions are currently being generated for use
in class files. This potentially allows us to create about 50 new instructions
based on patterns matched.
The technique suggested by III), even though quite attractive does not
lend itself well for the current assignment, where it is essential for
current JVM’s in the market to be able to execute the code after decompression.
It has been designed for a virtual machine which can operate directly on
tree representations of the code.
BYTE CODE COMPRESSION USING Static Patterns:
The technique used here is similar to I) wherein patterns in the instruction
set are matched with templates and replaced with new instructions. It is
different from I) because rather than figuring out the repeating patterns
dynamically, the current implementation uses static templates (Found studying
many different disassembled class file’s). Similar to I) the technique
uses "operand specialization", factoring out the operands during template
matching. The use of having "static templates" makes the algorithm single
pass.
The compression technique can be best explained through a example. Consider
the following disassembled byte code shown below:
Method public void enable()
>> max_stack=2, max_locals=4 <<
0 ALOAD_0
1 GETFIELD #197 <Field Component.enabled:boolean>
4 iconst_1
5 if_icmpeq 38
8 ALOAD_0
9 ASTORE_1
10 aload_1
11 monitorenter
12 aload_0
13 iconst_1
14 putfield #197 <Field Component.enabled:boolean>
17 ALOAD_0
18 GETFIELD #176 <Field Component.peer:peer.ComponentPeer>
………
37 NEW #28 <CLASS JAVA.LANG.STRINGBUFFER>
40 DUP
A couple of the very commonly occurring instruction sequences is highlighted(
uppercase) in the sample code.
The compression technique basically deals with two types of instruction
sets:
-
Instructions using Numbered Variables.
-
Instructions accessing the constant pool.
Instructions using numbered Variables:
These are instructions that use variables to store intermediate values
during method execution. The variables are numbered from ‘0’ , onwards.
These instructions can be further classified into 2 distinct types: One
set of Instr's using variables '0' through '3' and another set using variable
'4' upwards. The instructions using variables '0' through '3' includes
the variable number as part of the opcode itself, occupying only a single
byte. E.g. ILOAD_0, ILOAD_1, ILOAD_2, ILOAD_3. Instructions using variables
'4' onwards use a single byte operand to store the variable number.
Now consider the instruction sequence ALOAD_0 -> ASTORE_1 ( Notation:
-> stands for , "followed by"), highlighted in the sample code are instructions
which use numbered variables. The compressor matches this instruction sequence
using the following template: ALOAD_# -> ASTORE_# . Here ‘#’ is any
numbered variable used in the instruction. The match results in the replacement
of the two instructions with a new instruction ALOAD_ASTORE, occupying
only a single byte.
By factoring out the numbered variable from the instructions, we can
see that a single instruction (ALOAD_ASTORE) matches a total of 16 different
combinations of ALOAD_# -> ASTORE_# . This factorization technique is very
useful given that only about 50 new instructions can be generated ( A different
approach where generation of new instructions of variable length, may also
have been possible, but not considered here ).
The compressor uses a separate stream to store all the factored out
numbered variables in the same sequence as they appear in the original
program. Now as all the numbered instructions only use variables ‘0’ through
‘3’ , only two bits are needed to store any given variable number.
So, for the considered instruction sequence using 2 bytes in the original
class, a single byte new instruction and 4 bits for storing the variable
numbers will be required. From this simple example, it can be observed
that only ½ a byte has been saved, but the idea can be extended
to match sequences containing more than 2 instructions, resulting in better
savings.
Instructions accessing the constant pool:
There are many instructions within the JVM instruction set which use a
16 bit unsigned integer index as an operand pointing to an entry in the
constant pool.
The following about the constant pool are worth noting:
-
The entries in the constant pool can be one of twelve different types.
Entry types like ‘FieldRef’ and ‘MethodRef’ contain values which in turn
point to other entries like ‘Class’ and ‘Name_and_type’.
-
Studying typical class files suggested that the references made by the
constant pool accessing instructions point to the first few hundred entries
of the constant pool, even though the total number of constant pool entries
might be quite large.
-
A large number of class files of typical java applications are relatively
small and have less than 255 constant pool entries.
The JVM instructions accessing the constant pool use a 16 bit integer index
into the constant pool irrespective of the size of the constant pool. The
compressor determines the size of the constant pool in the original program
and substitutes these indexes with a 8 bit integer index when all the operands
of constant pool accessing instructions is less than 255 ( Note: The actual
number of entries in the constant pool may be much larger).
Consider the instruction sequence NEW <index> -> DUP in the sample
program. The compressor matches this instruction sequence using the following
template: NEW <xx>-> DUP . The match results in the replacement
of the two instructions consuming 4 bytes with a new instruction NEW_DUP,
occupying only a single byte. The constant pool index <xx> is factored
out into a separate stream for constant pool indexes and stored as a single
byte. This technique results in 50% saving.
The below table shows the current patterns (templates) the compressor
supports along with the new instruction generated plus the extra stream
information corresponding to constant pool indexes and variable numbers.
Current Byte Code Patterns matched (All instructions are
single byte)
Pattern |
New Instruction |
o/p to Constant Pool Idx Stream |
o/p to Numbered Variable Stream |
|
|
|
|
ICONST_x -> ISTORE_x |
ICONST_ISTORE |
|
<x -> x> :: 4 bits |
ILOAD_x -> ISTORE_x |
ILOAD_ISTORE |
|
4 bits |
ALOAD_x ->ASTORE_x |
ALOAD_ASTORE |
|
4 bits |
ISTORE_x->ILOAD_x |
ISTORE_ILOAD |
|
4 bits |
ASTORE_x->ALOAD_x |
ASTORE_ALOAD |
|
4 bits |
ALOAD_x ->
GETFIELD yy |
ALOAD_GETFIELD |
1 byte |
2 bits |
ALOAD_x ->
INVOKEVIRTUAL yy |
ALOAD_
INVOKEVIRTUAL |
1 byte |
2 bits |
LDC yy ->
INVOKEVIRTUAL zz |
LDC_INVOKEVIRTUAL |
2 bytes |
|
ALOAD yy ->
INVOKEVIRTUAL zz |
ALOAD_
INVOKEVIRTUAL1 |
2 bytes |
|
NEW yy -> DUP |
NEW_DUP |
1 byte |
|
ALOAD_x ->
PUTFIELD yy |
ALOAD_PUTFIELD |
1 byte |
2 bits |
ALOAD_x ->
INVOKESPECIAL yy |
ALOAD_
INVOKESPECIAL |
1 byte |
2 bits |
ALOAD_x ->
GETSTATIC yy |
ALOAD_GETSTATIC |
1 byte |
2 bits |
ALOAD_x -> ICONST_x -> PUTFIELD yy |
ALOAD_ICONST_
PUTFIELD |
1 byte |
4 bits |
High Level Design:
Class Compressor Architecture:
The class files which have structure as defined in the JVM specification
are compressed using a "Class Compressor". A "class file reader" reads
the entire class file, parses it into a container "Class_info". A Class_info
itself is a aggregation of components like FieldInfo, MethodInfo, etc(
Class diagram below shows the relationships). The "Class Compressor" delegates
the responsibility of actual compression to a "Compressed Class File Writer".
The methods are compressed using a "Byte Code Compressor" and the entire
compressed "Class Info" structure is further compressed using a GZIP compression
stream. The final compressed "Class Info" structure is streamed out a file
with a ".cls" extension.
Class Decompressor Architecture:
A "Compressed Class File Reader" can read compressed .cls files, parse
it, decompress it and create a "Class Info" structure. If the .cls
file is GZIP compressed, it is first read through a GZIP inflation stream.
The methods read are passed on to a "Byte Code Decompressor" and the inflated
methods are written on to "Class Info". Finally a "Class File Writer" is
used to write the "Class Info" data structure back to a file stream (.classd).
Class diagram 1: Class Compression:
The diagram below shows some of the important classes used for class compression
along with their relationships.
Class diagram 2: Class Decompression:
The diagram below shows some of the important classes used for class(.cls)
decompression along with their relationships.
Class diagram 3: Compressed Class Loading:
Implementation Notes:
Most of the code for the Class File Reader and Writer was downloaded from
the world wide web. Few modifications and bug fixes had to be done.
The class CompressedClassFileReader was implemented ,overriding some
of the behaviour of ClassFileReader. It for instance, overrides "method
reading", uses a ByteCodeDecompressor to decompress the method byte code
and write the inflated byte code into ClassInfo.
Similarly class CompressedClassFileWriter overrides some of the behaviour
of ClassFileWriter. It for instance, overrides "method writing",
uses a ByteCodeCompressor to compress the method byte code and write the
deflated byte code into ClassInfo.
A trivial file and URL class loaders have been implemented which can
read compressed(.cls) or uncompressed(.class) files. The uncompressed classes
are in turn resolved using the primordial class loader, hence validating
the decompression.
The entire implementation is done using 'java' and has been implemented
without catering to any specific performance goals.
..........
To install and execute the Class compressor and Compressed Class file
loader, read the User's guide.
Results:
Byte Code compression Results( all figures in bytes)
Class |
Original Size ( bytecode) |
Compressed byte code size
(bytecode + operand stream) |
% change |
BoundObject.class |
4079 |
2537 + 613 = 3150 |
|
Applet.class |
255 |
179+32 = 211 |
|
Column.class |
1091 |
715+148 = 863 |
|
Animation.class |
1173 |
801+157 = 958 |
|
Assignment.class |
464 |
266+74 = 340 |
|
AWTEventMulticaster.class |
1250 |
990+109 = 1099 |
|
ClassFileReader.class |
2911 |
2534+154 = 2688 |
|
ClassFileWriter.cls |
2825 |
2404+175 = 2579 |
|
Overall Class Compression Results ( all figures
in bytes)
Class |
Original size |
Just bytecode compression |
Bytecode + GZIP + removal of debug data |
PKZIP only |
BoundObject.class |
13899 |
12974 |
4198 |
5986 |
Applet.class |
2713 |
2673 |
1071 |
1240 |
Column.class |
9352 |
9128 |
2359 |
2860 |
Animation.class |
6576 |
6365 |
2956 |
3656 |
Assignment.class |
3378 |
3258 |
807 |
967 |
AWTEventMulticaster.class |
7286 |
7139 |
1934 |
2406 |
ClassFileReader.class |
10919 |
10703 |
3526 |
4739 |
ClassFileWriter.class |
14175 |
13933 |
3759 |
6210 |
Future Enhancements:
-
Written in a native language such as 'C', to achieve better performance.
-
Change the ByteCode compression algorithm to generate patterns dynamically
and suggest new instructions.
-
Only about 15 new instructions are currently being generated. At least
30 new instructions, matching patterns could be added to the Byte
code compressor.
-
Consider packaging several class structures into a single compact structure.
-
Compress all the other structures within a class, to achieve good overall
compression.
-
etc..
Conclusion:
Byte code happens to constitute only about 10% to 20% of the overall class
file size. So even a 50% compression would only result in a 5 to 10% compression
of the overall class file size. So it is very essential to compress all
the other structures of the class file to achieve good compression ratio's.
Constant pool constitutes about 60% of the total class file size, and
its structure is ideal for compression.
For instance:
The tag bytes, one byte for each entry could be literally eliminated
by ordering the constant pool entries according to their type.
It is filled with UTF strings with many repeating string's like the
package name 'java.lang...', which could be reduced.
Roughly about 10% of the class size is consumed by debugging information.
This could be eliminated for production systems.
References:
[1] Christopher Fraser and Todd Proebsting, "Custom Instruction sets for
Code Compression", Unpublished Technical report, available in http://www.cs.arizona.edu/people/todd/papers/pldi2.ps
[2] Jens Ernst(Univ. of Arizona), Christopher Fraser(Microsoft Research),
etc,; "Code Compression".; ACM SIGPLAN'97
[3] Thomas Kistler and Micheal Franz, "A Tree based alternative to Java
Byte-Codes"; UC Irvine.
[4] The Java Virtual Machine Specification, Sun Microsystems.
[5] Thomas Kistler and Micheal Franz, "Slim Binaries"; UC Irvine.
[6] Micheal Franz, "Adaptive Compression of Syntax Trees and Iterative
Dynamic Code Optimization: Two basic technologies for Mobile-Object systems.";
UC Irvine.
================================================================================