Monday, March 31, 2008

Bases, merits, and defects of packed code

How many times we have seen an included JavaScript apparently incomprehensible?
These examples could explain better what I mean:
packed.it via MyMin

eval((function(M,i,n){return '0("1");'.replace(/\w+/g,function(m){return (n[m]!=i[m]&&i[m])||(i[m]=M[parseInt(m,36)])})})('alert.test'.split('.'),{},Object.prototype))


Dean Edwards packer

eval(function(p,a,c,k,e,d){e=function(c){return c};if(!''.replace(/^/,String)){while(c--)d[c]=k[c]||c;k=[(function(e){return d[e]})];e=(function(){return'\w+'});c=1};while(c--)if(k[c])p=p.replace(new RegExp('\b'+e(c)+'\b','g'),k[c]);return p}('1('0');',2,2,'test|alert'.split('|'),0,{}))


These are only two parsers which aim is to reduce code size using an inline decompression technique that is able to evaluate source code after common keywords replacement.


Basis of portable and packed code


The main goal of these services is to compress a source reducing occurrences of repeated words, using different technique to create both compressed string and inline function with decompression algo.
In this paragraph we will see a rudimentary example on how can be possible to create our own compressor. Let's go with the first function:

function pack(code, keywords){

var re = /\w+/g,
filter = {},
key;

// foreach word or number in string
while(key = re.exec(code))
// make found one unique
// if does not exists it creates them
// otherwise it overwrite them
filter[key[0]] = 0;
// for example, if code is "a b a", filter
// will contain only two properties, a and b
// with value 0 for both of them

// with found list of unique words or numbers
for(key in filter)
// avoid inherited Object.prototype parameters or methods
if(filter.hasOwnProperty(key))
// add key to array
// save into filter key position in the array
// converting them into base 36 string
filter[key] = (keywords.push(key) - 1).toString(36);
// for example, if code is "a b a", filter.a value will be 0, and filter.b will be 1

// for each word or number
// return keyword index in base 36 format
return code.replace(re, function(key){
return filter[key];
// for example, if code is "a b a"
// returned code will be "0 1 0"
});
};


var myCode = 'alert("this alert will show this text");',
myKeywords = [],
packed = pack(myCode, myKeywords);
alert(packed); // 0("1 0 2 3 1 4");

Reading comments, we can understand the logic behind a client side based compressor.
And since compression operation follows usually a "one to many" logic, where one is the operation to compress, and many are clients that will decompress code, more simple and fast will be the decompression operation, more powerful, portable, and compatible will be our algorithm.

function unpack(code, keywords){

// foreach word or number
return code.replace(/\w+/g, function(key){
// return related keywords value
// converting base 36 string into
// base 10 index
return keywords[parseInt(key, 36)];
// for example, if code is "0 1 0"
// returned string will be like
// keywords[0] + " " + keywords[1] + " " + keywords[0]
});
};

(unpack(packed, myKeywords) === myCode); // true

The last ring of our chain, is a function that is able to create automatically final result code:

function portablePack(code){
var
// keywords container
keywords = [],

// packed version of the code
packed = pack(code, keywords),

// object to make returned string evaluable
safe = {"'":"\\'", "\\":"\\\\", "\n":"\\n", "\r":"\\r"};

// to solve problems with some special char inside the string
// make packed result more safe replacing chars using safe object keys values
packed = packed.replace(/'|\\|\n|\r/g, function(match){
return safe[match];
});

// return an evaluable string
return "eval((function(p,l){return p.replace(/\\w+/g,function(k){return l[parseInt(k,36)]})})('" + packed + "','" + keywords.join(".") + "'.split('.')))";

// created string has to contain an inline function
// that should be able to return original code
// to eval function. Created string will be
// something like
// eval((function(packedString,keywords){return unpack(packedString,keywords)})("packed string", "key.words.list".split(".")))
};

This is our first home made client side compression example, not really so efficient, but good enough to understand the logic behind.


Merits of client side compression technique


It simply does not necessary require server side operations to optimize the result size of our scripts, that are every day more than ever "thanks" to the Web 2.0 era.
Required bandwidth will be less than before, while download speed will be increased, and more code we pack, more possibilities we have that ratio between source and packed code will be greater, thanks to common names for common tasks programming routines.


Defects of packed code


Nowadays, quite every browser supports gzip or deflate runtime decompression.
These compression algorithms are really efficient thanks to decompression speed, 10 to 100 faster than pure JavaScript operations plus evaluation, and thanks to their support provided by every server side program language.
Every time we download a client side packed code, even if file will be saved in browser cache, it has to be executed every time we will visit that page again.
So, if our goal is to increase page interaction speed during navigation, JavaScript decompression delay will be a problem.
If this problem will be hilarious or heavy, it depends only on client hardware and its browser performances.
At the same time, using a gzip or deflate compression over a packed code, will not truly increase performances and result will be bigger than clear code gzip compression.

The reason is that common compression algorithms create compressed code using a dictionary that will contain common, repeated, words or letters, found in original string.

Since JavaScript compressors usually replace numbers or words with a unique identfier to be able to recreate original string, resulted string will contain much more characters pairs than before and every baseN encoded key, that is not human friendly and for this reason rarely wrote in original code, could be one more byte inside final compressed string.

Finally, client side compressors are not, usually, 100% compatible with every kind of code, and this is another reason to prefer minifiers + gzip|deflate to obtain the best, fastest, and finally smallest, result.


Conclusion


Nowadays, pure JavaScript client side runtime decompression technique is not really a necessary, but in some case, it could do things that gzip or deflate will never be able to do, for example merging both JavaScript and CSS using a single and common keywords list to produce a unique file that could contain both JavaScript and CSS with the best size result requiring only 1 download instead of 2 (1 for JavaScript, 1 for CSS). That is what packed.it is able to do since 2007, but never forget speed issue and please remember that more great will be packed code, more delay there will be in every page that will use them.

No comments:

Post a Comment