Saturday, August 13, 2011

How To JSONP A Static File

Update I have discussed this object a part and I agree that the url could be used as unique id as well.

In this case the server should use the static url as unique id:



StaticJSONP.notify("http://cdn.com/static/article/id.js",{..data..});



So that on client side we can use the simplified signature:



StaticJSONP.request(

"http://cdn.com/static/article/id.js",

function (uid, data) {

}

);



The callback will receive the uid in any case so that we can create a single callback and handle behaviors accordingly.

The script has been updated in order to accept 2 arguments but, if necessary, the explicit unique id is still supported.






Under the list of "incomplete and never posted stuff" I found this article which has been eventually reviewed.

I know it's "not that compact" but I really would like you to follow the reason I thought about a solution to a not so common, but quite nasty, problem.



Back in 2001, my early attempts to include callbacks remotely were based on server side runtime compilation of some JavaScript data passed through a single function.



<?php // demo purpose only code



// do something meaningful with server data



// create runtime the output data

$output = '{';

foreach ($data as $key => $value) {

$output .= $key.':"'.$value.'"';

}

$output .= '}';



echo 'jsCallback('.$output.')';



?>



Above technique became deprecated few years ago thanks to the widely adopted JSON protocol and its hundreds of programming languages native/coded implementations.

Moreover, above technique became the wrong way to do it thanks to a definitively better solution as JSONP has been since the very beginning.

Here an example of what JSONP services do today:



<?php // still demo purpose only code



echo $_GET['callback'].'('.json_encode($data).')';



?>





JSONP Advantages

The callback parameter is defined on the client side, which means it can be "namespaced" or it can be unique per each JSONP request.

If we consider the first example every script in the page should rely into a single global jsCallback function.

At that time I was using my code and my code only so problems like conflicts or the possibility that another library would have defined a different jsCallback in the global scope were not existent.

Today I still use "my code and my code only" :D when it comes to my personal projects, but at least I am more than ever aware about multiple libraries conflicts the primordial technique may cause, even if all these libraries are my own one.



JSONP Disadvantages

Well, the same reason that makes JSONP powerful and more suitable technique, is the one that could make JSONP the wrong solution.

If we still consider the first code example, nobody could stop me to be "really smart" and precompile that file into a static one.



// static_service.js by cronjob 2011-08-14T10:00:00.000Z

jsCallback({category:'post',author:'WebReflection',title:'JSONP Limits'});



While precompiled static content may be or may be not what we need for our application/service, it is clear that if no server side language is involved the common JSONP approach will fail due limitations of "the single exit point" any callback in the main page depends on: the jsCallback function.



Advantages Of Precompiled Static Files

The fastest way to serve a file from a generic domain is a static one.

A static file can be both cached into disk memory, rather than be seek and retrieved each time, or directly into server RAM.

Also a static file does not require any programming language involved at all and the only code that will be executed will eventually be the one in charge of serving the file over the network, aka: the HTTP Server.

The most common real world example about static files is represented by a generic CDN where the purpose is indeed to support as many requests per second as possible and where static files are most likely the solution.

The only extra code that would be eventually involved is the one in charge of statistics on the HTTP Server layer but every file can be easily mirrored or stored in any sort of RAID configuration and be served as fast as possible.



Another real world example could be a system like blogger.com where pages do not necessarily need to be served dynamically.

Most of the content in whatever blog system can be precompiled runtime and many services/blog applications are doing it indeed.



Same is for any other application/service that does not require real times data computations and different cron job behind the scene are in charge of refreshing the content every N minutes or more.

If we think about any big traffic website we could do this basic analysis:



# really poor/basic web server performances analysis



# cost of realtime computation

1% of average CPU + RAM + DISK ACCESS per each user

# performances

MAX_USERS = 100;

AVERAGE_MAX_USERS = 100;



# cost of a threaded cron job

20% of average CPU + RAM + DISK ACCESS per iteration

# cost of static file serving

0.1% of CPU + RAM + DISK ACCESS per user

# performances

MAX_USERS_NOCRON = 1000;

MAX_USERS_WHILECRON = 800; # MAX_USERS_NOCRON - 20%

AVERAGE_MAX_USERS = 900;





If we consider that we may chose to delegate the cronjob to a server a part behind the intranet and the only operation per each changed static file will be a LOCK FILE $f EXCLUSIVE, WRITE NEW CONTENT INTO $f, UNLOCK FILE $f EXCLUSIVE so that basically only the DISK ACCESS will be involved, we can even increase AVERAGE_MAX_USERS to 950 or more.

I know this is a sort of off topic and virtual/conceptual analysis but please bear with me, I will bring you there soon.



Static Content And RESTful APIs

There is a huge amount of services out there based on JSONP. Many of them requires realtime but many probably do not. Specially in latter case, I bet nobody is implementing the technique I am going to describe.

A Real World Example

Let's imagine I work for Amazon and I am in charge of the RESTful API able to provide any sort of article related data.

If we think about it, a generic online shopping cart article is nothing more than a group of static info that will rarely change much during the day, the week, the month, or even the year.

Do online users really need to be notified realitme and per each request about current user rating, reviews, related content, article description, author, and any sort of "doesn't change so frequently" related to the article itself? NO.

The only field that should be as much updated as possible is the price but still, does the price change so frequently during the lifecycle of an Amazon article? NO.

Can my infrastructure be so smart that if, and only if, a single field of this article is change the related static file could be updated so that everybody will receive instantly the new info? YES.

... but how can do that if JSONP does not scale with static files ?



My StaticJSONP Proposal

The only difference from a normal JSONP request is that passing through the callback call any sort of library should be able to be notified.

Being the client side library in charge of creating the requested url and having the same library knowledge about what is going to be received and before what is going to ask, all this library needs is to be synchronized with the unique id the static server file will invoke. I am going to tell you more but as quick preview, this is how the static server file will look:



StaticJSONP.notify("unique_request_id", {the:response_data});





Server Side Structure Example

Let's say we would like to keep the folder structure as clear as possible. In this Amazon example we can think about splitting articles by categories.



# / as web server root



/book/102304.js # the book id

/book/102311.js

/book/102319.js



/gadgets/1456.js

/gadgets/4567.js



A well organized folder structure will result in both better readability for humans and easier access for most common filesystems.

Every pre compiled file on the list will contain a call to the global StaticJSONP object, e.g.



// book id 102311

StaticJSONP.notify("amazon_apiv2_info_book_102311",{...data...});





The StaticJSONP Object

The main, and only, purpose of this tiny piece of script that almost fits in a tweet once minzipped (282 bytes) is to:


  • let any library, framework, custom code, be able to request a static file


  • avoid multiple scripts injection / concurrent JSONP for the same file if this has not been notified yet


  • notify any registered callback with the result




Here an example of a StaticJSONP interaction on the client side:



var

// just as example

result = [],



// library 1

client1 = function (uri, uid, delay) {

function exec() {

StaticJSONP.request(uri, uid, function (uid, evt) {

result.push("client1: " + evt.data);

});

}

delay ?

setTimeout(exec, delay) :

exec()

;

},



// library 2

client2 = function (uri, uid, delay) {

function exec() {

StaticJSONP.request(uri, uid, function (uid, evt) {

result.push("client2: " + evt.data);

});

}

delay ?

setTimeout(exec, delay) :

exec()

;

}

;

// library 1 does its business

client1("static/1.js", "static_service_1", 250);

// so does library 2

client2("static/2.js", "static_service_2", 250);



setTimeout(function () {

// suddenly both requires same service/file

client1("static/3.js", "static_service_3", 0);

client2("static/3.js", "static_service_3", 0);



setTimeout(function () {

alert(result.join("\n"));

}, 500);

}, 1000);



It is possible to test the live demo ... just wait a little bit and you will see this alert:



// order may be different accordingly

// with website response time x file

client1: 1

client2: 2

client1: 3

client2: 3



If you monitor network traffic you will see that static/3.js is downloaded only once.

If the response is really big and the connection not so good ( 3G or worse than 3G ) it may happen that same file is required again while the first request is not finished yet.

Since the whole purpose of StaticJSONP is to simplify server side life any redundant request will be avoided on the client side.



The Unique ID ...

StaticJSONP can be easily integrated together with normal JSONP service.

As example, if we need to obtain the list of best sellers, assuming this list is not static due too frequent changes, we can do something like this:



// this code is an example purpose only

// it won't work anywhere



// JSONP callback to best sellers

JSONP("http://amazon/restful/books/bestSellers", function (evt) {

// the evt contains a data property

var data = evt.data;



// data is a list of books title and ids

for (var i = 0, li = []; i < data.length; i++) {

li[i] = '<a href="javascript:getBookInfo(' + data[i].id + ')">' + data[i].title + '</a>';

}



// show the content

document.body.innerHTML = '<ul><li>' + li.join('</li><li>') + '</li></ul>';





});



// the function to retrieve more info

function getBookInfo(book_id) {

StaticJSONP.request(



// the url to call

"http://amazon/restful/static/books/" + book_id + ".js",



// the unique id accordingly with the current RESTful API

"amazon_apiv2_info_book_" + book_id,



// the callback to execute once the server respond

function (uid, evt) {

// evt contain all book related data

// we can show it wherever we want

}

);

}



Now just imagine how many users in the world are performing similar requests right now to the same list of books, being best sellers ...



Unique ID Practices

It is really important to understand the reason StaticJSONP requires a unique id.

First of all it is not possible, neither convenient, to "magically retrieve it from the url" because any RESTful API out there may have a "different shape".

The unique id is a sort of trusted, pre-agreeded, and aligned information the client side library must be aware of since there is no way to change it on the server side, being the file created statically.

It is also important to prefix the id so that debugging will be easier on client side.

However, the combination to generate the unique id itself may be already ... well, unique, so it's up to us on both client and server side to define it in a possibly consistent way.

The reason I did not use the whole uri + id info on StaticJSONP request method is simple:

if both gadgets/102.js and books/102.js contains a unique 102 id there is no way on the client side to understand which article has been required and both gadgets and books registered callbacks will be notified, one out of two surely with the wrong data.

It's really not complicated to namespace a unique id prefix and this should be the way to go imho.



Conclusion

It's usually really difficult to agree unanimously to a solution for a specific problem and I am not expecting that from tomorrow everyone will adopt this technique to speed up server side file serving over common "JSONP queries" but I hope you understood the reason this approach may be needed and also how to properly implement a solution that does not cause client side conflicts, that scales, that does not increase final application size in any relevant way, and it's ready to go for that day when, and if, you gonna need it. Enjoy

No comments:

Post a Comment