Technology: August 2011

Saturday, August 27, 2011

OS X Lion Automator And Mobile NOKIA Maps

When I have read this article about Making Desktop Webapps in Lion my first thought was "cool!" instantly followed by "what about an experiment with Mobile NOKIA Maps WebApp?" ... and here I come :)

m.maps.nokia.com

is the beta project I am working on right now together with a bounce of HTML5 geeks :P in order to bring the mature NOKIA Maps experience on Android 2.2+, iOS4+, and others already supported or "coming soon" devices.

Optimized for mobile but still usable with Desktop Chrome or Safari browser, the web app is quite "cute" seen in iPhone or other medium and small screens and this experiment was about bringing same "cuteness" on my Mac Mini as well: partially successful!

Lion Automator And WebPopup Limits

Unfortunately it is not possible to customize that much the popup browser user agent and I am not even sure what kind of engine is used there ...

With iPhone UserAgent the version exposed is 3 and Webkit 430+.

When it comes to iPad UserAgent the version is 4 while with Safari UA the version is the current one.

GeoLocation API does not seem to work, and the cache seems to be cleaned every time the app is closed.

Unfortunately these limits make the current beta less cool than usual, specially because every time the app is closed the storage seems to be reset which means last position is not shown next time, history and suggestions do not show off and even more annoying the routing "home to place" is not available due missing location.

Grab The Desktop App For OSX Lion

I have prepared everything you need to launch m.maps.nokia.com in your OSX Lion so you can give it a try.

Mobile NOKIA Maps For Automator and if necessary extract its content and click on the .app file.

I swear I did nothing different from what Andy Ihnatko described in its article, except changing the icon with the one downloaded automatically on iPad if you pin the website to your home screen.

If you are in OSX Lion give it a try and play around but bear in mind this beta offers much more on your smartphone ;)

Wednesday, August 24, 2011

Simulate Script Injection Via Data URI

Well, not only downloads on the fly, the data uri works for almost everything ( only iOS 5 beta does not want to work with inline data uri AUDIO sources .... but this is another story ... ) ... so ...

How To Simulate Script injection

Let's say you want a test but you don't want to bother a server. However, you want to be sure the test is asynchronous and it simulates the server.



var

    head = document.getElementsByTagName("head")[0],

    script = document.createElement("script")

;

head.insertBefore(script, head.lastChild);

script.src = "data:text/javascript;base64," + btoa(

    "alert('Hello World')"

);

How To Simulate JSONP

Same trick, isn't it? ... except:



script.src = "data:text/javascript;base64," + btoa(

    "callback(" + JSON.stringify(dummyData) + ")"

);

How To Drop Server Requests

well, this is the tricky one ...

Surely there is some job to do in the createResponse() function but ... hey, we can stop bothering servers now ;)

Sunday, August 21, 2011

wru, unit tests have ever been that easy

Do you remember my good old wru project? It has been refactored, readapted for both client and server side environment such node.js and Rhino and, most important, it landed in github ;)

Please spend few minutes to read the documentation and you'll realize why I chose this title for this post.

Have fun with JavaScript Unit Tests!

Saturday, August 20, 2011

Overloading the in operator

In all its "sillyness", the CoffeeShit project gave me a hint about the possibilities of an overloaded in operator.

The Cross Language Ambiguity

In JavaScript, the in operator checks if a property is present where the property is the name rather than its value.



"name" in {name:"WebReflection"}; // true

However, I bet at least once in our JS programming life we have done something like this, expecting a true rather than false.



4 in [3, 4, 5]; // false

// Array [3, 4, 5] has no *index* 4

The Python Way

In Python, as example, last operation is perfectly valid!



"b" in ("a", "b", "c") #True

4 in [3, 4, 5]         # True

The behavior of Python in operator is indeed more friendly in certain situations.

value in VS value.in()

What if we pollute the Object.prototype with an in method that is not enumerable and sealed? No for/in loops problems, neither cross browsers issues since if it's possible in our target environment, we do it, otherwise we don't do it ... a fair compromise?

Rules

if the target object is an Array, check if value is contained in the Array ( equivalent of -1 < target.indexOf(value) )

if the target object is an Object, check if value is contained in one of its properties ( equivalent of for/in { if(object[key] === value) return true; } ).

I don't think nested objects should be checked as well and right now these are not ( same as native Array#indexOf ... if it's an array of arrays internal arrays values are ignored and that's how it should be for consistency reason )

if the target object is a typeof "string", check if value is a subset of the target ( equivalent of -1 < target.indexOf(value) ).

The Python logic on empty string is preserved since in JavaScript "whateverStringEvenEmpty".indexOf("") is always 0

if the target property is a typeof "number", check if value is a divisor of the target ( as example, (3).in(15) === true since 15 can be divided by 3)

I didn't came out with other "sugarish cases" but feel free to propose some.

Compatibility

The "curious fact" is that in ES5 there is no restrictions on an IdentifierName.

This is indeed different from an Identifier, where in latter ReservedWord is not allowed.

"... bla, bla bla ..." ... right, the human friendly version of what I've just said is that obj.in, obj.class, obj.for etc etc are all accepted identifiers names.

Accordingly, to understand if the current browser is ES5 specs compliant we could do something like this:



try {

    var ES5 = !!Function("[].try");

} catch(ES5) {

    ES5 = !ES5;

}



alert(ES5); // true or false

Back in topic, IE9 and updated Chrome, Firefox, Webkit, or Safari are all compatible with this syntax.

New Possibilities

Do you like Ruby syntax?



// Ruby like instances creation (safe version)

Function.prototype.new = function (

    anonymous, // recycled function

    instance,  // created instance

    result     // did you know if you

               // use "new function"

               // and "function" returns

               // an object the created

               // instance is lost

               // and RAM/CPU polluted

               // for no reason?

               // don't "new" if not necessary!

) {

    return function factory() {

        

        // assign prototype

        anonymous.prototype = this.prototype;

        

        // create the instance inheriting prototype

        instance = new anonymous;

        

        // call the constructor

        result = this.apply(instance, arguments);

        

        // if constructor returned an object

        return typeof result == "object" ?

            // return it or return instance

            // if result is null

            result || instance

            :

            // return instance

            // in all other cases

            instance

        ;

    };

}(function(){});



// example

function Person(name) {

    this.name = name;

}



var me = Person.new("WebReflection");

alert([

    me instanceof Person,       // true

    me.name === "WebReflection" // true

].join("\n"));

Details on bad usage of new a part, this is actually how I would use this method to boost up performances.



// Ruby like instances creation (fast version)

Function.prototype.new = function (anonymous, instance) {

    return function factory() {

        anonymous.prototype = this.prototype;

        instance = new anonymous;

        this.apply(instance, arguments);

        return instance;

    };

}(function(){});

cool?

Do Not Pollute Native Prototypes

It does not matter how cool and "how much it makes sense", it is always considered a bad practices to pollute global, native, constructors prototypes.

However, since in ES5 we have new possibilities, I would say that if everybody agrees on some specific case ... why not?

for/in loops can now be safe and some ReservedWord can be the most semantic name to represent a procedure as demonstrated in this post.

Another quick example?



// after Function.prototype.new

// after Person function declaration

Object.defineProperty(Object.prototype, "class", {

    enumerable: !1,

    configurable: !1,

    get: function () {

        return this.__proto__.constructor;

    }

});



var me = Person.new("WebReflection");



// create instance out of an instance

var you = me.class.new("Developer");



alert([

    you instanceof Person,   // true

    you.name === "Developer" // true

].join("\n"));

I can already see Prototype3000 farmework coming out with all these magic tricks in place :D

Have fun with ES5 ;)

Friday, August 19, 2011

CoffeeShit, The CoffeeScript Parody

    ... because script happens!

This silly project does not want to offend the CoffeeScript creator neither any CoffeeScript user, hoping both have sense of humor :D

About

Every cult movie has one or more parodies ... well, every cult technology as well (or at least it should)!

This idiotic project is on github with a better explanation, a suite of unit tests, and both source code and the partially manually minified one ( manually because I have discovered a nice bug with closure compiler and eval there ... )

About The Name

You can check the library by yourself and realize I have used all worst possible practices in order to simulate a different syntax within the one allowed by JavaScript.

Such pile of shit I wrote to mimic an excellent project as CoffeeScript is could not have a better name, imo.

If you feel insulted by this idiotic experiment please let me know and I'll do my best to let people try it under a different name.

Example



// CoffeeScript

square = (x) -> x * x

fill = (container, liquid = "coffee") ->

    "Filling the #{container} with #{liquid}..."

mood = greatlyImproved if singing

eat food for food in ['toast', 'cheese', 'wine']

winner = yes if pick in [47, 92, 13]

speed ?= 75



// CoffeeShit

square = 'x'['->']('x * x')

fill = ['container', 'liquid = "coffee"']['->'](

    'Filling the #{container} with #{liquid}...'

)

mood = greatlyImproved.if(singing)

'eat(food)'.for('food').in(['toast', 'cheese', 'wine'])

winner = yes.if(pick.in([47, 92, 13]))

'speed'['?='](75)

Many more in the landing page, even more inside unit tests file.

As Summary

I really could not resist :D

This post aim is to let you write some comment, have fun with CoffeeShit!

Thursday, August 18, 2011

HTML5: How To Create Downloads On The Fly

this is a quick one I have implemented already in fuckn.es in the create angry memory button logic ...

The New Download Attribute

Hopefully soon, most updated browser will implement the download attribute in hypertext links (aka: <a> tag)

The quick summary is this one:

The download attribute, if present, indicates that the author intends the hyperlink to be used for downloading a resource. The attribute may have a value; the value, if any, specifies the default filename that the author recommends for use in labeling the resource in a local file system.

And this is a basic example:



click here to

<a

    href="resource234.txt"

    download="license.txt"

>

    download the license

</a>

What Is Download For

Well, I am pretty sure you have read at least once in your life this kind of extra info beside a link:

right click to download the content and "Save As ..."

Moreover, I am pretty sure you have created at least once in your server a page able to force a generic file download.

All these instructions and server side headers/files may disappear thanks to this new attribute, also because there's no such thing as "right button" on touch screens, neither in some newer device pointer.

If the file is meant to be download, it will ... cool?

Create Downloads On The Fly Via JavaScript

When I have read about it, I have instantly realized the potentials of this attribute combined with inline data uri scheme.

Only HTML5 ?

Nope! Please note that many browsers let us already practice this technique. Some may open the file in a new blank page while some other may download directly the file as is for Chrome and the CSV example.

As graceful degradation, the "right click" procedure will still do the trick.

Downlaod Canvas As Image Example

First example is a classic one: how to save a canvas snapshot as image via "click".



// basic example

function createDownloadLink(canvas, name) {

    var a = document.createElement("a");

    a.download = name;

    a.title = "download snapshot";

    a.href = canvas.toDataURL();

    return a;

}



// some paragraph in the page

document.querySelector(

    "p.snapshot"

).appendChild(createDownloadLink(

    document.querySelector("#game"),

    "snapshot" + (-new Date) + ".png"

));

When the user will tap/click on the link, the browser will simply start the download. No server side involved at all!

Save A Page As PDF

Thanks to this technique we may use same trick to produce a PDF file out of whatever web page.



// basic example

function createPDFLink(fileName) {

    var doc = new pdf();

    // whatever content you want to download

    var a = document.createElement("a");

    a.download = fileName;

    a.title = "download as PDF";

    a.href = doc.output('datauri',{"fileName":name});

    return a;

}



// some paragraph in the page

document.querySelector(

    "p.saveaspdf"

).appendChild(createPDFLink(

    "document-" + document.title + ".pdf"

));

Of course if the page content changes we can replace the old link with a freshly new created one.

Save Table As CSV

Well, another classic here, the csv format out of a table. This is a basic but working example ;)



<script>

// really basic example

function tableToCSV(table) {

    for (var

        header = table.querySelectorAll("tr th"),

        rows = table.querySelectorAll("tr td"),

        hlength = header.length,

        length = hlength + rows.length,

        result = Array(hlength),

        i = hlength,

        j;

        i < length; ++i

    ) {

        j = i % hlength;

        j || result.push("\n");

        result.push(rows[j].innerHTML);

        ++j % hlength && result.push(",");

    }

    i = 0;

    while (i < hlength) {

        result[i] = header[i].innerHTML + (

            ++i < hlength ? "," : ""

        );

    }

    return result.join("");

}

this.onload = function () {

    var a = document.body.appendChild(

        document.createElement("a")

    );

    a.download = "table.csv";

    a.href = "data:text/csv;base64," + btoa(

        tableToCSV(document.querySelector("table"))

    );

    a.innerHTML = "download csv";

};

</script>

<table>

    <tr>

        <th>name</th>

        <th>age</th>

    </tr>

    <tr>

        <td>Dan</td>

        <td>33</td>

    </tr>

    <tr>

        <td>John</td>

        <td>32</td>

    </tr>

</table>

Most likely we can test already above example even if the name won't probably be the chosen one.

For safe base64 encode, compatible with UTF-8 pages, have a look at this script ( base64.encode() and base64.decode() ).

Compatibility

Different developers asked already about compatibility.

As I have said before, we need to differentiate between "download" attribute compatibility AND inline data uri link compatibility.

In the first case I don't know browsers that will force the download with the specified name yet, but I'll update this section as soon as I know someone.

In the latter case, IE9, Chrome, Firefox, Safari, Webkit based, and Opera seem to be already compatible.

The main problem/limit I have spotted in fuckn.es is the size of the data uri, in certain cases we may need a decent/fast machine otherwise we may end up killing RAM and CPU performances.

IE8 is compatible as well except IE8 has limited data uri for CSS images, as example, and I expect same limit for this technique.

Bear in mind when all browsers will be compatible, we will still have data stream limit problem, or better, really big files has to be parsed on the fly in one shot, no "download progress" possibility.

As Summary

... now you know ... ;)

Wednesday, August 17, 2011

JSONH And Hybrid JS Objects

I have already described JSONH and now I also have the proof that it's as safe as native JSON is but on average 2X faster than native JSON operations with both small (10 objects), medium (100 objects), and massive (5000 objects and not a real world case, just a stress test to see how much JSONH scales) homogenous collections.

Wherever it's not faster it's just "as fast" but the best part is that it seems to be always faster on slower machines ( mobile ).

Moreover, the 5000 objects stress example shows that JSONH.stringify() produces a string with 54% of original JSON.stringify() size so here the summary: JSONH is faster on both compression and decompression plus it produces smaller output

yeah but ... what About Hybrid Objects

To start with, if you don't recognize/understand what is an homogenous collection and ask me: "what about nested objects?", all I can do is to point you out that Peter Michaux explained this years before me.

Have a look there and please come back after the "aaaaahh, got it: right!"

Hybrid Objects

Nowadays JSON is used everywhere but not everywhere with homogeneous collections. A simple example to screw up JSONH possibility is an object like this:



// result of a RESTful service, Ajax, query

// once again about generic articles: book!

var result = {

    category: "books",

    subcategory: "fantasy",

    description: [

        {

            title: "The Lord Of The Rings",

            description: "Learn about the darkness"

        }, {

            title: "The Holy Bible",

            description: "Learn about both light and darkness"

        },

        // all other results out of this list

    ]

};

If we receive an object with one or more properties containing an homogeneous collection, as is description in above example, we may already decide to use JSONH advantages.

JSONH On Hybrid Objects

It's that easy!



// before we send/store/write data on output

result.description = JSONH.pack(result.description);

print(JSON.stringify(result));

If the client is aware about the fact one or more specific property is an homogeneous collection, to obtain the original object we can do this:



// stringifiedResult as XHR responseText

var obj = JSON.parse(stringifiedResult);

obj.description = JSONH.unpack(obj.description);



// or simply via JSONP callback

data.description = JSONH.unpack(data.description);

For the same reason JSONH is faster than JSON, this operation will grant us less bandwidth to both send or receive objects, and faster conversion performances.

As Summary

I am willing to think soon about a possible schema able to describe homogeneous collections properties out of an object ... a sort of JSONH "mapper" to automate the procedure on both server side and client side and any suggestion will be more than welcome.

At least so far we know already how to adopt this solution :)

Tuesday, August 16, 2011

Last Version Of JSON Hpack

Update

created github repository with (currently) JavaScript, PHP5 and Python versions.

Update after quick chat on twitter with @devongovett who pointed out there is a similar standard called JSONDB I have created a JSONH(Flat) version. It looks slightly faster on mobile so I may opt for this one rather than Array of keys at index 0.

The whole array is flat and it changes from [{a:"A"},{a:"B"}] to [1,"a","A","B"] where the empty collection would be [0] rather than [[]].

Also more details here on how to JSONH Hybrid JS Objects.

A while ago I proposed a homogeneous collections optimizer nick named JSON.hpack.

What I wasn't expecting is that actually different projects and developers adopted this technique to shrink down JSON size.

Basic Advantage Of JSON.hpack

Gzip and deflate work really good with repeated chunks of strings and this is why homogeneous collections have really good compression ratio there.

However, gzip and deflate compression does not come for free!

If we compress everything on server side we can easily test the CPU overload compared with uncompressed data.

Via JSON.hpack we can still serve small or huge amount of dynamic and static data without necessarily use realtime compression.

Basic Notions On Compressors

There is no ideal algorithm yet for compressed data, any of them has pros and cons.

A really good compression ratio may cost a lot and the algorithm is efficient if at least decompression is fast. 7-Zip is one example, it takes more than normal zip to create a single file, but final ratio is usually much better and decompression extremely fast.

An incremental compressor as GIF is is both fast to encode and fast to decode. However, it's average compression ratio is really poor compared with PNG, which again is not fast as GIF to encode, but almost as fast as GIF to decode and capable of bringing much more complex data inside.

On the client side we may like a truly fast compressor in order to send data to the server where more horses power can decompress in a reasonable time. Still servers have not unlimited resources.

My Latest Improvements Over JSON.hpack

On the web is all about network layer latency, completely unpredictable specially in these smartphones/pads days.

We also need to consider high traffic, if things go really well, and most important mobile platforms computation power, basically the equivalent of a Pentium 3 with a GeForce card from 2001.

Which Is The Best Compromise

The original version of JSON.hpack is able to understand which compression level is the best one for the current collection of objects. Unfortunately this is slow both on the server side and even more on the client side.

In my opinion an intermediate layer as JSON.hpack is should bring advantages as fast as possible in both client and server.

I probably failed the first time because I was more focused on coolness rather than efficiency.

As example, if it takes 3X CPU load to save 5% of bytes compared with the most basic compression ratio, something is wrong because it's simply not worth it.

As summary, the best compromise for the latest version of this compressor is to be freaking fast with small overhead and providing a good average compression ratio.

Welcome JSONH

In a single call this object is able to pack and unpack homogeneous collections faster than native JSON and specially on mobile platforms.

How Is It Possible

To be honest I have no idea and I was surprised as well. All I could think about is the fact that JSONH makes data flat which means no recursions per each object in the original list.

This seems to boost up performances while packing and make JSON.parse life easier while unpacking.

The extreme simplification of the algorithm may have helped a lot as well.

JSONH Source Code

now on github!

~~And I had no time to create the equivalent C#, PHP, and Python version~~.

In any case you can see how simple is the logic and I bet anybody can easily reproduce those couple of loops in whatever programming language it is.

The minzipped size is 323 bytes but advantages over network calls can be massive. As example, if we check the console and the converted size in the test page, we can see the JSONH version of the same collection is 54% smaller ... and for a faster stringify and parse? ... it cannot be that good, isn't it :)

JSONH Is Suitable For

any RESTful API that returns homogenous collections

gzip on the fly costs too much due high traffic

map applications and routes, [{"latitude":1.23,"longitude":5.67},{"latitude":2.23,"longitude":6.67}]

will be [["latitude","longitude"],1.23,5.67,2.23,6.67]

any other case I am not thinking about right now

As Summary

It is good to take old projects created a while ago and think what could be done better in current days. It's both about re-thinking with different skills and experience over real world cases. I am not sure I made everybody happy with this latest version but I am pretty sure I won't ask client or server side to be slower than native JSON + native gzip compression since at that point all advantages will be simply lost.

This revisited version of JSONH is surprisingly faster, smaller, and easier to implement/maintain than precedent one so ... enjoy if you need it ;)

Sunday, August 14, 2011

Once Again On Script Loaders

It's a long story I would like to summarize in few concrete points ...

Three Ways To Include A Script In Your Page

First of all, you may not need loaders at all.

Most likely you may need an easy to go and cross platform build process, and JSBuilder is only one of them.

The Most Common Practice

This way lets users download and visualize content first but it lets developers start the JS logic ASAP as well without mandatory need to set DOMContentLoaded or onload event.



<!doctype html>

    <head>

        <!-- head content here -->

    </head>

    <body>

        <!-- body content here -->

        <script src="app.minzipped.js">/* app here */</script>

    </body>

</html>

The "May Be Better" Practice

I keep saying that a web application that does not work without JavaScript should never be accessed by a user. As example, if a form won't submit without JavaScript what's the point to show it before it can possibly work? If your page strongly depends on JavaScript don't be afraid to let the user wait slightly more before the layout is visualized. The alternative is a broken experience as welcome. Accordingly, use the DOMContentLoaded listener over this ordered layout:



<!doctype html>

    <head>

        <!-- head content here -->

        <script src="app.minzipped.js">/* app here */</script>

    </head>

    <body>

        <!-- body content here -->

    </body>

</html>

If you don't trust the DOMContentLoaded listener you can combine both layouts:



<!doctype html>

    <head>

        <!-- head content here -->

        <script src="app.minzipped.js">/* app here */</script>

    </head>

    <body>

        <!-- body content here -->

        <script>initApp();</script>

    </body>

</html>

The Optional "defer" Attribute

We can eventually try to avoid the blocking problem using a defer attribute. However, this attribute is not yet widely supported cross browser and the result may be unexpected.

Since this attribute is basically telling the browser to do not block downloads, in the very next future it could be specified both on head script or before the end of the body.

Everything I have said about possible broken UX is still valid so ... use carefully.

The Loading Practice

Classic example is twitter on mobile browsers and any native application with a loading bootstrap screen. Also Flash based websites use this technique since ages and users are used to it.

If the amount of javascript plus CSS and assets is massive, both precedent techniques will fail.

The first one will fail because the user doesn't know when the script will be loaded plus it's blocking so the page won't respond. Bye bye user.

The second approach will result into a too long waiting time over a blank page ... bye bye user.

This loading approach will entertain the user for a little while, it will be lightweight, fast to visualize, and it can hold the user up to "5 seconds" with cleared cache ( and hopefully much less next time with cache but if more we should really think to split the logic and lazy load as much as possible ).



<!doctype html>

    <head>

        <!-- head content here -->

        <!-- most basic CSS -->

    </head>

    <body>

        <!-- most basic content -->

        <!-- "animated gif" or loader spin -->

        <script src="bigstuff.minzipped.js">/* code */</script>

        <!-- optional BIG CSS -->

    </body>

</html>

This page should be as attractive as possible and no interaction that depends on JavaScript should be shown.

Why Scripts Loaders

Because an articulated website may have articulated logic split in different files.

The main page may rely into jQuery, commonLogic, mainPageLogic.

Any sub section in the site may depend on jQuery, commonLogic, subSectionLogic, adHocSectionLogic, etc.

The build approach will fail big time here because every page will contain a different script to download in all its variants.

Moreover, thanks to CDN some library can be cached cross domain, including as example jQuery from that CDN.

In this scenario a script loader is the best solution:



$LAB

    .script("http://commoncdn.com/jquery")

    .script("commonLogic.js")

    .wait()

    .script("subSectionLogic.js")

    .wait()

    .script("adHocSectionLogic.js")

    .wait(function () {

        // eventually ready to go

        // in this section

    })

;

Above example is based on LAB.js, a widely adopted library I have actually indirectly contributed as well solving one conflict with jQuery.ready() method.

script() and wait()

LAB.js has been created with performances in mind where every script will be pre downloaded as soon as it's defined in the chained logic.

The wait() method is a sort of "JS interpretation break point" and it's really useful when a script depends on another script.

Let's say commonLogic is just a set of functions while subSectionLogic starts with a jQuery.ready(function () { ... }) call, LAB.js will ensure that latter script won't be executed until jQuery is ready ... got it?

LAB.js size once minzipped is about 2.1Kb and the best way to use it is to include LAB.js as very first script in whatever page.

AFAIK LAB.js is not yet hosted in any major CDN but I do believe that will happen soon.

Preload Compatibility

LAB.js uses different techniques to ensure both pre downloads and wait() behavior. Unfortunately some adopted fallback looks inevitably weak to me.

As example, I am not a big fun of "empty setTimeouts" solutions since these are used as workaround over unpredictable behaviors.

~~One of these behaviors is the readyState script property that on "complete" state may have or may have not already interpreted the script on "onreadystatechange" notification.~~

If we have a really slow power machine, as my netbook is, the timeout used to decide that the script has been already parsed may not be enough.

I don't want to bother you with details, I simply would like you to understand why I came out with an alternative loader.

Before I reach that point I wanna show an alternative technique to get rid of wait() calls.

Update

It looks like few setTimeout calls will be removed soon plus apparently the setTimeout I pointed out has nothing to do with wait: my bad.

In any case I don't fancy empty timers plus LAB.js logic is focused on cross browser parallel pre-downloads and for this reason a bit more bigger in size than all I needed for my purpose.

Avoiding wait() Calls

JavaScript let us successfully download and parse scripts like this without problems:



function initApplication() {

    jQuery.ready(function () {

        // whatever we need to do

    });

}

Please note that no error will be thrown even if jQuery has not been loaded yet.

The only way to have an error is to invoke the function initApplication() without jQuery library in the global scope.

In few words, we are not in Java or C# world where the compiler will argue if some namespace is accessed in any defined method and not present/included as dependency before ... we are in JavaScript, much cooler, isn't it? ;)

Accordingly, if the current page initialization is wrapped in a single function we could simply use a single wait call at the end.



$LAB // no direct jQuery calls in the global scope

    .script("http://commoncdn.com/jquery")

    .script("commonLogic.js")

    .script("subSectionLogic.js")

    .script("adHocSectionLogic.js")

    .wait(function () {

        initApplication();

    })

;

The potential wait() problem I am worried about is still there but at least in a single exit point rater than distributed through the whole loading process ... still bear with me please.

The Namespace Problem

The generic init function can be part of a namespace as well. If we have namespaces the problem is different 'cause we cannot assign my.namespace.logic.init = function () {} before my.namespace object has been defined.

In this case we either create a global function for each namespace initialization/assignment or we impose a wait() call between every included namespace based file.

yal.js - Yet Another ( JavaScript ) Loader

Update

yal.js now on github

As written in yal.js landing page I have been dealing with JS loaders for a while.

This library created a sort of "little twitter war" between me and @getify where Kyle main arguments were "why another loader?" followed by "LAB.js has better performances".

Why yal.js

It's really a tiny script that took me 1 hour tests included plus 20 minutes of refactoring in order to implement a sort of "forced preload" alternative ( which kinda works but I personally don't like and neither does Kyle ).

yal.js is just an alternative to LAB.js and we all like alternatives, don't we?

The main focus of yal.js is being as small and as cross browser as possible using KISS and YAGNI principles.

No Empty Timers Essential Script Logic

yal.js is based on script "onload" event which behavior is already defined as standard and it's widely compatible.

If not usable in some older browser, the more reliable "loaded" state of readyState property is used instead. This state comes always after the "loading" or "complete" one.

I could not trigger any crash or problem wit this approach and together with next point no need to use unpredictable timers.

Simplified Wait Logic

In the basic version of the script any wait() call will block other scripts. These won't be pre downloaded until the previous call has been completed.

However, if we consider we may not even need wait calls:



yal // no direct jQuery calls in the global scope

    .script("http://commoncdn.com/jquery")

    .script("commonLogic.js")

    .script("subSectionLogic.js")

    .script("adHocSectionLogic.js")

    .wait(function () {

        initApplication();

    })

;

yal will perform parallel downloads same way LAB.js does and, being yal just 1.5Kb smaller, performances will be slightly better on yal rather than LAB.js

Also for my bad experience with "complete" state, I feel a bit more secure with the fact that when wait() is invoked in yal.js, everything before has been surely already interpreted and executed ( but please prove me wrong if you want with a concrete example I can test online, thanks )

Just What I Need

For my random and sporadic personal projects yal.js fits all my requirements. I do not use the forced parallel downloads and I don't care. I have asked Kyle to be able to grab a subset of LAB.js getting rid of all extra features surely useful for all possible cases out there but totally unnecessary for mine. Unfortunately that would not have happened any soon so I created the simplest solution for all I personally needed.

As Summary

I am actually sorry Kyle took my little loader as a "non sense waste of time" and if that's what you think as well or if you need much more from a loader, feel free to ignore it and go happily with LAB.js

Also I am not excluding that the day LAB.js will be in any major CDN I will start using it since at that point there won't be any overhead at all and cross domain.

Finally, in this post I have tried to summarize different techniques and approaches to solve a very common problem in this RIA era, hope you appreciated.

Saturday, August 13, 2011

How To JSONP A Static File

Update I have discussed this object a part and I agree that the url could be used as unique id as well.

In this case the server should use the static url as unique id:



StaticJSONP.notify("http://cdn.com/static/article/id.js",{..data..});

So that on client side we can use the simplified signature:



StaticJSONP.request(

    "http://cdn.com/static/article/id.js",

    function (uid, data) {

    }

);

The callback will receive the uid in any case so that we can create a single callback and handle behaviors accordingly.

The script has been updated in order to accept 2 arguments but, if necessary, the explicit unique id is still supported.

Under the list of "incomplete and never posted stuff" I found this article which has been eventually reviewed.

I know it's "not that compact" but I really would like you to follow the reason I thought about a solution to a not so common, but quite nasty, problem.

Back in 2001, my early attempts to include callbacks remotely were based on server side runtime compilation of some JavaScript data passed through a single function.



<?php // demo purpose only code



// do something meaningful with server data



// create runtime the output data

$output = '{';

foreach ($data as $key => $value) {

    $output .= $key.':"'.$value.'"';

}

$output .= '}';



echo 'jsCallback('.$output.')';



?>

Above technique became deprecated few years ago thanks to the widely adopted JSON protocol and its hundreds of programming languages native/coded implementations.

Moreover, above technique became the wrong way to do it thanks to a definitively better solution as JSONP has been since the very beginning.

Here an example of what JSONP services do today:



<?php // still demo purpose only code



echo $_GET['callback'].'('.json_encode($data).')';



?>

JSONP Advantages

The callback parameter is defined on the client side, which means it can be "namespaced" or it can be unique per each JSONP request.

If we consider the first example every script in the page should rely into a single global jsCallback function.

At that time I was using my code and my code only so problems like conflicts or the possibility that another library would have defined a different jsCallback in the global scope were not existent.

Today I still use "my code and my code only" :D when it comes to my personal projects, but at least I am more than ever aware about multiple libraries conflicts the primordial technique may cause, even if all these libraries are my own one.

JSONP Disadvantages

Well, the same reason that makes JSONP powerful and more suitable technique, is the one that could make JSONP the wrong solution.

If we still consider the first code example, nobody could stop me to be "really smart" and precompile that file into a static one.



// static_service.js by cronjob 2011-08-14T10:00:00.000Z

jsCallback({category:'post',author:'WebReflection',title:'JSONP Limits'});

While precompiled static content may be or may be not what we need for our application/service, it is clear that if no server side language is involved the common JSONP approach will fail due limitations of "the single exit point" any callback in the main page depends on: the jsCallback function.

Advantages Of Precompiled Static Files

The fastest way to serve a file from a generic domain is a static one.

A static file can be both cached into disk memory, rather than be seek and retrieved each time, or directly into server RAM.

Also a static file does not require any programming language involved at all and the only code that will be executed will eventually be the one in charge of serving the file over the network, aka: the HTTP Server.

The most common real world example about static files is represented by a generic CDN where the purpose is indeed to support as many requests per second as possible and where static files are most likely the solution.

The only extra code that would be eventually involved is the one in charge of statistics on the HTTP Server layer but every file can be easily mirrored or stored in any sort of RAID configuration and be served as fast as possible.

Another real world example could be a system like blogger.com where pages do not necessarily need to be served dynamically.

Most of the content in whatever blog system can be precompiled runtime and many services/blog applications are doing it indeed.

Same is for any other application/service that does not require real times data computations and different cron job behind the scene are in charge of refreshing the content every N minutes or more.

If we think about any big traffic website we could do this basic analysis:



# really poor/basic web server performances analysis



# cost of realtime computation

1% of average CPU + RAM + DISK ACCESS per each user

# performances

MAX_USERS = 100;

AVERAGE_MAX_USERS = 100;



# cost of a threaded cron job

20% of average CPU + RAM + DISK ACCESS per iteration

# cost of static file serving

0.1% of CPU + RAM + DISK ACCESS per user

# performances

MAX_USERS_NOCRON = 1000;

MAX_USERS_WHILECRON = 800; # MAX_USERS_NOCRON - 20%

AVERAGE_MAX_USERS = 900;

If we consider that we may chose to delegate the cronjob to a server a part behind the intranet and the only operation per each changed static file will be a LOCK FILE $f EXCLUSIVE, WRITE NEW CONTENT INTO $f, UNLOCK FILE $f EXCLUSIVE so that basically only the DISK ACCESS will be involved, we can even increase AVERAGE_MAX_USERS to 950 or more.

I know this is a sort of off topic and virtual/conceptual analysis but please bear with me, I will bring you there soon.

Static Content And RESTful APIs

There is a huge amount of services out there based on JSONP. Many of them requires realtime but many probably do not. Specially in latter case, I bet nobody is implementing the technique I am going to describe.

A Real World Example

Let's imagine I work for Amazon and I am in charge of the RESTful API able to provide any sort of article related data.

If we think about it, a generic online shopping cart article is nothing more than a group of static info that will rarely change much during the day, the week, the month, or even the year.

Do online users really need to be notified realitme and per each request about current user rating, reviews, related content, article description, author, and any sort of "doesn't change so frequently" related to the article itself? NO.

The only field that should be as much updated as possible is the price but still, does the price change so frequently during the lifecycle of an Amazon article? NO.

Can my infrastructure be so smart that if, and only if, a single field of this article is change the related static file could be updated so that everybody will receive instantly the new info? YES.

... but how can do that if JSONP does not scale with static files ?

My StaticJSONP Proposal

The only difference from a normal JSONP request is that passing through the callback call any sort of library should be able to be notified.

Being the client side library in charge of creating the requested url and having the same library knowledge about what is going to be received and before what is going to ask, all this library needs is to be synchronized with the unique id the static server file will invoke. I am going to tell you more but as quick preview, this is how the static server file will look:



StaticJSONP.notify("unique_request_id", {the:response_data});

Server Side Structure Example

Let's say we would like to keep the folder structure as clear as possible. In this Amazon example we can think about splitting articles by categories.



# / as web server root



/book/102304.js # the book id

/book/102311.js

/book/102319.js



/gadgets/1456.js

/gadgets/4567.js

A well organized folder structure will result in both better readability for humans and easier access for most common filesystems.

Every pre compiled file on the list will contain a call to the global StaticJSONP object, e.g.



// book id 102311

StaticJSONP.notify("amazon_apiv2_info_book_102311",{...data...});

The StaticJSONP Object

The main, and only, purpose of this tiny piece of script that almost fits in a tweet once minzipped (282 bytes) is to:

let any library, framework, custom code, be able to request a static file

avoid multiple scripts injection / concurrent JSONP for the same file if this has not been notified yet

notify any registered callback with the result

Here an example of a StaticJSONP interaction on the client side:



var

    // just as example

    result = [],



    // library 1

    client1 = function (uri, uid, delay) {

        function exec() {

            StaticJSONP.request(uri, uid, function (uid, evt) {

                result.push("client1: " + evt.data);

            });

        }

        delay ?

            setTimeout(exec, delay) :

            exec()

        ;

    },



    // library 2

    client2 = function (uri, uid, delay) {

        function exec() {

            StaticJSONP.request(uri, uid, function (uid, evt) {

                result.push("client2: " + evt.data);

            });

        }

        delay ?

            setTimeout(exec, delay) :

            exec()

        ;

    }

;

// library 1 does its business

client1("static/1.js", "static_service_1", 250);

// so does library 2

client2("static/2.js", "static_service_2", 250);



setTimeout(function () {

    // suddenly both requires same service/file

    client1("static/3.js", "static_service_3", 0);

    client2("static/3.js", "static_service_3", 0);

    

    setTimeout(function () {

        alert(result.join("\n"));

    }, 500);

}, 1000);

It is possible to test the live demo ... just wait a little bit and you will see this alert:



// order may be different accordingly

// with website response time x file

client1: 1

client2: 2

client1: 3

client2: 3

If you monitor network traffic you will see that static/3.js is downloaded only once.

If the response is really big and the connection not so good ( 3G or worse than 3G ) it may happen that same file is required again while the first request is not finished yet.

Since the whole purpose of StaticJSONP is to simplify server side life any redundant request will be avoided on the client side.

The Unique ID ...

StaticJSONP can be easily integrated together with normal JSONP service.

As example, if we need to obtain the list of best sellers, assuming this list is not static due too frequent changes, we can do something like this:



// this code is an example purpose only

// it won't work anywhere



// JSONP callback to best sellers

JSONP("http://amazon/restful/books/bestSellers", function (evt) {

    // the evt contains a data property

    var data = evt.data;



    // data is a list of books title and ids

    for (var i = 0, li = []; i < data.length; i++) {

        li[i] = '<a href="javascript:getBookInfo(' + data[i].id + ')">' + data[i].title + '</a>';

    }



    // show the content

    document.body.innerHTML = '<ul><li>' + li.join('</li><li>') + '</li></ul>';





});



// the function to retrieve more info

function getBookInfo(book_id) {

    StaticJSONP.request(

        

        // the url to call

        "http://amazon/restful/static/books/" + book_id + ".js",



        // the unique id accordingly with the current RESTful API

        "amazon_apiv2_info_book_" + book_id,



        // the callback to execute once the server respond

        function (uid, evt) {

            // evt contain all book related data

            // we can show it wherever we want

        }

    );

}

Now just imagine how many users in the world are performing similar requests right now to the same list of books, being best sellers ...

Unique ID Practices

It is really important to understand the reason StaticJSONP requires a unique id.

First of all it is not possible, neither convenient, to "magically retrieve it from the url" because any RESTful API out there may have a "different shape".

The unique id is a sort of trusted, pre-agreeded, and aligned information the client side library must be aware of since there is no way to change it on the server side, being the file created statically.

It is also important to prefix the id so that debugging will be easier on client side.

However, the combination to generate the unique id itself may be already ... well, unique, so it's up to us on both client and server side to define it in a possibly consistent way.

The reason I did not use the whole uri + id info on StaticJSONP request method is simple:

if both gadgets/102.js and books/102.js contains a unique 102 id there is no way on the client side to understand which article has been required and both gadgets and books registered callbacks will be notified, one out of two surely with the wrong data.

It's really not complicated to namespace a unique id prefix and this should be the way to go imho.

Conclusion

It's usually really difficult to agree unanimously to a solution for a specific problem and I am not expecting that from tomorrow everyone will adopt this technique to speed up server side file serving over common "JSONP queries" but I hope you understood the reason this approach may be needed and also how to properly implement a solution that does not cause client side conflicts, that scales, that does not increase final application size in any relevant way, and it's ready to go for that day when, and if, you gonna need it. Enjoy

Monday, August 8, 2011

Please Stop Reassigning For No Reason!

I swear it is was a short one but a must write and yes, once again, since I keep seeing this kind of mistake everywhere!

This is not about JavaScript, this is about programming whatever language you want ... if you do this, you are doing it wrong!

The Problem

JS Engines developers are desperate! They are even posting how to help JIT compilers to go faster and developers seem to be so lazy that even most basic good practices are avoided.

Performances a part, this pattern can be also dangerous, specially in this ES5 era where getters and setters are extremely common, specially on mobile browsers where nobody cares about IE gap.

I am looking at you only if you are still writing something like this in your code:



window.whatever = window.whatever || {

    /* the whatever it is, object, function, anything */

};

where window is just the generic object example.

OMG, What Can Be Wrong ...

Everything! Any object property could be a native or user defined setter. In this case above technique is invoking the potential setter passing through the potential getter.

This simply means that:

accordingly to the task, we are asking N extra computations or function invokes for no reason

the setter may accept something different from what the getter returns. If this is true, result could be an error, rather than a "smart assignment"

if the setter was set already and it was a lazy re-assignment logic, we are avoiding lazy assignment features/logic requiring it's execution instantly and, once again, without any reason

if the object has getter only the operation will throw an error while setting

No getters or setters? It does not matter since in this way any property has to be retrieved and, if present, has to be reassigned to itself.

How JavaScript Engines Work

Thanks for asking. Let's imagine every object refers behind the scene to a stack of strings representing the list of properties. This stacks is used to know if object.hasOwnProperty("propertyName") or if "propertyName" in object is present. Once decided if the object can access the propertyName, a sort of -1 < objectProperties.indexOf("propertyName") procedure, the property has to be retrieved and eventually unboxed. Once this part is complete, the reassignment does not care much about the same property. Here comes the propertyName to propertyValue procedure which most likely will "erase" the older reference to reassign the new one. Even if the engine is truly smart, all possible checks/logics about memory address for the value and property name for the object has to be executed.

If you don't believe me you can simply check the Webkit engine Object source code and compare operations needed to set and get VS JSObjectHasProperty, basically just a call to jsObject->hasProperty rather than a whole logic moved via JSObjectGetProperty plus JSObjectSetProperty, and of course, invoking hasProperty as well.

How To Do The Same Better And Faster



"whatever" in window || (window.whatever = {

    /* the whatever it is, object, function, anything */

});



// eventually easier to shortcut via minifier

var key = "whatever";

key in window || (window[key] = {

    /* the whatever it is, object, function, anything */

});



// eventually easier to use but requires potentially extra memory

// 'cause the assignment has to be created in any case

// and it can't be ignored as it could be with precedent examples

function setDefault(object, key, value) {

    key in object || (object[key] = value);

    // here we can play with the returned value

    // I chose the function itself but it can be anything

    // or nothing if you prefer

    return setDefault;

}



setDefault(

    window, "whatever", {}

)(

    window, "somethingElse", function(){}

)(

    window, "howCoolIsIt", "very cool!"

);

The reason it's better is simple: no setters or getters invoked plus no redundant+superfluous operations performed over *reassignment*. Above snippet will simply invoke JSObjectHasProperty and jsObject->hasProperty so that the whole setter logic will be executed only if necessary and in any case no getter logic will ever be involved ... got it?

It Does Not Work

Oh ... Really? Most likely it's a browser bug you should file ASAP and even most likely a new feature not there yet. The in operator should always work as expected indeed with or without inheritance involved.



var o = Object.defineProperty({}, "test", {

    enumerable: false,

    writable: false,

    configurable: false,

    value: "OK"

});



alert("test" in o && o.test); // ... guess what ...

// OK

Since defined properties respect the in operator, nobody blocks us to define them using same pattern.



"prop" in object || Object.defineProperty(

    object, "prop", descriptor

);

I Have To Do Refactoring

Good, so start with this RegExp to search the evil code in all your failes:



/([$_0-9a-zA-Z]+(?:\.[$_0-9a-zA-Z]+|\[[$_0-9a-zA-Z]+\]))\s*=\s*\1/

    .test(fileText)



// or suggested by @joseanpg

/([$\w]+(?:\.[$\w]+|\[[$\w]+\]))\s*=\s*\1(?:[\s\|&,;]|$)/m

and modify accordingly. The replace RegExp requires unfortunately the end of the assignment so it may be weak.

I Kill A Kitten Each Time I Read That

This is what happens if you don't start now with the right way so, please, THINK ABOUT KITTENS!

Update On Alternative Ways

As @diegoperini pointed out in this tweet another way to avoid the setter is:



window.propertyName || (window.propertyName = {

    /* the whatever value */

});

However, above technique still invokes the getter, either defined by the user or behind the JavaScript scene ( in core ).

My first suggestion avoids "empty getters" so even if this latter technique may be considered a better approach, it can still suffers or imply side effects.

Update On False Positives

As Diego pointed out the "prop" in obj may result into a false positive if obj.prop is assigned but it is falsy. I consider this really an edge case and even such pattern will be error prone if the value is truish but not the expected one.



this.prop || (this.prop = {});



// now imagine before

this.prop = true;



// bye bye library

There is actually no easy or fast enough way to compare two clones runtime in order to understand if the assigned object is the one expected. Also at that point this extra overhead would be unnecessary since re-assignment will be just faster.

At the end of the day it's up to us to be as safe as possible still preserving common sense and good programming logic but don't tell me I am talking about premature optimizations because all I am talking about is logic over a pattern that is similar in size with the one I am suggesting but it's completely different in therms of required operations and destroyed possibilities.

We should never abuse bad practices behind the "premature" flag!

Update With Benchmark

As asked via comments, I have created a jsperf benchmark. Please bear in mind the used test is not a real use case while described problems are still the same I have been talking about in this post.

JIT compilers may optimize repeated code and this is most likely what happens with the commonPattern approach.

I love the fact the benchmark basically proves me wrong, when getters and setters are not involved on JS side, because it's kinda illogical accordingly with browsers implementations.

Said that, if anybody from Chrome development would be so kind to comment why re-assignment is faster than just in it would be really nice.

Conclusions

100000 VS 2000000 ops for non real cases are not so useful to analyze but what should be underlined is that specially with getters and setter the in operator is both faster and safer cross browser and cross platform.

As example, there is only one proper way to shim Array.isArray.

If we use return typeof obj.length == "number" rather than toString.call(obj) == "[object Array]"we will surely go faster but we will do it totally wrong as well.

Last note about Webkit, we have to deal with it since it has the majority of mobile browsing market share. Special case is webOS which implements V8 rather than JSC but still, don't be blind in front of millions, just understand side effects the common pattern could cause against the right way to know if a property has been set already.

Monday, August 1, 2011

bit.ly bookmarklet

bit.ly offers a proper sidebar bookmarklet but you need to sign in so, if you are really lazy and you want an easy way to shorten whatever page you are visiting, drag and drop next link into bookmarks and click it once whenever you want.

bit.ly shortener

enjoy :)