Technology

Sunday, November 27, 2011

About Felix's Style Guide

Style guides are good as long as these are meaningful and a bit "open minded" because if one style is recognized as anti pattern then it must be possible to update/change it.

The Felix's Node.js Style Guide is surely a meaningful one, but I hope it's open minded too.

Why This Post

Because I write node.js code as well sometimes and I would like to write code accepted by the community. Since developers may go too religious about guides I don't want explain all these points per each present or future node.js module I gonna write ... I post it once, I will eventually update it, but it's here and I can simply point at this any time.

Style Guide Side Effect

As happened before with the older one or the linter, a style guide may mislead developers and if they are too religious and unable to think about some point, the quality of the result code may be worse rather than better.
I have talked about JSLint already too much in this blog so ... let's bring the topic in track and see what Felix may re-consider.

Quotes

Current status: Use single quotes, unless you are writing JSON
A common mistake for JavaScript developers, specially those with PHP, Ruby, or other scripting languages background, is to think that single quotes are different from double quotes ... sorry to destroy this myth guys but in JavaScript there is no difference between single and double quotes.
Once you get this, you must think that if by standard definition JSON uses double quotes there is no reason to prefer single quotes over double.
What should this difference tell us, that one is JSON and one is JavaScript for node.js? Such assumption is a bit ridiculous to me since JSON is JavaScript, nothing more, nothing less.
More over, I am not a big fan of HTML strings inside JS files because of separation of concerns, and I'll come back on it, but in English, as well as in many other languages, the single quote inside a string is quite common, isn't it?
Accordingly, are you really planning to prefer 'it\'s OK' over "it's OK"?
And if the suggested editor, in the guide itself, is one with supported syntax highlight, shouldn't we simply trust the highlighted string in order to recognize it, rather than distinguish between JSON and non JSON strings?
Shouldn't we write comfortably without silly rules, as Felix says in another point, so that we can chose whatever string delimiter we want accordingly with the content of the string?
Last, but not least, for those convinced that in HTML there is any sort of difference between single and double quotes, I am sorry to, once again, destroy this myth since there is no difference in HTML between quotes ... so, once again, chose what you want and focus on something else than distinction between JSON and non JSON strings ...

Variable declarations

Current status: Declare one variable per var statement, it makes it easier to re-order the lines. Ignore Crockford on this, and put those declarations wherever they make sense.
If there is one thing that I always agreed with Crockford is this one: variable declared at the beginning of the function, with the exception for function declarations that should be before variable declarations, and not after.
The reason is quite simple: wherever we declare a variable, this will be available at the very beginning of the scope so which place can be better than the beginning of the scope itself to instantly recognize all local scope variables rather than scrolling the whole closure 'till the end in order to find that name that completely screwed up the whole logic in the middle because is shadowing something else inside or outside the scope ?
Is that easy, I don't want to read potentially thousands of lines of code to understand which variable has been defined where, I want a single bloody place to understand what is local and what is not.
The only exception to this rule may be some temporary variable, as is the classic i used in for loops, or some tmp reference still common in for loops


// the exception ... in the middle of the code

for (var i = 0, length = something.length, tmp; i < length; ++i) {
    tmp = something[i];
    // do something meaningful 
}

Chances that above loop will destroy our logic because of a shadowed i or tmp reference are really rare and we should use meaningful variables name.
As summary, there are only advantages on declaring variables on top, and until we have no let statements it is only risky, and dangerous for our business, to define variables in the wild.


// how you may write code
(function () {
  return true;
  var inTheWild = 123;
  function checkTheWild() {
    return inTheWild === 123;
  }
}());

// how JS works in any case
(function () {

  // function declarations instantly available in the scope
  function checkTheWild() {
    return inTheWild === 123;
  }

  // any var declaration instantly available as undefined
  var inTheWild;

  return true;

  // no matters where/if we assign the value
  // the inTheWild is already available everywhere
  // with undefined as default value
  // and before this assignment
  inTheWild = 123;
  
  // shadows should be easy to track
  // top of the function is the right place to track them
}());

Constants

Current status: Constants should be declared as regular variables or static class properties, using all uppercase letters.

Node.js / V8 actually supports mozilla's const extension, but unfortunately that cannot be applied to class members, nor is it part of any ECMA standard.
I could not disagree more here ... the meaning of constant is something immutable and the reason we use constant is to believe these are immutable.
Why on earth avoid a language feature when it provides exactly what we need? Don't we want more reliable code? And what about security?
Neither in PHP we can define an object property and this does not mean we should avoid the usage of define function, isn't it?
Use const as much as you want unless the code does not have to run cross platform ( browsers included ) and if you want to define constants with objects, use the JavaScript equivalent to do that.


function define(object, property, value) {
    Object.defineProperty(object, property, {
        get: function () {
            return value;
        }
        // if you really want to throw an error
        // , set: function () {throw "unable to redefine " + property;}
    });
}

var o = {};
define(o, "TEST", 123);
o.TEST = 456;
o.TEST; // 123

I have personally proposed cross platform ways to do the same without or within objects themselves.
As we can see there are many ways to define constants and if the argument is that const is not standard it's a weak one because nobody is planning to remove this keyword from V8 or SpiderMonkey, most likely every other browser will adopt this keyword in the future but if that won't happen, to find and replace it with var only when necessary is a better way to go: security and features, we should encourage these things rather than ignore them, imho.

Equality operator

Current status: Programming is not about remembering stupid rules. Use the triple equality operator as it will work just as expected.
No it doesn't because coercion is not about equality only.


alert("2" < 3 && 1 < "2");

Programming IS about remembering rules, that's the reason we need to know the syntax in order to code something ... right?
Coercion must be understood rather than being ignored and rules are pretty much simple and widely explained in the specs, even more simplified in this old post of mine.
If any of you developers still do this idiotic check you are doing it wrong.


// WRONG!!!!!!!!!!!!!
arg === null || arg === undefined

// THE SAFER EXACT EQUIVALENT
arg == null

You should rather avoid undefined equality in any piece of code you wrote, unless you did not create an undefined reference by yourself.
As summary, a good developer does not ignore problems, a good developer understands them and only after decide which way is the best one.
Felix is a good developer and he should promote deeper knowledge ... this specific point was really bad because it's like saying "avoid regular expressions because you don't understand them and you don't want to learn them".

Extending prototypes

Current status: Do not extend the prototypes of any objects, especially native ones
Read this article and stick an "it depends" in your screen. Nobody wants to limit JS potentials and Object.defineProperty/ies is a robust way to do things properly.
Sure we must understand that even a method like Array#empty could have different meanings ... e.g.


var arr = Array(3);
// is arr empty ?
// length is 3 but no index has been addressed

If we pollute native prototype we must agree on the meaning or we must prefix our extension so that nobody can complain if that empty was good enough or just a wrong concept.


// a meaningful Array#empty
Object.defineProperty(Array.prototype, "empty", {
  value: function empty() {
    for (var i in this) return false;
    return true;
  }
});

alert([
  [].empty(),       // true
  Array(3).empty(), // true
  [null].empty()    // false
]);

Conditions

Current status: Any non-trivial conditions should be assigned to a descriptive variable
I often write empty condition on purpose and to both avoid superflous re-assignment and speed up the code.


// WRONG
object.property = object.property || defaultValue;

// better
object.property || (object.property = defaultValue):

// RIGHT
"property" in object || (object.property = defaultValue);

Why on earth would I assign that ? Same I would not do it if the condition is not reused at least twice.


if (done.something() && done.succeed) {
    // done.something() && done.succeed useful
    // here and nowhere else in the code
}

Accordingly with the fact spread variable declarations are a bad practice, I cannot think about creating flags all over the place for these kind of one shot checks.

Function length

Current status: Keep your functions short. A good function fits on a slide that the people in the last row of a big room can comfortably read. So don't count on them having perfect vision and limit yourself to ~10 lines of code per function.
I agree on this point but JavaScript is all about functions.
I understand the require approach does not need a closure to surround the whole module, being this already somehow sandboxed, but cross platform code and private functions or variables are really common in JS so that a module may be entirely written inside a closure.
A closure is a function so as long as this point is flexible is fine to me, also as long as the code is still readable and elegant ... I can already imagine developers writing 10 lines of functions using all 80 characters per line in order to respect this point ...

... now, that would be lame.

Object.freeze, Object.preventExtensions, Object.seal, with, eval

Current status: Crazy shit that you will probably never need. Stay away from it.
Felix here put language features together with "language mistakes".
With "use strict" directive, the one Felix should have put at the very beginning of his guideline, with statement is not even allowed.
eval has nothing to do with Object.freeze and others and these methods are part of the ES5 specification.
It is true you may not need them, to define them as something to stay away from, specially from somebody that said there are no constants for objects, is kinda limitative and a sort of non sense.
Learn these new methods, and use them if you think/want/need.

And that's all folks

Saturday, November 26, 2011

JSONH New schema Argument

The freaking fast and bandwidth saver JSONH Project has finally a new schema argument added at the end of every method in order to make nested Homogeneous Collections automatically "packable", here an example:


            var
                // nested objects b property
                // have same homogeneous collections
                // in properties c and d
                schema = ["b.c", "b.d"],
                
                // test case
                test = [
                    {   // homogeneous collections in c and d
                        b: {
                            c: [
                                {a: 1},
                                {a: 2}
                            ],
                            d: [
                                {a: 3},
                                {a: 4}
                            ]
                        }
                    }, {
                        a: 1,
                        // same homogeneous collections in c and d
                        b: {
                            c: [
                                {a: 5},
                                {a: 6}
                            ],
                            d: [
                                {a: 7},
                                {a: 8}
                            ]
                        }
                    }
                ]
            ;

The JSONH.pack(test, schema) output will be the equivalent of this string:


[{"b":{"c":[1,"a",1,2],"d":[1,"a",3,4]}},{"a":1,"b":{"c":[1,"a",5,6],"d":[1,"a",7,8]}}]

How Schema Works

It does not matter if the output is an object or a list of objects, as well as it does not matter if the output has nested properties.
As soon as there is an homogeneous collection somewhere deep in the nested chain and common for all other levels, the schema is able to reach that property and optimize it directly.
Objects inside objects do not need to be the same or homogeneous, these can simply have a unique property which is common for all items and this is enough to take advantage of the schema argument that could be one string, or an array of strings.

Experimental

Not because it does not work, I have added tests indeed, simply because I am not sure 100% this implementation covers all possible cases but I would rather keep it simple and let developers deal with more complex scenario via manual parsing through the JSONH.pack/unpack and without schema ... this is still possible as it has always been.
Let me know what you think about the schema, if accepted, I will implement it in Python and PHP too, thanks.

Friday, November 25, 2011

On Complex Getters And Setters

A common use case for getters and setters is via scalar values rather than complex data.
Well, this is just a programmer mind limit since data we could set, or get, can be of course much more complex: here an example


function Person() {}
Person.prototype.toString = function () {
    return this._name + " is " + this._age;
};
// the magic identity configuration object
Object.defineProperty(Person.prototype, "identity", {
    set: function (identity) {
        // do something meaningful
        this._name = identity.name;
        this._age = identity.age;
        // store identity for the getter
        this._identity = identity;
    },
    get: function () {
        return this._identity;
    }
});

With above pattern we can automagically update a Person instance name and age through a single identity assignment.


var me = new Person;
me.identity = {
    name: "WebReflection",
    age: 33
};

alert(me); // WebReflection is 33

While the example may not make much sense, the concept behind could be extended to any sort of property of any sort of class/instance/object.

What's Wrong

The problem is that the setter does something in order to keep the object updated in all its parts but the getter does nothing different than returning just the identity reference.
As summary, the problem is with the getter and the reason is simple: from user/coder perspective this may not make sense!


me.identity.name = "Andrea";

alert(me); // WebReflection is 33

In few words we are changing an object property, the one used as identity for the me variable, leaving the instance of Person untouched ... and semantically speaking this looks wrong!

A Better Approach

If the setter aim is to update a state there will be only a well known amount of properties to re-assign before this status can be updated.
In this example we would like to be sure that as soon as the identity has been set, the instance toString method will produce the expected result and without relying an external reference, the identity object itself.


// listOfNames and listOfAges are external Arrays
// with same length
for (var
  identity = {},
  population = [],
  i = 0,
  length = listOfNames.length;
  i < length; ++i
) {
  // reuse a single object to define identities
  identity.name = listOfNames[i];
  identity.age = listOfAges[i];
  population[i] = new Person;
  population[i].identity = identity;
  // out of this loop we want that each 
  // instanceof Person prints out
  // the right name and surname
}

// we cannot do it lazily in the toString method ...
// this will fail indeed
Person.prototype.toString = function () {
  return this._identity.name + " is " + this.identity._age;
};

Got it? It's basically what happens when we use an object to define a property, more properties, or to create inheriting from another one: properties and values are parsed and assigned at that moment, not after!


var commonDefinition = {
  enumerable: true,
  writable: true,
  configurable: false
};

var
  name = (commonDefinition.value = "a"),
  a = Object.defineProperty({}, "name", commonDefinition),
  name = (commonDefinition.value = "b"),
  b = Object.defineProperty({}, "name", commonDefinition)
;
alert([
  a.name,  // "a"
  b.name   // "b"
]);

As we can see if the property assignment would have been lazy the result of the latest alert would have been "b", "b" since the object used to define these properties has been recycled ... I hope you are still following ...

A Costy Solution

There is one approach we may consider in order to make this identity consistent ... the one that stores an identity with getters and setters.


// the magic identity configuration object
Object.defineProperty(Person.prototype, "identity", {
    set: function (identity) {
        var self = this;

        // do something meaningful
        self._name = identity.name;
        self._age = identity.age;

        // store identity as fresh new object with bound behavior
        this._identity = Object.defineProperties({}, {
            name: {
                get: function () {
                    return self._name;
                },
                set: function (name) {
                    self._name = name;
                }
            },
            age: {
                get: function () {
                    return self._age;
                },
                set: function (age) {
                    self._age = age;
                }
            }
        });
    },
    get: function () {
        return this._identity;
    }
});

What changed ? The fact that now we are able to pass through the instance and operate through the identity:



me.identity.name = "Andrea";

alert(me); // Andrea is 33

Cool hu ? ... well, it could be better ...

A Faster Solution

If for each Person and each identity we have to create a fresh new object plus at least 4 functions, 2 for getters and 2 for relative setters, our memory will fill up so quickly that we won't be able to define any other person identity soon and here comes the fun part: use the knowledge gained by this pattern internally!
Yes, what we could do to make things slightly better is to recycle the internal identity setter object definition in order to borrow maximum 4 functions rather than 4 extra per each Person instance ... sounds cool. uh?


function Person() {}
Person.prototype.toString = function () {
  return this._name + " is " + this._age;
};

(function () {

  var identityProperties = {
    name: {
      get: function () {
        return this.reference._name;
      },
      set: function (name) {
        this.reference._name = name;
      },
      configurable: true
    },
    age: {
      get: function () {
        return this.reference._age;
      },
      set: function (age) {
        this.reference._age = age;
      },
      configurable: true
    },
    reference: {
      value: null,
      writable: true,
      configurable: true
    }
  };

  Object.defineProperty(Person.prototype, "identity", {
    get: function () {
      return this._identity;
    },
    set: function (identity) {
      // something meaningful
      this._name = identity.name;
      this._age = identity.age;

      // set the reference to the recycled object
      identityProperties.reference.value = this;

      // define the _identity property
      if (!this._identity) {
        Object.defineProperty(
          this, "_identity", {value: {}}
        );
      }
      Object.defineProperties(
        this._identity,
        identityProperties
      );
    }
  });

}());

var
  identity = {},
  a = new Person,
  b = new Person
;

identity.name = "a";
identity.age = 30;
a.identity = identity;

identity.name = "b";
identity.age = 31;
b.identity = identity;

alert([
  a,  // "a is 30"
  b   // "b is 31"
]);

So what we have there? A lazy _identity definition, good for those scenario where some getter or setter may never been invoked, plus a smart recycled property definition through a single object descriptor, and performances boosted up N times per each instance since no multiple different functions are assigned per identity properties getters and setters and no extra objects are created runtime ... arf, arf .. are you still with me ?

As Summary

Some JS developer keep asking for standard ways to do these kind of crazy stuff without realizing that few other programming languages are that flexible as JavaScript is ... it's maybe not that simple to find better, optimized, in therms of both memory consumption and raw performances, patterns to cover weird scenario but what we should appreciate is that with JS we almost have, always, a way to simulate something we did not even think about until the day before.
Have fun with JS ;)