Monday, October 25, 2010

JavaScript Coercion Demystified

This post is another complementary one for my front-trends slides, about performances and security behind sth == null rather than classic sth === null || sth === undefined.
I have already discussed about this in my JSLint: The Bad Part post but I have never gone deeper into this argument.

Falsy Values

In JavaScript, and not only JavaScript, we have so called falsy values. These are respectively: 0, null, undefined, false, "", NaN. Please note the empty string is empty, 'cause differently from php as example, "0" will be considered truish, and this is why we need to explicitly enforce a number cast, specially if we are dealing with input text value.
In many other languages we may consider falsy values even objects such arrays or lists:

<?php
if (array())
echo 'never'
;
?>

#python
if []:
print 'never'


Above example will not work in JavaScript, since Array is still an instanceof Object.
Another language that has falsy values is the lower C, where 0 as example could be considered false inside an if statement.
Falsy values are important to understand, specially if we would like to understand coercion.

About Coercion

In JavaScript world, coercion is considered a sort of evil and unexpected implicit cast, while in my opinion it's simply a feature, if we understand it and we know how to use it.
Coercion is possible only via == (eqeq) operator, but the thing is that everything is properly implemented cross browsers accordingly with ECMAScript 3 Standard.
This is the scary list newcomers have probably never read, directly from the even newer ECMAScript 5 specification, just to inform you that coercion will hopefully always be there, and nothing is evil.

The comparison x == y, where x and y are values, produces true or false. Such a comparison is performed as
follows:


  1. If Type(x) is the same as Type(y), then

    1. If Type(x) is Undefined, return true: undefined == undefined

    2. If Type(x) is Null, return true: null == null

    3. If Type(x) is Number, then

      1. If x is NaN, return false: NaN != NaN

      2. If y is NaN, return false: NaN != NaN

      3. If x is the same Number value as y, return true: 2 == 2

      4. If x is +0 and y is −0, return true: 0 == 0

      5. If x is −0 and y is +0, return true: 0 == 0

      6. Return false: 2 != 1


    4. If Type(x) is String, then return true if x and y are exactly the same sequence of characters (same length and same characters in corresponding positions). Otherwise, return false: "a" == "a" but "a" != "b" and "a" != "aa"

    5. If Type(x) is Boolean, return true if x and y are both true or both false. Otherwise, return false: true == true and false == false but true != false and false != true

    6. Return true if x and y refer to the same object. Otherwise, return false: var o = {}; o == o but o != {} and {} != {} and [] != [] ... etc etc, all objects are eqeq only if it's the same


  2. If x is null and y is undefined, return true: null == undefined

  3. If x is undefined and y is null, return true: undefined == null

  4. If Type(x) is Number and Type(y) is String, return the result of the comparison x == ToNumber(y): 2 == "2"

  5. If Type(x) is String and Type(y) is Number, return the result of the comparison ToNumber(x) == y: "2" == 2

  6. If Type(x) is Boolean, return the result of the comparison ToNumber(x) == y: false == 0 and true == 1 but true != 2

  7. If Type(y) is Boolean, return the result of the comparison x == ToNumber(y)

  8. If Type(x) is either String or Number and Type(y) is Object, return the result of the comparison x == ToPrimitive(y): ToPrimitive means implicit valueOf call or toString if toString is defined and valueOf is not


About last point, this is the object coercion we are all scared about ...

var one = {
valueOf: function () {
return 1;
},
toString: function () {
return "2";
}
};

alert(one == 1); // true
alert(one == "2"); // false

If we remove the valueOf method, we will implicitly call the toString one so that one == "2" or, more generally, {} == "[object Object]".

null == undefined And null == null, Nothing Else!

99% of the time we do a check such:

function something(arg) {
if (arg === undefined) {
}
}

We are asking the current engine to check if the undefined variable has been redefined in the current scope, up to the global one, passing through all outer scopes.
Even worse, we may end up with the most silly check ever:

function something(arg) {
if (arg === undefined || arg === null) {
}
}

which shows entirely how much we don't know JavaScript, being the exact equivalent of:

function something(arg) {
if (arg == null) {
}
}

with these nice to have differences:

  • null cannot be redefined, being NOT a variable

  • null does NOT require scope resolution, neither lookup up to the global scope

The only side effect we may have when we check against null via == is that we consider for that particular case null and undefined different values .... now think how many times you do this ... and ask yourself why ...

Performances

Once again I send you to this simple benchmark page, where if you click over null VS undefined or null VS eqeq, or one of the lines showed under the header, you can realize that while it's usually faster and safer, it provides even more control when compared against the ! not operator.
The only way to reach better performances when we mean to compare against an undefined value in a safer way is to declare the variable locally without assigning any value, so that minifiers can shrink the variable name while the check will be safer.

// whatever nested scope ...
for (var undefined, i = 0; i < 10; ++i) {
a[i] === undefined && (a[i] = "not anymore");
}


There Is NO WTF In JavaScript Coercion!

Coercion in JavaScript is well described and perfectly respected cross browser being something extremely simple to implement in whatever engine. Rules are there and if we know what kind of data we are dealing with, we can always decide to be even safer and faster.
Of course if we are not aware the strict equivalent operator === is absolutely the way to go, but for example, how many times you have written something like this?

if (typeof variable === "string") ...

being typeof an operator we cannot overwrite neither change, and being sure that typeof always returns a string there is no reason at all to use the strict eqeqeq operator since String === String behavior is exactly the same of String == String by specs.
Moreover, as said before, coercion could be absolutely meant in some case, check what we can do with other languages, as example:

# Python 3
class WTF:
def __eq__(self, value):
return value == None

# Python 2
class WTF():
def __eq__(self, value):
return value == None

# in both cases ...
if WTF() == None:
"""WTF!!!"""

While this is a C# example:

using System;

namespace wtf {

class MainClass {

public static void Main (string[] args) {
WTF self = new WTF();
Console.WriteLine (
self ?
"true" : "false"
);
}
}

class WTF {
static public implicit operator bool(WTF self) {
return false;
}
}

}

Can we consider some sort of coercion latest cases as well? It's simply operator overloading, virtually the same JavaScript engines implemented behind the scene in order to consider specifications points when an eqeq is encountered.
Have fun with coercion ;-)

No comments:

Post a Comment