Sunday, November 29, 2009

XML To HTML Snippet

This John Resig post about nodeName demonstrates once again how trustfulness are edge cases in JavaScript.
I must agree 100% with @jdalton: frameworks or selector libraries should not be concerned about these cases.

First of all there is no universal solution so whatever effort able to slow down libraries won't be perfect, then why bother?

Secondly, I cannot even understand why on earth somebody could need to adopt XML nodes in that way.
Agreed that importNode or adoptNode should not be that buggy, but at the same time I have always used XSL(T) to inject XML into HTML and I have never had problems.

Different Worlds

In an XML document a tag is just a tag. It does not matter which name we chose or which JavaScript event we attached, XML is simply a data protocol, or transporter, and nothing else.
A link, a div, an head, the html node itself, does not mean anything different in XML so again: why do we need to import in that way?
In Internet Explorer we have the xml property which is, for HTML represented inside XML docs, the fastest/simplest way to move that node and what it contains inside an element and via innerHTML.
Moreover, namespaces are a problem, we cannot easily represent them into an HTML document so, in any case, we need to be sure about represented data.

A Better Import

Rather than ask every selector library to handle these edge cases we could simply adopt a better import strategy.
This XML to HTML transformer is already a valid alternative, but we can write something a bit better or more suitable for common cases.
As example, a truly common case in these kind of transformations is a CDATA section inside a script node.
With CDATA we can put almost whatever we want inside the node but 99.9% of the time what we need is a JavaScript code, rather than a comment.
Another thing to consider is that if we need to import or adopt an XML node, 99.9% of the time we need to import its full content and not an empty node (otherwise we should ask us why we are representing data like that, no?).
I bet the deep variable will be basically true by default so, here there is my alternative proposal which should be a bit faster, avoiding implicit boolean cast for each attribute or node, and considering what I have said few lines ago:

function XML2HTML(xml){
// WebReflection Suggestion - MIT Style License
for(var
nodeName = xml.nodeName.toUpperCase(),
html = document.createElement(nodeName),
attributes = xml.attributes || [],
i = 0, length = attributes.length,
tmp;
i < length; ++i
)
html.setAttribute((tmp = attributes[i]).name, tmp.value)
;
for(var
childNodes = xml.childNodes,
i = 0, length = childNodes.length;
i < length; ++i
){
switch((tmp = childNodes[i]).nodeType){
case 1:
html.appendChild(XML2HTML(tmp));
break;
case 3:
html.appendChild(document.createTextNode(tmp.nodeValue));
break;
case 4:
case 8:
// assuming .text works in every browser
nodeName === "SCRIPT" ?
html.text = tmp.nodeValue :
html.appendChild(document.createComment(tmp.nodeValue))
;
break;
}
};
return html
};

I have tried to post above snippet into John post as well but for some reason it is still not there (maybe waiting to be approved)
We can test above snippet via this piece of code:

var data = '<section data="123"><script><![CDATA[\nalert(123)\n]]></script><option>value</option></section>';
try{
var xml = new ActiveXObject("Microsoft.XMLDOM");
xml.loadXML(data);
}catch(e){
var xml = new DOMParser().parseFromString(data, "text/xml");
}
var section = xml.getElementsByTagName("section")[0];
onload = function(){
document.body.appendChild(XML2HTML(section));
alert(document.body.innerHTML);
};

We can use latest snippet to test the other function as well and as soon as I can I will try to compare solutions to provide some benchmark.

No comments:

Post a Comment