Tag: coding

Javascript Slim-Down

July 24, 2006 / fforw / 0 Comments

Modern javascript libraries tend to be fat. This increases the page load latency, wastes bandwidth and is a real problem for people with slower connections (There are still about 50% dialup users). So trying to slim things down comes almost naturally.

In this article I will try to present two compression methods that together will make your scripts lose up to 80% of their weight and I will also present an alternative way to serve gzip compressed files with Turbogears : a precompressing gzip controller for Turbogears.

The first thing one can do is javascript compression. Stripping unnecessary whitespace and renaming local variables can already bring down the size a bit, The dojo javascript compressor ( online version ) does both.

Let’s take two of the javascript libraries used for this site as examples:

ff-src.js – source version of my own javascript library
Uncompressed: 10.1 KB
Javascript compressed: 6.7 KB
Compression: 33%
tinymce.js – all sources I need from the tinymce richtext editor joined together
Uncompressed: 329.0 KB
Javascript compressed: 233.6 KB
Compression: 29%

This sure is nice for a start, but we can go even one step further with gzip compression. Gzip compression is a w3 standard ( defined in RFC 2616 ) and can be used to send the output of a server compressed to a client. Using this our files shrink even more:

ff-src.js
Uncompressed: 10.1 KB
Js and gzip compressed : 2.8 KB
Compression: 73%
tinymce.js
Uncompressed: 329.0 KB
Js and gzip compressed : 57.8 KB
Compression: 82%

Most modern web servers allow you to enable general gzip compression. IIS does it, Apache has mod_gzip for it, Turbogears can do gzip compression via cherrypy’s gzip filter.

Personally this felt like an overkill to me. I don’t really want to use my precious server CPU cycles to compress output where it’s no larger win to do so. So I wrote myself a precompressing gzip controller for Turbogears. It creates gzipped copies of a configurable set of files and automatically updates them. If the client browser supports gzip encoding, the gzipped file copy is sent, otherwise the normal file is send.

The controller is meant to be used as a subcontroller list this

class Root( controllers.RootController ):
# subcontroller
gz = GzipController()

This would map the gzip controller to /gz/ .

Download:

gz.py — precompressing gzip controller
serveFile.patch — Update: the patch has been integrated into the turbogears distribution

Annotating DOM nodes with JSON

July 3, 2006 / fforw / 0 Comments

One problem when trying to do “The Right Thing ™” and separate your webdesign into different layers is what to do when you need additional information to transform your nicely id and class annotated DOM nodes into fancy, javascript-enabled goodness:

Where do you store that information? How do you associate it to the DOM nodes? I will present three approaches to this problem..

Method 1: The squeaky clean

The first method is relatively straight-forward: You add ids or classes to your DOM nodes. Then you include a dynamically generated javascript library in your document.

The problem with this is that the generated javascript library is requested in a new request which makes it nescessary to provide this request with the nescessary context to generate it. You also need to write code which finds your DOM nodes and finds the data from the included javascript library and associates both. You also walk over the server side data twice: Once to generate the DOM nodes and once to create the javascript library.

Method 2: Compromising

The second approach is to compromise a little on the separation ideal and render the javascript data from the first method right into the page source.

You don’t need to carry the context for the data into a second request for the javascript library, but you still need to wak the data twice on the server side, you still need to associate the DOM nodes to the data, and you have a big block of js data in the head section of your document.

Method 3: Annotating DOM nodes with JSON

I thought about a way to directly associate the DOM with additional data. JSON seemed to be a good format to store the data. At first I thought about using a custom attribute but I did not like the idea of invalid HTML markup. Then I got another idea:

Why not use the event handlers? Something like

<a href="/no_js" onclick="return { foo: 'extra', bar: 1};">Link</a>

works pretty well. It is totally valid HTML, the onclick contains valid javascript code which can also be easily parsed by other tools. (It is basically just a JSON string wrapped with a “return […] ;” ) When javascript is disabled, the link just executes normally and can provide non-javascript functionality. The data can be retrieved by executing “var data=linkNode.onclick();”. “But what about the fake event handler?” you might ask. Well.. if someone clicks on the unmodified link in a javascript enabled browser: nothing happens. The script just returns some data and does nothing else. The return code will be ignored by the browser environment due to this age-old, pre-DOM standard of canceling the event if the handler returns false and just going on if it returns true — since the data is something it evaluates to true so it’s just ignored.

On the server side things get much easier. Not only does no context need to be carried into another request but you also only need to walk over the data once. You can ouput the HTML and the additional javascript data into the same part of the document.

I admit that this approach bends the rules of separation a little, but in my opinion that’s ok.

It uses onclick=”” but does not really put any code in there, just some data
The semantics of the original markup are kept as they are. The method only allows to annotate this “base data” with additional information.
The pros vastly outweigh the cons

Separate it!

July 3, 2006 / fforw / 0 Comments

One of the most important practices when developing complex web applications is to seperate them into different layers. This applies to the server side as well as to the client side of things. On the client side the separation usually is:

Content
The content of the web application expressed in semantic HTML markup, enriched with additional ids and classes to refer to in the other layers
Layout
CSS rules for styling the content
Behaviour
Javascript code transforming the DOM to provide a better user experience

It is important that the second and third layer are optional. The Content layer should contain the basic information and be accessible without the Layout and the Behaviour layer. Content and Layout should look nice and work without the Behaviour layer.

While this wisdom is very much common place for the first two layers, many people still don’t do the Behaviour layer correctly. There are several important points to this:

Build every single page so that it works well without any javascript. Do not rely on javascript for navigation or load content via javascript etc.
Build your scripts so that they use DOM methods to transform the non-javascript page into a better page using javascript.
Use w3 DOM scripting in your pages. No document.write nonsense, no onclick=”” for events. If you want to or have to support Internet Explorer that means using additional addEvent methods. (See the links for John Resig’s excellent addEvent )

Links:

A List Apart: Separation: The Web Designer’s Dilemma
mezzoblue Markup Guide — about semantic HTML markup
John Resig – Flexible Javascript Events

Dojo Javascript Compression and IE conditional compilation

July 1, 2006 / fforw / 0 Comments

I often use the dojo javascript compressor system which is a very nice way to reduce the site of larger js libs. It uses the Rhino javascript interpreter to reduce the size of all local variables and removes all unessecary spaces and comments — but most importantly it keeps the external API of a javascript lib like it is.

A minor problem with that was that I like to use Internet Explorer specific conditional compilation to handle Internet Explorer’s non-standard js garbage. (I would really like to be able to fully ignore IE, but ignoring 70% of all Internet users is not really going to please my clients).

Since the compressor removes all comments from the js files it also removes conditional code sections. There are plans to make the compressor keep them but they’re postponed until at least dojo 0.4.

So for now I wrote myself a little bash script that replaces the conditional comments with javascript expressions that are not removed, compresses the js file and reinserts the conditional comments.

I have tested this only on linux but you should also be able to use it under OS-X or under cygwin for Windows.

dojo compressor bash script ( needs java, expects custom_rhino.jar to be in the same directory )
custom_rhino.jar (see dojo site for how to get the source)

Using this script

/* Javascript compression Test */

alert("Js compression test");

function foo(arg)
{
        var local=arg;
}

/*@cc_on @*/
/*@if (@_jscript_version >= 5)

alert("You are using Internet Explorer 5+");

function bar(arg)
{
        var localIE=arg;
}

@end @*/

is compressed to

alert("Js compression test");
function foo(_1){
var _2=_1;
}
/*@cc_on @*/
/*@if (@_jscript_version >= 5)
alert("You are using Internet Explorer 5+");
function bar(_3){
var _4=_3;
}
@end @*/

That does not seem that much for this small example, but for larger libs the reduction is not that bad. The current version of my ff js lib e.g. compresses from 12005 bytes to 7644 bytes (36% reduction).

Update: corrected custom_rhino.jar URL

Storing hierarchical data in a database

May 24, 2006 / fforw / 0 Comments

When looking for a good way to store nested comments in a SQL database I stumbled upon the explanation of an algorithm called “modified preorder tree traversal” algorithm [2]. Instead of storing a parent-child relationship into the database and having to do one SQL query for every node, it allows querying the whole comment tree (and subtrees) with a single query.

The basic idea is to walk along the outside of the tree hierarchy and to number the left and right side of every node along the way:

While this is rather unintuitive it allows to:

Get the sub tree of an node by selecting all nodes whose left value is between the left and the right value of the top node of the subtree
Get the path of a node by selecting all nodes whose left value is smaller than the left value of the node in question and whose right value is larger than its right value
Determine the number of children of a node by calculating:
( right value – left value -1 ) / 2

The disadvantage of this approach is that the addition of a single node leads to changes in all nodes to the “right” of it. After pondering the problem for a while I came up with a variant of the algorithm that gets rid of that. Instead of using simple consecutive numbers for the left and right values I wrote an algorithm which spreads all the left/right values over a fixed range of numbers. The top node starts with a fixed left and right value and children are inserted by splitting that interval into smaller sub-intervals. This makes it possible to not change the key values on inserting and do that insert with a single writing operation.