topics: functional programming, concurrency, web-development, REST, dynamic languages

Thursday, February 21, 2008

JavaScript parasitic inheritance, power constructors and instanceof.

Abstract. This posting shows how one can make Crockford's power constructor functions play nicely with the JavaScript keyword 'instanceof.'

[update: March 18, 2008. I asked Crockford what he thinks about this pattern, and he actually discourages the use of 'instanceof' -- instead, he prefers to "... rely instead on good design and polymorphism."]

The inheritance model of JavaScript is based on a combination of the 'new' keyword and the prototype property of (constructor) functions. JavaScript Guru Douglas Crockford (aka 'Yoda') argues that this model (which he calls pseudo-classical) is awkward. Instead, he proposes an elegant, powerful and simple model (parasitic inheritance), using so-called power constructor functions. Note, familiarity with the above concepts is necessary for complete understanding of this post [and something that every web developer should know anyway ;-)].

The advantages of power constructor functions include support for private, shared and public variables as well as simplicty (avoiding new and prototype). There is a mismatch, however, between constructors and the JavaScript keyword, instanceof. Consider the following example:

//recall that the object function creates a new object which has
//the input object, o, as its prototype
var object = (function() {
function F() {}
return function(o) {
F.prototype = o;
return new F();
};
})();//included for completeness.
var OriginalPerson = {
sayHello: function(){
return "Hello, my name is "+this.getName();
},
getName: function(){return 'Adam';}
};

function Person(name) {
var p = object(OriginalPerson);
p.getName = function() {return name;};
return p;
}

function Guru(name,topic) {
var g = object(Person(name));//Technically we don't need object(.) here
g.getTopic = function() {return topic;};
return g;
}

var karl = Person('Karl');
var crockford = Guru('Douglas','JavaScript');

karl instanceof Person;//<- false
crockford instanceof Guru;//<- false

Hmm... Clearly, any environment that makes crockford instanceof Guru evaluate to false must be getting something wrong!

In general, one has to do something to make super-constructors work with instanceof. The expression exp1 instanceof exp2 where exp1,exp2 are JS expressions (exp2 must evaluate to a function and exp1 should evaluate to an object) works with following semantics:

First exp1 is evaluated, say, to o then
exp2 is evaluated, say, to f. If o is an object and f is a function, the entire expression evaluates to true only if following o's [[prototype]] chain, we can reach f.prototype.

This means that to make this work, we must ensure that the object created has Person.prototype or Guru.prototype in its prototype chain.

What I would really like to end up with at the end of this blog entry, is to be able to write code similar to:

var OriginalPerson = {
sayHello: function(){
return "Hello, my name is "+this.getName();
},
getName: function(){return 'Adam';}
};

var Person = OriginalPerson.parasite(function(Host,name) {
var p = object(Host());
p.getName = function() {return name;};
return p;
});

var Guru = Person.parasite(function(Host, name,topic) {
var g = object(Host(name));
g.getTopic = function() {return topic;};
return g;
});

Guru('Douglas Crockford','JavaScript') instanceof Guru;//<-- true
Guru('Douglas Crockford','JavaScript') instanceof Person;//<-- true


The extra parameter Host is supposed to represent the "Host" of the parasite (i.e., Person in the case of Guru), the idea being that the parasite function will somehow 'wrap' the Person function to set it up so that 'instanceof' works, and then finally feed this wrapped function to the parasite (via the Host variable). I won't be able to write the code exactly as above, but we will get close... Anyway, hopefully this will make more sense very soon!

We will get there in two steps. First we code it up manually (so to speak) and secondly we will do the meta-programming with parasite.

Incidentally, we can exploit Crockford's 'shared secrets' technique to get the prototype chain working. Consider the following code.

function Person(name, proto) {
var p = object(proto || Person.prototype);
p.getName = function() {return name;};
return p;
}
Person.prototype = {
sayHello: function(){
return "Hello, my name is "+this.getName();
},
getName: function(){return 'Adam';}
};
Person('Karl') instanceof Person;//<-- true

The key here is the statement: object(proto || Person.prototype). This ensures that the object created has Person.prototype in its [[prototype]] chain. The proto ||-part is intended to be used as a 'shared secret' between a parasite 'sub type'/'sub power constructor'; the invariant is that proto || Person.prototype will always denote an object which is Person.prototype or has it in its prototype chain. It can be used as follows:

function Guru(name,topic,proto) {
var g = object(Person(name, proto || Guru.prototype));
g.getTopic = function() {return topic;};
return g;
}
Guru.prototype = object(Person.prototype);
Guru('Douglas Crockford','JavaScript') instanceof Guru;//<-- true
Guru('Douglas Crockford','JavaScript') instanceof Person;//<-- true

Actually, I feel some pleasure in the assignments:

Person.prototype = {
sayHello: function(){
return "Hello, my name is "+this.getName();
},
getName: function(){return 'Adam';}
};
and Guru.prototype = object(Person.prototype);: Intutively, these objects are the 'prototypes' of the objects created by the power constructors, so it feels natural to make this assignment.

So far so good - we have instanceof working with power-constructors, but we can do better. The problem now is to do the meta-programming that will handle the 'secret-sharing' behind the scenes.

In this example, we will enhance Object.prototype and Function.prototype. For simplicity I've introduced the constraint that all power-constructor functions should take only one argument [however, I'm convinced this can be generalized to more than one argument].

Note, slightly off topic here...
In either case, I've developed at taste for single-argument functions. Consider:

/** power-constructer for Guru objects
*@param conf {Object} a configuration object with properties:
* name and topic
* @return a Guru object with name: conf.name and topic: conf.topic.
*/
function Guru(conf){
var g = object(Person(conf)),
t = conf.topic || 'none';
g.getTopic = function(){return t;};
return g;
}
Guru({name:'Yoda', topic:'JavaScript'});

The disadvantage is that there is slightly more code to write. The advantages are (1) readability: Guru({name:'Yoda', topic:'JavaScript'}) can be read without having to consult the constructor function about which argument is the name and which is the topic; (2) optional/default arguments: you can leave out either of the arguments: Guru({name:'Yoda'}) or Guru({topic:'JavaScript'}) are both valid (in the multiple arg constructor with name as the first parameter, you'd have to write Guru('Yoda') (which is fine), and Guru(undefined,'JavaScript') (which is not).

Back on track
The following is my implementation. I extend Object.prototype and Function.prototype so that we can write:

var OriginalPerson = {
sayHello: function(){
return "Hello, my name is "+this.getName();
},
getName: function(){return 'Adam';}
};

var Person = OriginalPerson.parasite(function(Host, conf) {
var p = object(Host()),
name = conf.name || 'Anonymous';
p.getName = function() {return name;};
return p;
});

var Guru = Person.parasite(function(Host, conf) {
var g = object(Host(conf)),
topic = conf.topic || 'none';
g.getTopic = function() {return topic;}
return g;
});
var h = Guru({
name: 'Douglas Crockford',
topic: 'JavaScript'
});
h instanceof Guru;//<-- true
h instanceof Person;//<-- true


I've implemented it as follows (with comments):

Object.prototype.parasite = function(parasite){
/* This is the function returned as the result
of this call; it represents a wrapper for the
function in parameter parasite. wrapper will simply
call the parasite function, but supplying a Host function
as the first argument. If wrapper is called with proto === undefined
then the Host function will create an object with its prototype === this,
otherwise an object with prototype === proto is created (this lets
sub-parasites supply the proto parameter).
*/
function wrapper(conf,proto) {
var p = proto;//Exercise why is this necessary?
Array.prototype.splice.call(arguments,0,0,function(){
return object(p || wrapper.prototype);
});
return parasite.apply(this, arguments);
}
/* it is important that wrapper.prototype is set to this object, both so that
o instanceof wrapper works, and so that objects created with
object(p || wrapper.prototype) above will inherit properties of this */
wrapper.prototype = this;
return wrapper;
};
Function.prototype.parasite = function(parasite) {
var host_cons = this;//the constructor function for the host of parasite

/* Again, this function is the result of the computation.
When called it splices a Host function on the 0'th pos in the arguments array.
The Host function will call the host_cons and (important!) supplies an
additional last argument (proto). If proto === undefined we are in the case
where client code is calling wrapper, so we call the host_cons function
supplying wrapper.prototype; if instead proto is provided we call host_cons
with this object (this is the case where wrapper is called by a sub-parasite).
*/
function wrapper(conf,proto) {
var wrapper_this = this,
p = proto;//exercise: why?
Array.prototype.splice.call(arguments,0,0, function() {
Array.prototype.splice.call(arguments,arguments.length,0,
p || wrapper.prototype);
return host_cons.apply(wrapper_this,arguments);
});
return parasite.apply(this, arguments);
}
/* our prototype is an object which inherits properties from this.prototype,
e.g., Guru.prototype inherits from Person.prototype.
*/
wrapper.prototype = object(this.prototype);
return wrapper;
};

Monday, February 18, 2008

Designing client/server web-applications

This particular entry will be the first in a series concerning some recent thoughts I've had about designing so-called 'rich' web-applications, which I will be thinking of as any distributed client/server application that have the following properties:

  • client and server communicate using HTTP
  • client is developed in JavaScript (GUI is made using CSS+HTML) and runs in an ("A-grade") browser

The series will cover a range of topics which should span all aspects of developing such an application. At present the topics are:

  • Patterns for client application design using Ext JS [JavaScript namespacing, module patterns, inheritance in JavaScript, Model-View-Controller in web clients, Observer pattern]
  • file organization, the build process and proper serving [splitting and building JavaScript programs, static checking, automated unit testing, JS 'compression', building and serving]
  • the RESTful server [designing RESTful backends to JS clients, Ruby/Rail implementation]


Throughout the series I will be developing a mini application called TriBook. The application is for booking meeting rooms at Trifork; it is developed using the JavaScript library Ext JS and Ruby on Rails for the server.

Part I, Section (i): JavaScript namespacing.

The WHAT:

Everyone knows that variables declared at the top level are global, e.g.,

function trim(str) {
return str.replace(/^\s+|\s+$/g, "");
}

is globally accessible using the name trim. When a page is including scripts from different sources that the page itself may not have control of, there is a chance of one script overriding the value of another's variables (the value of PI may indeed change!). Namespacing together with naming conventions significantly reduce the probability of such unintended script interactions.

The namespacing conventions I will be following attempt to mimic packages in Java. A package name is a name of the form: x.y.z... with "x" the top level name of an organization, "y" the organization's domain and then one or more subdomains or application identifiers. Package names are lowercase characters. In the example application, I will be defining JavaScript objects that "live" in the following package: com.trifork.tribook.

Now, "hold on!" you may say, continuing: "JavaScript doesn't have namespaces or packages". While the is true, we can achieve something close using just objects. Many JavaScript libraries support some sort of a namespace function. For instance, Ext JS has the Ext.namespace function. The statement

Ext.namespace('com.trifork.tribook');

ensures that there exists a global variable com which references an object that has a property 'trifork', which in turn has a property 'tribook'. So you can write

Ext.namespace('com.trifork.tribook');
com.trifork.tribook.Application = { ... };

However, while many libraries have the 'namespace' function (or something similar), none that I am aware of have any additional support for actually using namespaces in JavaScript programs. (And this should be where this post hopefully gets interesting and new).

Supposing you are structuring your application by a kind of Model-View-Controller pattern [one way to do this will be the topic of a later posting!]. Naturally you are using namespacing, and you want an application structure with model, view and controller in district packages. In our example, we will have a package structure


com.
trifork.
tribook.
model
view
controller


For example, the package com.trifork.tribook.model will contain a singleton object Model which contains the domain model of our application. We will develop our own namespace function which lets us define this layout in one statement:

namespace({
com:{
trifork:{
tribook:['model',
'view',
'controller']
}
}
});

Our namespace function is polymorphic in the sense that one can call it with one of several input types:

namespace('com.trifork.tribook');

namespace(['com.trifork.tribook.model','com.trifork.tribook.view',
'com.trifork.tribook.controller']);


and the form shown above. It is important to note that namespace('x.y') ensures that x.y exists: if it does not exist, it is created, and if it already exists, it is left alone (implying that namespace is idempotent).

The WHY?

We'll get back to the namespace function later. Here we continue with another function: using. This function lets us write code like:

namespace('com.trifork.tribook.model');

using(com.trifork.tribook.model).run(function(m){

m.Room = function(room_name) {
this.name = room_name;
...
};
});


The pattern: using(exp).run(function(n){...}); applies the inner function (function(n){...}) to the result of evaluating the expression exp.

In our example above, the code defines a constructor function for the Room domain objects; this function is accessed as com.trifork.tribook.model.Room (similarly to what one would do in a Java packaged world).

Now, as we shall see in the next couple of paragraphs, there are several benefits to structuring code this way:

(i) As with all namespacing: we don't clutter the global object and we reduce probability of unintended collisions.

(ii) using can take several parameters and introduces short names for deep namespaces.

namespace('long.boring.namespace.deep.as.well.controller');
using(long.boring.namespace.deep.as.well.model,
long.boring.namespace.deep.as.well.view).run(function(model,view) {

...
});

Not only is it easier to write model than: long.boring.namespace.deep.as.well.model; it is also more efficient.

(iii) If every JavaScript file in your filesystem organization of your application [and this will be a topic of a later posting] has the form:

namespace('xxx.yyy.zzz');

using(xxx.yyy.a,xxx.yyy.zzz).run(function(a,z){...});

Then two things are immediately clear to anyone reading the code (perhaps for the first time): (1) this file defines objects that live in xxx.yyy.zzz, and (2) the objects in this file depend on objects in the packages xxx.yyy.a and xxx.yyy.zzz. While you may not appreciate this immediately, I do believe that as JavaScript applications are getting larger and larger, we need all the help we can get in organizing the application (and I really think this benefit is useful).

(iv) private variables for free. In the statement:

namespace('x.y.controller');

using(x.y.model,x.y.view,x.y.controller).run(function(model,view,ctrl){

function trim(str) {
return str.replace(/^\s+|\s+$/g, "");
}
var PI = 3.14159;

ctrl.Controller = {
init: function(url) {
model.Model.url = trim(url);

},
getPI: function(){return PI;}
};

});
The function trim is private to the code defined within the function being "run". Hence the global namespace is not cluttered with functions that are intended to be used only locally. Also, no other script code can ever accidentally redefine PI, which sounds like a good thing ;-) (One may say that this structure lets one implement some of Douglas Crockfords patterns in a readable way!)


The HOW?
Ironically, when implementing namespace-using, I've decided not to use namespacing! The reason for this is that while

Trifork.namespace('x.y');

Trifork.using(x.y).run(function(y){...});

reduces probability of collision, the non-namespaced version feels much more like a language extension; in:

namespace('x.y');

using(x.y).run(function(y){...});

It almost feels as though 'namespace' and 'using' are language keywords and not user-defined functions. It is like working in an extended JavaScript language that has packages and imports -- although I would have preferred something like:

namespace x.y.z;

import x.y as y and y.z, y.w in function(y,z,w){
...
}


Implementing using is almost trivial:
using(exp).run(function(v){..}) is (almost) equivalent to (function(v){..})(exp). The full implementation is:

function using() {
var args = arguments;
return {run:function(inner){return inner.apply(args[0],args);}};
}

(not that when running the inner function, 'this' is bound to the first argument of using).

The namespace function is more code, but relatively straightforward:

//Update: generalize array case to handle a mix of strings and objects via recursion.
//Update [June 11, 2008] Simplifications noticed by Aaron Gray, thx.

function namespace(spec,context) {
var validIdentifier = /^(?:[a-zA-Z_]\w*[.])*[a-zA-Z_]\w*$/,
i,N;
context = context || window;
spec = spec.valueOf();
if (typeof spec === 'object') {
if (typeof spec.length === 'number') {//assume an array-like object
for (i=0,N=spec.length;i<N;i++) {
namespace(spec[i],context);
}
}
else {//spec is a specification object e.g, {com: {trifork: ['model,view']}}
for (i in spec) if (spec.hasOwnProperty(i)) {
context[i] = context[i] || {};
namespace(spec[i], context[i]);//recursively descend tree
}
}
} else if (typeof spec === 'string') {
(function handleStringCase(){
var parts;
if (!validIdentifier.test(spec)) {
throw new Error('"'+spec+'" is not a valid name for a package.');
}
parts = spec.split('.');
for (i=0,N=parts.length;i<N;i++) {
spec = parts[i];
context[spec] = context[spec] || {};
context = context[spec];
}
})();
}
else {
throw new TypeError();
}
}