Type-checking Tips
Friday, February 24, 2012 | 9:02 AM
Closure Compiler’s type language is a bit complicated. It has unions (“variable x can be A or B”), structural functions (“variable x is a function that returns a number”), and record types (“variable x is any object with properties foo and bar”).
A lot of people have told us that that’s still not expressive enough. There are many ways that you can write JavaScript that do not fit cleanly into our type system. People have suggested that we should add mixins, traits, and post-hoc naming of anonymous objects.
This was not particularly surprising to us. The rules of objects in JavaScript are a little bit like the rules of Calvinball. You can change anything, and make up new rules as you go along. A lot of people think that a good type system gives you a powerful way to describe how your program is structured. But it also gives you a set of rules. The type system ensures that everyone agrees on what a “class” is, and what an “interface” is, and what “is” means. When you’re trying to add type annotations to untyped JS, you’re inevitably going to run into issues where the rules in my head don’t quite match the rules in your head. That’s OK.
But we were surprised that when we gave people this type system, they often found multiple ways to express the same thing. Some ways worked better than others. I thought I’d write this to describe some of the things that people tried, and how they worked out.
Function vs. function()
There are two ways to describe a function. One is to use the {Function} type, which the compiler literally interprets as “any object x where ‘x instanceof Function’ is true”. A {Function} is deliberately mushy. It can accept any arguments, and return anything. You can even use ‘new’ on it. The compiler will let you call it however you want without emitting warnings.
A structural function is much more specific, and gives you fine-grained control over what the function can do. A {function()} takes no arguments, but we don’t care what it returns. A {function(?): number} returns a number and takes exactly one argument, but we don’t care about the type of that argument. A {function(new:Array)} creates an Array when you call it with “new”. Our type documentation and JavaScript style guide have more examples of how to use structural functions.
A lot of people have asked us if {Function} is discouraged, because it’s less specific. Actually, it’s very useful. For example, consider the definition of Function.prototype.bind. It lets you curry functions: you can give it a function and a list of arguments, and it will give you back a new function with those arguments “pre-filled in”. It’s impossible for our type system to express that the returned function type is a transformation of the type of the first argument. So the JSDoc on Function.prototype.bind says that it returns a {Function}, and the compiler has to have hand-coded logic to figure out the real type.
There are also many cases where you want to pass a callback function to collect results, but the results are context-specific.
rpc.get(‘MyObject’, function(x) {
// process MyObject
});
The “rpc.get” method is a lot more clumsy if the callback argument you pass has to type-cast anything it gets. So it’s often easier just to give the parameter a {Function} type, and trust that the caller type isn’t worth type-checking.
Object vs. Anonymous Objects
Many JS libraries define a single global object with lots of methods. What type annotation should that object have?
var bucket = {};
/** @param {number} stuff */ bucket.fill = function(stuff) {};
If you come from Java, you may be tempted to just give it type {Object}.
/** @type {Object} */ var bucket = {};
/** @param {number} stuff */ bucket.fill = function(stuff) {};
That’s usually not what you want. If you add a “@type {Object}” annotation, you’re not just telling the compiler “bucket is an Object.” You’re telling it “bucket is allowed to be any Object.” So the compiler has to assume that anybody can assign any object to “bucket”, and the program would still be type-safe.
Instead, you’re often better off using @const.
/** @const */ var bucket = {};
/** @param {number} stuff */ bucket.fill = function(stuff) {};
Now we know that bucket can’t be assigned to any other object, and the compiler’s type inference engine can make much stronger checks on bucket and its methods.
Can Everything Just Be a Record Type?
JavaScript’s type system isn’t that complicated. It has 8 types with special syntax: null, undefined, boolean, number, string, Object, Array, and Function. Some people have noticed that record types let you define “an object with properties x, y, and z”, and that typedefs let you give a name to any type expression. So between the two, you should be able to define any user-defined type with record types and typedefs. Is that all we need?
Record types are great when you need a function to accept a large number of optional parameters. So if you have this function:
/**
* @param {boolean=} withKetchup
* @param {boolean=} withLettuce
* @param {boolean=} withOnions
*/
function makeBurger(withKetchup, withLettuce, withOnions) {}
you can make it a bit easier to invoke like this:
/**
* @param {{withKetchup: (boolean|undefined),
withLettuce: (boolean|undefined),
withOnions: (boolean|undefined)}=} options
*/
function makeBurger(options) {}
This works well. But when you use the same record type in many places across a program, things can get a bit hairy. Suppose you create a type for makeBurger’s parameter:
/** @typedef {{withKetchup: (boolean|undefined),
withLettuce: (boolean|undefined),
withOnions: (boolean|undefined)}=} */
var BurgerToppings;
/** @const */
var bobsBurgerToppings = {withKetchup: true};
function makeBurgerForBob() {
return makeBurger(bobsBurgerToppings);
}
Later, Alice builds a restaurant app on top of Bob’s library. In a separate file, she tries to add onions, but screws up the API.
bobsBurgerToppings.withOnions = 3;
Closure Compiler will notice that bobsBurgerToppings no longer matches the BurgerToppings record type. But it won’t complain about Alice’s code. It will complain that Bob’s code is making the type error. For non-trivial programs, it might be very hard for Bob to figure out why the types don’t match anymore.
A good type system doesn’t just express contracts about types. It also gives us a good way to assign blame when code breaks the contract. Because a class is usually defined in one place, the compiler can figure out who’s responsible for breaking the definition of the class. But when you have an anonymous object that’s passed to many different functions, and has properties set from many disparate places, it’s much harder--for both humans and compilers--to figure out who’s breaking the type contracts.
Posted by Nick Santos, Software Engineer
4 comments:
Unknown said...
Excellent post Nick. Thank you.
February 24, 2012 at 12:01 PM
Anonymous said...
Great post, Nick. Re: your last point about not overusing record types, do you suggest any alternate methods for enforcing contracts?
Relatedly, what do you suggest to enforce such contracts at runtime? This is particularly important when validating API responses, which return data blobs that need to be validated before they can be used.
April 4, 2012 at 12:49 PM
Anonymous said...
Ah, I believe I understand what you're suggesting, now. You're saying that for complex applications, it's best to define classes rather than rely on anonymous objects to enforce contracts.
April 4, 2012 at 3:07 PM
Tim said...
Thank you, and Google for making such an amazing library available as open source. I think a lot of the javascript community doesn't (yet) appreciate the enormous value these tools represent, because most are still writing small web applications. But as javascript client gets fatter, (and when Microsoft releases their own competing library), the popularity of Google Closure Tools will only grow.
Cheers,
July 31, 2012 at 10:28 AM
Post a Comment