A Property By Any Other Name, Part 3

Tuesday, January 25, 2011 | 2:15 PM

This is the last in a series of blog posts on how Closure Compiler decides what properties. Part 1 was about the --compilation_level=ADVANCED_OPTIMIZATIONS flag and part 2 was about renaming policies that didn't work. This blog post will be a bit of a grab bag of newer tools for better property renaming.

Using Type Checking to Fill in Missing Externs


Your externs file should declare all the properties that are defined outside of your program. If you haven't done this before, then finding all those properties can be a pain. Fortunately, Closure Compiler can point us in the right direction by checking for missing properties.

There are a few ways to turn on the missing properties check. For example, you can specify the flags:
--warning_level=VERBOSE --jscomp_warning=missingProperties
or
--jscomp_warning=checkTypes
The compiler will try to find places where you've used dot access (foo.bar) to read a property that can't possibly be defined on that object, perhaps because you've forgotten to declare it in your externs.

Notice that this check is subtly different than compiler checks in more static languages, like Java. The Java compiler uses a "must-define" approach: the property bar must be declared on all possible values of foo (except null), or it will be a compiler error. Closure Compiler uses a "may-define" approach. It only requires that some possible value of foo has a property bar. For example, if you have:

function f(x) {
return x.apartment;
}


Closure Compiler will emit a warning if the apartment property is not assigned anywhere in the program. Similarly, if you have:

/** @param {Element} x */
function f(x) {
return x.apartment;
}


the compiler will emit a warning if the apartment property is not assigned on any object that could possibly be an Element. It will not emit a warning if apartment is assigned on some specific subtype of Element, like HTMLTextAreaElement.

As you can see, you don't need complete type annotations to use this check, but more type annotations will get better results.

Using Type Analysis for Better Renaming


If a property is listed in the externs file under the "All Unquoted" naming policy, that property will not be renamed anywhere in the program. This is a bummer. We went through a lot of trouble to make the Closure Library events API consistent with the native browser events API. But because we gave the methods the same name as extern methods, those methods can't be renamed. Could we use type information to differentiate between method calls on user-defined objects from method calls on browser-defined objects?

We can. Closure Compiler's Java API has two options: disambiguateProperties and ambiguateProperties.

Disambiguation means that Closure Compiler will look at all accesses to a property
x in the program. If two types have a property xx// externs file
/** @constructor */ function Element() {}
Element.prototype.id;

// user code
/** @constructor */
function MyWidget() {
this.id = 3;
}

Disambiguate properties will rename this to something like:

/** @constructor */
function MyWidget() {
this.MyWidget$id = 3;
}


By design, disambiguate properties gives things very verbose names, on the assumption that they will be given shorter names by the "All Unquoted" naming policy. This makes it easier to debug, because you can turn off this optimization independently of "all unquoted" property renaming.

Disambiguate properties allows us to rename some properties that are in the externs file, but it creates a new problem: there are more unique properties, which makes the gzipped code bigger. To solve this problem, we use ambiguateProperties to minify the number of unique properties. Ambiguate properties will look at two property names on different objects such that there's no chance those objects will appear in the same variable. Then it will give those properties the same name.

Disambiguate and ambiguate properties are very conservative. They will only rename things if they are reasonably sure that it's safe to do so. (They can never be absolutely sure, because you could always pass in external objects that violate the declared types.) These optimizations only make sense when used in conjunction with "All Unquoted" renaming.

Short Names Aren't Necessarily Better


So far in this series, we've been assuming that short names make your binary smaller, and will always be better than long names.

That makes sense if you're sending your entire JS file across the network. But what if your visitors had old versions of your JS in their cache? You might want to figure out what version they have, and send them only the parts that had changed.

This is called delta encoding. If the compiler is choosing the shortest possible names for your properties, then small changes to your JS may change every compiled property in the output. The delta between two versions may be as large as the original files.

What we really want is a way to tell the compiler, "give these properties the shortest possible names, unless you've seen them before, and in that case give them the same name you gave them last time." The compiler has 4 flags for this:

--variable_map_input_file
--variable_map_output_file
--property_map_input_file
--property_map_output_file


The output maps from one compilation can be used as the input map for the next compilation. Although the compiler cannot guarantee that a property will be renamed the same way in both binaries, it will make a best-effort attempt to do so.

That covers most of our major property renaming policies. If you have ideas for more, let us know at the Closure Compiler discussion group.

2 comments:

trooper said...

Very informative series of articles. Thanks :)

One question: do browsers (in conjunction with web servers) actually use delta encoding when obtaining updated versions of files?

Nick said...

@trooper: I've seen a couple specifications for delta-encoding, but it hasn't really become mainstream.

The SDCH protocol (http://groups.google.com/group/SDCH) definitely works better when the diffs between versions are small. AFAIR, there are plugins for it for Chrome, FF, IE, and Apache.