High-level overview of a compilation job

Thursday, March 3, 2016 | 9:31 PM

This post gives a short overview of Closure Compiler's code, and the main classes that are involved in a compilation. It is intended for people who are getting started with the compiler, and want to make changes to it and understand how it works.

For each compilation, an instance of Compiler is created. The compiler class contains several compile methods, and they all end up calling compileInternal.

The CommandLineRunner takes the command-line arguments and creates an instance of CompilerOptions. The compiler uses the options object to determine which passes will be run (some checks and some optimizations). This happens in DefaultPassConfig. The two important methods in this class are getChecks and getOptimizations.

Before running the checks, the compiler parses the code and creates an abstract-syntax tree (AST). The structure of the AST is described in Node, IR and Token. NodeUtil contains many static utility functions for manipulating the AST.

PhaseOptimizer takes the list of passes created in the pass config and runs them. Running the checks is simple, we just go through the list of checks and run each check once. Some optimization passes run once, and others run in a loop until they can no longer make changes. During an optimization loop, the compiler tries to avoid running passes that are no longer making changes. If you are experimenting with the compiler and want to see the code after each pass, use the command-line flag --print_source_after_each_pass. If you want to see how long each pass takes, and how each pass changes code size, use the flag --tracer_mode=ALL.

After all checks and optimizations are finished, the AST is converted back to JavaScript source. See CodeGenerator and CodePrinter.

This is basically it. Below, we briefly describe some of the common compiler passes.

NodeTraversal has utility methods to traverse the AST. All compiler passes use a traversal to go through the code, rather than hand-written recursion.

The type-checking code lives in TypedScopeCreator, TypeInference and TypeCheck. The code for the new type checker (still under development) lives in GlobalTypeInfo and NewTypeInference.

To see some of the optimizations, start at getMainOptimizationLoop and look at the passes used there.

Also, the debugger is really useful. You can paste some JS, turn individual passes on and off, and see the AST, the generated code, and the compiler warnings.

Posted by Dimitris Vardoulakis, Software Engineer

2 comments:

Jonny said...

So useful information for me! Like it!

Jon said...

This is great, thank you for the overview!