Beyond Source Maps

March 12th, 2014

There's been some recent talk on es-discuss about standardizing source maps in ECMAScript 7. Before that happens, we should take a moment to reflect on what source maps have done well, where they are lacking, and meditate on what a more perfect debug format for compilers targeting JavaScript might be. My response quickly outgrew an email reply, and so I am collecting my thoughts and posting them here.

Before diving in, it makes sense to tell you where I am coming from. I implemented the mozilla/source-map library and the source map support in the Firefox Developer Tools' debugger. I'm a coauthor of the document specifying source maps; although the actual design of the format was done by John Lenz. I mostly documented and polished the parts that weren't about the format itself: like how to find a script's source map. On top of that, I've debugged scripts with source maps compiled from browserify, the r.js AMD module optimizer, CoffeeScript, ClojureScript, Emscripten, UglifyJS, Closure Compiler, and more.

When do source maps work well?

Source maps work pretty well for:

When do source maps fall short?

Whenever you start doing "real compilation" instead of "transpiling" and your source language's semantics don't match JavaScript's semantics, source maps break down:

A stepping debugger isn't useful without the ability to inspect bindings and values in the environment, and the above issues create humungous hurdles for doing that!

Moreover, using the console as a REPL, adding a watch expression, or setting a new conditional breakpoint must be done with JavaScript, not the source language.

SourceMap.next?

Requirements

An ideal SourceMap.next format should support the existing use of source maps:

Additionally, any SourceMap.next format should support debugging JavaScript object code that the existing source map format fails to satisfy:

A Kernel of a Solution

The ideas I present here aren't fully baked yet, but I have confidence in the approach.

Because a SourceMap.next must do more than translate file, line, and column locations, it no longer makes sense to build the format on location mappings. Instead it should annotate the abstract syntax tree of the JavaScript object code with debugging information. Instead of adding debugging information to a line and column location in the JavaScript object code which in turn can be approximately translated into a specific JavaScript statement or expression, annotating the AST adds debugging information directly on a specific JavaScript statement or expression. It cuts out the middle man while simultaneously simplifying the architecture. Many existing tools are already using this approach: they create a JavaScript AST, annotate it with source language locations, and use escodegen to generate both the JavaScript object code and a source map!

Annotating the AST enables us to take advantage of the hierarchy present in the tree. If you are searching for a specific annotation and the current AST node does not have it, recurse on the node's parent. This makes it easy to provide very detailed information or a less fine-grained summary. For example, in the following AST, rather than requiring source location annotations for each AssignmentExpression, Identifier, BinaryExpression, and Literal AST nodes, one could simply add the annotation to the AssignmentExpression node. Then, any queries for source location information on any of the other nodes would bubble up to the AssignmentExpression and yield its annotated source location information.

AssignmentExpression
  operator: "="
  left: Identifier
    name: "answer"
  right: BinaryExpression
    operator: "*"
    left: Literal
      value: 6
    right: literal
      value: 7

Here are a few debugging annotations that should cover the requirements listed above:

"Solving the SourceMap.next problem once and for all" reduces to defining a compact binary encoding of JavaScript's AST that supports an arbitrary number of arbitrary debugging annotations on each node. The beauty of this approach is that as long as we take care that this format is future-extensible with new types of debugging annotations, we can resolve any future issues without needing to update the binary format. We just add new debugging annotations and keep the old annotations for legacy consumers until they die off. The only time the binary format changes is when a new version of the ECMAScript standard is released and defines new syntactic forms that the AST must incorporate.

The final affordance of having an extensible debugging format is that compilers which target JavaScript can progressively add more debugging information over time. Since specific annotations are optional, the compiler can simply skip those for which it can't provide information. When a later version of the compiler can provide that information, it can start adding those annotations.

« Previous Entry

Next Entry »


Recent Entries

Come work with me on Firefox Developer Tools on July 8th, 2014

Debugging Web Performance with Firefox DevTools - Velocity 2014 on June 26th, 2014

Beyond Source Maps on March 12th, 2014

Memory Tooling in Firefox Developer Tools in 2014 on March 4th, 2014

Hiding Implementation Details with ECMAScript 6 WeakMaps on January 13th, 2014

Re-evaluate Individual Functions in Firefox Developer Tools' Scratchpad on November 22nd, 2013

Testing Source Maps on October 2nd, 2013

Destructuring Assignment in ECMAScript 6 on August 15th, 2013

My Talk from Front Trends 2013 on June 21st, 2013

"Compiling to JavaScript, and Debugging with Source Maps" on Mozilla Hacks on May 22nd, 2013

Creative Commons License

Fork me on GitHub