dosync Archive Pages Categories Tags

Optimizing ClojureScript Function Invocation

16 March 2015

For all the available abstractions, the #1 tool of any Clojure and ClojureScript programmer is the humble function. This being true it becomes extremely important to optimize this common operation. This is not quite as straightforward as it may seem when compiling to JavaScript due to the fact that many Clojure and thus ClojureScript functions leverage multiple arities.

On the JVM this is handled via the clojure.lang.IFn interface as the JVM already has good support for methods with multiple arities. When writing JavaScript by hand this feature is simulated by dispatching on the length property of the magic arguments object available to any JavaScript function. However using the arguments object is not free so we would rather not pay this cost if we don't have to.

In the following we'll dig into the JavaScript generated by the ClojureScript compiler and the differences between the available optimization settings. All you need is an install of Java 8.

Run the following in a terminal in a directory of your choosing:

mkdir -p compiler_fun/src/compiler_fun
cd compiler_fun
touch src/compiler_fun/core.cljs
curl -OL https://github.com/clojure/clojurescript/releases/download/r3123/cljs.jar
touch build.clj

Use your favorite text editor to edit src/compiler_fun/core.cljs. Make it look like the following:

(ns compiler-fun.core)

(nth [1 2 3] 4 :oops)

Let's compile this simple program. Edit build.clj to look like the following:

(require 'cljs.closure)

(cljs.closure/build "src"
 {:output-to "out/main.js"
  :verbose true})

Let's compile:

java -cp cljs.jar:src clojure.main build.clj

Examine out/compiler_fun/core.js. Notice that nth got compiled to something like the following:

cljs.core.nth.call(null,new cljs.core.PersistentVector(...));

You're probably wondering why we would do the obviously slow thing and go through call here? In ClojureScript data structures are functions too, by invoking through call we can have higher order call sites that can invoke data structures as well as functions.

If we examine out/cljs/core.js and search for the value of cljs.core.nth we'll see a fairly large function that is invoked immediately. This function creates the arity dispatcher. You'll see that the dispatcher looks something like the following and interestingly has the different arities assigned as properties as well. We see how the properties are used shortly:

cljs$core$nth = function(coll, n, not_found) {
    switch (arguments.length) {
        case 2:
            return cljs$core$nth__2.call(this, coll, n);
        case 3:
            return cljs$core$nth__3.call(this, coll, n, not_found);
    }
    throw (new Error('Invalid arity: ' + arguments.length));
};
cljs$core$nth.cljs$core$IFn$_invoke$arity$2 = cljs$core$nth__2;
cljs$core$nth.cljs$core$IFn$_invoke$arity$3 = cljs$core$nth__3;
return cljs$core$nth;

Why do we use call again here paired with this? This is because the dispatcher might end up as a method on a data structure. While useless in this context, we can keep the compiler uniform. We won't be going through the dispatcher in production anyway.

Let's change build.clj to the following:

(require 'cljs.closure)

(cljs.closure/build "src"
  {:output-to "out/main.js"
   :optimizations :simple
   :static-fns true
   :pretty-print true
   :verbose true})

(System/exit 0)

And rebuild:

java -cp cljs.jar:src clojure.main build.clj

This time we examine out/main.js. Since we use a Google Closure optimization pass we end up with a single JavaScript file. By setting :static-fns we are asking the ClojureScript compiler to leverage static information. If we look at the end of the file we'll see something like this:

cljs.core.nth.cljs$core$IFn$_invoke$arity$3(new cljs.core.PersistentVector(...));

The indirection of call has disappeared which means we're not switching on arguments anymore. The ClojureScript compiler has static information about nth, it knows precisely what arities it supports and can optimize the call.

Why don't we always do this? The problem is that doing so for development would break redefinition. Many ClojureScript programmers enjoy redefining running programs as evidenced by the popularity of REPL driven development and even more radical tools like Figwheel.

There's one final production optimization. In the above form many JavaScript engines do not optimize nested property accesses like this. Google Closure in advanced compilation mode will collapse the namespace convention.

Change your build.clj to the following:

(require 'cljs.closure)

(cljs.closure/build "src"
  {:output-to "out/main.js"
   :optimizations :advanced
   :pretty-print true
   :pseudo-names true
   :verbose true})

(System/exit 0)

And rebuild:

java -cp cljs.jar:src clojure.main build.clj

We pretty-print and enable :pseudo-names so that we can see human readable output. Now if we look at the last line of out/main.js we'll see:

$cljs$core$nth$$.$cljs$core$IFn$_invoke$arity$3$(...);

The nested property access is gone. This code will execute significantly faster than the development version.

Future posts will cover protocol dispatch, type inference, arithmetic, and other neat things we're doing to ensure that ClojureScript is zippy across JavaScript engines old and new.