This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
- # adventofcode (6)
- # announcements (4)
- # aws (21)
- # babashka (49)
- # beginners (58)
- # calva (3)
- # cider (2)
- # clj-kondo (21)
- # clojars (3)
- # clojure (35)
- # clojure-dev (4)
- # clojure-europe (5)
- # clojure-nl (8)
- # clojure-uk (8)
- # clojuredesign-podcast (7)
- # clojurescript (10)
- # core-async (3)
- # data-science (2)
- # datomic (2)
- # defnpodcast (11)
- # duct (4)
- # figwheel-main (1)
- # fulcro (34)
- # graalvm (12)
- # graphql (4)
- # joker (14)
- # kaocha (1)
- # midje (1)
- # off-topic (5)
- # pedestal (1)
- # re-frame (3)
- # reagent (4)
- # reitit (1)
- # shadow-cljs (4)
- # testing (12)
After spending much more time and effort than I expected (rare for a software development effort, right?? 😉 ), I've got a dev branch of Joker that starts up 2x (MacBook Pro) to 4x (Ubuntu on AMD Ryzen 3) faster (and
24MB versus 16MB, so about 50% larger).
It's too intertwined with current Joker code (makes assumptions about how Objects are defined, mainly) to be worth PR'ing, I think; and, now that I have a clearer understanding of what's needed as well as possible, I think I can refactor (basically rewrite) it to be much closer to 10x faster and yet much simpler and more maintainable, though with some key changes to Joker itself (mainly, how it initializes its data structures) that should pose minimal risk. (While I'm confident I can get that faster startup speed out of Joker, I have no expectations with regard to what'll happen to the executable size. Right now my branch generates lots of runtime code that would become compile-time initializations; that might be smaller, but maybe not by much.)
If anybody wants me to push this branch to my fork (entirely separate from the
gostd branch, which I've left mostly alone while working on this), let me know. I'd be interested in any input, though (again) I think the work is best (mostly) discarded in favor of something much better.
this sounds intriguing! What optimization techniques did you use to achieve the speedup? I have not looked into startup time in a while since it's currently good enough for me, but if it can be made faster without adding too much complexity, it's always great.
The current approach (which passes the automated tests) adds a new program,
gen_code, that gets run after
gen_data. Similar to the latter, the former reads in the
core/data/*.joke files (after first taking a snapshot of predefined vars in
joker.core, as they need to be handled differently). It currently supports converting only
Then it walks the namespace mappings and emits new files named
a_core_code.go. The former initializes things like strings and keywords, which don't really "belong" to namespaces; the latter inits per-namespace stuff. When it emits
a_core_code.go, it deletes
a_core_data.go (which had already been generated by
The initialization code that is generated uses static (file-scope) initialization where possible, runtime otherwise (in a
func init() or
func coreInit() function).
Then Joker is built with those
a_*code.go files in place. The build takes longer. But the resulting executable starts up with a useful
joker.core namespace without parsing nor evaluating any of the
core.joke code (as it normally would, out of the digested form in
That saves a fair amount of runtime. But since there are quite a few fields that are not "stable" (the same from Joker run to Joker run, or build to build), the amount of runtime-init code that is generated is substantial.
As I found and fixed bugs, that init code grew and grew, ultimately doubling (or so) the startup time.
There's currently too much complexity (IMO) in this approach.
gen_code has to keep track of preexisting variable definitions and treat them differently. Think
(add-doc-and-meta ...) versus a straightforward
Second, in part because there are special cases like the above (and the unstable
.hash fields), the code generation currently consists of one method/receiver per
Object (or whatever) that is generated -- parsed, evaluated, written into an
a_*_data.go file, and later read back in -- during normal startup/namespace-loading processing.
There seems to be a straightforward path to resolving the above as well as getting that 2x (or so) startup-time performance improvement back:
• Move all existing initialization code (`TYPES`, namespace mappings including
Procs, and the like) into distinct Go source files built by default, but not built given a build tag (let's call it
• Stabilize all (relevant) hashcode generation, so hashes can be treated as "constants" just like other fields.
gen_code to take advantage of the above by emitting almost-entirely-static initialization. (Go doesn't support circular initialization such as a
List.rest member that points back to itself, so those would still need to be initialized at runtime.)
• The previous step might be best done by using reflection directly, so just a handful of "agnostic" functions that don't really know (much) about Joker internals. That way, we wouldn't need to modify
gen_code due to adding a new
Object type or changing/adding/removing an existing one's field(s).
I'm working on the 2nd item above (stabilizing hashcode generation). Then I'll work on the 1st. If there aren't any major roadblocks in those, I hope to start into the 3rd and 4th soon, perhaps simultaneously (i.e. just write the 4th as a replacement for the 3rd, perhaps several
Object types at a time -- creeping replacement).
An example of something I haven't confirmed it is whether static/filescope
map initializations happen at build (compile) time. I was very happy when I confirmed such initializations happen for
struct and array types, as that wasn't obvious from reading the docs (Go doesn't yet have the concept of "constant" structs nor arrays).
Avoiding runtime initialization of
STRINGS would be nice wins, perhaps even measurable.
Forgot to mention, among the bullet points above, is that the use of a build tag would replace the current (kludgy) deleting of e.g.
a_core_data.go, as those
a_*_data.go files would themselves be tagged as
Similarly, the newly generated
a_*code.go files would be tagged as
fast-init (i.e. built only when that tag is specified).
Besides getting rid of the kludge of deleting a previously generated file, that'd solve one pain point I currently have, which is that I'm using two distinct build scripts (the new one wrapping
run.sh), depending on which version of Joker I want to build. And of course I do a lot of A/B testing, sometimes after modifying "normal" Joker to add trace capabilities and the like to track down bugs.
Hope that helps! Let me know if you want me to push the current work to my fork as a branch you could then peruse, try out, etc.
Thank you for the detailed explanation! Yes, I am interested to look at the code, so if you could push it somewhere it'd be great! Also, do you think it makes sense to create github issue like "Improve startup time" to track this work and keep the comments like the ones above? Otherwise they may disappear due to Slack retention policy (whatever it currently is).
I think an Issue would be a great idea -- better to preserve discussion (worthy of a permanent record) there than on Slack. Pushed my work as of yesterday (I've started refactoring it to the new approach since then) here: https://github.com/jcburley/joker/commits/gen-code
Pushed a new version of the code to the same branch. Substantial rewrite, with about a 20-40% improvement on my Ryzen 3 running Ubuntu, now at maybe a 7x speedup over the vanilla version (not so much on my MacBook Pro; maybe a 2.5x speedup?). See the latest commit for more info. A few more bugs to fix (as it passes all tests but generates slightly different documentation), and a fair amount of cleanup to do. Plus I should provide much better documentation so Joker developers know how to care for the new code (with the concomitant changes to Joker itself). But, as deep as this rabbit hole turned out to be (I seriously thought it'd take a week or two when I started out -- several months ago!), there appears to be a light at the end of the tunnel. Here's the branch: https://github.com/jcburley/joker/tree/gen-code
(This is about improving only the startup time of Joker; out of context, the above might appear to be describing overall improvements, which was not intended.)
The latest version, just pushed, squeezes another 2ms or so out of startup time on my MacBook Pro (OS X), though it's barely measurable as an improvement on my Ryzen 3:
./run.sh as usual; the resulting
joker executable, also named (via hardlink)
joker.fast, is the fast-startup version, while
joker.slow is the normal version.
I hope to make this PR-able by next Thursday, possibly sooner. Needs more cleanup, but the list of known optimizations to pursue is now empty. (The list could start growing again if somebody analyzes why it's still 2.5x or so slower starting up than a simply command-line-echo program written in Go.)