Announcing "CLJS str", a drop-in replacement for ClojureScript's str which is 4-300x faster (depending on the input)
https://github.com/borkdude/cljs-str
Thanks to @qythium for the initial idea
Nice! Out of curiosity, if it is drop-in replacement, why a library instead of a PR to CLJS?
I was just going to ask that. Could/should this just be folded into cljs itself?
I guess one consideration would be that CLJS targets older version of JS where string templates aren't available, but I can fix that by not using them. The only reason I'm using them right now is multi-line strings, but I can normalize
"foo
bar"
to "foo\nbar" and then I don't need to use them anymore.
If @dnolen is interested, I'd be happy to make this into a CLJS issue + patch, but I'm sure someone else has thought about this beforeO yes, another consideration is that ?? isn't available in older CLJS environments
nice! I think you've got to escape any backticks in the string literals though - otherwise
(macroexpand '(str "`alert('oops')`"))
;; => (js* "``alert('oops')``")nice gotcha
I can get rid of the template altogether
Here's how i did it:
(defmethod emit `str [[_ & xs]]
(str "`" (str/join (for [x xs]
(if (or (char? x) (string? x))
(-> x
(str/replace "\\" "\\\\") ; Escape backslashes
(str/replace "`" "\\`")) ; And backticks
(str "${" (emit x) "}"))))
"`"))I'll get rid of the template and preferably also of ?? so in CLJS you can use it to target pre-2020 JS engines
fixed. it's now compatible with older versions of JS, so you could even use it in libraries where string concatenation is very important performance wise, e.g. in #honeysql (cc @seancorfield)
is there a built-in function in CLJS which does this?
(defn ?? [x]
(if (nil? x) "" x))
I needed to introduce this helper to make it old-JS compatibleI don't know about cljs itself, but I found https://github.com/google/closure-library/blob/master/closure/goog/string/string.js#L1118
not too familiar with cljs either, but aren't there compiler options that let you polyfill the template-literal syntax back to any pre-es6 form if needed?
it doesn't matter, outputting this is equally fast
ah right, it looked a bit strange how it's doing the + concatenation of constants at runtime, but I guess engines would easily optimize that away
yep, they optimize it equally well
Now upstreamed to CLJS! https://clojure.atlassian.net/browse/CLJS-3452
hmm, since this is emitting raw js* expressions, what would happen if the literal strings happened to contain "~{}" or anything else it parses as meaningful syntax (I couldn't find an official reference to it in the cljs docs) - would those have to be escaped? I don't have a cljs build setup but on self-hosted planck it appears to throw a compile-time error
(macroexpand '(str hmm "~{}"))
;; => (js* "''+~{}+\"~{}\"" (borkdude.cljs-str/?? hmm))
$ planck
ClojureScript 1.11.132
cljs.user=> (js* "''+\"~{}\"+~{}" :hmm)
Execution error (SyntaxError) at (<cljs repl>:1).
Unexpected end of scriptthat works because the right hand side is the literal string expression, but only the left hand side is the js* string part
f00 function g(){return"12truefalsedude~{}"}
basically you get:
(js* "~{}" "~{}")
which evaluates to: compile this expression, the string, in place of the ~{}which then is just the literal string
similar when you do this in clojure: (format "foo %s" "%s") => "foo %s"
ahh I was running an older version of the macro which coalesced constants into the first argument to js* - okay on the latest one I can see that it works fine
yeah I did migrate away from that, since the representation in a JVM macro might be different than how CLJS emits it to JS, so I went with "let CLJS emit it". Which also solves your problem, nice
@qythium your variation could also use this:
`${"1 `ss2 3"}${"4 5 6"}`
In that case you can leave the strings as is. And JS engines will likely just interpret this as efficiently as without the double quotes (at least that's what I'm measuring here.thanks! oh I was also wondering why your ?? was calling .toString on its output when they were all getting +'d onto the same chain of strings, guessing there's some weird type inferencey optimization that you've benchmarked
@qythium can you give a code example?
(defn ?? [x]
(if (nil? x) ""
(.toString x))) ;; <- why not just xah yes. this is because + has a nasty edge case that it calls .valueOf on objects instead of .toString
now I've made cljs.core/str arity one do the same as ?? so I can re-use that in the CLJS patch
for squint I might go with the same solution:
'' + "1 `ss2 3" + "4 5 6" + str1(x)
or:
`${"1 `ss2 3"}${"4 5 6"}${x ?? ''}`
In squint I can use string templates and ?? so that's fine. I'm not sure yet which I'll use. esbuild will optimize the inline ?? when it's sure that x is non-nil. But it doesn't reason across functions like google closure does.foo={valueOf:()=>'x',toString:()=>'y'}
[''+foo, [foo].join('')]
Array [ "x", "y" ]
oof, TIL yet another js footgunyeah, a footgun for sure. if Array.join would be just the fastest overall, we could just use that everywhere
but it's not.. :'(
(squint issue here: https://github.com/squint-cljs/squint/issues/723#issuecomment-3402203243)
yeah I'm probably going to with the inline ??''