This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2023-05-04
Channels
- # announcements (1)
- # architecture (7)
- # beginners (44)
- # biff (11)
- # calva (15)
- # cider (5)
- # clerk (9)
- # clj-kondo (20)
- # clj-on-windows (19)
- # clj-yaml (2)
- # cljs-dev (39)
- # clojure (52)
- # clojure-czech (2)
- # clojure-dev (11)
- # clojure-europe (28)
- # clojure-hamburg (10)
- # clojure-hungary (3)
- # clojure-nl (1)
- # clojure-norway (59)
- # clojure-uk (5)
- # clojured (2)
- # clojurescript (33)
- # conjure (2)
- # datahike (1)
- # datomic (5)
- # defnpodcast (5)
- # emacs (18)
- # figwheel (2)
- # funcool (6)
- # graphql (1)
- # hyperfiddle (11)
- # jobs (3)
- # joyride (13)
- # malli (6)
- # music (4)
- # off-topic (45)
- # polylith (11)
- # practicalli (3)
- # rdf (3)
- # releases (1)
- # scittle (8)
- # shadow-cljs (13)
- # specter (2)
- # squint (8)
- # testing (6)
- # tools-deps (21)
- # xtdb (2)
Together with @mfikes and @dnolen we recently improved the random-uuid implementation in ClojureScript making it considerably faster (https://clojure.atlassian.net/browse/CLJS-3369). However, it is not ideal, since it didn't fix the ultimate problem that it relies on Math/random in the form of rand-int instead of proper crypto functions. For that I thought about something like this:
(defn get-array
[]
(js/Array.from (.getRandomValues js/crypto (js/Uint16Array. 2048))))
(defn get-uint16
[]
(let [uint16 (when crypto-array
(.shift crypto-array))]
(if-not uint16
(do (def ^:dynamic crypto-array (get-array))
(get-uint16))
uint16)))
(defn crypt3-random-uuid
"Improved UUIDv4 generation."
[]
(letfn [(quad-hex []
(let [unpadded-hex ^string (.toString (get-uint16) 16)]
(case (count unpadded-hex)
1 (str "000" unpadded-hex)
2 (str "00" unpadded-hex)
3 (str "0" unpadded-hex)
unpadded-hex)))]
(let [ver-tripple-hex ^string (.toString (bit-or 0x4000 (bit-and 0x0fff (get-uint16))) 16)
res-tripple-hex ^string (.toString (bit-or 0x8000 (bit-and 0x3fff (get-uint16))) 16)]
(uuid (str (quad-hex) (quad-hex) "-" (quad-hex) "-"
ver-tripple-hex "-" res-tripple-hex "-"
(quad-hex) (quad-hex) (quad-hex))))))
It just shows getting a bigger chunk of random data and then basically taking smaller chunks from this small buffer is quite efficient. However, this is too concrete, hacky and very likely thread UNsafe in my view. I would much rather open a discussion in the Clojure and ClojureScript community about how we would like to treat getting cryptographically random data and whether everybody is fine with the current state of things.
I envision basically a /dev/urandom but with configurable output to get e.g. a byte stream, or an endless well of characters possibly even constrained to some alphabet/ strings possibly from some alphabet/doubles possibly from some range/ longs possibly from some range or whatever the literals and possibly even platform specific types are. This would make it much simpler to implement random-uuid in ClojureScript properly, however it could probably make it much simpler to implement portable libraries for cryptographic key generation/ generating truly random data e.g. for testing. The idea is to focus on practical usability and good performance (it shouldn't be much slower than tapping directly into /dev/urandom). My naive JavaScript implementation was able to reach about 1/8th of the /dev/urandom bandwidth based on the measurement of how many UUIDv4s I am able to generate per second in Chrome. The biggest bottleneck seems to be the browser/ the browser API.
What do you think?I would personally like to find a way to resort to the Web Cryptography API, but it has poor support outside of the latest browsers; e.g. Node.js only supports randomUuid() in 19+, and I am not sure if GCC has "good" polyfills for this
The .getRandomValues works outside the secure context too and is widely supported since many years. I would need to test/ check more in depth on old versions of Node.JS for compatibility.
We have implemented alternative impl on penpot: https://github.com/penpot/penpot/blob/develop/common/src/app/common/uuid_impl.js maybe it can be useful for another example.
You will probably loose the performance battle when getting small buffers of 16 bytes, especially when you need more than a single UUID (which is very often the case). The expensive thing is the API call and getting the random data. Getting it by the OS page size (usually 4096 bytes -> 2048 "shorts") turns out to be quite efficient. Working with strings and single numbers that the JIT recognizes as fitting in 31 bits is quite efficient in JS but your version could probably be slightly faster there than mine.
agree, but in this case it is already pretty fast (twice faster than random-uuid) and it is more than enough. Using BigInt math makes it a bit slower that the best performance, but reduces the code and complexity a lot, so this is a tradeoff. our approach generates truly v7/v8 UUID type fast enough for us
We have the same implementation for JVM also, so it is completely multi platform https://github.com/penpot/penpot/blob/develop/common/src/app/common/UUIDv8.java, we can think on extract it to a library if there are some interest in it, but for now it lives on the penpot codebase
v8 (based on same ideas of v7) outperforms with a good margin all other fully random v4 implementations because it does not need to generate random values on each invocation
> would personally like to find a way to resort to the Web Cryptography API
I guess you could probe if it exists via js/globalThis.crypto
and then use it, if not, then fall back on the older less optimal impl
TIL there's globalThis
. Instead of adding self
to Node or adding global
to browsers, they added globalThis
everywhere.
Funny detail is that in SCI environments like scittle or nbb, I map js
to globalThis
, so js/crypto
is just looking up crypto
from globalThis
if you do:
npx nbb@latest -e '(= js js/globalThis)'
it returns true
because js
is just the global object, not a special syntaxReading this to understand the reasoning: https://github.com/tc39/proposal-global/blob/master/NAMING.md Do you have any idea what they mean by "realm"?
Phew, it's not a new entity: > The language reference uses abstract terms because JavaScript environments can vary widely. In the browser, a window (a frame, a window opened with window.open(), or just a plain browser tab) is a realm.
Oh, we also have this wonder, albeit at the proposal stage: https://github.com/tc39/proposal-compartments Every single time I read something about JS, I feel like the relative amount of my knowledge about it reduces dramatically.
Maybe they should watch Rich Hickey's new talk. It seems they are starting with a solution rather than a problem statement.
One huge complication for web cryptography api usage is that browsers seem to only expose self.crypto
in “secure context”. So we’ll need more than just a polyfill if we intend to use it in a browser. Overall, I think the current implementation of random-uuid
is fine as-is, but if we want to try making use of the host, then there is a lot of tinkering one can do in this area
funny thought that just occurred to me: we can expose a random-uuid
macro in case people want to compute UUIDs at compile-time and avoid as much run-time generation as possible 😄
in any case, i think my opinion is firmly “keep random-uuid as-is”
if compile time random uuids are a solution to your problem, are you sure you needed random uuids at all?
As noted before I obviously thought about the API support a bit... The .getRandomValues works outside the secure context too and is widely supported since many years. I would need to test/ check more in depth on old versions of Node.JS for compatibility.
random-uuid should use crypto by default - the same as the Clojure (Java) version. It was very unfortunate to use Math/random by default at all especially since the API getRandomValues seems to be older than ClojureScript.
why is it important that this happens in cljs.core and not a library? just curious but what are you doing that the speed of random-uuid
matters?
If you are giving up on speed just because you can and the implementation isn't an order of magnitude simpler, there is no good reason to just burn the cycles. Getting random data in any form with a reasonable speed is always good. That way you can use the API in many situations without thinking twice - e.g. you can give every animation, every span a UUID and don't need to invent your identifiers and think about all the potential problems. Especially for the UUID case - I thought the implementation in quads to be somewhat more elegant and an obvious way how to reduce the number of function calls. It turned out to bring the expected speed up with it. Now I would like to actually get the proper crypto too (which might reduce performance by a bit, but it will probably still end up faster overall than the previous version).
random-uuid is in core, so an endless well of cryptographically pseudo-random data would be much welcome to be useable by random-uuid without duplication of effort. I am not opposed to have the "random well" as a separate library however it would be quite strange if random-uuid would tap into it if the library was present for instance.
I'm more concerned about the availability of js/crypto
. if its not available universally then it can't be in core. like is it available in node, deno, bun, browsers, jsc and whatever other engines might exist?
That needs to be tested. We can also have the current implementation as a fall back while e.g. setting some kind of flag to either not the fallback was used or throw/log an error if it is used or something for more constrained environments/ environments with stricter requirements that are now quite possibly mislead about the quality of random-uuid.
For older Node, we might need to add a suitable workaround. Modern Node supports the standard it seems. https://nodejs.org/docs/latest/api/crypto.html#crypto_crypto_randomfillsync_buffer_offset_size https://nodejs.org/en/download/releases https://developer.mozilla.org/en-US/docs/Web/API/Crypto/getRandomValues
but if I see this correctly in node its not a global js/crypto
but something you require/import?
May be different with very new Node.js though. I definitely would need more guidance, I am not sure what minimal version ClojureScript targets. Perhaps with the next major version though, the requirement could be lifted to only support slightly more recent stuff that is not so constrained in regards to cryptography for instance.
It seems, the non-randomness in the cryptographic sense of Math.random can be exploited rather practically: https://www.youtube.com/watch?v=_Iv6fBrcbAM It would help a lot to know what the minimal supported environment should look like. I guess we can have the Math.random implementation as a fallback if there is no supported cryptographic implementation. In such a case, it should be possible for the developer to know up front if actual randomness will be used. My worry is this is just so easy to overlook and use e.g. UUIDs as some session tokens with Node.js backend possibly making guessing a practical attack vector.