Fork me on GitHub
#clojure-dev
<
2020-10-13
>
rutledgepaulv02:10:35

I had an idea today and am wondering if it's ever been discussed / if a patch is worth considering. Currently, the swap implementation of an atom always executes the validator + compareAndSet + watcher notifications, however if the returned value from f.invoke(v) is identical to v I think it should be reasonable to skip the cas and validator and just return v / newV (skipping watchers might break an existing expectation). If swap was implemented that way I would feel more comfortable optimistically swapping to perform a conditional modification in f (versus doing a double checking pattern to avoid unnecessary CAS contention and validator invocations).

rutledgepaulv02:10:44

as a more concrete example:

rutledgepaulv02:10:52

(defonce public-key-cache (atom {}))


; currently I feel somewhat obligated to write things this way to avoid unecessary CAS contention:

(defn get-public-key [oidc-config key-id]
  (let [fetch (delay (get-indexed-signing-keys oidc-config))]
  	(or (get @public-key-cache key-id)
  	    (get
	      (swap! public-key-cache
	             (fn [old-keys]
	               (if (contains? old-keys key-id)
	                 old-keys
	                 (merge old-keys (force fetch)))))
	      key-id))))


; if the swap implementation never performed a CAS for an identical value, then i could always comfortably write this instead:

(defn get-public-key [oidc-config key-id]
  (let [fetch (delay (get-indexed-signing-keys oidc-config))]
    (get
      (swap! public-key-cache
             (fn [old-keys]
               (if (contains? old-keys key-id)
                 old-keys
                 (merge old-keys (force fetch)))))
      key-id)))

rutledgepaulv03:10:07

ahh let me think about that. it still guarantees f is run at least once, and it still guarantees that the value in the AtomicReference is at some point the value of f.invoke(v). Is there some other guarantee the current implementation is providing that the early exit does not?

seancorfield03:10:26

I deleted my comment because I convinced myself I was wrong, but now I'm not so sure -- I think it could lose data, because even if v is identical to newV, v might no longer be the current version by the time of the CAS.

rutledgepaulv03:10:15

yeah it's a little hard to think about. but i think it would not offer any fewer guarantees than already exist. there's no guarantee that by the time swap returns the value in the ref is still newV

seancorfield03:10:30

Is your suggestion that if v is identical to newV, it returns out of the for loop?

seancorfield03:10:57

In that case, it would not retry if v was no longer equal to the current value.

rutledgepaulv03:10:03

but having the AtomicReference be equal to newV at the time of return is not a guarantee currently offered. just that at some point in time AtomicReference was successfully set to the value of f.invoke(currentValueOfAtomicRef)? I see your point that the ref may have still changed before f.invoke(v) is done, but i am not sure that exiting early would violate any guarantees provided since there's no guarantee about the order of competing swap applications being applied anyway

seancorfield03:10:08

Yeah, that's an interesting point about what the final outcome is guaranteed (or not guaranteed) to be in cases where there would matter...

Alex Miller (Clojure team)03:10:59

unless I misunderstand what you want, you can't and shouldn't use atoms for this

rutledgepaulv04:10:40

can you clarify the "can't" scenario?

Alex Miller (Clojure team)04:10:28

you can't "read and then conditionally write" as separate operations

rutledgepaulv04:10:41

understood, i'm talking about deciding not to change the value passed into the swap function and just returning that same value in some scenarios

Alex Miller (Clojure team)04:10:59

you are spending more cycles trying to avoid contention that the cost of the contention

rutledgepaulv04:10:10

hm ok i will have to write some perf tests to prove it to myself. thanks

Alex Miller (Clojure team)04:10:12

"if the swap implementation never performed a CAS for an identical value" is nonsensical - you can't know the read value to determine identity without being in the CAS (that's the C)

Alex Miller (Clojure team)03:10:54

if you want to know whether a particular swap changed the value, you can do that with the newer swap-vals! which returns both the old and new value

rutledgepaulv03:10:35

thanks, I'm aware of swap-vals! - this isn't about detecting the result of a swap from user code, it's just a suggested way to reduce cas contention when using a swap function that doesn't always result in a modification

seancorfield03:10:53

@rutledgepaulv It feels a bit like you're trying to reinvent clojure.core.cache.wrapped or clojure.core.memoize? Can you explain a bit more about the actual problem you're trying to solve here?

seancorfield04:10:02

I think your function is trying to do something similar to clojure.core.cache.wrapped/lookup-or-miss (which deals with various edge cases around cache miss/race conditions).

rutledgepaulv04:10:49

Totally agree those are available / better mechanisms for doing caching. i've sometimes used an approach like this as a poor man's cache though and haven't encountered issues with doing so. Presumably there are other cases where a swap function sometimes degrades into identity ? i don't yet see a reason to spend a cas in those cases

seancorfield04:10:59

Given all the other issues involved with avoiding race conditions etc, avoiding CAS in this one case seems like a premature optimization...?

rutledgepaulv04:10:50

could be! I haven't measured any of this but if I had a lot of simultaneous operations on one atom the retries on f can increase pretty quickly and this is one way to remove a lot of those if some of the simultaneous f invocations end up not needing to do anything

seancorfield04:10:30

Try clojure.core.cache.wrapped/lookup-or-miss and see how it behaves. I'd be interested in feedback.

rutledgepaulv04:10:38

definitely looks like it implements the same delay / swap pattern. what i am wondering is if the swap! itself adds some overhead in the "hit" cases (if your hit is a no-op, which is probably not true of core.cache due to eviction policies, etc)

seancorfield04:10:12

The basic cache is a plain atom.

rutledgepaulv04:10:57

i'll take a look. fwiw i just wanted to discuss the idea and see if there was a potential opportunity for a backwards compatible improvement in core, i'm totally happy with "no" or "use something else if that matters to you" as the answer 🙂

seancorfield04:10:24

My gut says your proposed change could break existing programs but I have no idea how I would construct a test to show that 🙂 so I may be wrong and the differences wouldn't actually "matter". But I will say that it took quite a while to arrive at how lookup-or-miss works to 🙂

Alex Miller (Clojure team)04:10:00

atoms are very fast - clojure's atom is based on Java's atoms which typically compile right down to things with hardware support. I would not suggest trying to optimize it

rutledgepaulv04:10:30

that makes sense. but running f multiple times when f is content with a no-op seems like an opportunity?

Alex Miller (Clojure team)04:10:38

I just don't think it matters. people do this stuff with atoms all the time

rutledgepaulv04:10:19

yeah i don't think it's a massively important thing, just an idea. i will try to cobble together some perf test scenarios for my own edification if nothing else

vlaaad21:10:55

get/`get-in` could have 1-arity to make them useable in transducer context

vlaaad21:10:06

I'm in some very deep trees...

vlaaad21:10:45

ah, there is completing for that, right