Fork me on GitHub
#clojure-dev
<
2019-08-02
>
andy.fingerhut00:08:56

OK, for any Java concurrency experts out there, I am now less sure of transient collection thread safety for passing transient vectors (at least, as first example I am looking at) from one thread to another. Suppose everything is nicely synchronized between two threads when things begin, then thread 1 calls transient on a vector and does a pop! on it, which caused its count to decrease from 3 to 2. Thread 2 gets a reference to this post-pop! transient vector, and calls count on it. Is thread 2 guaranteed to get the updated count when pop! completes? A few more details in the thread I will start on this.

andy.fingerhut00:08:36

I think maybe the answer is "no". Why not? Because even though all of the fields of class PersistentVector$TransientVector are volatile: https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/PersistentVector.java#L520-L524

andy.fingerhut00:08:18

The pop! operation for transient vectors ends by assigning a value to the root field, followed later by decrementing cnt: https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/PersistentVector.java#L787-L792

andy.fingerhut00:08:45

Thread 2's count starts by calling ensureEditable, which reads the volatile field root (I am assuming that this happens strictly after pop! completes for this example), then reads the volatile field cnt: https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/PersistentVector.java#L537-L540

andy.fingerhut00:08:17

So we know Thread 1's pop! causes write of root to happen-before write of cnt, by program order of operations in a single thread.

andy.fingerhut00:08:30

And we know Thread 2's count causes read of root to happen-before read of cnt, for the same reason.

andy.fingerhut00:08:59

But even if we assume that Thread 1's write of root happens-before Thread 2's read of root, that doesn't seem to require Thread 2's read of cnt to see the value of Thread 1's update of cnt. That is: separate volatile fields do not have any coordination between them.

andy.fingerhut00:08:18

For this example scenario, it seems that putting an assignment to the field root in method pop at the very end, after updating cnt, would be safer.

andy.fingerhut00:08:47

All TransientVector methods that are public either start with ensureEditable(), or some other function that calls ensureEditable(), which reads the field root (good), but in order for that to guarantee a consistent state among all 4 volatile fields, it seems like all methods that modify the collection state should also end with an assignment to root.

andy.fingerhut00:08:38

Perhaps an answer is: depending upon precisely how thread 2 got the reference to the transient collection, it might introduce more synchronization constraints with thread 1, that guarantee things are correct.

Alex Miller (Clojure team)04:08:27

You might be right, but I’d have to do some code study to tell. I vaguely recall there being a jira in this area too if you want to hunt it up

Alex Miller (Clojure team)04:08:05

There is a total ordering of all volatiles and some additional rules related to happen-before too iirc

andy.fingerhut07:08:00

https://clojure.atlassian.net/browse/CLJ-1580 is the one I found that looks most related, and its patch was merged in with the same Clojure release where the birth-thread check was removed from transients.

leonoel07:08:05

what makes transients memory consistent is not volatiles, it's the golden rule of transients : always use return value for later changes. To follow this rule in a multithreaded context implies a synchronization of t1 and t2 such that t1's pop! returned HB t2's count called, thus ensuring all changes made by t1 are visible on t2.

andy.fingerhut07:08:49

I agree that may be correct, but is it necessarily true that t2 getting a reference to the updated transient value requires a synchronization from t1 to t2? If so, why all the warnings in JCIP about improper publishing of a reference to an object?

andy.fingerhut07:08:43

And if it isn't the volatile's, why bother adding them to the implementation?

leonoel07:08:32

JCIP's safe publication stuff is about visibility guarantees on race conditions. Transients are explicitly designed to be updated one thread at a time, so I guess it's safe to assume no race conditions here.

leonoel07:08:22

BTW I would be very glad to know the purpose of all these volatiles in the implementation as well

leonoel11:08:39

probably the same purpose as volatiles in stateful transducers

andy.fingerhut20:08:57

In response to this statement of yours Alex: "There is a total ordering of all volatiles and some additional rules related to happen-before too iirc" I think it is true that there is a total ordering of all accesses to a single volatile field among all threads, but at least from my reading so far I haven't seen anything that says there must be a total ordering among accesses to two different volatile fields (unless there is some other constraint like program order or monitor locks, etc.)

andy.fingerhut23:08:16

FYI for those interested I finally noticed that Goetz's book "Java Concurrency In Practice" (JCIP) has a chap. 16 that attempts to make explicit connections between the Java Memory Model specification terminology, and recommendations made elsewhere in the book. Nice.

👍 4
andy.fingerhut02:08:25

One source for the purpose of the volatiles is here: https://clojure.atlassian.net/browse/CLJ-1580 Those were a continuation of changes started with this ticket: https://clojure.atlassian.net/browse/CLJ-1498