Fork me on GitHub
#architecture
<
2018-08-05
>
john15:08:17

So today I'm endeavoring to build, in clojurescript, a threadsafe alloc/`free` mechanism over a js/SharedArrayBuffer slab. Pointers are welcome! Feel free to point out any pointers on pointers.

orestis15:08:13

I thought SAB was decommissioned after the Intel Spectre bugs?

john15:08:56

Temporarily... They're going to introduce some time obfuscation at the super low level, at some later date, iiuc

john15:08:15

SABs are critical to WASM eventually interoperating with JS, if I understand correctly

john15:08:20

SABs aren't thread safe, but they have an js/Atomics interface where you can atomically (CAS) swap out one byte at a time.

john15:08:55

You can js/Atomics.wait on a particular byte, and then another tread can `js/Atomics.wake` js/Atomics.notify on that byte and any waiting threads will be woken up.

john15:08:17

So when we just wake up every thread trying to get a lock, each tries to swap their id into lock the byte. Whoever wins proceeds with access to the rest of the data structure in the allocation. The rest go back to sleep waiting on the lock byte.

john15:08:13

What I've been doing, the data structure is duplicated in two parts of the SAB. So when someone gets a lock, they can only write to a second version of the structure. Once they're done writing, the flip the internal pointer to the new structure. That way, you get thread safe, non-blocking reads.

john15:08:12

The downside is that we cannot synchronously expand the size of a SAB. So unless we can predict the possible expansion rate of the memory beforehand, we may fail on asynchronously growing the memory fast enough.

john15:08:41

If you know you'll never add more than twice the previous size of the memory, we can pre-swap out the SAB with bigger SABs.

john15:08:48

One problem with that scheme is that you double your memory usage, obviously. Also, constantly growing and shrinking little SABs is more likely to hit those edge cases where you fail to grow them fast enough. So just using pointers on a SAB slab would amortize that failure rate as well across a larger memory pool.

john15:08:12

So this is the code I'm looking at right now: (you can ignore the sab and int32, as they're not yet touched in the below implementation. just the control maps are showing the memory would be managed. btw, in the maps, keys are indexes into the int32, and values are lengths)

john15:08:43

Also, there's going to be another SAB in gmem which is the "control structure" that stores :alloc and :free which controls the locking and current address. There'll likely be a pointer map that atomically points to the latest/current address of the object. And a lock map (of CASable bytes) that control who has access to updating the structure.

john15:08:22

I'm thinking I may need a read counter too, which readers can atomically inc and dec on, delaying any free calls until the read counter is at 0, so as to prevent new allocations from writing over a slow reader.

john15:08:29

I can't imagine a writer winning that race though, since this is an ordered block of memory and writers will always be staring after readers, at a location to the left any readers in the buffer.

john16:08:37

I'm not as much worried about the performance of the above implementation at them moment, over correctness. For best performance, the whole thing could be converted to bit-twiddling directly on the SAB

john16:08:19

For the time being, I'm going to store all these control maps as serialized strings in the control SAB

john17:08:12

the problem with the read-counter idea (if it's even necessary) is that misbehaving workers may die or get killed by a watchdog service during mid-read, so the counter may never reach zero. In that case, we'd need rather a read-registry of IDs, so a watchdog service can remove all registrations for readers that fail some health checks.

john17:08:57

Or perhaps free can have a few dozen seconds timeout on reads, and just assume any threads that haven't decd the counter must be dead workers. No single read should take that long anyway.

john23:08:30

K, here's a more thorough example model. It has a vector (vab) pretending to be an ArrayBuffer. As new objects are assigned to pointers, the old memory gets deallocated.