Fork me on GitHub
#core-async
<
2017-05-25
>
jfntn14:05:33

Is it possible to encode a chan as transit as if it were a single collection?

noisesmith17:05:24

you would need a new channel type if you wanted to get the contents of the chan (unless you want to dump the contents into a collection and send that I guess) - what is the purpose of sending a chan?

noisesmith17:05:34

I definitely have abstractions where each item coming from another host is decoded by transit and put onto a chan, but I can’t think of why I would want to send a chan, or if I did how I would want that chan to behave

jfntn18:05:17

@noisesmith our use-case is we’re getting a chan of db rows from cassandra, and we want to use that as a response body in pedestal

jfntn18:05:52

but we need to encode the chan as transit, and have it read back as a collection on the client

noisesmith18:05:53

don’t send the chan, send the things that come off the chan

noisesmith18:05:09

a chan is not a container

noisesmith18:05:25

it’s an object that does io

jfntn18:05:36

the response can be large and it’d be great if we could use the chan as a backpressure medium

jfntn18:05:58

@mpenet yes! alia is great really enjoying it 🙂

noisesmith18:05:42

@jfntn sure, you can pipe from the channel into a stream that encodes each element as it comes off for example

noisesmith18:05:55

a transit writer that loops on chan reads can do this

jfntn18:05:02

There’s a hack that works, where we wrap the chan and return a chan of transit strings, we wrap the contents in brackets, and interpose commas, the client reads that back as a vector

noisesmith18:05:27

you don’t need this with transit - just keep writing more objects to the writer

noisesmith18:05:34

and make sure you loop on reading on the other end

noisesmith18:05:00

the output stream coming from the writer will have the bytes ready as they come if you tie it together properly

noisesmith18:05:18

none of this is core.async or chan specific though

jfntn18:05:21

hmm need to look into that, think this would have to bridge to java.nio to work with pedestal though

noisesmith18:05:24

the transit shouldn’t ever need to become a string in this scenario

noisesmith18:05:43

@jfntn make the chan’s writer out of your nio output stream direclty

mpenet18:05:50

You can pass a custom chan to execute-chan-buffered that has a transducer that does transit encoding of each chunk

noisesmith18:05:13

it’s simpler to just wrap your network io object in the transit writer htough

noisesmith18:05:26

and write each item as you get it

mpenet18:05:25

I am not familiar with transit, and the particular use case, but there are options for surr :)

mpenet18:05:35

(on phone meh)

jfntn18:05:57

@mpenet I happened to stumble upon the issue you opened for pedestal regarding blocking calls in the go-loop, do you think this use case is worth the trouble given the current limitations?

noisesmith18:05:53

if this is about doing io inside go, definitely always put io writes and reads inside a thread call if you are in a go block

mpenet18:05:56

If they didnt fix it I d stay away from use cases like this yes

noisesmith18:05:23

or is this something where pedestal is already doing it wrong and you don’t really get to decide?

mpenet18:05:29

Pedestal does io in go blocks (oddly enough)

mpenet18:05:50

Dont remember all the details but check the issue

noisesmith18:05:16

that’s too bad

mpenet18:05:27

It was acknoledged to be a real issue in privmsg on slack, and would be fixed asap, but apparently it s still here

jfntn18:05:45

It might be better to turn the results chan into a ReadableByteChannel like @noisesmith suggested, this would be more performant than my core.async hack, and would sidestep the core.async issue in pedestal :thinking_face:

jfntn18:05:46

wasn’t planning on shaving that yak!

noisesmith18:05:08

in fact, code that writes transit directly to your destination will actually be simpler than the code that used strings in many cases - I think it’s intentional that transit doesn’t provide any string conveniences

jfntn18:05:25

@noisesmith I’m not familiar with nio, would you happen to know whether I should be looking into encoding to a ReadableByteChannel or a ByteBuffer? These are the two streamed and async containers that pedestal supports.

noisesmith19:05:00

use the one that can most conveniently be the target of an OutputStream - iirc that’s easy with a ByteBuffer

noisesmith19:05:51

you can make a readablebytechannel from an inputstream using newchannel, the inputstream can be tied to the outputstream you write to from transit

noisesmith19:05:36

(I haven’t done this with nio - hope it’s not too tricky but that api looks like it will do it)

noisesmith19:05:22

@jfntn I would move the a/thread to directly wrap the io call, and park on the result in a go block

noisesmith19:05:55

But yeah, that's petty close to it

jfntn19:05:50

Not sure what you mean by parking the result?

noisesmith20:05:03

(go-loop [] …. (<! (a/thread (transit/write ...))) …)

noisesmith20:05:33

that way you get the advantage of using core.async (go blocks that can park) without starving the thread pool

jfntn20:05:20

Oh ok, you wait on the values from the chan in a go block, but you start a thread for the write?

noisesmith20:05:34

right, that is what a/thread is for

noisesmith20:05:41

so you can park on something that would otherwise block

jfntn20:05:26

So this would have the advantage of not using up a thread unless doing actual i/o right?

noisesmith20:05:43

right - spending more time parking rather than using a thread

noisesmith20:05:25

(it lets you use <! instead of <!! in the channel take above, which likely takes longer than the individual transit/write calls do)

noisesmith20:05:54

yup, and the go could be a go-loop, but that’s just syntax it’s the same behavior

jfntn20:05:31

I think I need it for the resource cleanup, but anyway thanks for the feedback

noisesmith20:05:44

what? - you still try/catch

noisesmith20:05:47

but inside go-loop

noisesmith20:05:55

oh - wait, I think I get it now

jfntn20:05:46

I still find it really difficult to get a sense for when work should be offloaded to a thread vs right in the go block, esp if things are cpu bound. Say I had a transform fn in there, as in transform before writing, I think in the latest version i’d just apply it before the transit/write inside the thread, but I wouldn’t be confident that’s the best place to do it

noisesmith20:05:01

yeah- there’s some ambiguity - but in my experience things that use enough CPU to be a real issue will be pretty rare and I’ll know what they are - and they definitely go in a thread call

noisesmith20:05:24

like if I’m calculating graph metrics on an adjacency list and I know it’s going to take a few minutes

jfntn20:05:01

hmm so unless something takes multiple seconds, just don’t worry about it and use go?

noisesmith20:05:29

even the transit write - until you get a network hiccup it’s actually not going to block long enough tobe a problem - it’s just the unpredictable network lag that makes the thread call neccessary

noisesmith20:05:05

I’d say unless it aproaches a decent fraction of a second - but it’s worth measuring - the symptoms of go being overloaded are noticible

jfntn20:05:25

right so if you benchmarked it it’d be faster in the go block, but when things get congested in prod you’re better off with defering to a thread

noisesmith20:05:28

and it will be a different consideration depending on how many parts of your app are using go blocks

noisesmith20:05:05

@jfntn yeah - or in my app’s case I profile and I look for long green rectangles in the CPU usage overview inside threads that belong to core.async

noisesmith20:05:31

I take each one and move them to thread calls, and then I just need to profile at least ~once per major release of the code

noisesmith20:05:36

which is a good thing to be doing anyway

jfntn20:05:05

sounds good

jfntn20:05:52

One thing I’m not clear on is I used to only have one transit/read invocation per payload on the ClojureScript side. But with the function we looked at it seems I’ll need some out of band information to decide when to use a different read fn that will wrap multiple invocations of transit/read in a lazy-seq, is that how you’d do it @noisesmith ?

noisesmith20:05:06

@jfntn the most elegant thing would be to always return a sequence of results, and the old code would just get a single item

noisesmith20:05:24

but that means changing the old code’s consumers of course

noisesmith20:05:47

because no matter how many times a single transit was written to, you can always loop until it has no more and put them all in a coll

noisesmith20:05:13

but really I’d use my best judgment based on how deeply the concept of only one result from transit is baked into the app logic

jfntn20:05:04

right, I guess the bracket/commas hack is also an option with the ReadableByteChannel

noisesmith20:05:47

yeah, but I don’t like using hacks I don’t need, and transit already handles the case the brackets / commas thing was meant to cover

noisesmith20:05:00

but it’s your code, of course 😄

jfntn20:05:33

I think the best solution for our re-frame app will be to set a stream? flag on our remote effect and use that in the interpreter as a signal to use the lazy reader

jfntn20:05:58

It will be a while before I can run some full-stack benchmarks to compare the solutions, but I feel like this has been a productive yak shaving session, thanks for your help @noisesmith & mpenet !

noisesmith20:05:32

heh - no problem, I hope it proves productive

jfntn20:05:49

I’ll let you know when I have some numbers to compare

jfntn22:05:57

@noisesmith starting to look like the ClojureScript reader for transit doesn’t actually support parsing that, have you tried that approach yourself?

noisesmith22:05:58

I’ve done multiple writes followed by multiple reads… but not between cljs and clj

hlship22:05:27

I'm writing a simple patch for core.async, but I don't know how to run the clojurescript tests. lein cljsbuild test builds a test.js file, but doesn't seem to execute it. I haven't touched ClojureScript in a couple of years, not sure how to proceed.

hlship23:05:46

Ah, found the script/test file ... let's see ...

hlship23:05:19

I've installed the current version of v8, but still fails:

16:08:30 ~/workspaces/github/core.async > V8_HOME="/usr/local/bin" script/test
Compiling ClojureScript...
Compiling "tests.js" from ["src/test/cljs" "src/main/clojure/cljs"]...
WARNING: Wrong number of args (1) passed to cljs.core.async.runner-tests/pause at line 71 src/test/cljs/cljs/core/async/runner_tests.cljs
WARNING: Wrong number of args (1) passed to cljs.core.async.runner-tests/pause at line 83 src/test/cljs/cljs/core/async/runner_tests.cljs
Successfully compiled "tests.js" in 9.292 seconds.
Testing with V8
tests.js:2: ReferenceError: document is not defined
if(typeof goog == "undefined") document.write('<script src="../out/goog/base.js"></script>');
                               ^
ReferenceError: document is not defined
    at tests.js:2:32

SPIDERMONKEY_HOME not set, skipping SpiderMonkey tests
JSC_HOME not set, skipping JavaScriptCore tests
16:09:08 ~/workspaces/github/core.async >

Alex Miller (Clojure team)23:05:28

I do not think that script is actually used now (or at least I haven’t used it recently, maybe David uses it)

Alex Miller (Clojure team)23:05:18

I usually: 1. lein do clean, cljsbuild once 2. open script/runtests.html 3. Open JavaScript console and look for test output like: “Ran 43 tests containing 134 assertions. 0 failures, 0 errors.”

hlship23:05:01

Ran 44 tests containing 134 assertions.

hlship23:05:13

@alexmiller I'm updating CONTRIBUTING.md as part of this. Will it break anything if I change "runtests.html" to simply say "Check console for results."?

hlship23:05:39

And if the script is not used, could we delete it?