Fork me on GitHub
#core-async
<
2015-07-31
>
rauh11:07:49

@meow: This might sound harsh but: I wouldn't use/create a function like chan-of-primes (except a convenience function) since it's very inflexible and can barely be re-used. All you can ever do is create a channel of primes. You instead want to transform data independently, ie. without information about channels. (if that's possible). If you go the transducer way you can still compose it with other transducers, apply it to your normal collections and to your channels. You should also accept input instead of hard-coding the chan-of-ints. This means you've solved a more general problem which might allow future re-use of some kind.

rauh11:07:18

Remember, in clojure it's all about functions. If your function doesn't accept any parameter then you've likely written a non-reusable function.

meow11:07:52

@rauh: I haven't even read what you wrote yet. I woke up feeling like I understood what was going on with the transducer you showed me yesterday so I dove into the code and, wow! Check it out:

(defn chan-of-primes []
  (let [ints   (chan-of-ints 2)
        sieve  (posmod-sift)
        primes (chan 1 sieve)]
    (pipe ints primes)
    primes))

rauh11:07:13

@meow: Yes! That looks a lot better simple_smile. I like the name posmod-sift

rauh11:07:10

You could even do (onto-chan (chan 1 sieve) (range 2)) I believe

meow11:07:32

@rauh: I agree with everything you've said and am ever so grateful for you being willing to be "harsh" because that's the kind of feedback I need to improve as a programmer (and you weren't really harsh at all)

meow11:07:11

After I worked up this new code I couldn't believe how ugly my previous attempts looked.

rauh11:07:55

Cheers to that! I'm sure you learnt a lot with the dozen versions you've come up with.

meow11:07:11

I just knew there had to be better ways to solve this problem and I wasn't seeing it in the wild.

meow11:07:33

And, yes, now that I got the version using only channels, I'll follow up with examples of how this transducer will let you work with other types of collections and contexts and how the channel-of-ints isn't even necessary.

meow12:07:36

@rauh: This is so cool - this final version of the code is the first one that works well in my cljs app - I'm able to run it and still get 60 frames per second.

rauh12:07:06

@meow: Awesome! That's probably because of the volatile and mutable state. It's pretty fast.

meow12:07:39

@rauh: I'm loving it. simple_smile

meow12:07:50

at 30,000 primes found I'm down to 50fps, but that's not bad for a simple algorithm like SofE

rauh12:07:14

In cljs?

meow12:07:36

I'm working on using things like core.async and goog.async and my own event loops to be able to have cljs apps that can sustain 60fps

meow13:07:07

at 64,000 primes found I'm still getting 23fps

meow13:07:06

I think this quote from Rich Hickey sums things up nicely:

In working recently on providing algorithmic combinators for core.async, I became more and more convinced of the superiority of reducing function transformers over channel->channel functions for algorithmic transformation.

erik_price13:07:45

The problem I have with that is that in my own experience, the majority of channel->channel functions aren’t algorithmic transformations but rather involve I/O… given this channel of entities, look up some other entities for each one and do X with them, etc.

erik_price13:07:21

Obviously that’s a function of the kind of systems that I’ve been working on, and not applicable to everyone, but I’m sure there’s a large subset of Clojure developers to whom it is.

meow13:07:26

@erik_price: From what I'm now understanding about transducers the nature of the algorithmic transformation is pretty open-ended.

meow13:07:03

Well, we might need to work up a specific example, but let's start with the premise that the operations you are describing could be coded as transducers.

erik_price13:07:37

Imagine I have a source of messages (coming off a distributed message queue). Each message goes into a channel pipeline, where they are first validated, then transformed – those can be done via transducers – but the next step of my channel pipeline is to fetch a file off of S3 and also issue a database query. My channel pipeline then joins this data into some data structure, and passes it along to the next phase. That step requires that I either use something like clojure.core.async/pipeline-async, or break the channel pipeline up by introducing a go block that pulls values out of the channel, performs the work, and then puts them back into a channel and passes that along to the next part of the pipeline. Earlier, I said “the problem I have with that”, and that was not a wise choice of words. What I should have said is “there are certain transformations for which transducers cannot or should not be used”.

meow13:07:33

I am at that point where I'm wondering what the limits of transducers are or should be

meow13:07:57

Clearly I haven't been using them enough or appreciating their flexibility

meow13:07:36

So I'll probably push them by using them where they shouldn't be used, just to learn from my own mistakes. simple_smile

meow13:07:29

But I suppose that putting a go block inside a transducer is probably a warning sign that I've gone too far...

erik_price13:07:08

To my understanding, which is not expert, blocking operations of any kind should be avoided in transducers. Just use them for, as Rich puts it, “algorithmic transformation”.

ghadi19:07:57

Keep in mind channel transducers run in the context of a channel mutex, so they may possibly exacerbate contention

erik_price19:07:31

@ghadi: are you saying that the execution of a transducer places an exclusive lock on the channel as far as take and put operations are concerned?

meow20:07:13

@ghadi: I was actually wondering how channel transducers are processed because I was thinking that syntax like the following would be nice:

(into (chan) xform (range 100))

meow20:07:50

just treating a chan like a coll

meow20:07:13

and then I thought that the xform on a channel must somehow sort of treat it that way already

meow20:07:33

gonna go dive into the source code and see what I find

ghadi20:07:24

@meow: just as transducers transform some sort of "stepping function", channel transducers transform the add! operation of adding to a channel's buffer. https://github.com/clojure/core.async/blob/d07389619/src/main/clojure/clojure/core/async/impl/channels.clj#L288

meow21:07:51

@ghadi: right, I saw that

meow21:07:35

@ghadi: so would there be any advantage to being able to treat a chan as a coll where something like into just did a put! onto the channel?

ghadi21:07:02

not really. there's already core.async/into that does what you want

meow21:07:07

not sure that's what I'm looking for, but I find the docstring utterly confusing even though the code is only this:

(reduce conj coll ch)

meow21:07:40

onto-chan is what I've been using

meow21:07:38

yeah, core.async/into is definitely not what I had in mind

meow21:07:29

what I was thinking of was a way to take a stream of values, like from (range) and use them as input into a chan

meow22:07:28

so basically being able to use into instead of core.async/onto-chan

meow22:07:09

@rauh: just for the record, my earlier stats about frames per second while generating primes in the browser was with unoptimized code compilation - compiling with :advanced is way faster still

rauh22:07:47

That's awesome, I wonder how early the some? exits

rauh22:07:17

since I guess most of the time it's really early and only every now and then it has to run through the whole thing

rauh22:07:44

sorry, i mean every?

rauh22:07:01

though it could be turned into a some by boolean logic with negating both simple_smile

meow22:07:25

not sure if you noticed that I have the transducer state using a vector instead of a set for that very reason - so every? will fail asap

rauh22:07:40

Regarding into: It has different semantics. The first argument is a collection which always allows adding elements. With channels you don't have that. You can say "channel full, hold on"

meow22:07:58

ah, good point

rauh22:07:21

So it wouldn't make sense (or it would surprise people) to do extend that. Being explicit (thus other function) is better

rauh22:07:49

The whole thing about core.async is to find synchronization points in your program

rauh22:07:26

from both ends, producer and consumer. Which is very powerful. (much more than other async libraries in other languages)

meow22:07:16

No, the whole point about core.async is to invent synchronization points in your program so you can play with a new toy... 😉

rauh22:07:36

or that simple_smile

meow22:07:50

so this is the final version I'm playing with now, and I think I'm quickly losing interest in taking this primes thing any further:

(defn chan-of-primes []
  (let [inputs (filter odd? (drop 3 (range)))
        primes (chan 1 (posmod-sift))]
    (put! primes 2)
    (onto-chan primes inputs)
    primes))

meow22:07:24

it drops the evens so they won't fill up the transducer state

rauh22:07:42

Yeah, I'd be content with that solution. Not sure what more to poke around. I'm kinda out of ideas

rauh22:07:32

A parallel version, maybe? simple_smile

meow22:07:22

well, I was toying with trying to force range to go from 3 to infinity with a step of 2 as a possible optimization but it would have to work on both clj and cljs and at some point this isn't really about optimizing the finding of primes anyhow but just having a good example

rauh22:07:01

That's actually interesting: Write a pevery? or peach? that does runs with k parallelism and each thread testing every kth value.

meow22:07:02

so I punted with filter odd? as good enough

meow22:07:52

no peaches in the browser - only one thread

rauh22:07:55

Then coordinating an short circuit if one of them fails so the condition can't be satisfied

meow22:07:45

but don't let me stop you... simple_smile

rauh22:07:03

ha, I have no intention to do that.

ghadi23:07:25

pkobrien: async/into takes and returns a chan: (let [v (async/into [] achannel)] (<!! v))

ghadi23:07:52

v will only yield its vector when achannel closes

ghadi23:07:36

(same with the underlying async/reduce, which is different than core/reduce)