Fork me on GitHub
Drew Verlee00:12:50

I assume you can reduce over a lazy-seq without any trouble. This code seems to running a bit longer then i would think though (about 3 mins so far)... The file is 10gigs though.

(with-open [in-file (io/reader (io/resource kf))]
  (->> (cd-csv/read-csv in-file :separator \|)
       (sc/mappify {:header schema})
        (fn [coll {:keys [customer]}]
           (conj coll customer))

Drew Verlee00:12:05

thats semantic-csv


you could replace the reduce with (into #{} (map :customer)) (leaving the rest the same) and that should be faster


since your reduce is equivalent to mapping :customer across the input


(then putting in a set)

Drew Verlee00:12:17

oh, right. Why would that speed things up?


because into uses a transient which is marginally faster for building collections, and it's likely (map :customer) inlines better (edit - for larger collections the speedup can be better than marginal)

Drew Verlee00:12:03

is it possible to leverage multiple threads over chunks of the lazy sequence? Kind of like a map reduce?


What's the idiomatic way to do a map that has some state? I have a function f that basically takes x and returns y, but also takes in a "state of the world" s and returns a "changed state of the world" new-s, so that (f x s) is [y new-s]. Given an initial state s0, and some vector of xs like [x0 x1 x2 ...], I would like to compute y0 which comes from (f x0 s0) being [y0 s1], then y1 which comes from (f x1 s1) being [y1 s2], and so on for y2, y3, etc. (Also I would like to get the final value of s.)


If it wasn't for the s stuff, this would just be (map f xs).


There is probably some kind of Haskell higher-order function that does this, but of course I am interested in doing this in idiomatic Clojure.


that's a reduce


haskell calls it fold


I thought that maybe I can do something with reduce or reductions, but I could not see a nice way to arrange it.


you can use a hash map as your accumulator, and hold your state-of-the-world plus your ys under two keys


@noisesmith Ah yes, that works very nicely, thank you!


What's the most computationally efficient way of mutating elements of a Java array? So far I've tried doseq + aset and it seems terribly slow.


I would guess use loop/recur to walk the index and make sure you type hint and deal with boxes correctly.


dotimes is probably a better higher level form than loop/recur on second thought


@naylyn.gaffney you could check out amap and areduce - you can run into pitfalls with numeric boxing with loops if you are not careful but those functions abstract that stuff a bit


also be sure to check for runtime reflection and numeric boxing in general


Having no experience with Java prior to Clojure it feels like there are a lot of landmines I need to avoid.


for performance of numerics there are a bunch of gotchas - honestly it's often easier to write java code and call it from clojure instead


but for correctness of programs that use a lot of threads (and readability of that code) - clojure can really excell


the reflection and numeric boxing are a tradeoff for using a language that can't be effectively statically analyzed - there are benefits too (repl based development, being able to define things piece by piece without restarting the whole thing) but it makes optimal numeric performance tricky sometimes


yeah I'm seeing that, it's starting to feel like I'm writing something closer to java than clojure at this point


What causes Math.abs to be available in my Clojure code? In one of my source files, the REPL says CompilerException java.lang.ClassNotFoundException: Math.abs. The same thing works in another file though. I have no idea why. I have not done anything to refer (or anything like that) Math in either file.

Alex Miller (Clojure team)16:12:11

All classes in java.lang are automatically imported for you in every namespace


@sabbatical2017 it should be (Math/abs -123)


@scriptor @alexmiller Thank you! I've been blind to that Math.abs instead of Math/abs ... too much Python exposure 😃

Drew Verlee18:12:55

Does anyone have any good resources about modeling the abstractions of your system as a FSM or maybe something like state Charts? I’m curious about the idea if creating a more contractual model for designing systems and modeling them as a FSM seems like one way todo this. For example a simple game might have something like player-one -> player-two -> game-over It wouldn’t capture all the possible things that could happen to the program but would be a means to express the business domain.


@drewverlee this might not be what you are looking for, but petri nets are a formal language for designing asynchronous stateful systems that has a well defined semantics where you can make proofs about eg. whether it halts or deadlocks - similar to fsm but with no shared clock


I think reduce-fsm does somehting simiar -

Drew Verlee18:12:52

Wow lots of responses. I’ll take a look at these in a bit. Thanks!


how would I take this object and turn it into something I can access the data of #object[clojure.core$future_call$reify__8097 0x3f7533a9 {:status :ready, :val 0}]?


(def x (future 10)) --> @x


That will return the thing that x is defined to be


You can also use the deref function explicitly, (deref x)


you can also use realized? to see if it is done yet (otherwise deref will block until it is)


you can also provide an optional extra arg to deref to tell it how long to pause and wait for it to complete before giving up


it’s definitely been realized since this is part of a result on a blocking function call, im not creating the future


and in the repl calling (realized?) returns true


right, just mentioning that in general that is a concern


im just not sure how to get at the data heh


like admay said - @ aka deref


A common pattern with realized? is (if (realized? x) @x ( ... ))


weird im just getting 0 when i do deref


oh is the status part of the future itself


there’s also future-cancel which will stop it (if it is sleeping or waiting on IO at least)


but that’s a no-op if it’s realized iirc


ah yeah so @ is doing the trick, i assumed :status was part of the return data for the future but i guess it’s an internal thing


thanks guys 🙂


I’m trying to figure out how go and <! behave in the context of the JS environment. The go documentation say that any visible calls to <!, >! and alt!/alts! channel operations within the body will block. First: what does “visible” mean? Second, does a function which uses a go block return immediately or does it only return once the go block finishes?


It returns a channel immediately.


No values will appear on the channel until the go completes.


@lee.justin.m The macro rewrites the body of the block into a state machine that with attach callbacks to the channels wherever theres a >! or <!


okay so visible literally means textually visible because go is a macro


although the "visible" bit is a bit strange, since <! and >! can be inside a macro and go will still see them


but it doesn't dig into function calls.


i see. so basically one needs to pass the channel around until ready to consume, then use it with a go block and the <! macro.


Well if you just want to consume, then use put! and take!


oh. i was just following the readme from cljs-http


the go blocks run in a task queue in JS. When the browser isn't busy the pending go blocks will run


i’m trying to map all this stuff to promises and/or callbacks and having a hard time of it


Well, it's a lot of the same sort of thing. Let's say you have (http-get url callback) in JS.


You could write this as (http-get url (fn [result] (reset! a result)))


Or we could put that value on a channel: (http-get url #(put! ch %))


but we may want to notify something when that put is done: (http-get url #(put! ch % done-cb))


So what core.async is doing is transforming calls to >! and <! into calls to (put! ch val done-cb) and take!


And a simplified level, when we have a callback we have to run later, core.async is calling (js/setTimeout .. cb) to queue it up


I see. So ergonomically go blocks are more like promises or async/await and put!/take! are more like callbacks. The upshot seems to be that you need to use the right functions from core.async to interact with the channel (which of course makes total sense)


I get that they are not implemented the same, but that’s very helpful.


actually you're right, async/await in JS are very much the same sort of transform