This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2020-07-09
Channels
- # announcements (2)
- # babashka (33)
- # beginners (122)
- # bristol-clojurians (1)
- # calva (6)
- # chlorine-clover (3)
- # cider (45)
- # clara (10)
- # clj-kondo (3)
- # cljsrn (17)
- # clojure (80)
- # clojure-dev (21)
- # clojure-europe (86)
- # clojure-italy (5)
- # clojure-japan (5)
- # clojure-losangeles (7)
- # clojure-nl (5)
- # clojure-portugal (3)
- # clojure-uk (31)
- # clojurescript (30)
- # conjure (4)
- # core-async (29)
- # cursive (20)
- # data-science (25)
- # datomic (7)
- # duct (17)
- # figwheel-main (73)
- # fulcro (23)
- # jobs-discuss (36)
- # juxt (5)
- # kaocha (2)
- # lambdaisland (6)
- # luminus (5)
- # malli (17)
- # mount (10)
- # music (7)
- # off-topic (16)
- # re-frame (30)
- # ring (17)
- # rum (1)
- # shadow-cljs (10)
- # spacemacs (10)
- # specmonstah (4)
- # sql (45)
- # tools-deps (21)
- # xtdb (20)
I read this article about using core async and transducers to read and process a csv file https://www.javacodegeeks.com/2017/12/gettin-schwifty-clojures-core-async.html The author closes on a slightly disappointed note regarding the performance. I wonder if there is something that could be done to optimize their code. Also; I read elsewhere that it's better to use blocking put for IO.
on a quick skim it looks like they are doing IO inside a go block, which is a very bad idea
go isn't a mechanism for faster throughput, a dedicated thread pool is much better at that, it's a mechanism for async coordination, which this task hardly needs
everything about this is imo a weird approach to force something into using every core.async construct
it is probably much simpler and faster to just write a tight sequential loop
if you truly want to parallelize it, you probably want to memory map it or randomaccessfile, break it into n chunks, then do that same tight loop. the first part of that is somewhat complicated interop (and needs to take into account finding "line" breaks
ghadi spoke a bit about something like this in slack a while ago. using a custom pipeline iteration and a file walker pump to saturate cores. i made a gist out of it but would love to see a proper blog post about it
well the ghadi stuff above is eventually probably coming to clojure and core.async and there will be some bloggy things when we get to that
or maybe I'm conflating
yeah, sorry nvm!
the gist has comments below that explain wiring it together that are super helpful in getting an idiomatic core async pipeline up and running doing tons of work safely
the stuff excerpted above takes a filesystem walker, and pipelines over the stream of files, shelling out to a process for each file
As an aside, I can delete or make that gist private if you donβt like me copying and preserving you like that
@alexmiller may be worth considering making CompletableFuture interop with channels better
@hiredman has a gist about it, and L8-23 above are a manual adaptation of CF -> channel