Fork me on GitHub
#core-async
<
2018-11-16
>
kirill.salykin15:11:54

Hi all, I trying to implement web scraper

(System/setProperty "clojure.core.async.pool-size" "1024")

  (def to-fetch-url (async/chan 1000))
  (def to-fetch-attributes (async/chan 1000))
  (def to-persist (async/chan 1000))

  (defn fetch-url [loc out]
    (when-let [url (url loc)]
      (async/>!! out (assoc loc :url url))))

  (defn fetch-attributes [loc out]
    (async/>!! out (attributes loc)))

  (async/pipeline-async 500 to-fetch-attributes fetch-url to-fetch-url false)
  (async/pipeline-async 500 to-persist fetch-attributes to-fetch-attributes false)
both fetch-url and fetch-attributes doing web scraping the problem: it invokes fetch-attributes only once please advice what is wrong?

kirill.salykin15:11:20

but I do:

(async/go-loop []
    (async/<! to-fetch-url)
    (async/<! to-fetch-attributes)
    (async/<! to-persist)
    (recur))
everything works, which seems weird…

kirill.salykin15:11:29

it turns out that out chan should be closed

hiredman17:11:35

I would strongly advise against using pipeline-async you first go around

hiredman17:11:14

it is kind of weird and unintuitive, pipeline-blocking is much more likely to do what you want

hiredman17:11:37

also early in development I would suggest using smaller numbers everywhere, because larger buffers and threadpool sizes will mask bugs (channels not being consumed in a timely manner, certain kinds of deadlocks become less likely with a larger pool size)