Fork me on GitHub

Hi all, I trying to implement web scraper

(System/setProperty "clojure.core.async.pool-size" "1024")

  (def to-fetch-url (async/chan 1000))
  (def to-fetch-attributes (async/chan 1000))
  (def to-persist (async/chan 1000))

  (defn fetch-url [loc out]
    (when-let [url (url loc)]
      (async/>!! out (assoc loc :url url))))

  (defn fetch-attributes [loc out]
    (async/>!! out (attributes loc)))

  (async/pipeline-async 500 to-fetch-attributes fetch-url to-fetch-url false)
  (async/pipeline-async 500 to-persist fetch-attributes to-fetch-attributes false)
both fetch-url and fetch-attributes doing web scraping the problem: it invokes fetch-attributes only once please advice what is wrong?


but I do:

(async/go-loop []
    (async/<! to-fetch-url)
    (async/<! to-fetch-attributes)
    (async/<! to-persist)
everything works, which seems weird…


it turns out that out chan should be closed


I would strongly advise against using pipeline-async you first go around


it is kind of weird and unintuitive, pipeline-blocking is much more likely to do what you want


also early in development I would suggest using smaller numbers everywhere, because larger buffers and threadpool sizes will mask bugs (channels not being consumed in a timely manner, certain kinds of deadlocks become less likely with a larger pool size)