This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2015-12-10
Channels
- # admin-announcements (32)
- # announcements (4)
- # aws (25)
- # beginners (296)
- # boot (1)
- # cider (87)
- # clara (16)
- # cljs-dev (7)
- # cljsrn (41)
- # clojure (121)
- # clojure-art (26)
- # clojure-japan (4)
- # clojure-miami (190)
- # clojure-russia (168)
- # clojure-sg (3)
- # clojure-sweden (13)
- # clojurescript (138)
- # clojurex (7)
- # cursive (98)
- # data-science (2)
- # datomic (129)
- # devcards (10)
- # editors (5)
- # funcool (1)
- # hoplon (31)
- # jobs (1)
- # ldnclj (4)
- # lein-figwheel (3)
- # off-topic (2)
- # om (213)
- # onyx (33)
- # parinfer (7)
- # portland-or (1)
- # re-frame (19)
- # reagent (2)
- # ring-swagger (27)
- # slack-help (3)
Hi, quick question , does peer count correlate to some sort of hard limit on jvm thread pool size ?
@yusup Yes, however lifecycles and input/output plugins add some variability to it, as some plugins use additional reader / buffer or sender threads
so , I cant spawn more threads within functions than certain threshold which is correlated to peer count ?
Oh no there isn't really a hard limit, aside from what the JVM/your app can handle
But spawning within functions is almost always a bad idea
I highly recommend using Flight Recorder / Mission Control rather than YourKit btw
It's built in to Java 8
What sort of task is leading you to want to spawn threads from your onyx/fn?
I think there are probably ways to do it successfully but you have to be careful
What are you using to scrape?
Here's a few things I'd keep in mind.
If you need to do any expensive upfront, reusable initialisation in your task, do it in a lifecycle.
If you're doing a lot of IO, you may want to just increase the number of peers that you use for the IO tasks, rather than spinning up lots of threads within those tasks
If the unit of work in a segment is too big and you therefore want to parallelise it in your task, consider splitting it up and sending it to downstream tasks
(Just some thoughts. I don't really have any experience with tika and they could be wrong)
Yeah it really depends on how much work is being done in the task
Be careful with max-pending /pending-timeout on your input tasks. If the amount of work derived from an input segment is large you can get retries. Just something to monitor
What's your input source?
K. Max pending is your main backpressure knob there
Catch ya