This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2016-12-22
Channels
- # adventofcode (1)
- # beginners (172)
- # boot (47)
- # cider (7)
- # cljs-dev (30)
- # cljsrn (43)
- # clojure (180)
- # clojure-dusseldorf (1)
- # clojure-greece (1)
- # clojure-italy (3)
- # clojure-russia (41)
- # clojure-spec (67)
- # clojure-uk (101)
- # clojurescript (128)
- # core-async (4)
- # cursive (13)
- # datomic (29)
- # devcards (5)
- # emacs (19)
- # events (1)
- # hoplon (38)
- # lein-figwheel (1)
- # luminus (8)
- # midje (1)
- # off-topic (47)
- # om (10)
- # onyx (23)
- # protorepl (1)
- # re-frame (11)
- # reagent (7)
- # ring (3)
- # ring-swagger (9)
- # rum (6)
- # sql (5)
- # untangled (4)
okay, thank you
I'm having issues getting my job to start. Right now, I'm seeing things like Not enough virtual peers have warmed up to start the task yet, backing off and trying again...
But it just spits out those messages repeatedly and doesn't seem to be doing any sort of backing off. Is there some setting I need to tweak?
I have 8 tasks, so I've been trying 8+
it starts just fine if I swap out the kafka plugin reader stuff with core-async
but right now I'm curious as to why the logging isn't actually backing off
got it
Any ideas why the log messages aren't actually backing off and retrying?
@stephenmhopper any better?
@stephenmhopper hmm, that’s weird if it spits out those messages repeatedly. It should only happen if a log message is being applied to the cluster coordination log
@stephenmhopper that implies that something might be changing in your cluster a lot (if it’s just peers starting up, that’s fine)
Yeah, I'm not sure exactly what the issue was. I updated :onyx.peer/job-not-ready-back-off
to be less aggressive. Right now, I'm doing development in a REPL, but ZK, Kafka, and Bookkeeper are all running locally in Docker. It's possible that Docker was running out of memory (I had only allocated 2GB of RAM for all three to share). I updated it to 3GB and everyone seems to be happy now. But I also killed the containers entirely and recreated them. It's also possible that the bookkeeper data was somehow corrupted as the new container couldn't start up until I removed the old bookkeeper mounted volume
Yeah, I could see you having some join churning going on if you were using a lot of peers
@stephenmhopper That message is typically emitted when a peer is beginning a task but can’t make initial connections, so it’s retrying.