This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2016-06-30
Channels
- # admin-announcements (3)
- # aws-lambda (12)
- # beginners (88)
- # boot (73)
- # capetown (6)
- # carry (16)
- # cider (8)
- # cljsjs (7)
- # clojure (90)
- # clojure-belgium (4)
- # clojure-dev (19)
- # clojure-greece (41)
- # clojure-portugal (1)
- # clojure-quebec (4)
- # clojure-russia (25)
- # clojure-spec (172)
- # clojure-taiwan (1)
- # clojure-uk (76)
- # clojurescript (82)
- # cursive (37)
- # datavis (2)
- # datomic (46)
- # devcards (1)
- # emacs (4)
- # euroclojure (6)
- # events (1)
- # hoplon (31)
- # jobs (1)
- # keechma (9)
- # off-topic (4)
- # om (7)
- # onyx (65)
- # other-languages (15)
- # pedestal (1)
- # planck (50)
- # proton (1)
- # re-frame (40)
- # reagent (7)
- # spacemacs (14)
- # spirituality-ethics (37)
- # testing (1)
- # untangled (2)
- # yada (44)
Is there any thing we can do to get a better error message when we submit a job and the cluster/test-environment that does not have enough peers?
@dspiteself: There’s this https://github.com/onyx-platform/onyx/blob/0.9.x/src/onyx/test_helper.clj#L40
For the with-test-env macro, if that’s what your talking about
awesome
could I push for that to happen by default
we have probably lost 4 -8 hours of total development time as each developer runs into that issue
@dspiteself: unfortunately there’s not a good solution other than that sort of check, because the cluster is dynamic and not having enough peers can be a normal condition if you have lots of jobs
how about a default on with-test-env ?
or a warning
I could see a wrapper for submit-job that uses with-test-env, or does the submit as part of with-test-env, but those are more limited. Making it a warning could be bad under certain conditions (I could see cases where thousands of messages could be printed every time a new cluster event happens (e.g. do you want to print it every time a peer joins, leaves, etc)
with-dummy-test-env
I will think about it some more. We get blocked on it because there isn’t really a great solution.
it would be more like wrapping submit-job
to be submit-test-job
, I think that’s something that would be pretty simple to do in your own codebase if needed.
we are going to wrap submit-job in ours I am just worried about the beginner experience of onyx for the next learners.
Onyx is designed to intentionally handle both under and over allocated clusters. Every other platform has some notion of process slots, there's nothing terribly different going on here.
It worries me too, but we’ve not found a good user friendly way to inform users since it’s the expected behaviour. @michaeldrogalis what about a submit-job version that will auto-kill if the log entry plays and it can’t allocate it? We could print a message at that point. Of course, submit-job would have just returned :success?
so that not ideal either
I understand I read that when I hit the problem after 2-3 hours of fiddling, but now another developer of ours went through the same thing.
and storm as of 0.9.X did give error messages for that case
I respect that stance
The onyx peers do give error messages though right? They claim not enough peers to start the job.
@michaeldrogalis: I guess I could see printing a warning when the submit-job entry is played, and only then.
I am just making noise of felt pain
@gardnervickers: that’s only printed when the job has been started but all the peers aren’t warmed up
Our architecture is substantially different. This sounds a bit awful, but now that you've hit that problem, you're unlikely to make that mistake again. Losing a couple of hours and learning about how the scheduler works is preferably than doing something hacky in the architecture.
ah gotcha
@dspiteself: I feel it since it comes up enough, which is why we added that validation function
@dspiteself: Definitely. I understand.
Serious question though -- does my previous paragraph sound that terrible?
@michaeldrogalis: after thinking about it some more, I think we could output some log entries for this properly.
The “Our architecture” one?
if I would have seen "waiting for node to become availiable" I would have known what to do
Im alright with someone stumbling for a few hours if it's only going to happen once.
It happens many times tho, and people do forget
I’ve forgotten that it could be the reason
and fumble around and then figure it out… which is why I added that validation function
it was just once but once for several people on the team
maybe a validation function is not necessary if the node allocation logs were clear enough.
What I’m thinking is that we output something to timbre under the following conditions:
1. When the submit-job entry is played, but the job isn’t scheduled, output a warning.
2. When a job is newly scheduled, output an info
that would have been perfect for us
It would have been a bit cumbersome before when every peer would’ve printed it, but we have a peer-group now.
I feel like a substantial number of people don't even realize Onyx writes to a log file though, which is a problem all on its own
This is true.
I’m not suggesting we try to solve that one 😛
Can someone make an issue for this so I can think about it later? Please keep discussing -- I gotta run though
(Or reopen the issue I linked to)
agreed we were looking at our logs
@dspiteself: For what its worth, I use this in all my tests so I never have to think about adjusting the peer count until I go to production: https://github.com/onyx-platform/learn-onyx/blob/master/src/workshop/workshop_utils.clj#L28
hey all! What happens if a task function returns nil
as a segment? Is it then passed as a segment further or is it skipped?
@asolovyov: Undefined. Been meaning to raise an exception for that. Functions need to either return a map or a sequence of maps
so if I need to skip, I have to return something and then stop it with flow condition?
@asolovyov: you can return and empty sequence of maps
Part of me would like to see nil treated as an empty sequence, since map/empty?/etc all do. I’m of two minds about it though
@dspiteself: you mean just []
? 🙂
@lucasbradstreet: I would say I'm also not completely sure. On one hand it seems really confusing, on the other it really complements ability to return lists
Yeah, there’s an implicit mapcat when you return a vector of segments. So returning []
means you’re just dropping all the output
@dspiteself: thanks! that's a nice hack 🙂