This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2017-03-10
Channels
- # aleph (1)
- # aws-lambda (1)
- # beginners (80)
- # boot (20)
- # cider (75)
- # cljs-dev (45)
- # cljsjs (1)
- # cljsrn (11)
- # clojure (428)
- # clojure-dusseldorf (13)
- # clojure-italy (4)
- # clojure-russia (153)
- # clojure-spec (47)
- # clojure-taiwan (1)
- # clojure-uk (62)
- # clojurescript (84)
- # cursive (19)
- # datascript (96)
- # datomic (75)
- # dirac (9)
- # docs (3)
- # emacs (19)
- # jobs (5)
- # jobs-discuss (20)
- # jobs-rus (17)
- # lein-figwheel (5)
- # leiningen (1)
- # liberator (4)
- # luminus (12)
- # off-topic (4)
- # om (31)
- # onyx (102)
- # pamela (1)
- # parinfer (3)
- # pedestal (3)
- # proton (1)
- # protorepl (14)
- # re-frame (54)
- # reagent (22)
- # rum (40)
- # spacemacs (2)
- # specter (8)
- # test-check (5)
- # unrepl (110)
- # untangled (80)
- # vim (3)
- # yada (46)
Are any onyx users spinning up Onyx jobs remotely in response to user requests?
I’m curious how one would design a system to handle what was previously an operational overhead as part of a more functional workflow.
Unrelated to the above question... Posting this in case the core team wants to add onyx to this open source clojure list: http://open-source.braveclojure.com/
I want to build up state on a peer by putting :onyx/group-by-key on one task, and have another task downstream use that state, but only once all of the segments have passed through the first task. The problem I'm running into is: how do I pass that built-up state from the first task to the second? And is there any way to make sure the two tasks run on the same peer?
@atroche When you say build up state, I’m guessing you mean in the JVM and want it to be shared between tasks?
E.g. if I have segments like this: {:property :x :value "asdf"} {:property :y :value "qwer"} And I only want to output segments which have a :property which is featured in more than 50% of all of the segments…
I assume the state will be rather large which is why you want to avoid messaging it downstream?
@lucasbradstreet I’m not averse to messaging it downstream
OK, if you build up state using onyx’s windows features, you can emit it downstream via http://www.onyxplatform.org/docs/cheat-sheet/latest/#trigger-entry/:trigger/emit
then you won’t need to worry about whether the downstream task is on the same node
That’s right. Or you can progressively emit the state, assuming it can be split up.
The trigger/emit doc is wrong btw, it should say "A fully qualified, namespaced keyword pointing to a function on the classpath at runtime. This function takes 5 arguments: the event map, the window map that this trigger is defined on, the trigger map, a state-event map, and the window state as an immu table value. It must return a segment, or vector of segments, which will flow downstream."
I’m about to push a new release
with that fix in it
You mean you want to use that state as part of what is used in a predicate in another task?
That’ll be a bit harder to achieve, thought it’s possible. I’m not sure it’ll be a good idea. I’d consider combining it with the windowed/triggered task that builds up the state, and build an aggregation/trigger that both builds up the state and emits segments. If you make it emit segments with keys that are looked at by the predicate, you could then use flow conditions to decide which segments are routed where.
but how can I emit a segment with the aggregate, when not all the segments have come through yet?
You could try using a punctuation trigger. Punctuation triggers are called with a state event (containing the segment) http://www.onyxplatform.org/docs/cheat-sheet/latest/#/state-event and a predicate. Now that Onyx has in order processing, you could put a punctuation segment into your stream and ensure that the trigger only fires after that segment is seen
Those are my general thoughts given what you’ve told me anyway.
I’m off to sleep, so can’t really help much more. Hopefully that helps.
by the way, I’m trying to replicate my Apache Beam / Google Dataflow workflow in Onyx, and to do this kind of aggregation I’m using “side inputs”, which let you use data from another branch of your pipeline when processing segments: https://beam.apache.org/documentation/programming-guide/#transforms-sideio
We don’t support anything like that, but that’s interesting. I’d have to have a think about how we could achieve that in a sensible way.
Hey all! I'm hitting a problem, which is pretty severe (it almost kills my whole Onyx cluster), and it's manifested by this error: Aeron messaging publication error: io.aeron.exceptions.ConductorServiceTimeoutException: Timeout between service calls over 5000000000ns it happens hundred(s) times a second (also kills my Sentry, which is exception gatherer 🙂 ) any thoughts what's going wrong?
There’s a 5s timeout on an Aeron publisher to detect if a client is dead. Have you verified that your peers and media drivers can connect to each other?
What does your cluster setup look like?
@asolovyov are you on 0.9.x?
@gardnervickers hmmm... well, I have zookeeper, and 4 nodes
@asolovyov we’ve improved the way that it recovers from aeron timeouts. It’s likely due to a combination of our bad connection handing in 0.9 (logging too frequently, not bringing connections back up well enough), combined with timeouts caused by GCs (or something similar)
that would be my suggestion
what's the news on the 0.10 release? 🙂 I'm looking through cheat sheet from time to time and those sweet deprecations are exciting 🙂
you want aeron.client.liveness.timeout
btw, I'm pretty silent except for the problems, but we're using Onyx quite a lot for various tasks right now and it's working really great for us 🙂
and possibly one or more timeout, I’d have to look
@asolovyov that’s great to hear 🙂
@asolovyov Sometimes we realize a detrimental bug to smoke out the silent users 😉
Kidding 🙂
@michaeldrogalis well I hope it's a joke, because my cluster died twice today! 🙂
That’s awesome to hear. Glad it’s working well. We’re pausing a bit on 0.10 development while we beat it up with Pyroclast. Fixing problems as we go.
I mean I could live, but it sends emails to users right now and they are pretty unhappy 🙂
it’s in a pretty stable state thankfully
in many ways (like this issue), I trust it more than 0.9.x, but I’m not ready to call it production ready yet.
Right. There’s not much left in the way of feature development for 0.10. Just a matter of giving it more usage hours before final release.
@asolovyov are you using the G1GC gc?
that can help get rid of long GCs, which could cause timeouts
again, it’s mostly Onyx’s fault, but reducing timeouts can help too
I guess so! I see, we had no real problems with Onyx performance and stability before that, so our JVMs have almost vanilla config 🙂
@asolovyov Most Aeron configuration goes through JVM parameters with -D
Heh, beat me.
Hi guys, i have a question about flow-conditions, whats happend when my predicate indicate false? The job its done or i have to specify the false flow to end the job? I made these questions because in (with-test-env) the test dosent end. Its the behavior?
@lellis flow conditions won’t affect whether the job ends or not. That is achieved via punctuation on your input sources (i.e. the done message)
flow conditions will only help you decide whether to flow segments downstream (or retry).
@lucasbradstreet So in a :onyx.plugin.datomic/read-log how can i end my test? Any ideia?
@lellis you can either wait for some condition (e.g. some output to appear on the output task) and then just end. Or you can find out what the final basis-t is before you submit the job and supply ` :datomic/log-end-tx <<OPTIONAL_TX_END_INDEX>> `
wow, got a new one (not as severe as previous one though):
Aeron write from buffer error: java.lang.IllegalArgumentException: Encoded message exceeds maxMessageLength of 2097152, length=2648406
@asolovyov How large are the messages that you’re sending through Onyx?
Aeron caps max message size to a configurable value - one of your messages went over that.
@michaeldrogalis can you point me at config option please? I looked through aeron config and didn't find anything.
See Term Buffer Max Length in the doc linked above. ^
Er, whoops
One sec.
aeron.term.buffer.length / 8
is the max message size
Thanks ^
by default term.buffer.length is 16MB, for a max message size of 2MB