This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2016-08-12
Channels
- # bangalore-clj (4)
- # beginners (40)
- # boot (53)
- # cider (34)
- # cljs-dev (9)
- # cljsrn (11)
- # clojure (113)
- # clojure-boston (5)
- # clojure-dev (9)
- # clojure-dusseldorf (4)
- # clojure-russia (8)
- # clojure-spec (11)
- # clojure-uk (12)
- # clojurescript (88)
- # cloverage (17)
- # conf-proposals (4)
- # core-async (30)
- # cursive (9)
- # datomic (107)
- # euroclojure (5)
- # hoplon (196)
- # luminus (10)
- # off-topic (20)
- # om (24)
- # om-next (1)
- # onyx (80)
- # parinfer (3)
- # pedestal (16)
- # planck (44)
- # proton (8)
- # protorepl (1)
- # re-frame (19)
- # reagent (7)
- # spacemacs (2)
- # untangled (29)
- # yada (25)
@aengelberg: Yes to 2nd question, uniqueness key is still required.
So windows might count records twice if I don't include the uniqueness key?
First question - @lucasbradstreet will want to confirm since he authored most of it, pretty sure trigger synchronization happens after a segment is processed, but before it is ack'ed.
So if I input the segment {:x 1}
and the window is summing all the x's, my trigger may get called with the value 2? That didn't sound like what onyx was guaranteeing on the state and accumulation page.
@aengelberg: If you have a uniqueness key, and you're summing x's, you'll only add 1 no matter how many times you see that segment
A unique key needs to be present in order to make that guarantee though.
If you don't have a key, you might consider using a consistent hash to roll your own if you know every value in your domain is unique - https://github.com/replikativ/hasch
I get that the uniqueness key would only count multiple records of the same key as one record according to the window. I'm just trying to figure out if, without the uniqueness key, there are scenarios where a trigger gets called with a value not equal to what it's supposed to be (all x's read in summed together).
What do you mean by "what it's supposed to be"? Under what circumstances?
You get no guarantees about idempotent state updates without telling Onyx what the uniqueness key is - which includes both the windowing value and the trigger criteria.
You can still use those features if you're okay with that, but its been designed to use the key in general.
@aengelberg: yes to first question
btw in http://www.onyxplatform.org/docs/user-guide/latest/monitoring.html, the example is wrong it should be (def v-peers (onyx.api/start-peer-group 3 peer-group monitoring-config)) instead of start-peers đ
Hmm that's no good. Thank you!
Just before a aeron conductorTimeout timesout the latency of :zookeeper-force-write-chunk 8250 increases, it happen regulary after transacting aprox 200.000 record (using out of the box onyx/sql as input and onxy/datomic as output). jvm heap size i 3 gb and look fine i visualjm (using a zooker/server = true) Any ideas what this could be an indicator of ?
I havenât used jvisualvm because Java Mission Control is better (run via jmc on the command line). Does jvisualvm show you how long GCs are taking?
JMC can do this
Will try with jmc đ
Cool, let me know if you need any tips. FYI, Flight Recorder has a great feature where it can write X mins of history to a ringbuffer, and which doesnât hurt performance much so it can be used in prod (assuming you pay Oracle for the privilege :P)
Which options do you normally use ?
For this sort of thing Iâd just start up jmc and attach to it while it runs. Alternatively you can add these lines to your java-opts https://github.com/onyx-platform/onyx-benchmark/blob/master/project.clj#L34
then open up localrecording.jfr from mission control after a run / killing the JVM
@zamaterian: that monitoring doc looks good to me. Can you clarify? start-peer-group takes a monitoring config as the last parameter
hey @lucasbradstreet -- forgive the Off-topic, have you guys done any web load-testing before?
Not really, though it's something I could get into
@lucasbradstreet: it states âstart-peersâ instead of start-peer-group đ
;; Pass the monitoring config as a third parameter to the start-peers
function.
(def v-peers (onyx.api/start-peers 3 peer-group monitoring-config))
Oh. Apologies. We must have fixed it because I looked at the doc in the develop branch and it was correct
Thanks for the report :)
nop problem, nice to help out đ
lucasbradstreet: The latency increase is probably because of gc pauses (longest pause i 4 s in parrallelOld) Thx for the pointer đș on me.. again.
@michaeldrogalis: just to follow up from the other day. i did get it working. my main issue was not starting all of the stateful bits using the lifecycles. so i had data coming in but never writing out. once i added that it output to elasticsearch just fine. đ
So we are having an issue where our resulting data appears to be incorrect. Meaning looks like we lost data in our out data does not match the number coming from the in. To be more clear we are reading data from kafaka and running in through some tasks; ending with a window that is doing grouping so we can collapse like records into one and right them into elastic. We are using a Session Window with a Watermark trigger. We keep a count in the collapse so we can see how many records it corresponds to.
To figure out what is going on we attached another window to the :out task to count total packets flowing through and those numbers sometimes appear to be more and sometimes less. lol. Any way to diagnose what may be happening. I know this is probably tough to answer without seeing our stuff
@camechis: if it werenât for you seeing âsometimes lessâ, I would suggest that you may be seeing retried segments
@lucasbradstreet: I lied, we are seeing more on the :out channel not less. Our Collapse is always short. I do think we are getting retries. I am seeing a large spike at the beginning of the run and then it kind of goes away
OK, :onyx/uniqueness-key
can help with that if youâre not using it already. Itâll help with deduping on the window
The second thing I would try is decreasing :onyx/pending-size
and/or increasing :onyx/pending-timeout
Ah, well done. It wonât help with anything not on a window (e.g. possibly the out channel), but it sounds like youâre having trouble with the window too
One other thing to look at is that when you use group-by-key/fn, itâll implicitly create a window for each key. If youâre firing the trigger to write to a static key, you may be overwriting the value over and over again
e.g. say you have keys A, B, C. Itâll have windows for each A, B, C and the trigger will fire for each
if youâre writing to a single key each time itâs fired for A, B, and C, youâll only end up with the value for C
Right, so when you write out, are you using the value of :group-key in the state-event that is supplied to your trigger in any way?
What Iâm worried about, is if you have grouping keys A, B, and C
your trigger could be called once for each A, B and C
A will succeed, but what you write to elasticsearch might be overwritten by the next fire for B, and B by the next fire for C
i think each write to elastic is unique. Essentially a completely new document on each trigger fire
Ah, ok, but are you adding up all of the counts for all of the group keys?
from all of the documents that you wrote to?
OK, sounds like you have things fairly right then
i will say at one point we used a fixed window with a timer trigger. Things seemed much slower but our data looked much better
Itâs certainly possible that there is a problem with the way that youâre using session windows
try a global window
that should absolutely match up with your segment count
Yeah, from going through this discussion it does sound like you have a handle on the way the triggers are working
global was actually kind of my first instinct but session windows also had some nice things that sound like what we wanted. lol
Absolutely - global may just help you nail down whether the issue is due to how you build your windows
I think I have all the information I need about window guarantees, thanks @michaeldrogalis (and @camechis for confirming that just now with an anecdote đ )
so still struggling with the windowing but we are seeing big spikes off and on on retries. What was the parameters we should try to adjust? I know you mentioned onyx/pending-size but not sure where that goes?
@camechis: http://www.onyxplatform.org/docs/cheat-sheet/latest/#catalog-entry/:onyx/max-pending
Set on the catalog task map of the input task to alter.
What are the limitations on windows? If i try to keep one open for the span of a week i assume i would hit some problems.
You shouldn't as long as you don't run out of disk space