Fork me on GitHub

Any news on the ignored :aggregation/init function? Or the extra event state parameters?


BTW, is there anything in Onyx that could be separated to a "distribution" lib that anyone can use to implement new abstractions, for example if I want to support externally-initialized aggregates, or implement actors model etc.?


I'm not sure how much more we could split out. It's a little tricky. The new ABS work may allow extra lifecycle hooks because we've turned the whole task into a state machine


Thanks Lucas. What's a good place to learn about how things will work in the new ABS release?


We don't have much in the way of documentation onyx wise yet


I'm trying to learn to read onyx-metrics and this looks a bit weird to me: I'm not sure why complete latency gets posted only sometimes, and why reading from kafka always has bigger throughput than sending emails, even though we don't experience any lags?


complete-latency will be posted whenever a message is acked. Are you seeing retries?


That could also explain why your send-email is lower than your read-email throughput. If messages are being lost, or similar, then you will see retries and a lower throughput at send-email


It also might be explained by having two peers on the send-email task, since you’re taking an mean of the throughput, assuming you’re not grouping by peer


i.e. you have two 2500 throughput readings per second which are being averaged to 2500 throughput overall


rather than being summed


sorry for not answering right away, your first messages got me to check what's going on and we indeed had retries 🙂


fixed that now and will see if chart changes


oh, three peers on send-email and a single one on read-email


I guess I'll have to change from mean to sum, but I'm not sure how to handle situation when grafana adds grouping by time... anyway, I'll see, thanks! 🙂


There should only be one throughout event per second, so it's safe to sum them if you care about only the aggregates


It's good to be able to split them so you know what each peer is doing though


what do you think is a sensible setting for max-pending? 🙂


@asolovyov how long is a piece of string? 🙂. It really depends on what your tasks are doing. I’ve used a max pending of 100000 for simple tasks that are processable in microseconds, whereas if you’re making synchronous calls that take 100ms maybe 100 is more appropriate


Best I can tell you is to increase it until you get retries and then back off a bit for a safety factor


I need to invent some way to experiment with tasks in production 🙂 it's really hard to emulate in development, and doing release on every small change is painful 🙂


test environment? The turn around times can still be long though.


got an error like that and not sure how to find the actual problem, any ideas?

#error {
 :cause "No matching field found: getCause for class clojure.lang.PersistentArrayMap"
 :data {:original-exception :java.lang.IllegalArgumentException}
 [{:type clojure.lang.ExceptionInfo
   :message "No matching field found: getCause for class clojure.lang.PersistentArrayMap"
   :data {:original-exception :java.lang.IllegalArgumentException}
   :at [onyx.compression.nippy$fn__13700$fn__13701 invoke "nippy.clj" 33]}]
 [[onyx.compression.nippy$fn__13700$fn__13701 invoke "nippy.clj" 33]


ah, found better one in the logs


is it possible to catch exception in flow-condition and stop doing any processing on that segment? Onyx expects that post-transform will return something sensible, but I wonder if that is possible to avoid?


@asolovyov looks like our exception serialization is failing you


because I've been returning stuff I shouldn't 🙂


so I try to post something to Sentry


but then I just want to skip this segment


flow conditions would normally be able to do it, but it might not be because the exception handling is failing internally


You’re throwing an array map or something?


just though that it's probably my sentry handler throwing something


Shouldn’t do that regardless


Which version of onyx are you using?


@yonatanel We’re open to taking most PR’s into lib-onyx.


It’s the general experimental library.


I haven’t had time to look at :aggregation/init being ignored in local-rt. If the fix seems obvious to you, can you PR and send in a fix? It should roughly match what’s in core, and IIRC that’s where you found the discrepency?


It's been awhile since I checked, but if I call kill-job, does the clean shutdown imply that the job will stop taking new segments and if a segment has not made its way to the out tasks, the job will continue to run until it drains the existing segments that need to make their way through the workflow?


It will not drain segments, no. Shutdown is immediate.


OK, that's what I thought and generally want, however is there a way to drain segments?


I guess I can send a :done


OK, good, that's how I am doing it now. I hadn't touched onyx since a few versions back so just verifying I am remembering things right.


If I end up with a few thousand jobs in my cluster that are mostly parked, is it better to shut some down explicitly after a long period of inactivity or is the overhead of leaving it parked not really going to have any implications?


That is to say, those jobs can and will be run again so there's a tiny justification for just keeping them there, but I could just as easily check if a corresponding job type is running, and if not, launch a new one


The peers will remain blocked on real JVM threads.


OK, I thought the other day someone said the peers will be reclaimed


So in that case, it would make sense then to have some inactivity threshold and end the jobs either by killing them or sending a completion.


OK, I've read through this and the source and I think I'll end up writing my own scheduler to tweak things to my needs.


Oh wow, that murdered my formatting, but whatever, you get the point. It's rendering the raw template markup there


@yonatanel Patch for local-rt is out on Thanks!