This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2015-12-03
Channels
- # admin-announcements (8)
- # beginners (99)
- # boot (60)
- # cider (44)
- # cljs-dev (47)
- # cljsrn (68)
- # clojure (223)
- # clojure-art (1)
- # clojure-russia (190)
- # clojure-sg (9)
- # clojure-uk (2)
- # clojurecup (1)
- # clojurescript (59)
- # clojurex (3)
- # core-async (43)
- # core-typed (2)
- # cursive (18)
- # datavis (7)
- # datomic (16)
- # events (2)
- # funcool (3)
- # hoplon (3)
- # jobs (1)
- # lein-figwheel (10)
- # leiningen (6)
- # off-topic (1)
- # om (123)
- # onyx (57)
- # parinfer (16)
- # portland-or (2)
- # random (1)
- # re-frame (4)
- # reagent (7)
- # remote-jobs (1)
- # spacemacs (12)
@lucasbradstreet: ok if we use https://github.com/pyr/clj-statsd/blob/master/project.clj in our onyx-metrics PR for statsd?
Sure, but make it part of the dev dependencies of the project, and then document that the dependency should be explicitly included if you use the functionality.
That's what we do with riemann
clever!
One day we may split out the projects so we can upgrade the library dependencies for our users but this works well enough for now
last two days have been bliss
not much input, true, but the latencies have been flat!
Yeah, that’s a really good sign that you’ve fixed the underlying issue which was mostly performance
Looks great
we’ll be going multi-node next - back down to 4-core 8gb ram machines, but two of them
probably going to need input on externalising aeron from you soon
@lucasbradstreet: i am SO happy with our Onyx system now. thank you so much for your patience and mentorship. it means the world!
I'm really glad to hear it! You definitely understand what's going on a lot better now, so I think you're well placed to deal with any potential issues that might come up.
Do you have any other Onyx shaped problems that you might tackle in the future?
yes. we’re going to be working on triggered/scheduled/reminder messages, and gamification badges quite soon. it’s all going to be tasks in Highstorm
the only thing that won’t be in HS is the actual scheduler. that’ll live outside. but the scheduler will create work for HS to perform
Cool, within the same job? Or will you decouple it?
fantastic question!
i suppose we’ll start with the same job and see
splitting it apart isn’t that hard to do. we have a find-tasks-in-namespaces abstraction, so we can easily just re-org namespaces and duplicate the job and link things correctly
One cool idea I just had, that is an advantage of splitting the job up, is that you could possibly use a degraded mode at a high load time by killing the secondary job until you get another box up.
that’s true. that’s a good reason to split things up early
Either way, it should be relatively easy to switch the code between the two
just pushed 1700 segments through, no latency spikes above 1s
backfilling all the missing stats from the last 90 days
-super impressed-
Killing it!
How's the load on the server looking? Depending on how much you've improved things you may want to look at a using a higher max pending in the future.
Not that I'm suggesting you start mucking with things now that you have it working nicely :D
@robert-stuttaford: Nice man, congrats
we’re going multi-node next, and then we’ll look at raising that value
Sounds reasonable. I guess the key is how it performs under load / on hard data.
With datomic you could probably run experiments for that pretty easily in your test env by setting the starting tx suitably low and see how it handles the catch up.
Assuming you have something there that looks like the real db
That's the awesome thing about log processing
we have some simulation tests going already, working on one that will let us go back to a previous backup, and replay transactions up till whatever point beyond that, at the same real time they occurred. and then adding a param to vary that replay speed. should be interesting. have a sim test currently that just spams it with chat events, on this i7 macbook, with chrome etc running in the background, the system can handle around 120 users concurrently no problem, which is around ~1500 transactions a minute. i'm really happy with that, thanks so much.
also working to get a close to production test setup going on aws, so we can do proper sim tests
That’s sweet
Exactly what you need!
It’ll be great when you implement new features
amen to that
@lucasbradstreet, question, do you know what the pros and cons are of doing the onyx-metrics approach, of summing and doing percentiles etc within the app, and then just sending these to something to plot them, versus not doing any calculations inside the app, but rather sending them directly to something like statsd to do the summing / percentiles etc, for you? i'm just wondering where one would be better than the other.
for high throughput work, we’d never be able to send that many events without a hit
so we really have no choice but to coallesce them in the app
I think for the current work you’re doing, you could probably send all the events and not take too much of a hit
For StatsD, are you guys going to initially do what we do and transform the events that the main metrics lifecycle puts on the channel, and send them out? As in here: https://github.com/onyx-platform/onyx-metrics/blob/0.8.x/src/onyx/lifecycle/metrics/riemann.clj
I really have to clean: https://github.com/onyx-platform/onyx-metrics/blob/0.8.x/src/onyx/lifecycle/metrics/metrics.clj up at some point
@greywolve: back to the first point, we’re processing on the order of 3M segments a second on some of our benchmark tests, where it would be pretty costly to send that order of events to riemann
sending out a whole event via TCP, versus adding one new measurement to an interval-metrics reservoir are very different orders of cost
A little peek at what's coming out next: https://gist.github.com/MichaelDrogalis/9f6109703c660789839b
Automated transfer of data across storage mediums through Onyx. Removes the grunt work.