This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
- # beginners (165)
- # boot (9)
- # cider (9)
- # cljs-dev (5)
- # cljsjs (5)
- # clojars (4)
- # clojure (207)
- # clojure-brasil (1)
- # clojure-greece (3)
- # clojure-poland (2)
- # clojure-russia (6)
- # clojure-spec (85)
- # clojure-taiwan (1)
- # clojure-uk (53)
- # clojurescript (96)
- # community-development (2)
- # cursive (4)
- # datomic (14)
- # emacs (41)
- # events (2)
- # hoplon (184)
- # leiningen (1)
- # nginx (1)
- # off-topic (16)
- # om (7)
- # onyx (63)
- # pedestal (27)
- # planck (17)
- # protorepl (3)
- # rdf (9)
- # re-frame (62)
- # reagent (7)
- # ring-swagger (5)
- # schema (2)
- # spacemacs (5)
- # test-check (25)
- # untangled (93)
- # yada (3)
@yonatanel we have a doc with all our production recommendations http://www.onyxplatform.org/docs/user-guide/0.9.x/#performance-tuning
No worries. Just wanted to make sure you don’t miss it. Performance tuning is probably a bad section header for it
@akiel no. The best you can currently do is play the log back and figure it out from the time on the log entry that started the job
@lucasbradstreet We could display this info in dashboard - like apache does. 2 blocks - running jobs and finished jobs - with info
job-id. It's common that people want this info. I could implement this generally, I think.
@mariusz_jachimowicz I think he wants to get it from inside the peer as it’s running though, not from the dashboard
@lucasbradstreet Sure. I just think that there are common questions and we should provide answers in the dashboard, so people could have more view what's going on. I will work on implementation than.
@mariusz_jachimowicz: absolutely. More stuff like that in the dashboard would be appreciated
Yes I Like the information from inside a peer. The background is that my peers write files into S3 and I like to have a date inside the S3 key. But I like to have the same date for all peers. In case the job runs over night, I don’t like to have different dates for each peer.
@lucasbradstreet Yes I could but it inside the catalog as function param. Can I mix function params from the catalog with params from lifecycle? My function gets already params from the lifecycle.
If it already gets params from the lifecycle you can grab what you want out of the task-map and add them to the params from the lifecycle when you add your other params
Hello people, a little help here. Is it possible to test with onyx-RT a job created to read from onyx-datomic? I am having the following error when using RT.
CompilerException java.lang.IllegalArgumentException: No method in multimethod 'write-chunk' for dispatch value: [nil :chunk]
It’d be nice, and we may support it in the future, but currently too much is going on with the run-time to support it
I think I figured how to implement my aggregate that needs initialization per group by doing it on the input level instead of the aggregation, and should I need to initialize from snapshot I will send it as a segment. Durable snapshots will be saved in sync function.
@lucasbradstreet In the dashboard should I display job duration as a
min-duration / max-duration where
min-duration = min value from array of
:seal-output timestamp - :signal-ready timestamp values
max-duration = max value from array of
:seal-output timestamp - :signal-ready timestamp values
to show min and max diff for processing a job, right?
It’s not really possible to use any of those things without actually looking at whether the log entry moves the job to :completed-jobs or :killed-jobs
If I recklessly open new threads inside a task or aggregation is it possible to starve a peer or do you use your own pool?
@yonatanel most of the processing is done on threads created with core.async/thread https://github.com/clojure/core.async/blob/master/src/main/clojure/clojure/core/async.clj#L428
@lellis The reason that we don’t (and will likely not) support it is because local-rt is designed to be functionally pure. We can’t integrate external storage plugins and obtain that goal.
You can make a simple adapter to dump data into the local-rt from whatever source you’re reading from and manage that in your application.
Re @akiel @mariusz_jachimowicz Grabbing the job start time of is a good idea, but figuring out what that time is needs some consideration. The best “shared location” to get the wall clock time would be off the znode in the ZooKeeper cluster, but I agree with @lucasbradstreet that the submitter of the job has the best notion of “now”, and the job start time should be supplied through the catalog at submission time.
The dashboard (and other log subscribers) can read the wall clock time off the submit-job znode from ZooKeeper for the former.
@lellis If that ends up being a giant pain in the neck, we can look at making a library of adapters to move data from the plugin targets to the local-rt and back. That seems useful.
Not to mention that the current plugin API is very nasty for that kind of use
@lellis Sweet, glad it wasn’t too bad. The other reason for that design choice is that local-rt is used in ClojureScript 🙂
@lellis I don’t know if it helps, but you’ve probably noticed a lack of validation with local-rt compared to core. There’s a mostly complete Spec in the namespace onyx.spec that you can leverage if you want.
each log entry has created-at value so this will be timestamp value for me
You can see that we do some tracking in here https://github.com/onyx-platform/onyx-dashboard/blob/0.9.x/src/cljs/onyx_dashboard/controllers/websocket.cljs#L41
You could just look for when jobs are added between two replica states, and add that to the om state
In general we shouldn’t be storing what is capable of discovery by replay. Replay of the log is meant to be short in duration to catch up to the head.
I dunno. A little bit of extra state so you don’t have to replay some things is not so bad
The lag is more or less invisible once the subscriber is caught up. Removing state from the equation in exchange a quick one-time lag is a better play IMO.
I am suggesting that as the replicas are updated on the client, we should record the created-at time that jobs are first submitted/killed/completed in om’s state
Oh, I thought we were talking about something completely different. Nevermind 😛
I think @mariusz_jachimowicz was suggest that we could just read the znodes to the job chunks when he said "so it could be taken from ZK rather then replaying log entries”, so that’s probably what you were responding to
Yes, I was just curious could we store some usefull data directly in ZK without need to replay log entries and to be able (in the dashboard) to answer common questions like generally activiti history (each job duration, start/stop time), are there any errors seen.....
There's no sane way for me to guarantee I write to Kafka with Onyx in order,? Without going too much into what I'm doing, let's say I want to deliver paged results from a DB in order over time, in batches. Can I guarantee in any way that I write results 1-10 first, then 11-20 next, and so on? I can have a delay between batches potentially and use some form of windowing, but I don't want it to be like some black magic trick similar to a thread.sleep that works only 90% of the time. I need things in order as much as possible. As far as I know, there's no great way to do this given the distributed nature of things.
Doing that now some, but I'm thinking it won't work for me. I would need a lot of jobs per window definition, which varies in my case per the input data.
So user A would be getting results every 5 seconds ordered, but user B needs results every 10 seconds for example.
I guess if windows are the only way, I'll have to either go with actors or think of some way to design my job around doing dynamic portions from the input data somehow without requiring separate jobs.