This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2017-06-30
Channels
- # aws (1)
- # bangalore-clj (1)
- # beginners (73)
- # boot (13)
- # cider (3)
- # clara (19)
- # cljs-dev (33)
- # cljsrn (37)
- # clojure (177)
- # clojure-dev (13)
- # clojure-gamedev (1)
- # clojure-italy (10)
- # clojure-nlp (1)
- # clojure-russia (1)
- # clojure-spec (64)
- # clojure-uk (128)
- # clojurescript (177)
- # core-async (23)
- # cursive (5)
- # datascript (13)
- # datomic (20)
- # devops (49)
- # emacs (13)
- # graphql (5)
- # hoplon (13)
- # keechma (1)
- # leiningen (3)
- # liberator (4)
- # lumo (2)
- # off-topic (11)
- # om (19)
- # om-next (3)
- # onyx (6)
- # re-frame (13)
- # reagent (14)
- # ring-swagger (7)
- # rum (2)
- # spacemacs (7)
- # unrepl (1)
- # untangled (23)
- # vim (8)
- # yada (1)
Going through the user guide, and the sample project https://github.com/onyx-platform/onyx-examples/blob/0.9.x/multi-output-workflow/src/multi_output_workflow/core.clj, May I test few assertions about the architecture here?
1 existing cluster deploy tools should be used; no need for Onyx to invent its own mechanism for it
2 peers are network nodes or the process that managers virtual peers on a node, virtual peers are workers running within peers
3 tasks are split evenly across all peers and virtual peers
4 nodes can be added and removed by will, will join the cluster and seamlessly begin performing tasks
5 tasks have types: input, function (which is really the processing part of it all), output
6 number of virtual peers to participate in a task is determined by the max-peers
key of the task
7 system-wide snapshots are somehow taken (in the latest version through an algorithm optimizing for space efficiency not so much latency) such that if a node is removed or stops responding, work is never (?) lost.
8 exactly-once semantics are enforced (but side-effects may occur more than once)
9 kind of a rough edge, something to automate in future versions, or just a tuning task for the system architects of a particular system, gc should be invoked by user code or an admin command.
1. Yes. 2. Yes. 3. Depends on the scheduler being used. Controlled by both job and task schedulers, and by per task constraints like onyx/min-peers / max peers. 4. Yes. 5. Yes. 6. max-peers is the upper bound, there are other factors — it can receive less unless min-peers is specified. Otherwise could get some number inbetween. 7. Other way around — sort of. I’d argue it’s generally optimized for latency. We used a more space efficient algorithm in 0.9 — 0.10 dumps an entire snapshot at every epoch checkpoint. 8. Yes. “exactly-once aggregations” are supported. exactly-once side effects are snake oil. 🙂 9. Yeah. Though, most of the time we switch to a new tenancy before invoking GC.
michaeldrogalis: thanks a lot 🙂 I'll proceed my learnings through further reading beyond this now.
@U050A65BL What had you literally meant by "most of the time we switch to a new tenancy before invoking GC"? does does switching a tenancy mean?
@matan Ill get back to you on the mailing list tonight btw