This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2017-05-17
Channels
- # aws (16)
- # beginners (82)
- # boot (29)
- # cider (43)
- # cljs-dev (90)
- # cljsrn (14)
- # clojure (79)
- # clojure-dev (12)
- # clojure-greece (4)
- # clojure-italy (12)
- # clojure-russia (81)
- # clojure-shanghai (1)
- # clojure-spec (39)
- # clojure-uk (28)
- # clojurescript (159)
- # consulting (1)
- # cursive (16)
- # data-science (6)
- # datomic (18)
- # devops (3)
- # emacs (22)
- # figwheel (1)
- # graphql (15)
- # hoplon (3)
- # jobs (1)
- # jobs-discuss (8)
- # leiningen (1)
- # luminus (6)
- # lumo (1)
- # off-topic (18)
- # om (6)
- # onyx (38)
- # pedestal (30)
- # perun (3)
- # re-frame (38)
- # reagent (8)
- # ring-swagger (2)
- # rum (2)
- # sql (2)
- # unrepl (14)
- # untangled (1)
- # vim (8)
Morning all, I'm working on an issue that has been safe to ignore but troubling me nonetheless - implementing a "buffer" for DB writes. My workflow is effectively "analyse msg, attach side-effects, assert side-effects, reason about side-effects, save side-effects" and it's the saving that has an issue: the possibility that a message creates the same side-effect. So I'm trying to implement a "buffer" that will allow the first msg through, whilst aggregating subsequent messages until such time that the system is working on an up to date DB entry so that it no longer creates side-effects but modifies them whereupon I would release the aggregated modifier. (I hope this makes sense)
I'm just looking for some direction on how I might go about doing this... in other situations I "prime" the DB so that all fields exist and every side-effect is a modifier, but that doesn't seem correct here as it will quickly bloat. So instead I'm attempting to use a session window with a segment trigger to allow the first msg through while aggregating the others. This seems like a good way to tackle it, but I'm open to suggestions from more experienced devs on this subject. A part of me wonders if this is the perfect situation for flow-conditions? I went with window/trigger because of the need to accumulate state.
@theblackbox what database are you using ?
I'm using mongo
and you are totally correct, excepting the async nature means I can't guarantee the side-effect is going to be an upsert
that's what I mean by buffering until a segment is a modifier then simply aggregating the buffered creates into the modify
I think I'm on the right path with window/trigger/state to getting a working solution, then I think I need to improve it by utilising a persistent queue - which I have available in the guise of Kafka
right, what you can also do is version your data and try to do a 'compare and swap' operation when storing it -- if the data changed in the meantime, you can redo the calculation
this is a form op optimistic locking, though, and if your chance of having conflicts is too high, it will not scale
cheers, appreciate the advice
just got the Unfreezable type: class clojure.lang.Delay
error when starting a job in a fresh environment / deploy.
found relevant slack logs:
https://clojurians-log.clojureverse.org/onyx/2017-01-04.html
https://clojurians-log.clojureverse.org/datomic/2016-10-04.html
onyx "0.9.15"
onyx-datomic "0.9.15.0"
datomic-pro "0.9.5561"
restarted the job several times to see if it consistently repros - it does. the value is:
#object[clojure.lang.Delay 0x5bb71dc7 {:status :pending, :val nil}]
I didn't see any resolution on this issue in the Slack logs. should onyx-datomic workaround it by checking if tx-range returned a delay?
this same job is running successfully in another staging env against a separate datomic db that has nearly identical data. this is the first time i've seen that error.Does the problem persist across JVM restarts? I wonder if it's something particular with your dataset that can be used to help diagnose the problem with the Datomic team.
Error communicating with HOST 0.0.0.0 or ALT_HOST datomic on PORT 4334
– i see this intermittently. trying again
this is a fresh datomic db with schema installed and about 500 generated test entities.
would be useful to have some logging in onyx-datomic to verify tx-range is for certain returning the Delay
That's really good that you found a repro, I'm sure they'll be able to assist further.
I don't think the dashboard provides that yet. Are you trying to get at the actual checkpoint state or metrics on checkpointing.
should that be exposed on an endpoint, e.g. https://github.com/onyx-platform/onyx-peer-http-query#endpoints ?
@devth If the Datomic team doesn’t have an update, it’s probably a good idea for us to work around it in onyx-datomic. Why that’s happening is still a mystery to me.
Thanks!
Been doing a lot of design work on log based architectures the last few months. Not directly related to Onyx, but hopefully useful for this crowd: http://pyroclast.io/blog/2017/05/17/the-future-of-event-stream-processing.html
Thanks. Hope some of the insight is fresh. 🙂
If you’re also willing to vote it up on https://news.ycombinator.com/newest it’d be appreciated 🙂
Nice writeup, it seems we are increasingly ditching b-trees for logs. However it does not even hint at possible issues/how to resolve things like read-after-write, monotonicity etc. I would love to know what you guys think 🙂
@nha Only so much room in one post. 🙂 Future material.
Though, important to note that in the context of event steaming, I’m not trying to say that materialized views are replacements for any sort of database. In the full landscape of data storage engines, this is just one narrow case.