This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2017-05-26
Channels
- # aws (7)
- # beginners (109)
- # boot (5)
- # carry (2)
- # cider (25)
- # clara (6)
- # cljs-dev (86)
- # cljs-experience (19)
- # cljsrn (1)
- # clojure (183)
- # clojure-dev (7)
- # clojure-dusseldorf (7)
- # clojure-gamedev (2)
- # clojure-greece (32)
- # clojure-italy (2)
- # clojure-norway (1)
- # clojure-russia (228)
- # clojure-sg (3)
- # clojure-spec (38)
- # clojure-uk (104)
- # clojurebridge (1)
- # clojurescript (29)
- # community-development (9)
- # core-async (118)
- # core-matrix (20)
- # cursive (5)
- # datomic (140)
- # emacs (25)
- # figwheel (1)
- # hoplon (21)
- # jobs (4)
- # lein-figwheel (2)
- # luminus (10)
- # lumo (35)
- # off-topic (137)
- # om (31)
- # onyx (62)
- # pedestal (6)
- # reagent (25)
- # remote-jobs (1)
- # ring-swagger (11)
- # spacemacs (2)
- # test-check (17)
- # uncomplicate (10)
- # unrepl (1)
- # untangled (20)
- # vim (4)
- # yada (3)
having trouble grokking the "set" "unset" in the documentation for altering a schema to be non-unique
(d/transact conn '[{:db/id :user/email
:db/unique unset}])
not working like I would hope it would π
ahhh hahaha
darn users ... can't they read? π€
does anyone have any idea why these might be conflicting?
{:d1 [17592186045678 :user/id 52 13194139534544 true],
:d2 [17592186045678 :user/id 71 13194139534544 true]}
or, what questions would you ask to determine if they are conflicting?
this is the schema:
#:db{:ident :user/id,
:cardinality :db.cardinality/one,
:valueType :db.type/long}
You can't assert multiple values against the same EA pair in a single transaction if the attribute is cardinality one
to further the mystery... I attempt to transact a bunch at once (d/transact con list-of-maps)
and I was getting lots of conflicts, but when I do
(doseq [m list-of-maps] (d/transact conn [m]))
it works fineBecause the entirety of a transaction is atomic (ie it all happens at exactly the same time) how would you know which is the value to assert?
ohhh... so, all those maps were being assigned the same entity?
I really should've gone to the g*dd*mn day of daytomic training π do you know if they're doing another one at the conj this fall?
dooo ittttt
so it the second way of doing it preferred for a bunch of seperate entities?
Looking at your original 2 conflicting datoms - you're saying the same entity has both 52 and 71 as id
i.e. for a more complete example
(d/transact conn [#:user{:disable false,
:email "i********@****.net",
:authenticated false,
:pwdhash
"pbkdf2:sha1:1000$mIrT****************",
:lastname "B***",
:hawk "ib******",
:username "ibi*****",
:firstname "I*****",
:id 53,
:group_name "client",
:count 0,
:last 1485361260000}
#:user{:disable false,
:email "robe*****@***.net",
:authenticated false,
:pwdhash
"pbkdf2:sha1:1000$****************,
:lastname "F",
:hawk "r*****",
:username "r"****,
:firstname "R****",
:group_name "admin",
:count 0,
:id 77
:last 1491327360000}])
there were a bunch more of those in the transaction but I just pulled two
those are somehow all being regarded as the same entity?
oh, I filtered those out in desperation π
Ah. Yes it seems that you have multiple maps referring to the same entity. Do you have a unique identity or value attribute ?
no I didn't make any of them unique π
if I had made at least one of the attributes unique would a bulk-add have worked?
You could add an explicit db/id to each to be sure, but the behavior you describe is unexpectes
I thought it implicitly created a db/id
Unless there's a unique attribute or you have the same temp ids in more than one map
It does. But you can use an arbitrary string temp id for instance to refer to other entities in the same txn
ah I see ... let me check
datomic-pro-0.9.5561
If youll email me a repro (schema and txn that fail) ill have a look tomorrow morning.
sure thing
@val_waeselynck i just came across your datofu project (via a slack log archive)
im wondering what happened to the idea of defining the schema using helper :db/fn
s as you suggested in https://stackoverflow.com/a/31480922
that time u said u haven't tried it because you are happy with generating the schema via code.
have you tried it since or do u know anyone who tried it?
onetom: haven't tried it, I've been moving more in the opposite direction. Regarding modeling, I see the Datomic schema as a derived thing rather than a source of truth; my approach for http://bandsquare.com is to store model metadata in a DataScript database from which installation transactions are derived.
I still believe that for most projects, datofu's approach will be the most reasonable one, at least for getting started. Datomic's transactions being data doesn't mean it has to be written in data literals
I should add this to the SO question
I now believe even less in database functions approach than I did at the time - they'd just be an un-portable DSL disguised as data
interesting... however what does porting mean? you would expect that some other system might want to read the same EDN data which describes the schema as transaction function calls? it will see a vector of lists which contain a symbol a few keywords and a string. if you would just use the data literal, then you would get a vector of namespaced-keyword-keyed maps. then what would be the next step that other system would do with this data? it would still need to interpret it somehow. the db/fn approach means it would just deal with positional parameters as opposed to named ones... and if that other system would understand datomic schema attribute names already, then it should just receive the output of a datomic query which returns a schema as maps using pull... π
not that i don't like the functional approach, just the person im working with at the moment insists on using .edn
files for the schema and similar seed data.
and it works for now, so instead of resisting, i'd like to trick him towards a more concise solution π
In this case, trick him by using custom EDN tagged literals π
anyone have any tips for muscling through large sql imports?
my transactor keeps timing out
do smaller imports you say? that's a solid idea. I'm glad we had this talk π
yeah I think part of the problem is I'm using jdbc
and pulling the whole table into memory which isn't great either
I need to figure out away to do a lazy-seq on the rows
and I am doing transact-async and I'm assuming it's without derefing b/c I didn't know derefing was a technique you could employ π³
(defn import-table! [conn db table-name tx-fn]
(do (import-schema! conn db table-name)
(d/transact-async conn (import-table conn db table-name tx-fn))))
what's a better practice?
loop over the rows and transact them one at a time or in chunks?
to be fair this stuff is fairly cutting edge as far as tech goes
it's really nice we have a nice community (aka (== @favila 'community))
you know it's also interesting ... part of the brilliance of this all is the prolog has been around for a long time and so have databases and it's so awesome that someone finally put them together
in order of importance: 1) transact in chunks of 1000 ish datoms 2) use pipelining 3) do it with a separate amped-up transactor with no other load, or on a local machine (or whatever) and get it into production with a backup/restore 4) dial up memoryIndexThreshold and memoryIndexMax (to avoid indexing as long as possible)
baller... that oughta be pinned
thanks again π
the rest you can often ignore. since your entire import fits in memory anyway it's unlikely the other stuff matters much
well π some tables
okay so what would that look like?
(doseq [chunk chunks]
@(d/transact conn chunk))
?I'm not sure I understand the signifcance of derefing the transaction
ah I see
but if you deref it you get the benefits of async and sync
it automatically adds a timeout, and does not return until the future is either done or throws because it timed out
ahhhh interesting
but really long waits are not abnormal on a bulk import job so you don't want the timeout
however, that doesn't wait at all, so if you call it over and over without deref you are just overwhelming the transactor with potentially thousands of tx requests
(and not checking for errors either--transactions may legitimately fail but you won't see the error and won't stop issuing txes)
however immediately derefing is slow: it means tx is sent, and no new tx is sent until response is received
that's where pipelining comes in: you send maybe 10-20 d/transact-async at a time and deref later or in another thread
yeah that's fine, slow is not an issue
just steady and reliable is important
:thinking_face:
so you're saying send 10-20 chucks sized 1000 and then deref somewhere?
pipelining relies on not waiting for a deref to finish before sending another d/transact-async
this one does (but I rarely use it, not sure how bug-free it is) https://gist.github.com/favila/3bc6fae005228a3290d5509c088e2f11
inorder isn't so important
gather the in-flight futures into a vector. when it reaches the desired size, start derefing them and removing the derefed ones from the vector, then keep ggoing
oooo okay fantastic
I think it would probably be doable to use channels for that
don't deref all the in-flights at once?
why is flushing the pipeline bad?
or is just inefficient?
hmm is that true even if you are derefing them on a seperate thread/channel?
so you go from e.g. 20 inflight, then your depth is reached you start derefing them all, so then you have 0 in flight, then you issue 20 inflight all again
ahhhh
when your inflights fill, be careful to deref only some, not all of your backlog, or else you will empty your pipeline
I see
so... fill up 20, deref 10 or so, bring on 10 more, deref 10 or so, etc?
If I had time for such things this would make for an interesting study
I don't know how much depth matters, just as long as the txor never has to wait for another tx from you
gotcha. I did a NoSQL -> MySQL transfer pipeline once that did a similar thing that attempted to optimize its write speed by varying chunk size, but honestly I'm not sure it was worth the effort, the gains were fairly marginal
gotcha. Excellent, gives me a great place to start