This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2017-07-16
Channels
- # aleph (4)
- # bangalore-clj (10)
- # beginners (39)
- # boot (24)
- # cljs-dev (66)
- # cljsrn (79)
- # clojure (39)
- # clojure-italy (7)
- # clojure-russia (6)
- # clojure-spec (4)
- # clojure-uk (1)
- # clojurescript (91)
- # core-async (1)
- # core-typed (2)
- # datomic (49)
- # hoplon (42)
- # lumo (2)
- # nyc (1)
- # off-topic (1)
- # parinfer (20)
- # pedestal (2)
- # reagent (4)
- # rum (14)
- # vim (2)
I want to find every entity that keys the atribute :my/keys
.. How to datalog it?
(d/q [:find ?e :in $ [?keys ...] :where [?e :my/keys ????]] db [:foo/bar :bar/quiux])
@U2J4FRT2T ????
-> ?keys
my/keys is a ref to many. Each is it's values should has a db/ident. If I pass :a :b :c, I want just the entities that has this keys.
You have to convert ?keys from indents to entity ids, using datomic.api/entid or with another clause
The v slot of query match clauses is never interpreted because it's interpretation depends on the attribute
Only the e and a slots understand lookup refs and indents because their types are known structurally
I am a newb so don’t take my word for it, but Datomic executes Datalog queries by steps (I think @favila is the one who explained this to me, so he might be able to correct me / explain it to you)
but this does not mean you cannot query a dataset larger than what fits in memory of course
just that the data returned by the query (and the data used in each of the intermediate steps) should fit in memory
I suspect this is also the reason Datomic threw the following exception at me: > Exception Insufficient bindings, will cause db scan
It’s basically a degenerate case of the “query step data does not fit in memory”. If clauses are not specific enough then Datomic cannot use the indexes to narrow down the data to get from storage, and so it would have to scan the whole database, which in most applications would not fit in RAM
There might be other reasons, but since you asked the question and I just had this error 5min ago I though it might be partially related
Unrelated question: are datomic backups storage-agnostic? e.g. if I use “dev” mode for a while and then decide to move to Dynamo or SQL later, will I be able to smoothly transition by populating the new storage from a backup?
Sure you can
@val_waeselynck thanks!
@val_waeselynck maybe ^
With dates is it common practice to store them as java dates and coerce them to clj-time/joda dates each time query? Or is there some better way - such as for instance just keeping them as clj-time/joda dates in datomic, so everywhere they are always clj-time/joda dates? The 'clj-time/joda everywhere' makes sense to me, but all the examples I've seen have java dates being stored.
There is https://receptive.io/app/#/case/17713 to request support for java.time Instants
@hmaurer well that explains why they tout a customer using Spark to overcome the limitation http://www.datomic.com/nubanks-story.html
But definitely the weak spot of the datomic architecture, even if most queries in a given system don't hit this wall.
By the way, while the docs still say that memory is cheap enough to fit all the data in memory, this is not a reality with enterprise data center memory prices (even if it is for your desktop machine).
The problem here is scalability and reliability, as you can simply one day find out that your queries no longer fit in memory just because data accumulation had persisted over time; which is quite terrible a situation unless you can plug in more memory by demand across each machine in your cluster in emergency mode, which is well, a terrible scenario..
I've been through the process of moving all our aggregations from Datomic to Elasticsearch and it went quite well. I see a lot of people who use a relational store and end up in a much worse situation when they hit that wall - because mutable databases simply arent well suited to feeding derived data stores, as they cant answer 'what changed' queries out of the box
@val_waeselynck are you using the log api to keep ES in sync? Or do you follow another approach? out of curiosity
Yeah it’s much easier to do on Datomic… There is bottledwater for postgres but it’s much more complex: https://github.com/confluentinc/bottledwater-pg
Is the log log api simply "the way" to sync all data changes to an external target, such as ES or even HDFS?
@matan you can do it in whatever way you want, but even when using a SQL datastore usually you want to sync data changes from a flux of events that describe all changes in your main data store
On your earlier message, that’s not quite true. I don’t know Datomic’s internals in details but I am pretty sure that standard queries on the “present” will not degrade in performance / memory consumed for a large database
I only commented on not scaling by query data size, not the size of the database... some queries will grow with the size of the database, and then ......
@matan I was just commenting on the part ” as you can simply one day find out that your queries no longer fit in memory just because data accumulation had persisted over time”
@hmaurer I know 🙂 and it still holds. Some queries grow with the database, so my statement holds 😉
@matan possibly, but Datomic’s target market isn’t big data. Also if you have a query which would need to go over extremely large amounts of data you are likely better of denormalising in another datastore