Fork me on GitHub
#datalog
<
2020-09-11
>
simongray10:09:56

any kind of standard format to import/export datalog triples? I’m thinking something simple like ndjson, but for datalog triples. I realise it could easily be implemented, but I was just wondering if anything had been agreed on. I tried expanding my search for to include newline-delimited EDN and I found this which I guess is good enough: https://github.com/lambdaisland/edn-lines

whilo17:09:50

@simongray This is how our simple export and import functions in Datahike work as well, we just print one datom in each line. The problem for general import and export is to map the entity and attribute ids between different triple stores. What are you trying to do?

simongray17:09:08

Just doing preliminary research for a research project. Why would mapping entity and attribute ids be an issue? Aren't the mappings implied by the triples?

refset18:09:01

in the classic Datomic datoms model entity ids are intended to be for internal-only usage and shouldn't be communicated (or persisted!) outside of the boundaries of the particular instance of the database system For contrast, Crux took the opposite approach and requires the user to provide an explicit :crux.db/id value for each document

👍 3
whilo19:09:31

You can also explicitly pass the entity ids in Datomic, Datahike, datalevin and Datascript, if you want. The difficulty is to figure out mappings between different databases in hindsight typically. RDF has solutions for that, basically you need to scope all the ids properly.

3
simongray14:09:35

@U1C36HC6N I actually have a bunch of RDF data from a WordNet, but I’m not sure if getting it into one of the Clojure Datalog dbs makes sense, or if I’m better off using either Apache Jena or Neo4j. I have no experience with either.

refset14:09:26

@simongray we've done a lot of benchmarking for Crux using RDF bench suites (LUBM and WatDiv, specifically), so there's quite a bit of code you could use or borrow in crux-bench and crux-rdf e.g. https://github.com/juxt/crux/blob/master/crux-rdf/src/crux/rdf.clj and https://github.com/juxt/crux/blob/master/crux-bench/src/crux/bench/watdiv.clj

metal 3
refset19:09:23

Are there any avid users of sub-queries / "nested queries" here? I am curious about the kinds of practical use-cases that people have come across. We added sub-query support to Crux a couple of days ago (releasing next week) with the immediate motivation being: the ability to transpose TPC-H queries without touching Clojure. This is unrelated to our prior crux-sql TPC-H work. Some test examples, for context: https://github.com/juxt/crux/blob/master/crux-test/test/crux/query_test.clj#L1186

lilactown19:09:46

I did not even know that was a thing. Is that supported by datomic and/or datascript too?

whilo19:09:57

@lilactown Yes, you can just call query again, but you need to provide an explicit binding between the surrounding query and the nested query. Datomic had some restrictions of how you can pass databases around last time I checked.

lilactown19:09:42

“just call query again” you mean calls to d/q?

pithyless21:09:31

#TIL; also, this doesn't seem to be mentioned anywhere in the on-prem docs: https://docs.datomic.com/on-prem/query.html#built-in-expressions Is this just an oversight, are the cloud docs usually kept more up-to-date, or is this just not supported on-prem?