Fork me on GitHub
#datalog
<
2022-03-24
>
David Epstein07:03:25

Hi everyone. I'm new to the JVM and noticed there are many datalog implementations related to clojure. I have three questions: (1) what is the relationship between datalog and clojure? (2) can these implementations be used from kotlin? (3) what implementation has good rule unification for a prolog/datalog style symbolic AI and inference project?

refset10:03:35

Hi 🙂 I'm travelling at the moment so can't writeup a proper response right now, but this example may be interesting to you (in regards XTDB's Datalog implementation, at least) https://gist.github.com/refset/21b3fc1dec9a6928943073809e13356d

refset10:03:29

XT has quite a few Kotlin users, and there's a Kotlin query DSL for Datalog in the repo also if you would prefer to avoid edn

refset10:03:15

Datalog appeals to Clojurists because it's relatively simple (few features), declarative, and composable

David Epstein10:03:45

thank you@U899JBRPF . I'm familiar with Prolog but not Datalog. I'm looking forward to learning more about it. thanks for the link!

refset11:03:30

Ah, in that case I should add that the main interest in Datalog from the Clojure ecosystem perspective has been as a lightweight database query language, but there are certainly whole other worlds beyond that. See how RDFox uses Datalog, for instance

David Epstein11:03:57

@U899JBRPF, the link you provided mentions flix as well, which appears to be an entirely other JVM language?

refset11:03:11

This other recent thread might be interesting to you also, if you hadn't already seen it https://clojurians.slack.com/archives/CJ322KHNX/p1647192994651699?thread_ts=1647192994.651699&amp;cid=CJ322KHNX I have a "Sudoku solver in XT Datalog" blog post in the works that scales up the same idea

David Epstein11:03:00

@U899JBRPF, is there an implementation that would allow me to work close to traditional logic programming syntax? For example:

female(mary).
female(alice).
parent(alice, mary). 
mother(M, C) :- parent(M, C), female(M).
Or maybe in tuples like:
(mary isa female)

refset11:03:36

I can't recall seeing anything written in Clojure that supports the Prolog-style syntax, unfortunately. We did implement a working parser for XT at one point, but didn't go much further (e.g. translating to edn Datalog)

David Epstein11:03:34

XTDB seems like a reasonable option. Looks like there are others: datahike, datalevin, datascript, naga, asami, etc. If my goal is to build up an expert system via rule-based inferences connecting via kotlin (possibly mobile)--what do you recommend? or what additional questions should I be asking?

refset11:03:23

What kind of data volumes are you hoping to handle? I'm completely biased towards XT and would gladly help you figure out how to apply it, but normally I'd start with whatever looks simplest to play around with and figure out whether any of these Clojure incarnations of Datalog are what you're really after, before losing too much time in evaluation mode

David Epstein12:03:15

@U899JBRPF, not a lot of data, but I'd like the final app to run on mobile and tablet directly without a server. 50k to 500k facts and 100 to 500 rules with the goal of answering 1 to 10 queries per second. I'd also like access to a datalog REPL so I don't need to compile and deploy just to check the logic. Naga's Pabu language seems similar to Prolog, but that project has not been updated for two+ years--I'm not certain of its status: https://github.com/quoll/naga/. Does XTDB seem like a good fit for a desktop & mobile project?

Steven Deobald17:03:50

As long as your desktop and mobile apps run on the JVM, xtdb, asami, datalevin, and datahike could all work in theory. XTDB has modular storage (for documents, tx-logs, and indexes) so the storage modules you choose would also need to work on your target platform. (The smaller/simpler Datalog databases may make life easier, in that regard.) I'm not sure there's any sensible way to get any of these Datalog databases running on iOS, however.

David Epstein18:03:12

Thanks @U01AVNG2XNF. I'm not sure Asami permits datalog rules. Is datascript intended for web development? Between datalevin, datahike, and others (naga?) which is most robust in terms of features? which is the easiest to use from kotlin?

Steven Deobald18:03:02

Right, Asami doesn't have rules at the moment. DataScript, from the second line in the readme (https://github.com/tonsky/datascript): > DataScript is meant to run inside the browser. There have been attempts to make DataScript persistent (on-disk), such as https://github.com/hraberg/datascript-mapdb, but none I'm aware of that is production-ready. #datascript is probably the best place to ask about non-web-dev applications of DataScript. I'm not sure I've heard of anyone using the others (datalevin, datahike) from Kotlin, but like @U899JBRPF I work on xtdb so I'm biased toward the information I have. Datalevin is DataScript ported/inspired, but I haven't seen a non-Clojure client lib for it. It's also worth noting that Datalevin is not an immutable database, unlike most Datalog databases. Datahike is also DataScript ported/inspired but it does have a Java API available, which might make it comfortable from Kotlin.

👍 1
Steven Deobald18:03:11

@U0395BDNRG8 I assume you've seen https://clojurelog.github.io/ but if you haven't, it describes the space in broad strokes. The comparison tables on that page are still lacking some details, so if you discover that one of the Datalog databases is particularly good for you, we'd love a PR that highlights the features you felt were important: https://github.com/clojurelog/clojurelog.github.io

David Epstein18:03:16

@U01AVNG2XNF, wow--I've read all the project websites but somehow missed the main comparison site! thanks!

Pieter Koornhof06:03:38

@U0395BDNRG8 Just for completeness, there are some other clojure logic programing projects you can have a look at outside of datalog https://github.com/clojure/core.logic for logic programming (it implements mini kanren) http://www.clara-rules.org/ a forward chaining rules engine

David Epstein06:03:49

Thanks @UC465C762. I'll look at the alternatives, too. I'm coming from Prolog, so something similar to that would be helpful. Datalog seems less prone to infinite loops, which is nice though 🙂 How do core.logic and clara rules compare with Datahike & XTDB in terms of performance? Both datalogs use indexes, but only XTDB has a query planner, right?

Pieter Koornhof07:03:45

Both of those are build with performance in mind. But I have only used them on toy level. I will add that the datalog query language in clojure is the closest thing I have found to prolog. (again I have only done prolog on a toy level, so by no means an expert). I basically went from clojure to prolog and then started looking for prolog alternatives in clojure. If you want a 1 to 1 mapping to prolog you can look at https://github.com/mauricioszabo/spock, which just wraps swi prolog

David Epstein12:03:12

@UC465C762 thanks. SWI-Prolog is great, but it doesn't have the flexibility of running locally on mobile. TuProlog is a possibility, since I think the new version is written in Kotlin. However, I'd also like to use something that has a relatively large userbase.

David Epstein12:03:47

While I don't know what to select yet, it is clear there are for more options on the JVM than for C#. I am definitely tempted to pickup Kotlin for access to these tools while keeping the ability to write code for mobile. Clojure looks interesting but there is no direct path to mobile, correct?

mauricio.szabo03:03:47

@U0395BDNRG8 Tuprolog is insanely slow so I don't really recommend for production usage. SWI was implemented in Spock exactly because tuprolog did not deliver in terms of speed and feature completeness.

mauricio.szabo03:03:01

Core.logic is faster than tuprolog by orders of magnitude, and SWI is faster than core.logic again by orders of magnitude: https://mobile.twitter.com/mauricio_szabo/status/1408179649180487681

mauricio.szabo03:03:57

One thing that could help is - SWI runs on wasm. I was not able to compile it to wasm, but of this was possible, maybe a ClojureScript version should be possible (so anywhere that we can run JS could run Spock w/ SWI without needing to install binaries)

timo09:03:39

@U0395BDNRG8 Datahike has a java interface and we are working on our http-implementation datahike-server. Datahike is a fork of Datascript that has a few persistent backends like file and jdbc.

David Epstein12:03:41

Thanks @U3Y18N0UC. Those speed tests are helpful. Have you conducted any speed tests with Datahike, Naga + Asami, or Clara? I know nothing about wasm. Would this only be used if I were going to make a webapp or use a web GUI framework?

David Epstein12:03:52

@U4GEXTNGZ, Datahike is definitely one of the packages I'm considering. I'm not sure yet what I would lose and gain coming from Prolog. Could you summarize for me?

timo12:03:16

Sorry, I don't have an answer to that. Never used Prolog myself.

David Epstein12:03:09

@U4GEXTNGZ, that is OK. I'll continue to read through the docs and perhaps ask questions here?

timo12:03:44

I guess you should ask questions related to Datalog and Prolog here in this channel. Here are people that know a lot about these topics. Datahike's API is quite similar to Datascript and also Datomic as well as Datalevin. So asking here or in #datahike #datalevin #datomic is fine but you could get good answers in #asami and #xtdb as well.

mauricio.szabo13:03:56

I don't think that Datahike compares to Prolog. Datahike/Datascript are databases, and they don't implement a query planner, for example, so probably you'll miss some features. For example, you will lose the unification features, there's no way to define a rule (you can query, but a "rule" in that sense would be a view that you can query over it, and I'm not sure if any of the DBs here implement views), etc. You can kinda use views on user-space (by defining a datalog query and then augmenting it on code) but I don't see as a viable solution to replace Prolog (like, again, no query planner so you have to be aware of how you're querying)

mauricio.szabo13:03:23

At the same time, Clara is a rules engine - it's not a logic programing environment. What Clara does (and it does REALLY WELL) is implement a pattern-match to replace what would eventually become a deeply nested if with a more feasible, parallel, and optimized structure that you can insert facts and wait until they converge into a final state, then you can query that state, but that's all.

David Epstein13:03:25

@U3Y18N0UC, https://dzone.com/articles/how-to-use-datahike-an-immutable-datalog-based-dat Datahike example shows how to define a recursive rule:

(def rule '[[(ancestor ?e1 ?e2)
               [?e1 :ancestor ?e2]]
              [(ancestor ?e1 ?e2)
               [?e1 :ancestor ?t]
               (ancestor ?t ?e2)]])
?e1 must match ?e1 throughout the rule--isn't that Prolog-like unification? I'll ask about asami on the #asami channel.

mauricio.szabo14:03:03

@U0395BDNRG8 kinda, because it works only on equality of attributes. For example, it's close to impossible (maybe even impossible) to define string_concat/3 (concatenate 2 strings into a third one) with full unification, or implement some of the most complex rules like List = [1, 2], member(X, List), List = [X | List2], member(Y, List2). I may be wrong too, there's been a while that I worked with some datalog dbs, so make your own tests 🙂

David Epstein14:03:29

@U3Y18N0UC, got it. If Datahike looks like a good fit, I'll be sure to test it carefully before making a final decision.