Fork me on GitHub
#core-logic
<
2018-09-05
>
rickmoynihan11:09:06

Are there any docs on tabling in core.logic and how it works?

rickmoynihan11:09:20

Backstory: I’ve implemented a SPARQL/Datalog-like query DSL for querying in memory RDF graphs with core.logic. The DSL essentially tries to hide a lot of the minikanren/relational-programming stuff and present a simpler “SPARQL like” model, where you don’t need to fresh variables etc… I use db-rel to get acceptable performance. Currently I compile the Basic Graph Pattern vectors my macro receives and convert them to calls to db-rel. What I have works quite well, but it does of course mean that you can’t generate queries at runtime. I’d like to explore doing the same but with functions, so I can also provide a way to build tabled queries dynamically. Is this possible?

rickmoynihan11:09:39

For reference this is the project: https://github.com/Swirrl/matcha

rickmoynihan11:09:09

e.g. the graph pattern [[:foo ?p ?o]] would currently compile into something like:

rickmoynihan11:09:11

(clojure.core.logic/fresh
          [?p ?o]
          (grafter.matcha.alpha/triple ?s ?p ?o))

rickmoynihan11:09:26

I can obviously define a relation that will recursively apply the triple relation, but won’t that eliminate the advantages of indexing? I’m also not entirely sure of all the semantics around using clojure inside relations etc, which was my first instinct to avoid using a macro.

bajrachar15:09:27

@rickmoynihan - sounds like an interesting project - It would be great to have something to query in-memory graph - without digging into minikanren

bajrachar15:09:45

I didn't quite follow what you mean by generating queries dynamically

bajrachar15:09:58

I recently also used core.logic for graph traversal - but due to large size of the db it wasnt as efficient

rickmoynihan15:09:52

Sometimes you want to generate queries dynamically based on inputs. Currently in Matcha you can’t do this kinda thing: (let [query-pattern (conj '[?s ?p] '?o)] ((select query-pattern) friends)

rickmoynihan15:09:54

because select is a macro, so query-pattern has to be hardcoded as a literal value at compile time

bajrachar15:09:52

ok I see what you mean - how does tabling help in this scenario?

rickmoynihan15:09:12

it makes it orders of magnitude faster

bajrachar15:09:19

and you want to use functions in place of macro

bajrachar15:09:53

sorry - just trying to understand the problem 🙂 -

bajrachar16:09:08

I don't know if it is applicable in this particular case - but for my particular use case of making graph traversal faster on a very large in-memory db - I went about re-writing some parts of pldb and db-rel

bajrachar16:09:25

making the matching logic parallel - which seemed to work better than tabling

bajrachar16:09:52

as issue for me wasn't that query was diverging

rickmoynihan16:09:58

> and you want to use functions in place of macro @bajrachar Yes.

rickmoynihan16:09:43

I think the problem is that without tabling I’d need to scan the database on each join, so performance becomes exponential?/quadratic?/something-really bad. Making it parallel might help, but at best parallelisation will only make linear gains in terms of processors; rather than the orders of magnitude gains you can get through indexing/tabling/memoization/better-algorithms. For me querying just 1000 edges was taking my naive version longer than querying millions of edges with tabling

bajrachar16:09:56

sounds interesting - definitely would like to see how you've used tabling

bajrachar16:09:56

by join - I assume its traversing the graph to depth n?

rickmoynihan18:09:20

I mean join in the RDBMS sense. Effectively a triple store is a single table with three columns; every time you walk from one edge (row) to another through a ?variable binding you’re effectively doing a self join back on that table. But yes each traversal is effectively another join.

rickmoynihan18:09:25

I think I understand roughly how this works; but need to figure out the finer details

rickmoynihan18:09:08

the second link in the source code 404's 😞

rickmoynihan18:09:15

I think the first link is Byrds thesis… I’ve skimmed this before; but can’t recall the details of tabling

rickmoynihan19:09:46

hmmm… maybe I’m misunderstanding the relationship between tabling and pldb