Fork me on GitHub
#asami
<
2021-02-18
>
rickmoynihan09:02:02

@quoll picking up on this thread: https://clojurians.slack.com/archives/C09GHBXRC/p1613605836089200?thread_ts=1613605226.087200&amp;cid=C09GHBXRC I get that some users might be more familiar with having a “mutable” database/connection, and that this mirrors datomic… However I find this sort of thing makes naga/asami harder to use for these usecases. It’s much less obvious about how to wire things together as we have to learn asami connection APIs; for negative value compared to something more value oriented e.g.

(->> asamival/empty-db 
       (asamival/transact {:tx-data [[:db/add :foo :bar :baz]]) 
       (nagaval/materialise-inferences program) 
       (asami/q '[:find ,,,,]))
In terms of API design I feel like the “clojure way” for creating maximally useful APIs is to essentially avoid creating singletons, defing atoms etc… from my perspective state management is largely an application concern not a library concern. That’s not to say there isn’t value in providing those sort of interfaces; but I feel that’s the distinction between a library and a framework. i.e framework’s take on application concerns by providing organisational conventions around application concerns. Frameworks get in the way however if their paradigm isn’t what suits your application… for example what if I wanted to use clojure agents to change asami graphs, or refs to transact them with other in memory state, or provide my own storage layer with a shape different to asami’s protocols? I feel like asami and naga are possibly doing too much by being frameworks rather than libraries, and their use and implementation could be simpler by doing less. @quoll… Sorry to criticise by the way, naga and asami look absolutely fantastic and I really like what I see here. I also know it’s possible to work around these issues, but it would great to be able to support the in memory use case without this sort of friction. I don’t know if it’s possible at this stage to extract the pure stuff into a smaller separate library, and leave the frameworky bits elsewhere? Similarly I noticed naga’s project.clj pulls in some rather large deps, that I think should be either moved into separate profiles or put in a “provided” scope, essentially making them optional. i.e. the library use case doesn’t need a CLI or datomic-free or postgresql. Anyway thanks again for all the hard work here, I’ve really been enjoying playing with these libraries, and may even be able to contribute relevant changes after I’ve learned more. 🙇

dominicm10:02:06

I also ran into these issues fwiw 🙂. I avoided by going to the lower-level API underneath, but I found that easier as I have used naga before the conn concept was introduced and knew where to poke.

rickmoynihan10:02:10

Yeah not done any digging yet, but good to know.

quoll14:02:51

Try this:

(require '[asami.graph :as graph])
(require '[asami.index :as index])
(require '[asami.core :as asami])

;; reverse the arguments, since you don't need a varargs seq of inputs
(defn graph-q [graph query] (asami/q query graph))

(-> index/empty-graph
    (graph/graph-transact 0 [[:foo :bar :baz]] nil)
    (graph-q '[:find ?e ?a ?v :where [?e ?a ?v]]))
This doesn’t fit too neatly with Naga, but it can be done

rickmoynihan10:02:41

@quoll: For fun I had a go at defining rdfs entailment in naga. As I think your skos.rlog definition was missing some rdfs rules e.g:

rdf:type(YYY,XXX) :- rdfs:domain(AAA, XXX), AAA(YYY, ZZZ).    /* rdfs2 */
rdf:type(ZZZ,XXX) :- rdfs:range(AAA, XXX), AAA(YYY, ZZZ).     /* rdfs3 */
One small issue is that I don’t think rules like rdfs1 https://www.w3.org/TR/rdf11-mt/#rdfs-entailment are expressible in naga (or at least pabu). I can obviously pretty easily materialise these instances myself, but I was wondering if an extension might be possible, to allow arbitrary clojure predicate functions to be called as unary predicates in consequent positions. Logically I guess they’d just be wrapped to return success or failure goals and essentially trigger backtracking (though not sure how these concepts map into your RETE implementation). I guess if you were to do this, there’d need to be another restriction that the variables used here would need to be ground — but I expect the compiler could figure that out. I’ve not dug into the grammar yet so I don’t know how you’d represent such a thing syntactically, but I was imaging you could just use clojure.core/requiring-resolve and that with something like this it might be possible to express rdfs1 as something approximating:
rdf:type(A, rdfs:Datatype) :- A(B,C), clojure.core/keyword?(A) .
rdf:type(B, rdfs:Datatype) :- A(B,C), clojure.core/keyword?(B) .
rdf:type(C, rdfs:Datatype) :- A(B,C), clojure.core/keyword?(C) .

quoll14:02:45

I’ll confess that I haven’t been looking at making that skos stuff work. It was lifted out of Mulgara. But since I know Naga/Asami can do everything Mulgara can do, then I figured it just needed a little weaking, if anything.

quoll14:02:54

Also, it’s skos, so it’s not supposed to infer rdfs. A lot of rdfs inferences are trivial. For instance, I found that the inferences of everything being an rdfs:Resource to be useless from a practical perspective, because if it’s in the database, then it’s a resource.

quoll14:02:43

As for these rules, I don’t believe that they are valid.

quoll14:02:50

Nor consistent

quoll14:02:54

If you’re doing RDF (which Asami can approximate, but isn’t trying to be), then resources can only be datatypes if they’re IRIs (or URIReferences)

quoll14:02:13

However, if you’re in Clojure, then keywords are basically QNames, and can be used as such. In that case, your inferences are OK, and are consistent, but I don’t believe they’re valid?

quoll14:02:34

Actually, it’s hard to make something invalid when you’re only in RDFS, so I guess it’s valid, but it isn’t useful. e.g. This will infer that (QName rdf:type or as a keyword :rdf/type) is an rdfs:Datatype. That makes the following syntactically correct, but nonsense: "foo"^^<rdf:type>

rickmoynihan15:02:57

:thumbsup: Yes I’m aware of most of that (though you’re right to pick me up on misunderstanding rdfs:Datatype and rdfs1 — it’s about literals/datatype-uris and not quite what I thought — the classes of things you can speak of (i.e. BNodes/IRIs)). Re: RDFS I know it’s not part of skos… I just want a combination of some rdfs inferences (mostly domain / ranges, subproperties etc), and likely also some or all of the skos you have too. Though at this stage I’m really just tinkering, rather than having concrete plans for any of this — so I was mainly wanting to write the rdfs rules out in as an excuse to play with naga. The suggestion above came out of that tinkering. My suggestion really wasn’t about RDF though, it was primarily do you think it might be useful to support arbitrary clojure predicates in naga logic programs like this?

quoll16:02:04

Quick answer: yes

🥳 3
quoll16:02:51

They need to be identified as such (and not as edges to be searched for in the database), and then they get turned into filters instead

👍 3
quoll16:02:26

a trivial way to do this might be to look for namespacing with a / character, instead of a : character

rickmoynihan16:02:51

yeah that’s what I was thinking actually

rickmoynihan16:02:25

though might be better not to put it in pabu?! And just have it with naga?

quoll16:02:34

BTW, I just checked in a pabu modification that allows for -- comments 🙂

quoll16:02:54

Pabu is just a parser that generates Naga rules

quoll16:02:04

It takes a string and returns Naga rules

rickmoynihan16:02:45

Incidentally something else I noticed in the README. /* */ comments aren’t iso prolog comments, prolog uses %. Some implementations e.g. swi do additionally support C style comments blocks; but as far as I know they’re non standard.

quoll16:02:59

I was about to show my manager this thing I’d just gotten working, except it was only configurable in code, and I knew him well enough to know that he would immediately ask me to try changing things to see if they worked. That would work best if I hard a parser for rules. Which was why I put Pabu together so quickly. It was never supposed to last. But when I saw how it could run all sorts of simple Prolog code… well, I kept it 🙂

quoll16:02:40

I don’t know Prolog very well, so that’s good to know, thank you

rickmoynihan16:02:15

I’m not suggesting you add a 3rd comment form — but might % not be a better choice, as it will probably make your datalog a proper subset of prolog syntax… and thus work properly in emacs prolog-mode etc.

rickmoynihan16:02:49

I think sicstus might also support C style ones

quoll16:02:08

errr, well… I’ve already done it, and am just running the regression tests now 😜

rickmoynihan16:02:49

oh well the ship has sailed 🚢 👋

quoll16:02:51

I like the -- style, because I kept seeing it in various places. Also SQL

quoll16:02:20

plus, it should make those mulgara rules work

quoll16:02:49

not the mulgara:UriReference function though. I’ll give that one some thought

rickmoynihan16:02:17

Yes I quite like them visually… just a shame to miss out on free syntax highlighting support

rickmoynihan16:02:33

and automatically commenting out blocks

rickmoynihan16:02:54

not a big deal though

quoll19:02:02

BTW, I don’t know if it was clear earlier… there is already external predicate support in Naga (via Asami). So the Pabu-style rules:

rdf:type(A, rdfs:Datatype) :- A(B,C), clojure.core/keyword?(A) .
rdf:type(B, rdfs:Datatype) :- A(B,C), clojure.core/keyword?(B) .
rdf:type(C, rdfs:Datatype) :- A(B,C), clojure.core/keyword?(C) .
Could be generated in code as:
(r [?a :rdf/type :rdfs/Datatype] :- [?b ?a ?c] [(keyword? ?a)])
(r [?b :rdf/type :rdfs/Datatype] :- [?b ?a ?c] [(keyword? ?b)])
(r [?c :rdf/type :rdfs/Datatype] :- [?b ?a ?c] [(keyword? ?c)])

rickmoynihan23:02:06

Ok this is fantastic and is exactly what I was asking for, I didn’t realise you were saying this, so it’s amazing to see you’re already doing it! :thumbsup: I can imagine this will be very handy.

quoll02:02:31

Honestly… Naga isn’t all that complex. Executing a rule just means turning the body into a where clause, and then projecting the results into groups of 3, which then get inserted as statements for every result row.

quoll03:02:28

The tricks are in things like: 1. identifying the parts of the where clauses which can be affected by parts of the output from any rule. 2. fill the queue with rules 3. If the queue is empty, exit 4. take the first rule from the queue, and check if any of the parts in its body (the :where clause patterns) have changed. If not, return to step 3. 5. cache the new results of the parts of the :where clause patterns (for comparison next time). We use semi-naïve reasoning which means that we need only store the count. If we start getting more aggressive about negation operations, then this may need to turn into a hash (which is, more expensive) 6. something changed, so run the rule. This executes the :where clause, and projects each row into a group of triples. These are all inserted. 7. check if any rules had parts that can be changed by this rule. If so, add them to the queue. (The queue will ignore any duplicates) 8. go back to step 3

quoll03:02:56

Also, when generating new entities (unbound variables in the head of the rule), then the :where clause is updated to exclude any results which will generate an entity that is exactly equal to one that already exists (this is a cute bit of query rewriting, and was actually the impetus to get not into Asami).

quoll03:02:22

There… now you know how to build a rule engine 🙂

rickmoynihan10:02:14

I guess such a thing might be better expressed in the naga representation than the pabu one though… As pabu rules could then be kept compatible with other datalogs… where as it would be reasonable to assume if you’re using the clojure rules representation you have access to clojure… Actually do you have something like this already? I seem to recall you mentioned negation and (or ,,,)

quoll14:02:35

Yes. Rules are actually just a :where clause, and a projection to groups of triples for assertion. Pabu is just a parser that takes Datalog and creates such a thing, but because the parsing hasn’t been looked at for a long time, it has limited syntactic capabilities.

quoll14:02:49

If I were to expand Pabu, I should really do it with instaparse. At the moment it uses Parsatron, which was really fun and cool to play with at the time. I just needed a quick way to parse something, and I remembered a previous colleague of mine had ported Parsatron from Haskell parser combinators, so I grabbed it quickly, and 45 minutes later the first version of Pabu existed. It was really only that well thought out 🙂

rickmoynihan10:02:31

:thinking_face: hmm looks like mulgara might have had some special predicate/hacks for this sort of thing too e.g mulgara:UriReference

quoll14:02:42

SPARQL does specifically allow for extensions like this, so hopefully it’s not a “hack” 🙂

rickmoynihan10:02:06

Actually I’m curious, it looks like mulgara “fixes” a frustration I’ve had in the past with the rdfs entailments… i.e. in rdfs IIRC literally everything is an rdfs:Resource, which means that Literals and URIs are kind of indistinguishable. It looks like mulgara’s entailments might try and keep these distinct? Is that what I’m looking at here @quoll? https://github.com/quoll/mulgara/blob/36ee68b9cccaca26f55a39d37511fb4664b004e0/rules/rdfs.dl#L38-L39 i.e: - 4a says any subject is a resource (therefore must be a URI as you can’t speak of literals) - 4b (appears to) say that any object is a resource iff its an object and of type uri?

quoll14:02:42

URIReference, but yes.

quoll14:02:49

RDF does not allow literals as a subject or predicate, and that avoids the type error that would result. Asami lets you do this, which actually allows for some interesting data expressions. Like:

12 :math/factor 6
12 :math/factor 4
12 :math/factor 3
12 :math/factor 2
It also let someone use strings as a kind of “magic” predicate in a graph that was being rendered (keywords were for attributes on object, and strings were edges between the objects). That was definitely a hack, but it allowed a previous datastructure to be ported into Asami with no effort.

rickmoynihan15:02:16

:thumbsup: yup I’m aware of this. matcha is similar.

quoll19:02:19

I figure it was worth calling out since I’ve been explicit in stating my RDF background, and that a lot of design was influenced by Mulgara

rickmoynihan23:02:26

Oh absolutely. I’m very grateful for all your explanations.

rickmoynihan10:02:10

ok actually going to have to tear myself away and stop digging any further, and do some real work. 😢

quoll14:02:51
replied to a thread:@quoll picking up on this thread: https://clojurians.slack.com/archives/C09GHBXRC/p1613605836089200?thread_ts=1613605226.087200&amp;cid=C09GHBXRC I get that some users might be more familiar with having a “mutable” database/connection, and that this mirrors datomic… However I find this sort of thing makes naga/asami harder to use for these usecases. It’s much less obvious about how to wire things together as we have to learn asami connection APIs; for negative value compared to something more value oriented e.g. (-&gt;&gt; asamival/empty-db (asamival/transact {:tx-data [[:db/add :foo :bar :baz]]) (nagaval/materialise-inferences program) (asami/q '[:find ,,,,])) In terms of API design I feel like the “clojure way” for creating maximally useful APIs is to essentially avoid creating singletons, defing atoms etc… from my perspective state management is largely an application concern not a library concern. That’s not to say there isn’t value in providing those sort of interfaces; but I feel that’s the distinction between a library and a framework. i.e framework’s take on application concerns by providing organisational conventions around application concerns. Frameworks get in the way however if their paradigm isn’t what suits your application… for example what if I wanted to use clojure agents to change asami graphs, or refs to transact them with other in memory state, or provide my own storage layer with a shape different to asami’s protocols? I feel like asami and naga are possibly doing too much by being frameworks rather than libraries, and their use and implementation could be simpler by doing less. @quoll… Sorry to criticise by the way, naga and asami look absolutely fantastic and I really like what I see here. I also know it’s possible to work around these issues, but it would great to be able to support the in memory use case without this sort of friction. I don’t know if it’s possible at this stage to extract the pure stuff into a smaller separate library, and leave the frameworky bits elsewhere? Similarly I noticed naga’s project.clj pulls in some rather large deps, that I think should be either moved into separate profiles or put in a “provided” scope, essentially making them optional. i.e. the library use case doesn’t need a CLI or datomic-free or postgresql. Anyway thanks again for all the hard work here, I’ve really been enjoying playing with these libraries, and may even be able to contribute relevant changes after I’ve learned more. :bow:

Try this:

(require '[asami.graph :as graph])
(require '[asami.index :as index])
(require '[asami.core :as asami])

;; reverse the arguments, since you don't need a varargs seq of inputs
(defn graph-q [graph query] (asami/q query graph))

(-> index/empty-graph
    (graph/graph-transact 0 [[:foo :bar :baz]] nil)
    (graph-q '[:find ?e ?a ?v :where [?e ?a ?v]]))
This doesn’t fit too neatly with Naga, but it can be done

quoll14:02:32

Originally, everything was done via a graph protocol, which is reasonably small:

(defprotocol Graph
  (new-graph [this] "Creates an empty graph of the same type")
  (graph-add [this subj pred obj tx] "Adds triples to the graph")
  (graph-delete [this subj pred obj] "Removes triples from the graph")
  (graph-transact [this tx-id assertions retractions] "Bulk operation to add and remove multiple statements in a single operation")
  (graph-diff [this other] "Returns all subjects that have changed in this graph, compared to other")
  (resolve-triple [this subj pred obj] "Resolves patterns from the graph, and returns unbound columns only")
  (count-triple [this subj pred obj] "Resolves patterns from the graph, and returns the size of the resolution"))

quoll14:02:18

originally, graph-transact and graph-diff weren’t there. graph-diff came about because people wanted to see what had changed after Naga had run on it. I consider it optional, as nothing in Asami uses it.

quoll14:02:39

graph-transact is much more recent. Originally, there were 2 functions in the query namespace (not a great place for them, but :woman-shrugging: )

(defn add-to-graph
  [graph data]
  (reduce (fn [acc d] (apply graph/graph-add acc d)) graph data))

(defn delete-from-graph
  [graph data
  (reduce (fn [acc d] (apply graph/graph-delete acc d)) graph data))
As you can see, these just called graph-add or graph-delete for each statement

quoll14:02:05

The new graph-transact function does exactly this… applying deletions first, then assertions. It also takes a transaction ID, which is not (currently) used in the in-memory database

quoll14:02:25

Anyway, I believe that the `Graph` protocol is the API you want @rickmoynihan

quoll14:02:21

What I found was that no one I worked with felt comfortable learning Asami. So I wrapped that Graph API in a Database/Connection façade, and voilà! They started using it! 🙂

quoll14:02:34

But it’s all still a Graph under the covers

quoll14:02:49

If you have a Connection, then you get the most recent Database using (asami.core/db connection) If you have a Database, then you get the graph for it using (asami.core/graph database)

quoll14:02:01

You can then do whatever you want with the graph (querying it the same as you do a Database… in fact, the q function only works on graphs, and calls graph on a database to get the graph). When you’re done, you can make it look like a Connection/Database again by calling (asami.core/as-connection graph)

quoll14:02:57

Optionally, you can give it a URI for the connection to associate it with:

(asami/as-connection graph "asami:")

quoll14:02:43

This will actually replace any registration of prior connections at that URI, which is probably something to be aware of, and also really useful

quoll15:02:15

So, referring back to that original code:

(-> index/empty-graph
    (graph/graph-transact 0 [[:foo :bar :baz]] nil)
    asami/as-connection
    (naga-engine/run my-program)
    db
    graph
    (graph-q '[:find ?e ?a ?v :where [?e ?a ?v]]))

quoll15:02:40

Should this all go in the Wiki?

rickmoynihan15:02:56

Thanks for posting all the above… I’ll try and digest it all later 🙇

quoll16:02:32

Also… I just remembered: Yes, you’re right. The datomic dependency is entirely optional and should be in a separate profile. I’ve been too lazy to do this.

quoll16:02:09

When you say that it brings in a lot, that’s only because of Datomic. Remove that, and there’s VERY little to come in. Most of it is code that I’ve written myself

quoll17:02:05

A big exception there is Plumatic Schema.

quoll17:02:58

I’m not sure if I should keep that or not. It was so useful during development, and people looking at my code said that they really appreciated seeing it. That said, it does nothing at runtime

quoll19:02:42

@rickmoynihan this morning you inspired me to clean up my builds Try depending on Naga 0.3.12 and tell me what you think about the dependencies

quoll19:02:01

Surprisingly, one of Asami’s plugin dependencies (`cider-nrepl`) was included in the Asami release. I have no idea why this would happen. But I don’t use it anyway, so I’ve removed that, and it’ll propagate through later

rickmoynihan00:02:32

Looks much better 👌 You could possibly tidy it a little further like this: https://github.com/threatgrid/naga/pull/134 but arguably you’ll be hitting diminishing returns pretty soon.

quoll02:02:18

It’s coincidental that you’d be mentioning some of this, since it came up this week.

quoll02:02:42

I’m tempted to move cli.clj to a separate project entirely, or at least a module under this one (I’ve never played with modules before, so I don’t know how useful that would be). The CLI was actually written as an example of how to use Naga. It ended up doing more than I first expected, and it was kinda cool. But it’s really not supposed to be part of the project.

quoll02:02:26

I’m also tempted to remove Asami as a dependency, since it’s not needed at all if you want to run Naga with Datomic. But since we’re using it here, it wasn’t going to hurt me to leave it where it was. It also makes the CLI useful. Maybe if the Asami and Datomic adapters are made into modules, then the CLI can be separated out entirely, and depend on Naga + Asami-adapter? This is more Leiningen than I know right now.

quoll02:02:11

The reason it came up this week was because the CLI was being referenced by :main which compiled Naga and all it’s dependencies. That was a horrible mistake to discover! 😖 I started by removing :main altogether, but put it back when I discovered the ^:skip-aot metadata. But if I split it out, then a lot of these problems go away.

quoll02:02:21

Finally… Zuko needs Cheshire, due to its ability to parse JSON. I could use clojure/data.json but that’s not as fast, so I was reluctant to go that way. The JSON related code is not actually core Naga functionality, but it’s something we use a lot.

quoll02:02:32

I’ll give some thought to all of this, and I’ll also look into how modules are put together. If you have anymore to contribute I’d love to hear it!

rickmoynihan10:02:35

Thanks. That all makes sense. Re: leiningen and modules… Have you considered using tools.deps instead? In my experience it makes having multiple modules within the same project quite a bit simpler. Especially if you don’t need a build. In particular git deps, with roots into the project can make stuff like the CLI work really nicely in an independent way. AFAIK the only build you have is for the clojurescript stuff, but I think that could probably be done quite easily too with just cljs.main and deps.edn.

rickmoynihan10:02:22

It definitely makes some things harder though; but I think overall it may be simpler for you and provide you with more advantages than disadvantages for this collection of projects.

quoll15:02:42

While trying to figure it all out yesterday, I started looking at what leiningen itself does. There’s actually another project within leiningen called leiningen-core. This is built manually, as if it were an unrelated project, and then the main project has a dependency on it. It’s not automated, but I’m actually OK with this approach. So I decided to go with it.

quoll15:02:51

So now if you go to Naga, the project just builds the library, and nothing else. Inside of it there is a cli directory, which contains a dependency to this library.

quoll15:02:47

This has the nice effect that the cli is now completely optional, and if you want it then it gets built with full AOT (meaning that it can be run easily as a CLI without needing dependencies set up for the classpath)

quoll15:02:13

Also… no, I had not considered using tools.deps 🙂

rickmoynihan15:02:19

what you have looks good. Definitely better to separate the concerns like that :thumbsup:

quoll15:02:42

The CLI was never envisioned for this in the first place. Neither was Pabu, to be honest 😄 But I’ll admit that when I took some simple Prolog and it just ran without modification, then I was so incredibly happy.

rickmoynihan15:02:52

It’s essentially what I wanted to do in my initial PR. I’d just assumed you prefered the app to be the most prominent thing rather than the library. But as it’s mainly an example what you have here, inverting that relationship is perfect.

quoll15:02:53

No, it wasn’t supposed to be prominent. I really just built the CLI to provide a template for people to understand how to call Naga. But then I kept being asked for more features on it :rolling_on_the_floor_laughing:

quoll15:02:37

Sort of like “runnable documentation”

quoll15:02:30

This is exactly why making everything open source is so valuable. People note my mistakes (ouch!) and I fix things that I wouldn’t have thought of otherwise.

👍 3
quoll00:02:06

@rickmoynihan you may be amused to note that your PR not only prompted me to rearrange Naga, but it also resulted in submitting a PR to lein-modules 😂

👍 3