Fork me on GitHub
#rdf
<
2023-08-18
>
quoll18:08:44

The essential elements of Datalog are: • a body, which matches patterns against data • a head, which uses bound values from the body to express new data

quoll18:08:20

Traditionally, that’s done using predicate syntax, using comma separators in the body clause to separate multiple conjunctive terms, and :- to separate the head and body clauses: headTerm(arg,arg) :- pattern1(arg,arg), pattern2(arg,arg) . arguments can be ground terms or variables.

quoll18:08:13

However, this is awkward with RDF, since it reorders the triple into: predicate(subject,object) Also, because any of these terms can be IRIs or QNames, then it doesn’t look a lot like standard predicate syntax.

wikipunk18:08:23

[?d, :deptAvgSalary, ?z] :-
    [?d, rdf:type, :Department],
    AGGREGATE(
        [?x, :worksFor, ?d],
        [?x, :salary, ?s]
        ON ?d
        BIND AVG(?s) AS ?z) .
is the rule I adapted from the RDFox documentation

quoll18:08:24

But most importantly, predicate logic like this doesn’t expect variable predicates, which are just fine in RDF. So using a triple-based syntax seems much more appropriate. RDFox made that switch (triple patterns, rather than predicate statements), but left everything else as-is. Datomic also uses triple-patterns, but then went with its usual data structures for the body, wrapped them in a function-looking syntax, and uses it in a query like a function, rather than the head creating new statements. There’s enough familiarity that it works

wikipunk18:08:07

yeah if the head were a pattern too then you could have a head like [?x :owl/sameAs ?y]

quoll18:08:09

Well, that would result in a new transaction, which isn’t what Datomic queries are for. But the result of a query can easily be passed in as the data for a transaction that follows it 🙂

wikipunk18:08:34

Well, I definitely would like to be able to infer triples inside a query, can't speak for most others though

wikipunk18:08:52

But now yeah I have to do inference and then chain logic

wikipunk18:08:58

But it's still pretty nice!

quoll18:08:53

That can be done where you say something like (rule-name ?entity ?attr ?value) instead of a standard [?entity ?attr ?value] pattern

wikipunk18:08:11

Furthermore there is definitely alot of room to compile datalog rules that are more formally specified in a syntax like you shared into Clojure/datomic forms

quoll18:08:12

But you can need a big ruleset to make it work

wikipunk18:08:38

OWL RL isn't too big

wikipunk18:08:50

I guess depending on big

quoll18:08:17

Some of those rules aren’t easy to encode though

wikipunk18:08:39

The only one I am having trouble with is property chain axiom

wikipunk18:08:44

That one is a mind bender

wikipunk18:08:48

The rest of pretty straightforward

quoll18:08:15

Yes, I’ve been thinking that int1 would be easier in Clojure (SPARQL is awful for it)

wikipunk18:08:44

I'm trying to understand the semantics of that vs int2

wikipunk18:08:54

https://www.w3.org/TR/rif-owl-rl/ this has been the most useful reference ive found because it provides RIF equivalents for the rules and deals with the lists

🙏 1
wikipunk18:08:31

(* <#cls-int1> *)
  Forall ?y ?c ?l (
    ?y[rdf:type->?c] :- And (
      ?c[owl:intersectionOf->?l]
      _allTypes(?l ?y) ))

  Forall ?l ?y ?ty ?tl (
    _allTypes(?l ?y) :- And (
      ?l[rdf:first->?ty rdf:rest->?tl]
      ?y[rdf:type->?ty]
      _allTypes(?tl ?y) ))

  Forall ?l ?y ?ty (
    _allTypes(?l ?y) :- And (
      ?l[rdf:first->?ty rdf:rest->rdf:nil]
      ?y[rdf:type->?ty] ))

quoll18:08:39

OK, that’d work. It’s very recursive, but I guess that’s ideal for Clojure 🙂

wikipunk18:08:30

http://webont.org/owled/2009/papers/owled2009_submission_16.pdf something like this is the "holy grail" for me -- what I want is to describe as much as possible with pure OWL and only drop down into rules when necessary. If there were a well specified syntax then we could create tools to statically analyze a subset of rules that are DL safe. I think that would be a tremendous result for safety. The thing that I have found most amazing about OWL, and this holds for all of its profiles, is how expressive it is. I can think almost in the same way I would think if I were in common lisp and modeling classes with CLOS. That ability to think at a high level about my domain and then actually use those classes in programs is enticing to me. In my ideal world there would be no rules that could not be encoded in RDF.

wikipunk18:08:25

also relevant food for thought if anyone is interested in the problem of encoding rules in RDF: https://spinrdf.org/spin-shacl.html SPIN, which I did try to wrap my mind around and failed, apparently attempted to be a more expressive RDF syntax for SPARQL rules and now they moved to SHACL Which doesn't really help OWL but it does make me wonder if there's a way to combine the node expressions of SHACL with SHACL Triple pattern rules, but create a SWRL vocabulary mapping to Datomic built-ins instead of XML builtins. Thinking out loud. Have a good weekend everyone!

quoll19:08:25

SPIN was a pseudo Datalog, where they decided that mapping standard Datalog constraints to SPARQL was too hard, and they just encoded the SPARQL directly. This made it unstructured, and consequently, there was ability to automate dependencies between productions and rule scheduling. This made it inefficient. But… it worked. There were issues with trying to use it at scale though.

quoll19:08:13

SHACL is basically a concession to software engineers who can’t get their head around the open world assumption, and how to construct constraints in OWL

quoll19:08:20

It’s easy to build, since it can be built out of SPARQL

quoll19:08:18

I do like that it’s declarative, but it is designed for a CWA system. That’s fine if you’re trying to use your RDF graph as a CWA representation of data, but it’s not “Semantic Web”, and breaks a number of assumptions for both RDF and OWL

wikipunk19:08:02

SPDX adopted SHACL so I need to learn it anyway for software bill of materials

quoll19:08:29

Oh, it’s popular. You need to know it. I just disagree with it 🙂

wikipunk19:08:46

https://github.com/spdx/spdx-3-model/issues/464 specifications are not easy to collaborate on if you have any insight feel free to add comments I’m trying to understand a bunch of different aspects of the new model and had to reimplement it myself

quoll20:08:43

I’m at work, so I don’t have time to read this, but I’m instantly on my guard reading > “Blank Nodes” - generally said to be harmful in the literature

💯 2
quoll20:08:04

As step 1, I’d like to know which literature

quoll20:08:19

There are occasions where blank nodes are painful to deal with. In those cases, I advocate for the minting of IRIs. But I can’t fathom what’s considered wrong about them

wikipunk20:08:09

Ha, yeah. But this kind of thing is common in my experience specifically in this domain

wikipunk20:08:53

I just have to figure out how to contribute the right ideas :) maybe you can help, but not now

quoll20:08:08

I mean… I’ve heard complaints. I’ve complained about them myself. But the complaints usually miss some elements of modeling. I know that I didn’t get it, because I was looking at it from a storage/querying point of view

quoll20:08:01

I know that there have been some databases which allocated IDs that were externally addressable (e.g. Talis)

wikipunk20:08:56

A bigger problem (which I’m not sure how to broach effectively) is that they are defining an RDF model with markdown using a WIP format that doesn’t really compile into turtle currently

wikipunk20:08:07

Instead of collaborating with turtle directly

wikipunk20:08:26

It doesn’t make contribution easier because the contributors need to learn the markdown format which isn’t specified well

wikipunk20:08:38

And needs to be extended as they learn about RDF