rdf

wikipunk 2023-08-18T17:34:39.478069Z

https://docs.oxfordsemantic.tech/4.0/reasoning-in-rdfox.html comparing RDFox datalog to Datomic is interesting

quoll 2023-08-18T18:07:44.576629Z

The essential elements of Datalog are: • a body, which matches patterns against data • a head, which uses bound values from the body to express new data

quoll 2023-08-18T18:10:20.497679Z

Traditionally, that’s done using predicate syntax, using comma separators in the body clause to separate multiple conjunctive terms, and :- to separate the head and body clauses: headTerm(arg,arg) :- pattern1(arg,arg), pattern2(arg,arg) . arguments can be ground terms or variables.

quoll 2023-08-18T18:12:13.619519Z

However, this is awkward with RDF, since it reorders the triple into: predicate(subject,object) Also, because any of these terms can be IRIs or QNames, then it doesn’t look a lot like standard predicate syntax.

wikipunk 2023-08-18T18:12:23.755969Z

[?d, :deptAvgSalary, ?z] :-
    [?d, rdf:type, :Department],
    AGGREGATE(
        [?x, :worksFor, ?d],
        [?x, :salary, ?s]
        ON ?d
        BIND AVG(?s) AS ?z) .
is the rule I adapted from the RDFox documentation

quoll 2023-08-18T18:16:24.799849Z

But most importantly, predicate logic like this doesn’t expect variable predicates, which are just fine in RDF. So using a triple-based syntax seems much more appropriate. RDFox made that switch (triple patterns, rather than predicate statements), but left everything else as-is. Datomic also uses triple-patterns, but then went with its usual data structures for the body, wrapped them in a function-looking syntax, and uses it in a query like a function, rather than the head creating new statements. There’s enough familiarity that it works

wikipunk 2023-08-18T18:18:07.009579Z

yeah if the head were a pattern too then you could have a head like [?x :owl/sameAs ?y]

quoll 2023-08-18T18:19:09.245949Z

Well, that would result in a new transaction, which isn’t what Datomic queries are for. But the result of a query can easily be passed in as the data for a transaction that follows it 🙂

wikipunk 2023-08-18T18:19:34.218089Z

Well, I definitely would like to be able to infer triples inside a query, can't speak for most others though

wikipunk 2023-08-18T18:19:52.991319Z

But now yeah I have to do inference and then chain logic

wikipunk 2023-08-18T18:19:58.768159Z

But it's still pretty nice!

quoll 2023-08-18T18:20:53.550809Z

That can be done where you say something like (rule-name ?entity ?attr ?value) instead of a standard [?entity ?attr ?value] pattern

wikipunk 2023-08-18T18:21:11.220089Z

Furthermore there is definitely alot of room to compile datalog rules that are more formally specified in a syntax like you shared into Clojure/datomic forms

quoll 2023-08-18T18:21:12.438159Z

But you can need a big ruleset to make it work

wikipunk 2023-08-18T18:21:38.626989Z

OWL RL isn't too big

wikipunk 2023-08-18T18:21:50.598529Z

I guess depending on big

quoll 2023-08-18T18:23:17.357449Z

Some of those rules aren’t easy to encode though

wikipunk 2023-08-18T18:23:39.275379Z

The only one I am having trouble with is property chain axiom

wikipunk 2023-08-18T18:23:44.145819Z

That one is a mind bender

wikipunk 2023-08-18T18:23:48.734899Z

The rest of pretty straightforward

quoll 2023-08-18T18:24:15.862699Z

Yes, I’ve been thinking that int1 would be easier in Clojure (SPARQL is awful for it)

wikipunk 2023-08-18T18:26:44.167799Z

I'm trying to understand the semantics of that vs int2

wikipunk 2023-08-18T18:29:54.355329Z

https://www.w3.org/TR/rif-owl-rl/ this has been the most useful reference ive found because it provides RIF equivalents for the rules and deals with the lists

🙏 1
wikipunk 2023-08-18T18:30:31.559239Z

(* <#cls-int1> *)
  Forall ?y ?c ?l (
    ?y[rdf:type->?c] :- And (
      ?c[owl:intersectionOf->?l]
      _allTypes(?l ?y) ))

  Forall ?l ?y ?ty ?tl (
    _allTypes(?l ?y) :- And (
      ?l[rdf:first->?ty rdf:rest->?tl]
      ?y[rdf:type->?ty]
      _allTypes(?tl ?y) ))

  Forall ?l ?y ?ty (
    _allTypes(?l ?y) :- And (
      ?l[rdf:first->?ty rdf:rest->rdf:nil]
      ?y[rdf:type->?ty] ))

quoll 2023-08-18T18:38:39.210609Z

OK, that’d work. It’s very recursive, but I guess that’s ideal for Clojure 🙂

wikipunk 2023-08-18T18:43:30.587649Z

http://webont.org/owled/2009/papers/owled2009_submission_16.pdf something like this is the "holy grail" for me -- what I want is to describe as much as possible with pure OWL and only drop down into rules when necessary. If there were a well specified syntax then we could create tools to statically analyze a subset of rules that are DL safe. I think that would be a tremendous result for safety. The thing that I have found most amazing about OWL, and this holds for all of its profiles, is how expressive it is. I can think almost in the same way I would think if I were in common lisp and modeling classes with CLOS. That ability to think at a high level about my domain and then actually use those classes in programs is enticing to me. In my ideal world there would be no rules that could not be encoded in RDF.

wikipunk 2023-08-18T18:52:25.467799Z

also relevant food for thought if anyone is interested in the problem of encoding rules in RDF: https://spinrdf.org/spin-shacl.html SPIN, which I did try to wrap my mind around and failed, apparently attempted to be a more expressive RDF syntax for SPARQL rules and now they moved to SHACL Which doesn't really help OWL but it does make me wonder if there's a way to combine the node expressions of SHACL with SHACL Triple pattern rules, but create a SWRL vocabulary mapping to Datomic built-ins instead of XML builtins. Thinking out loud. Have a good weekend everyone!

quoll 2023-08-18T19:26:25.280249Z

SPIN was a pseudo Datalog, where they decided that mapping standard Datalog constraints to SPARQL was too hard, and they just encoded the SPARQL directly. This made it unstructured, and consequently, there was ability to automate dependencies between productions and rule scheduling. This made it inefficient. But… it worked. There were issues with trying to use it at scale though.

quoll 2023-08-18T19:27:13.447109Z

SHACL is basically a concession to software engineers who can’t get their head around the open world assumption, and how to construct constraints in OWL

quoll 2023-08-18T19:28:20.027269Z

It’s easy to build, since it can be built out of SPARQL

quoll 2023-08-18T19:35:18.723189Z

I do like that it’s declarative, but it is designed for a CWA system. That’s fine if you’re trying to use your RDF graph as a CWA representation of data, but it’s not “Semantic Web”, and breaks a number of assumptions for both RDF and OWL

wikipunk 2023-08-18T19:55:02.801939Z

SPDX adopted SHACL so I need to learn it anyway for software bill of materials

quoll 2023-08-18T19:56:29.849229Z

Oh, it’s popular. You need to know it. I just disagree with it 🙂

wikipunk 2023-08-18T19:57:46.966329Z

https://github.com/spdx/spdx-3-model/issues/464 specifications are not easy to collaborate on if you have any insight feel free to add comments I’m trying to understand a bunch of different aspects of the new model and had to reimplement it myself

quoll 2023-08-18T20:02:43.801179Z

I’m at work, so I don’t have time to read this, but I’m instantly on my guard reading > “Blank Nodes” - generally said to be harmful in the literature

💯 1
quoll 2023-08-18T20:03:04.433379Z

As step 1, I’d like to know which literature

quoll 2023-08-18T20:04:19.947939Z

There are occasions where blank nodes are painful to deal with. In those cases, I advocate for the minting of IRIs. But I can’t fathom what’s considered wrong about them

wikipunk 2023-08-18T20:05:09.665609Z

Ha, yeah. But this kind of thing is common in my experience specifically in this domain

wikipunk 2023-08-18T20:05:53.184779Z

I just have to figure out how to contribute the right ideas :) maybe you can help, but not now

quoll 2023-08-18T20:06:08.592899Z

I mean… I’ve heard complaints. I’ve complained about them myself. But the complaints usually miss some elements of modeling. I know that I didn’t get it, because I was looking at it from a storage/querying point of view

quoll 2023-08-18T20:07:01.652469Z

I know that there have been some databases which allocated IDs that were externally addressable (e.g. Talis)

wikipunk 2023-08-18T20:07:56.837119Z

A bigger problem (which I’m not sure how to broach effectively) is that they are defining an RDF model with markdown using a WIP format that doesn’t really compile into turtle currently

wikipunk 2023-08-18T20:08:07.030099Z

Instead of collaborating with turtle directly

wikipunk 2023-08-18T20:08:26.831219Z

It doesn’t make contribution easier because the contributors need to learn the markdown format which isn’t specified well

wikipunk 2023-08-18T20:08:38.260189Z

And needs to be extended as they learn about RDF