Fork me on GitHub
Bart Kleijngeld09:02:43

I may have asked this before, but how would one go about querying RDF lists? In SPARQL you can use property paths to achieve this relatively easily:

    :myNodeShape (sh:and/rdf:rest*/rdf:first)* ?otherShapes .
AFAIK Asami supports * and + , but not property paths (`/`). Is that right? Any tips on how to deal with this?


That will get you down nested lists, which asami can do, but I’m guessing that’s not what you want?


:where [:myNodeShape sh:and ?head]
       [?head :rdf/rest* ?list]
       [?list :rdf/first ?otherShapes]


This binds ?head to the first list node. Using :rdf/rest* means that ?list will bind to every node in the list, including the ?head node (which is zero steps away from the node bound to ?head). For each one of these nodes, you get the :rdf/first value, which is the data in the list.


Your original query is a property path that is very strange… Step along sh:and. From there, follow zero or more rdf:rest steps (which will be everything in the list), and then follow the rdf:first step to the entry in the list. From each of these list elements, recurse. i.e. follow a sh:and to the head of a list, and traverse to it’s nodes, recursing on each of these as well.


I had never considered nested transitive closures. This will be interesting to encode when I get to that part of SPARQL 😬

Bart Kleijngeld22:02:38

Hmm reading back I might have butchered the example a bit. I'll look at it closer tomorrow, it's bed time here now. But indeed, nested transitivity is probably a cool and powerful thing to support anyways :)

Bart Kleijngeld10:02:13

@U051N6TTC To provide a little more context: I want to recursively fetch all node shapes being referred to through sh:and. So if I have:

  a sh:NodeShape ;
  sh:and ( :BShape :CShape )

  a sh:NodeShape ;
  sh:and ( :DShape )

:CShape a sh:NodeShape .
:DShape a sh:NodeShape .
:EShape a sh:NodeShape .
I'd like to obtain:

Bart Kleijngeld10:02:41

> From there, follow zero or more rdf:rest steps (which will be everything in the list), and then follow the rdf:first step to the entry in the list. Also, following rdf:rest* does not give everything in the list, but everything but the first element, right?


No, rdf:rest* is the entire list. rdf:rest+ is everything by the first element


OK, if you have nested shapes, then are you going for an arbitrary level of nesting, or just 1 level?

Bart Kleijngeld16:02:43

Arbitrary level of nesting was the idea


Then your original query was actually correct. And I need to add that functionality

👍 2

In the meantime, I’d use a magic sets approach

Bart Kleijngeld16:02:04

I see now I was indeed mistaken in how I read the rdf:rest*. What is the magic set approach? Something "Google"-able suffices


INSERT {?outer internal:includes ?inner}
  ?outer a sh:NodeShape ;
         sh:and/rdf:rest*/rdf:first ?inner
Then you can do simple queries of:
SELECT ?other
{ :myNodeShape internal:includes ?other }


It just means that you instantiate extra edges that meet the requirements of what you’re looking for


You can do it without instantiation if you use subqueries… Which you can actually do with Asami, but I never showed anyone how. It’s not (yet) part of the query syntax. Instead, you feed the results of one query into the input of the next

Bart Kleijngeld16:02:10

Ah right, so those are SPARQL queries ^. And I use these to CONSTRUCT (instead of INSERT, right?) triples like you suggest, and then fetching them is straightforward, also in Asami. I get the gist at least


Ah, wait… I forgot that the INSERT/WHERE id still only doing a single step. Sigh. You can loop on that in code, but it’s messy


I’m in a meeting, so I only have half my attention on the problem. Consequently, the following is probably wrong. But I would use this kind of approach:

(loop [nodes #{:myNodeShape}]
  (let [part (q '[:find [?inner ...]
                  :in $ [?node ...]
                  :where [?node :a :sh/NodeShape]
                         [?node :sh/and ?list]
                         [?list :rdf/rest* ?node]
                         [?node :rdf/first ?inner]]
                my-db nodes)
        next-nodes (into nodes part)]
    (if (= next-nodes nodes)
      (recur next-nodes))))


The idea is, use the output of a query as the input of the next query. That’s how subqueries usually work too (you can thread queries into each other with ->>)


And that’s how I do transitivity. I collect things in sets, and loop on the query. When the set doesn’t change, you know you’ve finished the search


It’s a little more low-level, which allows me to filter out things that have already been seen, reducing the size of each step, but that’s just an optimization


I’ve corrected it now, but the :in clause needs “relations bindings” because that’s the result format from a prior query


i.e. [[?nodes]] needs the double brackets

Bart Kleijngeld17:02:24

I'm going to have to read this a few times and mull over it somewhat to understand it in detail. But I understand the gist, and I see a solution in this indeed. Thanks, I'm learning a lot as always!


I just realized… I needed to extract results, (which I could do, by mapping first across the output), but I also needed to wrap nodes in a seq, since it’s not actually a prior result. It’s just easier to project the results into a seq instead. Fixed.


See? I told you I only had half of my attention on this. (I’m very sorry)

Bart Kleijngeld17:02:20

Haha don't be sorry. Grateful for your help, even during costly meetings 😉

👍 2