rdf

Mathias Picker 2023-01-16T18:12:19.480109Z

Has anyone tried to use the jena libs with GraalVM? I'm trying the first time, and getting a ClassNotFoundException. I've seldomly used graal, and never with java libraries in a clojure project, I'm running against a wall here. Pure clojure projects work fine with this setup. Demo project: https://github.com/virtual-earth-de/jena-graal

Mathias Picker 2023-01-17T11:28:25.780649Z

…thanks, I got over the reflection bit for the queryparser but now I'm seeing a new one with xerces :) It woud be great to have a working semantic web environment in clojure - it's on my todo list to check out asami for RDF. Right now I'm testing / benchmarking commercial triple store & found RDFox with it's main memory implementation quite impressive. > 4 billon triples in 512Gb, not too shabby. And query performance is impressive, compared to it's on-disk contenders. I'm wondering how much work it would be to import test data from e.g. the berlin sparql benchmark into clojure datalog stores, maybe I'll work on that, maybe I can even use fluree's sparql parser to run the tests…

Kelvin 2023-01-17T14:45:30.766949Z

Apologies if this sounds narcissistic for promoting my own libs, but looking at your code you might want to check out https://github.com/yetanalytics/flint, the Clojure -> SPARQL compiler I made.

Kelvin 2023-01-17T14:46:56.229029Z

I also wrote https://github.com/yetanalytics/flint-jena, which is a version that builds Jena Query/UpdateRequest instances directly instead of strings.

Kelvin 2023-01-17T14:48:00.571839Z

I’m curious how they would actually work in GraalVM. Vanilla Flint should work just fine since that’s 100% Clojure, but flint-jena might run into issues since it calls a lot of low-level Jena methods directly

2023-01-17T14:56:03.474379Z

Have you traced reflective calls with the -agentlib:native-image-agent?

2023-01-17T14:58:19.962699Z

I’ve not done this with Jena, but have done it with RDF4j and using the agent to trace the reflective calls, generated the configs I needed just fine. IIRC it does tend to give you much more config than you need (which IIRC can bloat the binaries a bit) but you can then go through them by hand and clean up the config you don’t require easily enough.

quoll 2023-01-17T15:14:46.175049Z

@mathias.picker Asami is not there for RDF yet. I have started on this, but other solutions will be better

Kelvin 2023-01-17T15:17:01.066629Z

Speaking of Asami I wish there was a “Korra” library that would allow you to use Asami with SPARQL

Kelvin 2023-01-17T15:17:15.206029Z

> I have started on this Though this might be a good sign

quoll 2023-01-17T16:07:15.035699Z

I am trying to create a transform library that takes SPARQL and generates Asami queries. Why do it that way? Because Asami queries are data structures, and the structures are executed directly

quoll 2023-01-17T16:07:44.448389Z

Asami does not have 100% coverage of SPARQL functionality, but it has a LOT of it

Kelvin 2023-01-17T16:08:03.820739Z

Yes that sort of SPARQL -> Asami compilation was exactly what I was envisioning

quoll 2023-01-17T16:08:50.084859Z

I spend my days doing SPARQL queries now, and it seems ridiculous that I don’t have it on Asami. And embarrassing

2023-01-17T16:39:17.583769Z

That would be a pretty cute addition to asami for sure What I’d really like though would be a something to go from SPARQL to flint (though IIRC it doesn’t have 100% coverage of the SPARQL grammar) or an intermediate data representation of SPARQL. We currently rewrite SPARQL queries by parsing the AST with Jena, munging it as the AST and then printing it back to SPARQL, for execution against the database. All written in clojure; but the Jena APIs for this aren’t ideal. That said if such a thing existed I probably wouldn’t replace what we have, as it’s battle tested and works in a bunch of production settings… However if I were to do something again and were in less of a hurry I’d probably go down this route 🙂

Kelvin 2023-01-17T16:41:10.982439Z

> though IIRC it doesn’t have 100% coverage of the SPARQL grammar Yeah unfortunately it doesn’t, which is why I never made a SPARQL -> flint parser

Kelvin 2023-01-17T16:41:24.840479Z

I’d say it’s like 90-95% coverage though

👍 1
Kelvin 2023-01-17T16:41:36.570759Z

And as an aside I would love to see Flint work with Asami

quoll 2023-01-17T16:42:29.768739Z

Asami’s functionality has been driven by SPARQL. There were a couple of occasions when Asami implemented a query option before Datomic did. However, there are a couple of minor semantic differences between SPARQL and Datomic. Whenever that happened, I always took the SPARQL route

quoll 2023-01-17T16:43:23.264799Z

(I had lots of reasons for this, but probably the real reason was laziness… I already understood the SPARQL semantics)

😁 1
quoll 2023-01-17T16:47:49.403909Z

Going back to the original statement that started this thread… My biggest reason for reimplementing things from scratch is platform neutrality. My company has a big investment in .net, and I don’t want to write C# 🙂

quoll 2023-01-17T16:48:11.987579Z

Also, embedding in web pages can be an important use case

quoll 2023-01-17T16:48:48.316069Z

This is also why I try to build each component as a library, rather than a monolithic project. The smaller the dependencies for a web page, the better

Kelvin 2023-01-17T16:49:37.864609Z

Also because multiple libraries are easier to maintain than a big monolithic one

quoll 2023-01-17T16:50:03.149379Z

That’s not been my experience

Kelvin 2023-01-17T16:50:39.584659Z

Ok maybe “maintain” was the wrong word - I should say “easier to navigate each lib”

Kelvin 2023-01-17T16:50:57.498549Z

I’m just basing off my experiences of having to trawl through all of Jena

quoll 2023-01-17T16:51:21.550259Z

That’s true. The boundaries are far clearer

quoll 2023-01-17T16:51:49.168449Z

Nothing worse than going through module A only to see it reach into the internals of something else

Kelvin 2023-01-17T16:52:28.849049Z

Also doesn’t help that Jena is a real-life version of Enterprise FizzBuzz

quoll 2023-01-17T16:53:03.833809Z

I’m guessing that you didn’t see the Jena codebase circa 2002

😄 1
quoll 2023-01-17T16:53:20.277139Z

It was… 😖

quoll 2023-01-17T16:53:51.390169Z

I appreciate that they describe their queries plans in a lisp-like syntax, but that’s it

quoll 2023-01-17T16:57:01.499649Z

Even as late as 2004, they did BGP matching against triples using a filter. Yes, a filter. If you inserted 1000 triples, in which you have 10 instances of my:Type, and you did a basic query of:

SELECT ?y
WHERE { ?x rdf:type my:Type . ?x rdf:value ?y }
Then you would iterate through that data with 10,000 tests

😮 1
2023-01-17T16:58:33.443779Z

yeah I always much preferred the code quality and cleanliness of RDF4j to Jena… Though it’s not quite as fully featured as Jena.

quoll 2023-01-17T16:59:12.771119Z

Actually, I guess it was 11000 tests. 1000 to match the type, and then for each of the 10 matches you’d iterate through 1000 statements

quoll 2023-01-17T16:59:45.165179Z

Sesame was much, much better. Hmmm, I wonder what happened to them? I should ask

2023-01-17T17:00:01.611339Z

Sesame was renamed RDF4j when it moved to eclipse

👍 1
1
2023-01-17T17:01:00.600479Z

it’s still basically the same — though they repackaged all the classes under org.eclipse and have made a few breaking changes along the way

2023-01-17T17:01:58.497899Z

Jeen Broekstra is still very active

2023-01-17T17:02:37.464179Z

It is of course still a huge java project with lots of modules etc

quoll 2023-01-17T05:54:28.611179Z

All love to Andy, but this is one of the reasons why I reimplement these things in Clojure 🙂

quoll 2023-01-17T05:55:18.716609Z

one of the big reasons for Graal to fail on Java libs is when reflection gets used. I can't tell you where that's happening, but that's a first thing to look for.