Fork me on GitHub
#rdf
<
2023-01-16
>
Mathias Picker18:01:19

Has anyone tried to use the jena libs with GraalVM? I'm trying the first time, and getting a ClassNotFoundException. I've seldomly used graal, and never with java libraries in a clojure project, I'm running against a wall here. Pure clojure projects work fine with this setup. Demo project: https://github.com/virtual-earth-de/jena-graal

quoll05:01:28

All love to Andy, but this is one of the reasons why I reimplement these things in Clojure 🙂

quoll05:01:18

one of the big reasons for Graal to fail on Java libs is when reflection gets used. I can't tell you where that's happening, but that's a first thing to look for.

Mathias Picker11:01:25

…thanks, I got over the reflection bit for the queryparser but now I'm seeing a new one with xerces :) It woud be great to have a working semantic web environment in clojure - it's on my todo list to check out asami for RDF. Right now I'm testing / benchmarking commercial triple store & found RDFox with it's main memory implementation quite impressive. > 4 billon triples in 512Gb, not too shabby. And query performance is impressive, compared to it's on-disk contenders. I'm wondering how much work it would be to import test data from e.g. the berlin sparql benchmark into clojure datalog stores, maybe I'll work on that, maybe I can even use fluree's sparql parser to run the tests…

Kelvin14:01:30

Apologies if this sounds narcissistic for promoting my own libs, but looking at your code you might want to check out https://github.com/yetanalytics/flint, the Clojure -> SPARQL compiler I made.

Kelvin14:01:56

I also wrote https://github.com/yetanalytics/flint-jena, which is a version that builds Jena Query/UpdateRequest instances directly instead of strings.

Kelvin14:01:00

I’m curious how they would actually work in GraalVM. Vanilla Flint should work just fine since that’s 100% Clojure, but flint-jena might run into issues since it calls a lot of low-level Jena methods directly

rickmoynihan14:01:03

Have you traced reflective calls with the -agentlib:native-image-agent?

rickmoynihan14:01:19

I’ve not done this with Jena, but have done it with RDF4j and using the agent to trace the reflective calls, generated the configs I needed just fine. IIRC it does tend to give you much more config than you need (which IIRC can bloat the binaries a bit) but you can then go through them by hand and clean up the config you don’t require easily enough.

quoll15:01:46

@U039VREN8F5 Asami is not there for RDF yet. I have started on this, but other solutions will be better

Kelvin15:01:01

Speaking of Asami I wish there was a “Korra” library that would allow you to use Asami with SPARQL

Kelvin15:01:15

> I have started on this Though this might be a good sign

quoll16:01:15

I am trying to create a transform library that takes SPARQL and generates Asami queries. Why do it that way? Because Asami queries are data structures, and the structures are executed directly

quoll16:01:44

Asami does not have 100% coverage of SPARQL functionality, but it has a LOT of it

Kelvin16:01:03

Yes that sort of SPARQL -> Asami compilation was exactly what I was envisioning

quoll16:01:50

I spend my days doing SPARQL queries now, and it seems ridiculous that I don’t have it on Asami. And embarrassing

rickmoynihan16:01:17

That would be a pretty cute addition to asami for sure What I’d really like though would be a something to go from SPARQL to flint (though IIRC it doesn’t have 100% coverage of the SPARQL grammar) or an intermediate data representation of SPARQL. We currently rewrite SPARQL queries by parsing the AST with Jena, munging it as the AST and then printing it back to SPARQL, for execution against the database. All written in clojure; but the Jena APIs for this aren’t ideal. That said if such a thing existed I probably wouldn’t replace what we have, as it’s battle tested and works in a bunch of production settings… However if I were to do something again and were in less of a hurry I’d probably go down this route 🙂

Kelvin16:01:10

> though IIRC it doesn’t have 100% coverage of the SPARQL grammar Yeah unfortunately it doesn’t, which is why I never made a SPARQL -> flint parser

Kelvin16:01:24

I’d say it’s like 90-95% coverage though

👍 2
Kelvin16:01:36

And as an aside I would love to see Flint work with Asami

quoll16:01:29

Asami’s functionality has been driven by SPARQL. There were a couple of occasions when Asami implemented a query option before Datomic did. However, there are a couple of minor semantic differences between SPARQL and Datomic. Whenever that happened, I always took the SPARQL route

quoll16:01:23

(I had lots of reasons for this, but probably the real reason was laziness… I already understood the SPARQL semantics)

😁 2
quoll16:01:49

Going back to the original statement that started this thread… My biggest reason for reimplementing things from scratch is platform neutrality. My company has a big investment in .net, and I don’t want to write C# 🙂

quoll16:01:11

Also, embedding in web pages can be an important use case

quoll16:01:48

This is also why I try to build each component as a library, rather than a monolithic project. The smaller the dependencies for a web page, the better

Kelvin16:01:37

Also because multiple libraries are easier to maintain than a big monolithic one

quoll16:01:03

That’s not been my experience

Kelvin16:01:39

Ok maybe “maintain” was the wrong word - I should say “easier to navigate each lib”

Kelvin16:01:57

I’m just basing off my experiences of having to trawl through all of Jena

quoll16:01:21

That’s true. The boundaries are far clearer

quoll16:01:49

Nothing worse than going through module A only to see it reach into the internals of something else

Kelvin16:01:28

Also doesn’t help that Jena is a real-life version of Enterprise FizzBuzz

quoll16:01:03

I’m guessing that you didn’t see the Jena codebase circa 2002

😄 2
quoll16:01:20

It was… 😖

quoll16:01:51

I appreciate that they describe their queries plans in a lisp-like syntax, but that’s it

quoll16:01:01

Even as late as 2004, they did BGP matching against triples using a filter. Yes, a filter. If you inserted 1000 triples, in which you have 10 instances of my:Type, and you did a basic query of:

SELECT ?y
WHERE { ?x rdf:type my:Type . ?x rdf:value ?y }
Then you would iterate through that data with 10,000 tests

😮 2
rickmoynihan16:01:33

yeah I always much preferred the code quality and cleanliness of RDF4j to Jena… Though it’s not quite as fully featured as Jena.

quoll16:01:12

Actually, I guess it was 11000 tests. 1000 to match the type, and then for each of the 10 matches you’d iterate through 1000 statements

quoll16:01:45

Sesame was much, much better. Hmmm, I wonder what happened to them? I should ask

rickmoynihan17:01:01

Sesame was renamed RDF4j when it moved to eclipse

👍 2
thanks3 2
rickmoynihan17:01:00

it’s still basically the same — though they repackaged all the classes under org.eclipse and have made a few breaking changes along the way

rickmoynihan17:01:58

Jeen Broekstra is still very active

rickmoynihan17:01:37

It is of course still a huge java project with lots of modules etc