RDF and the future of LLMs: https://www.youtube.com/watch?v=OxzUjpihIH4
Loved this.
@luke seems to have be doing a lot of deep thinking on this.
Yeah, it was pretty interesting. There’s a lot of people looking at using GOFAI to augment LLMs/ML systems. It’s also a trend that we’ve seen 3 RDF database vendors being acquired recently; largely I suspect to boost the AI stories of the purchashers. How much of this translates into meaningful progress is still to be seen though. Regardless I think there’s a lot of mileage in combining symbolic and ML approaches. My belief is that symbolic approaches are inherently rooted in mathematics, formal methods, symbolic logic etc; and I don’t think anyone is saying that an LLM removes the utility of maths. So my argument for knowledge graphs and representation, that I’ve not heard many people make (though I think the belief underpins @luke’s talk) is that in principle these are tools “for thought” and if they’re helpful for us, they’re helpful for AI’s/LLMs too. For example CoT over ontologies and GOFAI reasoning feels like an interesting research direction/
Also Erik Meijer has some pretty cool stuff in this space
100%, it doesn't matter how "smart" a LLM's generative capacity is, it needs to be able to do the equivalent of breaking out a pencil and paper to make some calculations.
and I think everyone realizes this, but the prevailing approach is just to have them write code and then run it in a sandbox, which kind of works but I think is a dead end.
Great to have you here @luke 👋
Your talk really resonated with me, as I’d come to similar conclusions.
I’ve spent many years working on modelling government stats with RDF and data cubes; though it’s unfortunately not an area I’m currently paid to work on 😞
There are real/concrete problems in how statisticians work with statistics, on understanding what the statistics mean; that would benefit from these kinds of approaches.
Often they’re just given (or publish) a spreadsheet of stats with very little descriptions of what the statistics are really about. What are the dimensions of the data, what is being counted, what are the units, is an ethnicity of Other comparable to an ethnicity of Other in a different dataset, etc… Not to mention issues around changing definitions, geographic boundaries etc.
The idea is to model all these concepts and changes such that statistics are better described more comparable, more harmonised in their concepts etc.
So there’s a lot of mileage here to use LLMs to help with that; in particular on the tooling side to augment the extra metadata — but then also to use that metamodel to support grokking previously unseen datasets etc.