The idea of connecting LLMs to knowledge bases does seem like a good PhD research topic đ¤
Indeed⌠Though I suspect itâd be ârelatively straightforwardâ* to train or fine-tune a âsupervisor NNâ to check a knowledge base as part of the reward function as a mesa-optimiser; and then use that to train the base model or do something like an adversarial learning approach. I think the bigger problem is likely the quality and coverage of existing knowledge bases. Or approaches where you train it how to fact check its output⌠I think bingâs GPT had an ability to take actions like search the web; so I think you could reasonably teach it how to fact check an oracle; and then use RLHF to check it fact checked the right things, and in the right context and correctly interpreted the fact checking results from its prompt to the fact-checker. * obviously by ârelatively straightforwardâ I mean straightforward for experts and not me (a total layman) to do it.
I'm very confident that an interesting approach will be developed with the release of GPT-4, which has a https://cdn.openai.com/papers/gpt-4.pdf with lots of detail on (checks notes): absolutely nothing about how it fits together: > "Given both the competitive landscape and the safety implications of large-scale models like GPT-4, this report contains no further details about the architecture (including model size), hardware, training compute, dataset construction, training method, or similar."
That right there is âcannot be safely used for any purpose other than (arguably) researchâ.
coming back to the channel topic though⌠đ @quoll I was reminded again of your kiara adapter for datomic ⌠can you say a bit about how it informed your later approaches? or put another way â if you needed something like that today would you still use the same broad approach?
From an efficiency POV, then connecting objects in a graph makes sense. Datomic does this too, insofar as those objects are âentitiesâ. I like the homogeneity of RDF though. Maybe people prefer having properties on objects to be separate from the edge labels, but I like the way RDF allows greater flexibility here. We see it with things like SKOS notations, where the value can be structured or a literal.
Agreed. And without RDF you donât have standardized, defined semantics for graph merge, and you have to do the integration yourself ~every time.
I would probably work more with IRIs (URIs on the JVM) and not try to use keywords as much as I did. Thatâs for scalability reasons.
I would also be more willing to create entities for each IRI, and not try to eliminate those as much as I did. Partly because I donât think it would hurt scalability as much as I was worried about, and also because I needed too many exceptions to check for raw IRIs (permissible in the âvalueâ or object position, but not in the âentityâ or subject position)
I would also probably work more with RDFS than I did. I work more with models now, while in the past I was happier with raw RDF. These days Iâm thinking that itâs not such a terrible thing to request that a schema be available (one can always be derived, of course)
Itâs been a few years since I looked at it though! Iâm operating on memory that is like⌠a decade old at this point
LOL ⌠all very helpful still!
then again, my current top challenge is âgraph == neo4jâ so I might just be tilting at windmills. not the first time Iâd be described that way, to be fair.