This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2023-03-15
Channels
- # babashka (4)
- # beginners (136)
- # calva (63)
- # clerk (7)
- # clj-kondo (8)
- # clojure (43)
- # clojure-boston (1)
- # clojure-europe (37)
- # clojure-nl (1)
- # clojure-norway (11)
- # clojure-uk (3)
- # clojurescript (6)
- # clr (1)
- # code-reviews (16)
- # cursive (45)
- # datomic (2)
- # docker (32)
- # emacs (10)
- # events (2)
- # exercism (1)
- # fulcro (3)
- # hugsql (1)
- # hyperfiddle (47)
- # leiningen (3)
- # lsp (30)
- # malli (39)
- # missionary (1)
- # off-topic (24)
- # pathom (2)
- # portal (14)
- # practicalli (5)
- # rdf (13)
- # reagent (18)
- # reitit (18)
- # releases (7)
- # remote-jobs (1)
- # sci (2)
- # shadow-cljs (45)
- # sql (7)
- # tools-build (11)
- # xtdb (13)
The idea of connecting LLMs to knowledge bases does seem like a good PhD research topic :thinking_face:
Indeed… Though I suspect it’d be “relatively straightforward”* to train or fine-tune a “supervisor NN” to check a knowledge base as part of the reward function as a mesa-optimiser; and then use that to train the base model or do something like an adversarial learning approach. I think the bigger problem is likely the quality and coverage of existing knowledge bases. Or approaches where you train it how to fact check its output… I think bing’s GPT had an ability to take actions like search the web; so I think you could reasonably teach it how to fact check an oracle; and then use RLHF to check it fact checked the right things, and in the right context and correctly interpreted the fact checking results from its prompt to the fact-checker. * obviously by ‘relatively straightforward’ I mean straightforward for experts and not me (a total layman) to do it.
I'm very confident that an interesting approach will be developed with the release of GPT-4, which has a https://cdn.openai.com/papers/gpt-4.pdf with lots of detail on (checks notes): absolutely nothing about how it fits together: > "Given both the competitive landscape and the safety implications of large-scale models like GPT-4, this report contains no further details about the architecture (including model size), hardware, training compute, dataset construction, training method, or similar."
That right there is “cannot be safely used for any purpose other than (arguably) research”.
coming back to the channel topic though… 🙂 @quoll I was reminded again of your kiara adapter for datomic … can you say a bit about how it informed your later approaches? or put another way — if you needed something like that today would you still use the same broad approach?
I would probably work more with IRIs (URIs on the JVM) and not try to use keywords as much as I did. That’s for scalability reasons.
I would also be more willing to create entities for each IRI, and not try to eliminate those as much as I did. Partly because I don’t think it would hurt scalability as much as I was worried about, and also because I needed too many exceptions to check for raw IRIs (permissible in the “value” or object position, but not in the “entity” or subject position)
I would also probably work more with RDFS than I did. I work more with models now, while in the past I was happier with raw RDF. These days I’m thinking that it’s not such a terrible thing to request that a schema be available (one can always be derived, of course)
It’s been a few years since I looked at it though! I’m operating on memory that is like… a decade old at this point
then again, my current top challenge is “graph == neo4j” so I might just be tilting at windmills. not the first time I’d be described that way, to be fair.
From an efficiency POV, then connecting objects in a graph makes sense. Datomic does this too, insofar as those objects are “entities”. I like the homogeneity of RDF though. Maybe people prefer having properties on objects to be separate from the edge labels, but I like the way RDF allows greater flexibility here. We see it with things like SKOS notations, where the value can be structured or a literal.