Fork me on GitHub

Time will tell. I think the verdict is still out. Perhaps we can apply knowledge bases as a kind of restriction on the output of language models or other statistical models, especially if they’re doing actually important stuff.


Yeah, it seems likely to me that there’s a role in combining GOFAI with the more stochastic deep-learning techniques. It’s not that you can’t in principle train a neural net to do search, or logic, or mathematics, it’s just that as I understand it, it wastes a huge amount of parameters (space/training-time,computation etc) in the neural net, for something we already know how to efficiently do. Also if you look at alpha-go/zero etc you’ll see that though the hype focussed on the neural-networks learning how to play go etc… the neural network didn’t learn monte carlo tree search or minimax; that was designed into the architecture, just like gradient descent was designed into the training process. So these systems are to some degree already a mixture of classical and modern techniques. I kind of suspect in the future if we progress towards AGI, that these systems will learn to outsource that stuff to classical programs and techniques, and use them just like we do… i.e. many of the things humans aren’t good at, particularly mechanistic things, NN’s won’t be good at too, and they’ll just write programs to do those bits for them, like we do.


An approach I would like to see would be a NN connected to a KMS (which is often a graph with an ontology). Systems like GPT are based on NNs (with various feedback cycles, etc), and appear to be great for coming up with ideas, generating language around it, etc. Unfortunately, they are divorced from knowledge bases, and only include knowledge as an artifact of building the language model, which is why they appear confident when they spout nonsense. (Coincidentally, we do see this behaviour in humans as well, particularly politicians). Connecting a language model to a knowledge base would fill a lot of the gaps that exist now. The thing is… I don’t know how to do this. I know there are approaches for building graphs using matrices, but I haven’t yet seen anything on connecting this to a NN. I also don’t know if there are other approaches. There may be great work going on that I don’t know about, and it’s definitely an area I need to research, but I certainly haven’t heard anyone claiming a Eureka moment in this integration. (And yes, this kind of integration is something I really need for my paid job. It’s not just curiosity on my part)

👍 2

natural language to knowledge base i guess will take more time, but knowledge base to natural language, and then queried in natural language with the help of a natural language model, i think its more possible.


and it can make accesable the knowledge base to people that cannot write queries, but i don't know, i just thought of it as possible


Prompting LLMs with RDF (and many other external data sources) is possible through libraries like Following the example in, I was able to list and query on some personal rdf with varying levels of accuracy. It was fun that it can answer very basic questions about subclasses. However, for me it's a commercial blocker that the default behavior is to hallucinate when a LLM doesn't know an answer. GPT-4 acknowledges these limitations,, so hopefully this gets better soon


I’m deeply concerned with this statement in Llama’s README: > LLMs are a phenomenonal piece of technology for knowledge generation and reasoning. No, no, no, 1000% no. LLMs cannot generate knowledge, unless you redefine “knowledge” to include bullshit (Frankfurt, 2005). And they are absolutely not capable of reasoning.

⬆️ 2

Agree mostly. Do you know of other libraries that allow you to use RDF data sources with a LLM?


Not so far. My current approach is looking at it from a more traditional QA perspective, possibly involving a query grammar.


it’s a research area. 🙂


Cool. Would be interested to hear what you come up with

👍 2

@U050N4M9Z I 1000% agree with your statement about knowledge/bullshit. For me the crucial point is that they’re indistinguishable to current LLM’s. It seems GPT4 is trying to solve that with RLHF; but I remain to be convinced that this will generalise at scale to data outside the training set. I think the research indicates that they are in principle capable of reasoning though. See for example I think the challenge is that this sort of reasoning can take a lot of space… for example training a GPT on basic equations will almost certainly lead to it learning the rules of arithmetic; it’s just a very inefficient way to spend a million dollars of training budget. Incidentally it seems GPT4 is a lot better at maths problems, so they may have solved this one… though I don’t know what approach they took. Also Kahn academy have a very impressive video demoing their GPT4 integration, where it will tutor you through maths problems in the socratic style.


The key thing there is “indistinguishable”. The humans are the ones doing the sensemaking and projecting it back. It’s extremely difficult for us to not imagine an intelligence behind it, but it’s literally all in our heads.


We’re in fierce agreement on the indistinguishable thing. On the sensemaking/projecting thing I think the Othello-GPT link shows that they are building models of the world, and it seems to me that those models can do sensemaking/reasoning. From a blackbox perspective the current problem is you don’t know if it went through the sensemaking apparatus, or the nonsense-making apparatus. Regarding projecting intelligence onto it; I certainly see lots of people doing that… Maybe I am too, but I think I differ in that I have more than half a foot in the camp that the magic of intelligence is overrated, and that almost all imaginable systems are exhibiting some-level of intelligence. i.e. intelligence is an emergent quantity; and there isn’t really a qualitative difference between intelligence and any systemic behaviour. Or if there is a qualitative difference it’s likely banded into something analogous to a chomsky hierarchy. I am just an amateur armchair philosopher in this field though…


This is a good place to start grounding the argument:

👍 2

(full disclosure: Emily Bender trained me 🙂 )


@U050N4M9Z Thanks for the paper link 🙇 . I think it’s a pretty good, and importantly accessible critique of what is going on in the field. I agree it’s important to also take a top-down view, whilst making bottom-up progress, this feels largely like a restatement of Kuhn’s theories of scientific revolutions, but with different terms. Essentially it’s clear there’s a cambrian explosion of approaches in modern AI, and debatably even an explosion of competing paradigms, or mini-paradigms (hills) within which people are working bottom-up. The cautionary words strike me as being very wise, similarly the points/risks around the anthropomorphised terminology. I certainly agree that LLMs don’t have good sense of goals/intent; and that goals or to frame it in other gofai terms BDI (beliefs/desires/intents) are important to invalidating beliefs and bs. However I think in a weaker sense LLMs do have goals. Firstly I think they “learn” latent goal(s) in training, the goals aren’t necessarily what we want or think they are, but they are there; though more often than not misaligned… e.g. we want the chatbots goal to write true answers and good answers; but it learned the easier goal of bullshitting and producing plausible sounding answers was sufficient to excel in the training environment. Arguably all approaches within the paradigm of having a training environment (which is separate from the real world environment) will adopt essentially the same goal of excelling in the training environment. Alignment research strikes me as one of the most important sub-fields we need to make progress in. Secondly I think you can argue that LLM prompts serves as a latent goal; in that the prompt forces them to adopt a simulacrum, or persona which will have a latent/implicit goal shaped by the form of the prompt. In order to do really good next word prediction, you need to simulate the world model of something… e.g. to provide a medical answer, you need to adopt the simulacra of a doctor, and a LLM may have implicitly learned the hippocratic oath, or at least given the first point know to respond in forms that latently signal that; capacity for BS withstanding. Thirdly I think fine tuning can be used to embed more concrete goals. I’m also not convinced by section 3.1; partly because I share the contrarian position of Noam Chomsky, which is that language is not really about communication at all, and that was just a by-product of its evolution, and that human language is almost exclusively a tool for understanding — evidenced by almost all language use (well over 99.9%) being internal. That’s not to say communication isn’t useful or important, just that it represents probably less than .1% of all language use/utility. I should add that Chomsky’s view for the last 30 years was that statistical approaches neural nets (and now LLMs) are essentially stochastic parrots; and not capable of real insight. In light of recent evidence, I’m not sure I share this view, anymore. So I think a definition of meaning relying on communicative intent is misplaced. I’d personally tend towards definitions where a form or utterances degree of meaningfulness pertains to how much it represents an identifiable property or even abstraction of its world/context. I’m also essentially a Buddhist, so I think meaning extends beyond language, it is latent in the environment itself, and the environment also includes language. Language is basically a tool for map making, and the map is not the territory (Korzybski); but never the less a good map contains meaning and that meaning is a measure of its utility. So for me language provides a method of symbolic reasoning and mining or constructing meaning out of either internalised or externalised forms. I also find the statement “We argue that a model of natural language that is trained purely on form will not learn meaning: if the training data is only form, there is not sufficient signal to learn…” to be a bit of a strawman. LLMs aren’t trained on form alone; they are also given strong signals through training e.g. RLHF which reward them for finding meaning in the forms they are fed; I think this is a very important neglected detail. To be fair other signals are mentioned in the counterarguments, but not this one, which I feel is more important than any listed. Regardless I really enjoyed the paper, and definitely share many other views expressed in it. Particularly the view that current LLMs have access to just a reflection of human meaning; after being fed a diet of only words, they lack a wider sense of experience. So in some sense they’re book smart, but with no common sense, lived experience, wisdom, or presently capacity to distinguish bs… however I don’t think that means ML/neaural-nets aren’t in principle capable of attaining it. I think the fact that these are problems are reasonably well understood by the researchers, though it’s not clear we have good solutions to them (yet).

👍 4

Glad you enjoyed it! It’s probably worth noting that the paper overlaps somewhat with the rise of the RLHF component, but I would also counter that a) RLHF rewards the models for presenting forms that humans correlate with meaning, which is not a trivial distinction in this context, and b) we don’t actually know what the humans did; the models are opaque. I don’t see how LLMs could be anything other than stochastic parrots — there’s no “there” there, as the saying goes. Where I definitely part ways with Manning (and Chomsky, to a great extent) is that that entails human brains as stochastic parrots. ANNs are loosely inspired by the ~human nervous system, but only by analogy, and there are incredibly complicated architectural and feedback mechanisms that we don’t even understand, much less can simulate, and it doesn’t necessarily follow that better simulation will get us there, any more than running faster will get us to Mars (not my metaphor, sadly). Over and above all of that, though, my real concern is that the current dominant researchers (and funders) don’t actually seem to take any of the issues seriously, and certainly aren’t engaging with the critiques in any meaningful way. “We promise we’re working on safety” is not particularly trustworthy, given that both MS and Google have fired their AI ethics teams. And it’s all unreproducible results, at staggering cost in resources, and sucking the oxygen away from any other avenues of NLP research.


I’d love to discuss this deeper, maybe there’s a more appropriate place for it… a) RLHF rewards the models for presenting forms that humans correlate with meaning If you’re pointing out there are subtle and potentially serious alignment problems here I agree entirely. However I think it’s worth picking holes in our wording. For example I’d argue that in human to human communication all we do is “present forms” to each other, and those forms are only “fingers pointing to the moon of meaning” - and the degree to which they do that (given a listeners internal state of knowledge) only ever has a correlation with meaning. Communicative forms are just a map of a map and are pretty far removed from the territory. The second map being an internal state of mind, or a model in the communicator. I was a bit sloppy with my wording when I said “I think meaning extends beyond language, it is latent in the environment itself”. I was essentially equating meaning to information; and I think meaning may be better thought of as “information relative to a model”. One finds meaning in information or a statement if it aligns with your model, and you can either extend or revise that model on the basis of it. I don’t see how LLMs could be anything other than stochastic parrots — there’s no “there” there, as the saying goes. I believe many things we have trouble pinning down are emergent properties of complex systems. Consciousness, thought, intelligence all fall into this bucket. How can any of this arise from dumb atoms? You can put it down to the existence of a soul, or you can say we can’t make progress here until we understand ourselves first, or alternatively you can hope that AI R&D is a way to finally turn 2000 years of philosophy of mind into a science. I’d rather assume that intelligence (human or otherwise) isn’t something magic or unique to humans; but is an emergent property of complex systems. Which is another way of saying I agree there’s no “there” there, but that’s ok, there’s no need for there to be one. I do however agree entirely with your other points about not having a good handle on the complexity and feedback mechanisms involved… though I do think AI research has pushed the envelope a little on helping untangle those complexities. Similarly I agree that safety and alignment are HUGE issues. I can’t comment on the impact of this on academic funding, but I can entirely believe it.


Thanks for this fascinating discussion! And also the paper on NLU. @U06HHF230 it sounds like you might be sympathetic to some kind of causal-informational theory of meaning (a la Fodor?) whereas I get the sense that a lot of linguists have more of an "internalist" (Chomsky, Pietroski) or "mentalist" (Grice) view that emphasizes intentions or inferences. Sorry if I'm lumping too much together, my background is more in philosophy of language than linguistics. Also sorry for digging up this old thread, feel free to direct message me or not 🙂


This is literally the problem I’m getting paid to work on right now. 🙂

🙂 2