Fork me on GitHub

Hello, does anyone knows if chatGPT or similar AI tools will be able to produce big programs any time soon? They can already code-complete small parts of code like (10-100 lines), but i guess they have trained to see similar small parts of code millions of times. To me looks irrational to produce a big program unless they have seen the same type of big program, implemented like thousands of times,but i don't have a clue how this deep learning things work.

Rupert (All Street)08:02:58

The training sets typically includes big codebases downloaded from GitHub in addition to small snippets from places like stackoverflow. One key limitation preventing this is the context window size. Early versions of GPT-3 only could only look backwards 2,000 to 4,000 tokens (which is approx 1500 to 3000 words). This means after the AI has generated 1500 to 3000 words of output it can only see what it has produced and not what came before it (ie the prompt from the human). It may not even be able to see the namespace name or :require statements! This limitation can be fixed/improved over time with non-fixed context windows or larger fixed context windows or some other approach. This is one reason why tools like co-pilot can struggle recommending/utilising existing functions/libraries within a large codebase - there is simply no way to feed in all relevant functions/libraries into the prompt that may be needed. Instead CoPilot in a large codebase often resorts to suggesting open source libraries (seen during training), re-inventing the wheel from scratch or hallucinating new/made up functions.

👍 1

yes looks like that, that it has problems combining the context that already produced, thank you : ) i hope AI to be used in problems that we dont have people to solve them, else it will start replacing people, especially knowledge based jobs.


I think if you read this, then you can get a pretty good ide of what ChatGPT is doing, and thus what its limits are;

👍 2

I can confirm that he hallucinate often. And one of them was ruining my entire PLT career. We shouldn’t see it as a source of truth or knowledge, we should see it as a developed version of what we tried to find in Google or some tools.