What's the correct license for vibecoded code? Am I liable for when the big AI corps get convicted for their steeling of licensed code/IP? Just asking for a friend 🙂 It feels strange to put a license on the code that is generated by a machine.
Currently some of them offer to cover your lawsuit fees if you get sued.
Well, but does anyone have any doubts they used pirated content to train their models? I mean... 😂
@afoltzm that collocation is point on. Never thought about such a scenario. Pretty interesting
I have been using MIT for code from AI. I'm no lawyer but it's definitely murky territory, and will take years for the courts to figure it out. I do think it's ridiculous that these companies hovered up all the data they could, even illegally, and now resell it back to everyone mashed-up at a premium.
Well in a hypothetical legal dispute, which party isn't a part of mega-investments most likely co-opted by the government in that juristiction? Weigh that against the level of trust in the judicial system of the particular country.
one could imagine a fully GPL-respecting LLM that would automatically release every line of code it generates under the GPL in its own public database of output. that way it would at least attempt to maintain the intellectual commons rather than enclosing it. but it is also easy to see why no company would ever invest in a tool like that
The AI companies have already gotten a pass for what they do (and have done). There is no future where these companies would accept a regression in quality of their models by beginning to respect ownership rights. I have gotten answers from ChatGPT about the internal implementation of source-unavailable C# systems that basically aren't published anywhere in the public web, and I've confirmed some of the answers to be accurate by decompilation. Do you think that if, say a Bank, ends up practically copying the work of such source-unavailable systems via using LLMs they wouldn't be a target of litigation just by saying "that's what the LLM produced"?
> I have gotten answers from ChatGPT about the internal implementation of source-unavailable C# systems that basically aren't published anywhere in the public web I am no expert but that seems a prompt that would be highly prone to hallucinatory replies
Sure, that's why I was cross checking the answer. But the c# methods described, their exact signature, and non-standard domain specific naming hinted that the source code was part of the corpus of data available to it. There was definitely enough bullshit in the rest of the chat session. It's not too dissimilar to how people early on showed that paywalled scientific papers and other copyrighted materials are part of the corpus of data.