GenAI has started to kill open source projects

Pierre-Yves Lapersonne@programming.dev · 18 hours ago

GenAI has started to kill open source projects

HiddenLayer555@lemmy.ml · edit-2 5 hours ago

The question is: What is an effective legal framework that focuses on the precise harms, doesn’t allow AI vendors to easily evade accountability, and doesn’t inflict widespread collateral damage?

This is entirely my opinion and I’m likely wrong about many things, but at minimum:

The model has to be open source and freely downloadable, runnable, and copyleft, satisfying the distribution license requirements of copyleft source material (I’m willing to give a free pass to making it copyleft in general, as different copyleft licenses can have different and contradictory distribution license requirements, but IMO the leap from permissive to copyleft is the more important part). I suspect this alone will kill the AI bubble, because as soon as they can’t exclusively profit off it they won’t see AI as “the future” anymore.
All training data needs to be freely downloadable and independently hosted by the AI creator. Goes without saying that only material you can legally copy and host on your own server can be used as training data. This solves the IP theft issue, as IMO if your work is licensed such that it can be redistributed in its entirety, it should logically also be okay to use it as training data. And if you can’t even legally host it on your own server, using it to train AI is off the table. And the independently hosted dataset (complete with metadata about where it came from) also serves as attribution, as you can then search the training data for creators.
Pay server owners for use of their resources. If you’re scraping for AI you at the very least need to have a way for server owners to send you bills. And no content can be scraped from the original source more than once, see point 2.
Either have a mechanism of tracking acknowledgement and accurately generating references along with the code, or if that’s too challenging, I’m personally also okay with a blanket policy where anything AI generated is public domain. The idea that you can use AI generated code derived from open source in your proprietary app, and can then sue anyone who has the audacity to copy your AI generated code, is ridiculous and unacceptable.

GenAI has started to kill open source projects

GenAI has started to kill open source projects

Tailwind CSS Lets Go Of 75% Of Engineering Team After 40% Traffic Drop To Docs From Google