cross-posted from: https://europe.pub/post/7719730
cross-posted from: https://europe.pub/post/7719728
Here it is: https://annas-archive.org/blog/backing-up-spotify.html
cross-posted from: https://europe.pub/post/7719730
cross-posted from: https://europe.pub/post/7719728
Here it is: https://annas-archive.org/blog/backing-up-spotify.html
Download all existing literature to build a library for preservation and you’re called a pirate. Download all existing literature from aforementioned library to train an LLM and you’re a tech innovator. What a strange world we live in.
Hey let’s create our own LLM or something that can pass as an LLM😏 maybe then we can get away with the pirating
Are you rich? Otherwise we’ll still be arrested.
If we’re pirates then they’re privateers, and I know which I respect less.
Said library contains petabytes of the exact text of each and every piece of literature.
Said model contains gigabytes of a bunch of weights that can never go back to the exact words of the book.
It’s not strange at all. It’s degrees of compression. You compress a JPEG to the point that it’s unrecognizable, and it’s no longer breaking copyright. It’s essentially like trying to write a book you just read based on memory.
so you’re saying degrading quality while getting filthy rich by stealing everyone else’s work is better than archival efforts? not sure what your point is.
His point is basically that if you remove every 5th word of a book it’s legal to hoard as it’s compressed.
Caping for big tech?
Nasty work.
Lol Meta literally torrented 81 TB of data from the site. Stop with this “degrees of compression” bs
And yet, the tech bros do have access to the exact words. The only difference is that they don’t share, instead choosing to extract value from it by training an LLM and (eventually, hypothetically) turn a profit. The product is created by processing the intellectual labor of billions of people into a formless amalgam of human creativity, which is then exploited for their private benefit.