Spotify Music Library Scraped by Pirate Activist Group

tfm@europe.pub · 1 day ago

Spotify Music Library Scraped by Pirate Activist Group

Nilz@sopuli.xyz · 1 day ago

Download all existing literature to build a library for preservation and you’re called a pirate. Download all existing literature from aforementioned library to train an LLM and you’re a tech innovator. What a strange world we live in.

Galactose@sopuli.xyz · edit-2 14 hours ago

Hey let’s create our own LLM or something that can pass as an LLM😏 maybe then we can get away with the pirating

Nilz@sopuli.xyz · 10 hours ago

Are you rich? Otherwise we’ll still be arrested.

Schmoo@slrpnk.net · 1 day ago

If we’re pirates then they’re privateers, and I know which I respect less.

P03 Locke@lemmy.dbzer0.com · edit-2 1 day ago

Download all existing literature to build a library for preservation and you’re called a pirate.

Said library contains petabytes of the exact text of each and every piece of literature.

Download all existing literature from aforementioned library to train an LLM and you’re a tech innovator.

Said model contains gigabytes of a bunch of weights that can never go back to the exact words of the book.

What a strange world we live in.

It’s not strange at all. It’s degrees of compression. You compress a JPEG to the point that it’s unrecognizable, and it’s no longer breaking copyright. It’s essentially like trying to write a book you just read based on memory.

hexagonwin@lemmy.sdf.org · 14 hours ago

so you’re saying degrading quality while getting filthy rich by stealing everyone else’s work is better than archival efforts? not sure what your point is.

Nilz@sopuli.xyz · 10 hours ago

His point is basically that if you remove every 5th word of a book it’s legal to hoard as it’s compressed.

01011@monero.town · 15 hours ago

Caping for big tech?

Nasty work.

upstroke4448@lemmy.dbzer0.com · 1 day ago

Lol Meta literally torrented 81 TB of data from the site. Stop with this “degrees of compression” bs

Schmoo@slrpnk.net · 1 day ago

Said model contains gigabytes of a bunch of weights that can never go back to the exact words of the book.

And yet, the tech bros do have access to the exact words. The only difference is that they don’t share, instead choosing to extract value from it by training an LLM and (eventually, hypothetically) turn a profit. The product is created by processing the intellectual labor of billions of people into a formless amalgam of human creativity, which is then exploited for their private benefit.