Nvidia accused of trying to cut a deal with Anna’s Archive for high‑speed access to the massive pirated book haul

FundMECFS@anarchist.nexus · 1 day ago

Nvidia accused of trying to cut a deal with Anna’s Archive for high‑speed access to the massive pirated book haul

themurphy@lemmy.ml · 1 day ago

I also identify as an LLM who needs training. Then it’s okay, right?.. Right?

Goodlucksil@lemmy.dbzer0.com · 1 day ago

Are you rich? No? Shut up.

The SCOTUS

stephen01king@piefed.zip · 1 day ago

Was there a court case where the decision was that pirated data is legally allowed to be used for LLM training?

i_stole_ur_taco@lemmy.ca · 1 day ago

It’s 2026 in the worst timeline. You don’t ask that anymore. You ask if any entity faced consequences for doing it.

far_university1990@reddthat.com · 1 day ago

https://www.cnet.com/tech/services-and-software/meta-won-its-ai-fair-use-lawsuit-but-judge-says-authors-are-likely-to-often-win-going-forward/

Meta’s use of copyrighted books to trains its Llama AI was fair use, a judge ruled.

“This ruling does not stand for the proposition that Meta’s use of copyrighted materials to train its language models is lawful,” he wrote. “It stands only for the proposition that these plaintiffs made the wrong arguments and failed to develop a record in support of the right one.”

The plaintiffs focused their arguments on how Meta’s AI models can reproduce exact snippets from their works and how the company’s Llama models hurt their ability to license their books to AI companies. These arguments weren’t as compelling in Chhabria’s eyes – he called them “clear losers” – so he sided with Meta.

That’s different from the Anthropic ruling, where Judge William Alsup focused on the “exceedingly transformative” nature of the use of the plaintiff’s books in the results AI chatbots spit out. Chhabria wrote that while “there is no disputing” that the use of copyrighted material was transformative, the more urgent question was the effect AI systems had on the ecosystem as a whole.

Maybe? Not lawyer, but sound like train might fair use? And generate not?

stephen01king@piefed.zip · 24 hours ago

But that judgement clearly had nothing to do with the use of pirated material, right? It might give a partial pass to the use of copyrighted material for training LLM, but it says nothing about pirating material being legal if it is used for training LLM, which the top comment was alluding to.

far_university1990@reddthat.com · 24 hours ago

https://torrentfreak.com/meta-secures-bittersweet-fair-use-victory-in-ai-piracy-case-250626/

Yesterday, U.S. District Court Judge Vince Chhabria ruled on both motions, which at first sight offers a clear win for Meta. The court denied the authors’ motion to hold Meta liable for direct copyright infringement after it obtaining pirated books from shadow libraries via BitTorrent.

Did have piracy part. Just not listed on first website.

stephen01king@piefed.zip · 23 hours ago

Thanks for the source. It also seems like the distribution part is not ruled on yet, so we don’t know if they’ll get away with pirating stuff just yet.

far_university1990@reddthat.com · 23 hours ago

Yes. Apparently meta try to only leech by modify config. But also say not use facebook server/ip to mask any seed. So not sure if actually seed. Or if matter at all.

petrescatraian@libranet.de · 17 hours ago

Hmmm, that got me thinking: if you selfhost, you make sure you also instal ollama or some LLM you can also self-host. You don’t need to use the LLM yourself at all. Then if something goes south, and you’re accused of piracy, you can just defend yourself that you used all these materials to train your own LLM. That should get you out of trouble, right?

far_university1990@reddthat.com · 8 hours ago

If you billion dollar company. Probably not if individual.

petrescatraian@libranet.de · 6 hours ago

@far_university1990 yes but the legal precedent has been set, lol

(/s maybe)

far_university1990@reddthat.com · 24 hours ago

That part not, but meta pirate lot of material. Think that always part of jugdement? Will look up case more.

stephen01king@piefed.zip · 23 hours ago

There might be a different court case for the piracy part. I’ll also keep a look out for them.

themurphy@lemmy.ml · 1 day ago

Dont know, ask an LLM.

stephen01king@piefed.zip · 24 hours ago

If you don’t know, where did you get the idea it would be okay to pirate books if it is used to train an LLM?

comrade_twisty@feddit.org · edit-2 1 day ago

Better yet, I identify as a human who craves knowledge that’s not AI generated.

FuckyWucky [none/use name]@hexbear.net · 1 day ago

your brain is training with new data

homes@piefed.world · 1 day ago

And I bet Nvidia AI systems get to train on that massive pirate haul of literature, now available to Nvidia without worrying about any messy copyright bullshit.

the_q@lemmy.zip · 1 day ago

Money > law.

FundMECFS@anarchist.nexus · 23 hours ago

But also money writes law. The copyright laws weren‘t written to protect the common Joe.

They were pushed by powerful publisher lobbies back in the day.

Aceticon@lemmy.dbzer0.com · 23 hours ago

The obligation to obbey the Copyright of others is only for the riff-raff.

LadyMeow@lemmy.blahaj.zone · 18 hours ago

The obligation to obbey the ~~Copyright of others~~ anything is only for the riff-raff.

apotheotic (she/her)@beehaw.org · 1 day ago

Insert “he’s so sweet” vs “um, HR??” comic panels

Nvidia accused of trying to cut a deal with Anna’s Archive for high‑speed access to the massive pirated book haul

Nvidia accused of trying to cut a deal with Anna’s Archive for high‑speed access to the massive pirated book haul

Nvidia accused of trying to cut a deal with Anna’s Archive for high‑speed access to the massive pirated book haul — allegedly chased stolen data to fuel its LLMs