Dropsitenews published a list of websites Facebook uses to train its AI on. Multiple Lemmy instances are on the list as noticed by user BlueAEther

Hexbear is on there too. Also Facebook is very interested in people uploading their massive dongs to lemmynsfw.

Full article here.

Link to the full leaked list download: Meta leaked list pdf

    • FaceDeer@fedia.io
      link
      fedilink
      arrow-up
      5
      arrow-down
      1
      ·
      11 days ago

      I’ve given my suggestion in other comments in this thread. In short: if you don’t want your comments to be seen by all, then don’t post them on a public forum that uses an open protocol specifically designed to broadcast your comments to everyone who cares to listen. Perhaps use some closed-off forum instead, preferably run by a large and litigious company that guards its possessions jealously.

        • FaceDeer@fedia.io
          link
          fedilink
          arrow-up
          1
          ·
          11 days ago

          Got any citations about this being illegal? If it is then the whole ActivityPub protocol is in trouble.

          • 反いじめ戦隊@ani.social
            link
            fedilink
            English
            arrow-up
            1
            arrow-down
            3
            ·
            11 days ago

            [Insert any copyright law you’re domicile to] Consult your lawyer about copyright violations of federated content. I am not yours to violate.

            • FaceDeer@fedia.io
              link
              fedilink
              arrow-up
              3
              ·
              11 days ago

              Have there been any relevant lawsuits you could point me to? Vaguely waving in the air and declaring “copyright” is not helpful.

                • FaceDeer@fedia.io
                  link
                  fedilink
                  arrow-up
                  5
                  arrow-down
                  1
                  ·
                  11 days ago

                  That article’s proposal is incompatible with how the Fediverse works. It proposes licensing models for viewing, printing, and copying, but all of this hinges on the content being delivered in a protected format that enforces those restrictions. It describes using encrypted “software envelopes” that check with a central server for authorization before allowing access. If content is freely accessible without technical restrictions, then legally, it’s considered published and available to the public.

                  I am never going to ask you for a license to read your posts. Go ahead, sue me.

                  • 反いじめ戦隊@ani.social
                    link
                    fedilink
                    English
                    arrow-up
                    1
                    arrow-down
                    4
                    ·
                    11 days ago

                    If content is freely accessible without technical restrictions, then legally, it’s considered published and available to the public.

                    That’s not how copyrighted content works. Consult your lawyer.

                    I am never going to ask you for a license to read your posts. Go ahead, sue me.

                    Thank you for your permission to send you to court.

    • qaz@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      9 days ago

      They’re just using very simple scrapers that don’t have any knowledge about how the site operates. The simplest counter would probably be using Anubis on the web interface.

      I wouldn’t mind waiting 2-3 seconds when first loading the site and mobile apps would remain unaffected since they use the API.