Anubis is awesome! Stopping (AI)crawlbots

zoey@lemmy.librebun.com · edit-2 20 hours ago

Anubis is awesome! Stopping (AI)crawlbots

Daniel Quinn@lemmy.ca · 15 hours ago

This all appears to be based on the user agent, so wouldn’t that mean that bad-faith scrapers could just declare themselves to be typical search engine user agent?

SorteKanin@feddit.dk · 6 hours ago

Most search engine bots publish a list of verified IP addresses where they crawl from, so you could check the IP of a search bot against that to know.

SheeEttin@lemmy.zip · 14 hours ago

Yes. There’s no real way to differentiate.

SorteKanin@feddit.dk · 6 hours ago

Actually I think most search engine bots publish a list of verified IP addresses where they crawl from, so you could check the IP of a search bot against that to know.

Anubis is awesome! Stopping (AI)crawlbots

Anubis is awesome! Stopping (AI)crawlbots

Incoherent rant.

Behold, Anubis.

“Weighs the soul of incoming HTTP requests to stop AI crawlers”