Will check this out. Thanks!
Will check this out. Thanks!
Thank you for the detailed reply.
keeping on top of this is a full time job!
I guess that’s why I’m interested in a tooling based solution. My selfhosting is small-fry junk, but a lot of others like me are hosting entire fedi communities or larger websites.
In that case I’m interested in tools to automate doing that.
I hadn’t heard of that before, thanks for the link.
I haven’t read through the docs yet… But PoW makes me wonder what the work is and if it’s cryptocurrency related.
Edit: Found it: https://altcha.org/docs/proof-of-work/
In the hackernews comments for that geraspora link people discussed websites shutting down due to hosting costs, which may be attributed in part to the overly aggressive crawling. So maybe it’s just a different form of DDOS than we’re used to.
A commenter in the hackernews post has created this: https://marcusb.org/hacks/quixotic.html
I’m interested, but it seems like an easy way for bots to exhaust your own server resources before they give up crawling.
Thank you for the detailed response. It’s disheartening to consider the traffic is coming from ‘real’ browsers/IPs, but that actually makes a lot of sense.
I’m coming at this from the angle of AI bots ingesting a website over and over to obsessively look for new content.
My understanding is there are two reasons to try blocking this: to protect bandwidth from aggressive crawling, or to protect the page contents from AI ingestion. I think the former is doable, and the latter is an unwinnable task. My personal reason is because I’m an AI curmudgeon, I’d rather spend CPU resources blocking bots than serving any content to them.
Thank you for the reply, but at least one commenter claims they’ll impersonate Chrome UAs.
That’s pretty neat. Thanks!