No more Standard Search for standard.site

17 June 2026

Recently I was working on some DB optimisations on Standard Search, my hobby search engine for pages which have standard.site records on AT Protocol. Partway through, as I sifted through the data, I decided to shut it down.

One of the fun™ things about standard.site records is that anyone can create records for any URL. There's nothing stopping you. You're master of your own PDS. If you are an appview, i.e. some sort of service that wants to ingest and work with published standard.site records, then one of the things you need to do is verify that page. Each linked page is supposed to have a <link> tag in the header pointing back to the matching atproto record, so everything checks out. This way someone can't just point to a random URL and claim it as their own.

This is all well and good for end users, who never see the pages that fail verification, but appview servers need to go to these websites and download the pages to see if the verification tags exist. This means resolving the domain name, making an HTTP request for the exact URL named in the record, and inspecting it (programatically).

Most of the time this is cool and normal, but when thousands of records of… well… presumably not illegal but certainly exotic pornography are being added to the standard.site corpus, automated checks against these sites aren't necessarily traffic that you want to have associated with your IP address, either with your ISP or more broadly. And if that's what's being added "by accident", wait until malicious records show up.

So I'm tapping out. Not something I can be bothered with as a hobbyist.

I'm not naïve—I realised from the beginning that this outcome was a possibility. It just happened much sooner than I anticipated. It's one thing to index the blogs of all the lovely people on pckt, offprint, leaflet, etc. But the bridgy project brings a looot of material over from activitypub and this has accelerated the rate at which I have to deal with pages that were never intending to get verified, or are otherwise pretty strange.

I'm not upset or anything. I knew if standard.site blew up then moderation (in all its forms) would be a challenge. If there's a lesson to take away, it's to be cautious before writing software that acts on whatever comes through the jetstream/firehose. Some things are best left to companies with legal counsel.

I will continue to operate the standard.site validator. This is much less risky because users have to supply their own URLs, and since it's so lightweight I can easily run it on fly.io rather than my own hardware.

Serious Computer Business Blog by Thomas Karpiniec
Posts RSS, Atom