They paved paradise and put up a bunch of slop

30 April 2025

For a short few decades we've had something miraculous. Virtually anybody can set up a website to provide a service or serve a community, plugging in a server wherever connectivity is available. Nowadays this is rapidly getting harder due to spam, abuse, and crawlers slurping up data for LLM training.

Although they're culpable, we can't entirely blame the LLMs. This happy egalitarian concept has been under attack for a long time. Spam, DoS, hackers, and bots have long existed and continue to play havoc with anyone who runs a site with open registration. Despite this, low-budget independent websites have been holding on, even thriving. It remains practical to spin up a bare VPS with an IP address and start hosting—to a point. If your site is boring or unpopular you're probably not going to become a specific target for spam or DoS. Once you do, everything gets more challenging.

There are ways and means of coping. These include moderation services like Akismet, Cloudflare's free tier, and outsourcing the anti-bot work to other companies by requiring email or phone verification for new accounts. Each of these cedes independence or has other annoying trade-offs but for the most part it has continued to be possible for humans to set up online services for other humans.

Now we have LLMs too. I recently had an experience with an online retailer's "live chat" function which was miserable for two reasons. First I had to speak with a chatbot, which was annoying but quick to bypass. Then while communicating with a human I noticed my messages were stuck for a long time in the "Sending" phase, my CPU fan roaring, settling down once the message was sent. It's clear that my browser was being asked to complete a proof-of-work challenge, on the theory that a spambot would give up or at least be slowed down. Unfortunately, like many people, I have a laptop that is around 9 years old and the difficulty of this challenge meant that it took up to two minutes for my computer to send each message. Obviously, this sucks.

It's difficult to blame the chat system though. LLM bots can cheaply and easily produce unique messages indistinguishable from real people. What other options do they have to avoid the human customer service agents getting spam? The fact is, we're scraping the bottom of the barrel. Website operators have an existing pile of tricks, and we're getting a few more, but they're at best a deterrence. Bad actors have access to CPU time too, often stolen. The jig is up. We can't just put things on the internet and hope for the best any more, especially things where users can post content or do their own signups, and expect that those users are real people.

I probably sound overly pessimistic. Don't we have the same forums and social media operating today as we did a few years ago? Indeed, but existing services have an advantage, an established network of probably-human users. These provide an ongoing reliable core of non-bot content and they have a rich tapestry of social metadata such as follows and likes which reinforce links between people. Things will seem normal for a while yet. I assume that having this kind of history makes it easier to detect new bots and bot networks since they'll have much less overlap with the existing userbase. I recently signed up on Bluesky and the spam is kind of nasty—I suspect it's hard to manage particularly because they didn't have pre-LLM users. It's only going to get harder as bots get increasingly sophisticated.

So what options do we have left?

Large corporate identity providers. Login with Facebook, Microsoft, or GitHub. They have the funding, staff, and enormous scale to identify and stop spammers in a way that small operators don't, making them a much more reliable test than "does this user have a working email address?". Now you and your users are totally dependent on these services and their benevolence, and these corporate behemoths know which of their users registered on your site.
Financial relationships. Spammers will try many things to place affiliate links on your website but they will probably not give you two dollars. Somebody who signed up for a paid service is less likely to spam you and even if they did it's easy to block that specific account. Something Awful has successfully used this strategy for a long time.
Invite-only communities. This model works fairly well for the forum lobste.rs, where invitations can be sent by an existing member and the tree of who invites whom is publicly viewable. It isn't a silver bullet—the mod log provides occasional examples of spammers or voting rings getting banned but the quantity is surely less. I'll be curious to see how well this holds up. In the worst case, bots could slip in over time causing the discourse to slowly suffer, driving out interested humans until it's largely bots left. Smaller communities that are private by default will presumably have an easier time of it, such as those on Discord.

Obviously, this sucks. However there's no going back and we will have to make do with the options we have.

One question remains: whose fault is it anyway? Clearly, the people doing the spamming and conducting the fraud are directly responsible, but it's largely impossible to hold them accountable and that's kind of the point. This intractable situation was made possible by the researchers and developers who created LLM technology; yet algorithms are simply knowledge. Imagine if someone found an efficient way to generate a specific SHA-256 hash. This would be enormously disruptive but you could hardly say it was their fault.

No, if I blame anyone it's the LLM maximalists: the companies that make it easy to jam LLMs into everything, the companies that do jam LLMs into everything, and the people who go around talking about how fantastic they are. They single-mindedly brag about their own "gains" while brushing off the economic turmoil and disruption of internet norms. They have no taste and no respect for the human essence of creation and communication. They paved paradise and put up a bunch of slop.

Without the efforts of these maximalists and their investors, perhaps the world would not have been consumed by AI frenzy as quickly. We could have had some proper discussions in the media and in parliaments about how this kind of tech should be trained and deployed, about who is empowered and disempowered by its spread. Instead we're just winging it. Obviously, this sucks.

Serious Computer Business Blog by Thomas Karpiniec
Posts RSS, Atom