Web Bot Auth

This discussion on Hacker News centers on Cloudflare's "Verified Bots" program and the broader concept of Web Bot Authentication. Opinions are divided, with some viewing it as a necessary solution to online abuse and others as an overreach by a powerful, centralized entity.

Concerns About Cloudflare's Centralizing Influence and Trustworthiness

A significant portion of the discussion expresses apprehension about Cloudflare’s growing control over internet traffic and its potential to act as a gatekeeper. Critics worry about Cloudflare’s power to dictate which bots are allowed access and the implications of trusting a single company with this authority.

Cloudflare as a Central Chokepoint: Users like mips_avatar are critical, stating, "Cloudflare's verified bots program is a terrible idea. They want to be the central chokepoint for agents, and they're doing it in shady ways like auto enrolling customers into blocking agents."
Regulation of the Internet: zb3 echoes this sentiment, asserting, "Seems like Cloudflare wants to regulate the internet.. they should not have that power."
Lack of Trust and Past Behavior: cuuupid is particularly distrustful, citing Cloudflare's past actions: "Cloudflare is the last party that should be running this for two reasons. 1. THey have already proven to be a bad faith actor with their "DDoS protection." 2. This is pretty much the typical Cloudflare HN playbook. They release soemthing targeted at the current wave and hide behind an ideological barrier; meanwhile if you try to use them for anything serious they require a call with sales who jumps you with absurdly high pricing."
Monetization and Enshittification: observationist paints a bleak picture of Cloudflare’s motives: "You not understanding those reasons is not an excuse for allowing a giant tech company to step in and be the gatekeeper for a huge portion of the internet. Nor to monetize, enshittify, balkanize, and fragment the web with no effective recourse or oversight. Cloudflare shouldn't be allowed to operate, in my view."
Gatekeeping and Outsource Login Walls: nemothekid points out a perceived hypocrisy, "They did exactly that [put up a login wall], they just outsourced it to cloudflare. The problem became bad enough that a lot of other people did the same thing."

Support for Web Bot Authentication as a Necessary Standard

Conversely, many participants see Cloudflare's Verified Bots program, and the underlying Web Bot Authentication standard, as a practical and necessary solution to the proliferation of malicious bots and the abuse of web content. They argue that existing methods are insufficient and that verifiable bot identity is crucial for a healthier web.

Protecting Content and Intellectual Property: kylehotchkiss defends the initiative, stating, "Not everybody wants their sites scraped and their content used to train a model that they'll never see a penny from. Cloudflare is the only party who wants to build a system where both the models and individual sites have their interests respected."
Critique of Open-Source Alternatives: The effectiveness of alternative solutions is questioned. tick_tock_tick notes, "I have [looked into open-source alternatives], sadly they are basically worthless and often worse then worthless as they negatively impact the site."
A Credible Identity Mechanism: bobbiechen champions the technical merits of Web Bot Auth: "I believe Web Bot Auth is a useful and non-centralized emerging standard for self-identifying bots and agents... Web Bot Auth is a way for bots to self-identify cryptographically. Unlike the user agent header (which is trivially spoofed) or known IPs (painful to manage), Web Bot Auth uses HTTP Message Signatures using the bot's key, which should be published at some well-known location. This is a good thing! We want bots to be able to self-identify in a way that can't be impersonated." They also emphasize its potential for enabling better "Agent Experiences."
Addressing the Spam Problem: tick_tock_tick highlights the severity of the issue: "There is just too much spam and it's not clear that is a solvable problem without Cloudflare (or some other similar service)."
Standardization and Commoditization: maxwellg suggests that while Cloudflare may be first, the standard itself will likely be adopted and commoditized by others: "Cloudflare is only the first to market with a solution. If this proposal catches on every WAF vendor under the sun will have it implemented before the next sales cycle. Enforcement of this standard will be commoditized down to nothing."

The Role of Cloudflare in Website Security and Management

A segment of the discussion acknowledges the practical realities of running a website today, suggesting that services like Cloudflare are often necessary due to the overwhelming burden of security and traffic management.

Necessity Due to Real-World Hassles: tick_tock_tick argues against the criticism of Cloudflare's intermediary role: "Web operators choose to use them; hell they even pay Cloudflare to be between them. Seriously I just think you don't understand how bad it is to run a site without someone in-front of it."
Avoiding the Hassle: mcspiff concurs, drawing parallels to other complex infrastructure: "Couldn’t agree more — Much like running my own DNS or email server, I don’t think I’ll ever go back to running my own website directly on the internet. It’s just not worth the hassle."

Technical Nuances and Future Directions of Bot Authentication

There is also discussion around the technical specifications of Web Bot Auth and potential further developments, including policy layers for authorization and usage control.

Beyond Authentication to Authorization: jithinraj points out a missing piece: "Web Bot Auth solves authentication (“who is this bot?”) but not authorization/usage control. We still need a machine-readable policy layer so sites can express “what this bot may do, under which terms” (purpose limits, retention, attribution, optional pricing) at a well-known path, robots.txt-like, but enforceable via signatures." They propose a flow involving fetching policy and presenting signed receipts.
APIs vs. Web Crawling: nerdsniper questions the necessity of this approach for bots, positing, "Why use a 'web bot' instead of an API? Either can be driven by an AI 'agent'...but this just seems like an 'API key for a visual api interface', and rather wasteful in cost and resources." notatoad counters that not all data is available via APIs, and some entities want to crawl websites directly.
The Website as the New API: mediaman offers a perspective: "The website the human sees is the new API. That's needed because many APIs are either nonexistent or extremely marginal in design and content coverage."

Concerns about Implementation Details and Failure Modes

Even among those who see potential in bot authentication, there are concerns about how such systems are implemented and the potential for errors.

Private Gatekeeper Design: binarymax agrees "in principle, but I disagree that it should be designed and mandated by a private gatekeeper."
Failure Modes and Recourse: binarymax raises critical questions: "What happens if cloudflare decides you are a bot and you’re not. What recourse do you have? What are the formal mechanisms to ensure a person is not blocked from the majority of the web because cloudflare is a middleman and you are a false positive?"
User Experience Issues: realityfactchex expresses frustration with Cloudflare's impact on human users: "No offense, but screw CloudFlare, screw their captchas for humans, and screw their wedging themselves between web operators and web users."

The Role of Standards Development Processes

The thread touches upon how web standards evolve, with one user suggesting that powerful companies often initiate such processes.

Standards Evolution from Interested Parties: jacobn observes, "Isn't that how most web standards got their start? One of the interested parties pushed something, then things evolved through the standards process?"