OneTrust consent management and Zaraz tag manager visible in page source; third-party analytics/tracking infrastructure present
Terms of Service
—
No on-domain ToS observable; deferred to parent domain Cloudflare
Accessibility
+0.05
Article 2 Article 27
Semantic HTML structure observed (headings with IDs, navigation); view-transition reduced-motion preference respected; no major accessibility barriers detected in accessible markup
Mission
+0.10
Article 19 Article 20
Cloudflare's public mission includes open internet and developer empowerment; content demonstrates transparency in technical development process
Editorial Code
—
No explicit editorial policy observable on-domain
Ownership
+0.05
Article 2 Article 8
Cloudflare is a publicly-traded US company with established legal/corporate governance; no predatory ownership signals
Access Model
+0.15
Article 19 Article 27
Content freely accessible, no paywall; open-source project (vinext) linked and promoted; supports developer access to tools and knowledge
While neither am I nor the company I work for directly impacted by this outage, I wonder how long can Cloudflare take these hits and keep apologizing for it. Truly appreciate them being transparent about it, but businesses care more about SLAs and uptime than the incident report.
Insufficient mock data in the staging environment? Like no BYOIP prefixes at all? Since even one prefix should have shown that it would be deleted by that subtask...
From all the recent outages, it sounds like Cloudflare is barely tested at all. Maybe they have lots of unit tests etc, but they do not seem to test their whole system... I get that their whole setup is vast, but even testing that subtask manually would have surfaced the bug
I do not work in the space at all, but it seems like Cloudflare has been having more network disruptions lately than they used to. To anyone who deals with this sort of thing, is that just recency bias?
> Because the client is passing pending_delete with no value, the result of Query().Get(“pending_delete”) here will be an empty string (“”), so the API server interprets this as a request for all BYOIP prefixes instead of just those prefixes that were supposed to be removed. The system interpreted this as all returned prefixes being queued for deletion.
if v := req.URL.Query().Get("pending_delete"); v != "" {
// ignore other behavior and fetch pending objects from the ip_prefixes_deleted table
prefixes, err := c.RO().IPPrefixes().FetchPrefixesPendingDeletion(ctx)
if err != nil {
api.RenderError(ctx, w, ErrInternalError)
return
}
api.Render(ctx, w, http.StatusOK, renderIPPrefixAPIResponse(prefixes, nil))
return
}
even if the client had passed a value it would have still done exactly the same thing, as the value of "v" (or anything from the request) is not used in that block
This blog post is inaccurate, the prefixes were being revoked over and over - to keep your prefixes advertised you had to have a script that would readd them or else it would be withdrawn again. The way they seemed to word it is really dishonest.
Sure vibe-coded slop that has not been properly peer reviewed or tested prior to deployment is leading to major outages, but the point is they are producing lots of code. More code is good, that means you are a good programmer. Reading code would just slow things down.
Hindsight is 20/20 but why not dry run this change in production and monitor the logs/metrics before enabling it? Seems prudent for any new “delete something in prod” change.
The one redeeming feature of this failure is staged rollouts. As someone advertising routes through CF, we were quite happy to be spared from the initial 25%.
> Because the client is passing pending_delete with no value, the result of Query().Get(“pending_delete”) here will be an empty string (“”), so the API server interprets this as a request for all BYOIP prefixes instead of just those prefixes that were supposed to be removed.
Lmao, iirc long time ago Google's internal system had the same exact bug (treating empty as "all" in the delete call) that took down all their edges. Surprisingly there was little impact as traffic just routed through the next set of proxies.
It's something we debated in our team: if there's an API that returns data based on filters, what's the better behavior if no filters are provided - return everything or return nothing?
The consensus was that returning everything is rarely what's desired, for two reasons: first, if the system grows, allowing API users to return everything at once can be a problem both for our server (lots of data in RAM when fetching from the DB => OOM, and additional stress on the DB) and for the user (the same problem on their side). Second, it's easy to forget to specify filters, especially in cases like "let's delete something based on some filters."
So the standard practice now is to return nothing if no filters are provided, and we pay attention to it during code reviews. If the user does really want all the data, you can add pagination to your API. With pagination, it's very unlikely for the user to accidentally fetch everything because they must explicitly work with pagination tokens, etc.
Another option, if you don't want pagination, is to have a separate method named accordingly, like ListAllObjects, without any filters.
The code they posted doesn't quite explain the root cause. This is a good study case for resilient API design and testing.
They said their /v1/prefixes endpoint has this snippet:
if v := req.URL.Query().Get("pending_delete"); v != "" {
// ignore other behavior and fetch pending objects from the ip_prefixes_deleted table
prefixes, err := c.RO().IPPrefixes().FetchPrefixesPendingDeletion(ctx)
[..snip..]
}
What's implied but not shown here is that endpoint normally returns all prefixes. They modified it to return just those pending deletion when passing a pending_delete query string parameter.
The immediate problem of course is this block will never execute if pending_delete has no value:
This is because Go defaults query params to empty strings and the if statement skips this case. Which makes you wonder, what is the value supposed to be? This is not explained. If it's supposed to be:
Then this would work, but the implementation fails to validate this value. From this you can infer that no unit test was written to exercise the value:
The post explains "initial testing and code review focused on the BYOIP self-service API journey." We can reasonably guess their tests were passing some kind of "true" value for the param, either explicitly or using a client that defaulted param values. What they didn't test was how their new service actually called it.
So, while there's plenty to criticize on the testing front, that's first and foremost a basic failure to clearly define an API contract and implement unit tests for it.
But there's a third problem, in my view the biggest one, at the design level. For a critical delete path they chose to overload an existing endpoint that defaults to returning everything. This was a dangerous move. When high stakes data loss bugs are a potential outcome, it's worth considering more restrictive API that is harder to use incorrectly. If they had implemented a dedicated endpoint for pending deletes they would have likely omitted this default behavior meant for non-destructive read paths.
In my experience, these sorts of decisions can stem from team ownership differences. If you owned the prefixes service and were writing an automated agent that could blow away everything, you might write a dedicated endpoint for it. But if you submitted a request to a separate team to enhance their service to returns a subset of X, without explaining the context or use case very much, they may be more inclined to modify the existing endpoint for getting X. The lack of context and communication can end up missing the risks involved.
Final note: It's a little odd that the implementation uses Go's "if with short statement" syntax when v is only ever used once. This isn't wrong per se but it's strange and makes me wonder to what extent an LLM was involved.
the 'empty string = select all' pattern in the delete path is the kind of bug that static typing and explicit null handling would catch at compile time. when your delete API accepts a bare query string, you're one missed validation away from this. probably the deeper lesson is that destructive operations should require explicit confirmation of scope, not just 'no filter = everything.'
Old tech could work around these outages. Set up GSLB at a DNS provider that does health checks or perform your own health checks to both origin and CDN's and use API's to change DNS. If the origin servers are OK and the CDN is not, automatically change DNS to a different CDN. There should be multiple probes that form a consensus. This process assumes one is managing the configurations of their CDN's through code and API so that one can set up and tear down any number of CDN's on a whim.
That does mean having contracts with more than one CDN provider however the cost should be negotiated based on monthly volume. i.e. the CDN with the most uptime gets the most money. If an existing CDN under contract refuses to negotiate then move some non critical path services to them and let that contract expire. Instate a company wide policy to never return to a vendor if their contract was intentionally not renewed.
maybe go can do (string v, ok bool) for this or add proper sum types...
Score Breakdown
+0.42
PreamblePreamble
High A:transparency-accountability F:public-interest P:open-access
Editorial
+0.50
Structural
+0.30
SETL
+0.32
Combined
ND
Context Modifier
ND
Post explicitly asserts commitment to transparency, public explanation of service failure, and accountability. Preamble values of dignity, justice, and peace supported by transparent incident reporting demonstrating corporate responsibility.
+0.41
Article 1Freedom, Equality, Brotherhood
Medium A:equality-accountability
Editorial
+0.40
Structural
+0.30
SETL
+0.20
Combined
ND
Context Modifier
ND
Content treats all impacted customers equally in remediation explanation. Transparency applies to all parties. Structural accessibility supports equal information access.
+0.38
Article 2Non-Discrimination
Medium P:accessibility P:non-discrimination
Editorial
+0.20
Structural
+0.40
SETL
-0.28
Combined
ND
Context Modifier
ND
Semantic HTML and accessibility features support non-discriminatory access to information. No discriminatory language observed in editorial. Context modifier includes both accessibility (+0.05) and ownership (+0.05) modifiers.
0.00
Article 3Life, Liberty, Security
Editorial
0.00
Structural
0.00
SETL
ND
Combined
ND
Context Modifier
ND
Right to life, liberty, and security of person not directly addressed in technical incident post content.
0.00
Article 4No Slavery
Editorial
0.00
Structural
0.00
SETL
ND
Combined
ND
Context Modifier
ND
Slavery and servitude provisions not applicable to technical service documentation.
0.00
Article 5No Torture
Editorial
0.00
Structural
0.00
SETL
ND
Combined
ND
Context Modifier
ND
Torture and cruel treatment provisions not applicable to technical incident post.
0.00
Article 6Legal Personhood
Editorial
0.00
Structural
0.00
SETL
ND
Combined
ND
Context Modifier
ND
Recognition as person before the law not directly implicated in technical service documentation.
0.00
Article 7Equality Before Law
Editorial
0.00
Structural
0.00
SETL
ND
Combined
ND
Context Modifier
ND
Equal protection under law provisions not directly addressed in technical incident context.
+0.19
Article 8Right to Remedy
Medium A:remedial-action
Editorial
+0.10
Structural
+0.20
SETL
-0.14
Combined
ND
Context Modifier
ND
Post discusses remedies available to customers and commitment to prevention. Ownership modifier (+0.05) applied for legitimate corporate governance. No evidence of rights violations in content itself.
0.00
Article 9No Arbitrary Detention
Editorial
0.00
Structural
0.00
SETL
ND
Combined
ND
Context Modifier
ND
Arbitrary arrest and detention not applicable to technical service documentation.
0.00
Article 10Fair Hearing
Editorial
0.00
Structural
0.00
SETL
ND
Combined
ND
Context Modifier
ND
Fair trial rights not addressed in technical incident post.
0.00
Article 11Presumption of Innocence
Editorial
0.00
Structural
0.00
SETL
ND
Combined
ND
Context Modifier
ND
Presumption of innocence provisions not applicable to incident documentation.
-0.31
Article 12Privacy
High P:tracking-analytics F:consent-dependent
Editorial
+0.10
Structural
-0.30
SETL
+0.35
Combined
ND
Context Modifier
ND
Page source shows Zaraz tag manager and OneTrust consent management. Analytics infrastructure reduces privacy guarantee despite consent mechanism. Privacy modifier (-0.15) and ad_tracking modifier (-0.10) both directly apply to Article 12, totaling -0.25 context modifier.
+0.67
Article 13Freedom of Movement
High A:freedom-of-movement-information F:open-internet
Editorial
+0.60
Structural
+0.40
SETL
+0.35
Combined
ND
Context Modifier
ND
Post explicitly documents movement of information on Cloudflare network and BGP routing. Detailed technical transparency supports freedom of information. Access_model modifier (+0.15) applied for free, unrestricted content access.
+0.16
Article 14Asylum
Low A:remedy-seeking
Editorial
+0.20
Structural
+0.10
SETL
+0.14
Combined
ND
Context Modifier
ND
Post demonstrates company taking responsibility and outlining remediation processes, which supports asylum and refuge principles indirectly. Limited direct relevance to refugee/asylum context.
0.00
Article 15Nationality
Editorial
0.00
Structural
0.00
SETL
ND
Combined
ND
Context Modifier
ND
Right to nationality not addressed in technical service documentation.
0.00
Article 16Marriage & Family
Editorial
0.00
Structural
0.00
SETL
ND
Combined
ND
Context Modifier
ND
Marriage and family rights not applicable to technical incident post.
0.00
Article 17Property
Editorial
0.00
Structural
0.00
SETL
ND
Combined
ND
Context Modifier
ND
Property rights provisions not directly addressed in incident documentation.
+0.10
Article 18Freedom of Thought
Low F:technical-transparency
Editorial
+0.10
Structural
+0.10
SETL
0.00
Combined
ND
Context Modifier
ND
Technical documentation of systems supports freedom of thought indirectly through information access. Limited direct relevance to conscience and belief.
+0.87
Article 19Freedom of Expression
High A:freedom-of-expression A:transparency-accountability F:open-internet P:unrestricted-access
Editorial
+0.70
Structural
+0.50
SETL
+0.37
Combined
ND
Context Modifier
ND
Post exemplifies freedom to seek, receive, and impart information. Detailed technical disclosure, open access without paywall, and commitment to transparency demonstrate strong alignment. Mission modifier (+0.10) and access_model modifier (+0.15) applied; total +0.25.
+0.56
Article 20Assembly & Association
Medium A:freedom-of-assembly F:developer-community
Editorial
+0.50
Structural
+0.40
SETL
+0.22
Combined
ND
Context Modifier
ND
Content promotes developer community engagement and open technical discussion. Cloudflare's mission of open internet supports peaceful assembly and association. Mission modifier (+0.10) applied.
+0.26
Article 21Political Participation
Medium A:democratic-participation
Editorial
+0.30
Structural
+0.20
SETL
+0.17
Combined
ND
Context Modifier
ND
Post demonstrates corporate accountability and transparent governance in response to service failure. Democratic participation principles supported through information access and stakeholder communication.
+0.20
Article 22Social Security
Low F:social-responsibility
Editorial
+0.20
Structural
+0.20
SETL
0.00
Combined
ND
Context Modifier
ND
Post demonstrates corporate social responsibility in acknowledging impact and committing to improvements. Limited direct relevance to social security and economic rights.
0.00
Article 23Work & Equal Pay
Editorial
0.00
Structural
0.00
SETL
ND
Combined
ND
Context Modifier
ND
Right to work and fair wages not addressed in technical incident documentation.
0.00
Article 24Rest & Leisure
Editorial
0.00
Structural
0.00
SETL
ND
Combined
ND
Context Modifier
ND
Rest and leisure provisions not applicable to technical service post.
0.00
Article 25Standard of Living
Editorial
0.00
Structural
0.00
SETL
ND
Combined
ND
Context Modifier
ND
Standard of living rights not directly addressed in incident documentation.
+0.45
Article 26Education
Medium A:education-transparency P:open-knowledge
Editorial
+0.30
Structural
+0.30
SETL
0.00
Combined
ND
Context Modifier
ND
Post provides detailed technical education about BGP, IP addressing, and system architecture. Free access to knowledge supports education principles. Access_model modifier (+0.15) applied.
+0.64
Article 27Cultural Participation
Medium A:cultural-participation P:open-internet-access
Editorial
+0.40
Structural
+0.50
SETL
-0.22
Combined
ND
Context Modifier
ND
Post promotes participation in technical culture and internet ecosystem through transparent knowledge sharing. Accessibility features and open knowledge access support participation in cultural life. Access_model (+0.15) and accessibility (+0.05) modifiers applied; total +0.20.
+0.36
Article 28Social & International Order
Medium A:social-order-accountability
Editorial
+0.40
Structural
+0.30
SETL
+0.20
Combined
ND
Context Modifier
ND
Post demonstrates company establishing social and international order necessary for full realization of rights through transparency, accountability, and commitment to system improvements.
+0.30
Article 29Duties to Community
Medium F:community-responsibility
Editorial
+0.30
Structural
+0.30
SETL
0.00
Combined
ND
Context Modifier
ND
Post acknowledges duties to community through incident disclosure and remediation commitment. Demonstrates responsibility in exercising rights that affect others' internet access.
+0.20
Article 30No Destruction of Rights
Low F:interpretation-limitation
Editorial
+0.20
Structural
+0.20
SETL
0.00
Combined
ND
Context Modifier
ND
Post does not contain provisions that could be interpreted to destroy rights. Content supports UDHR purposes through transparency and accountability.