This page displays the subscription paywall interface for a Financial Times article on OpenAI and DeepSeek, with no editorial content accessible. The observable structural signal is a subscription-based access model that restricts information availability to paying subscribers, which conflicts with UDHR Articles 19 (freedom to receive information), 25 (adequate standard of living), and 27 (cultural and scientific participation). While FT maintains strong editorial standards and journalistic mission, the paywall structure prioritizes commercial interests over universal information access.
While I'm as amused as everyone else - I think it's technically accurate to point out that the "we trained it for $6 mio" narrative is contingent on the done investment by others.
I'm not being sarcastic, but we may soon have to torrent DeepSeek's model. OpenAI has a lot of clout in the US and could get DeepSeek banned in western countries for copyright.
They don't care, T&C and copyright is void unless it affects them, others can go kick rocks. Not surprising they and OpenAI will do a legal battle over this.
> “It’s also extremely hard to rally a big talented research team to charge a new hill in the fog together,” he added. “This is the key to driving progress forward.”
Well I think DeepSeek releasing it open source and on an MIT license will rally the big talent. The open sourcing of a new technology has always driven progress in the past.
The last paragraph too is where OpenAi seems to be focusing their efforts..
> we engage in countermeasures to protect our IP, including a careful process for which frontier capabilities to include in released models ..
> ... we are working closely with the US government to best protect the most capable models from efforts by adversaries and competitors to take US technology.
So they'll go for getting DeepSeek banned like TikTok was now that a precedent has been set ?
2. Lived through a similar scenario in 2010 or so.
Early in my professional career I've worked for a media company that was scraping other sites (think Craigslist but for our local market) to republish the content on our competing website. I wasn't working on that specific project, but I did work on an integration on my teams project where the scraping team could post jobs on our platform directly. When others started scraping "our content" there were a couple of urgent all hands on deck meetings scheduled, with a high level of disbelief.
OpenAI's models were trained on ebooks from a private ebook torrent tracker leeched en-mass during a free leech event by people who hated private torrent trackers and wanted to destroy their "economy."
The books were all in epub format, converted, cleaned to plain text, and hosted on a public data hoarder site.
> “It is (relatively) easy to copy something that you know works,” Altman tweeted. “It is extremely hard to do something new, risky, and difficult when you don’t know if it will work.”
The humor/hypocrisy of the situation aside, it does seem to be true that OpenAI is consistently the one coming up with new ideas first (GPT 4, o1, 4o-style multimodality, voice chat, DALL-E, …) and then other companies reproduce their work, and get more credit because they actually publish the research.
Unfortunately for them it’s challenging to profit in the long term from being first in this space and the time it takes for each new idea to be reproduced is getting shorter.
OpenAI is going after a company that open sourced their model, by distilling from their non-open AI?
OpenAI talks a lot about the principles of being Open, while still keeping their models closed and not fostering the open source community or sharing their research. Now when a company distills their models using perfectly allowed methods on the public internet, OpenAI wants to shut them down too?
This reminds me of the railroads, where once railroads were invented, there was a huge investment boom of eveyrone trying to make money of the railroads, but the competition brought the costs down where the railroads weren’t the people who generally made the money and got the benefit, but the consumers and regular businesses did and competition caused many to fail.
AI is probably similar where the Moore’s law and advancement will eventually allow people to run open models locally and bring down the cost of operation. Competiition will make it hard for all but one or two players to survive and Nvidia, OpenAI, Deepseek, etc most investments in AI by these large companies will fail to generate substantial wealth but maybe earn some sort of return or maybe not.
All the top level comments are basking in the irony of it, which is fair enough. But I think this changes the Deepseek narrative a bit. If they just benefited from repurposing OpenAI data, that's different than having achieved an engineering breakthrough, which may suggest OpenAI's results were hard earned after all.
Is OpenAI claiming copyright ownership over the generated synthetic data?
That would be a dangerous precedent to establish.
If it's a terms of service violation, I guess they're within their rights to terminate service, but what other recourse do they have?
Other than that, perhaps this is just rhetoric aimed at introducing restrictions in the US, to prevent access to foreign AI, to establish a national monopoly?
There is a lot of discussion here about IP theft. Honest question, from deepseek's point of view as a company under a different set of laws than US/Western -- was there IP theft?
A company like OpenAI can put whatever licensing they want in place. But that only matters if they can enforce it. The question is, can they enforce it against deepseek? Did deepseek do something illegal under the laws of their originating country?
I've had some limited exposure to media related licensing when releasing content in China and what is allowed is very different than what is permitted in the US.
The interesting part which points to innovation moving outside of the US is US companies are beholden to strict IP laws while many places in the world don't have such restrictions and will be able to utilize more data more easily.
I think there's two different things going on here:
"DeepSeek trained on our outputs and that's not fair because those outputs are ours, and you shouldn't take other peoples' data!" This is obviously extremely silly, because that's exactly how OpenAI got all of its training data in the first place - by scraping other peoples' data off the internet.
"DeepSeek trained on our outputs, and so their claims of replicating o1-level performance from scratch are not really true" This is at least plausibly a valid claim. The DeepSeek R1 paper shows that distillation is really powerful (e.g. they show Llama models get a huge boost by finetuning on R1 outputs), and if it were the case that DeepSeek were using a bunch of o1 outputs to train their model, that would legitimately cast doubt on the narrative of training efficiency. But that's a separate question from whether it's somehow unethical to use OpenAI's data the same way OpenAI uses everyone else's data.
Everyone is responding to the intellectual property issue, but isn't that the less interesting point?
If Deepseek trained off OpenAI, then it wasn't trained from scratch for "pennies on the dollar" and isn't the Sputnik-like technical breakthrough that we've been hearing so much about. That's the news here. Or rather, the potential news, since we don't know if it's true yet.
Hey, OpenAI, so, you know that legal theory that is the entire basis of your argument that any of your products are legal? "Training AI on proprietary data is a use that doesn't require permission from the owner of the data"?
You might want to consider how it applies to this situation.
I was wondering if this might be the case, similar to how Bing’s initial training included Google’s search results [1]. I’d be curious to see more details of OpenAI’s evidence.
It is, of course, quite ironic for OpenAI to indiscriminately scrape the entire web and then complain about being scraped themselves.
The cat is out of the bag. This is the landscape now, r1 was made in a post-o1 world. Now other models can distill r1 and so on.
I don’t buy the argument that distilling from o1 undermines deep seek’s claims around expense at all. Just as open AI used the tools ‘available to them’ to train their models (eg everyone else’ data), r1 is using today’s tools.
Does open AI really have a moral or ethical high ground here?
If it's true, how is it problematic? It seems aligned with their mission:
> We will attempt to directly build safe and beneficial AGI, but will also consider our mission fulfilled if our work aids others to achieve this outcome.
> We will actively cooperate with other research and policy institutions; we seek to create a global community working together to address AGI’s global challenges.
OpenAI has been in a war-room for days searching for a match in the data, and they just came out with this without providing proof.
My cynical opinion is that the traning corpus has some small amount of data generated by OpenAI, which is probably impossible to avoid at this point, and they are hanging on that thread for dear life.
For the curious, it was vertical integration in the railroad-oil/-coal industry which is where the money was made.
The problem for AI is the hardware is commodified and offers no natural monopoly, so there isn't really anything obvious to vertically integrate-towards-monopoly.
The railroads drama ended when JP Morgan (the person, not yet the entity) brought all the railroad bosses together, said "you all answer to me because I represent your investors / shareholders", and forced a wave of consolidation and syndicates because competition was bad for business.
Then all the farmers in the midwest went broke not because they couldn't get their goods to market, but because JP Morgan's consolidated syndicates ate all their margin hauling their goods to market.
Consolidation and monopoly over your competition is always the end goal.
> US and could get DeepSeek banned in western countries for copyright
If US is going to proceed with trade war on EU, as it was planning anyway, then DeepSeek will be banned only in US. Seems like term "western countries" is slowly eroding.
The most interesting part is that China has been ahead of the US in AI for many years, just not in LLMs.
You need to visit mainland China and see how AI applications are everywhere, from transport to goods shipping.
I'm not surprised at all. I hope this in the end makes the US kill its strict IP laws, which is the problem.
If the US doesn't, China will always have a huge edge on it, no matter how much NVidia hardware the US has.
And you know what, Huawei is already making inference hardware... it won't take them long to finally copy the TSMC tech and flip the situation upside down.
When China can make the equivalent of H100s, it will be hilarious because they will sell for $10 in Aliexpress :-)
I understand they just used the API to talk to the OpenAI models. That... seems pretty innocent? Probably they even paid for it? OpenAI is selling API access, someone decided to buy it. Good for OpenAI!
I understand ToS violations can lead to a ban. OpenAI is free to ban DeepSeek from using their APIs.
But it does mean moat is even less defensible for companies whose fortunes are tied to their foundation models having some performance edge, and a shift in the kinds of hardware used for inference (smaller, closer to the edge.)
That's not correct. First of all, training off of data generated by another AI is generally a bad idea because you'll end up with a strictly less accurate model (usually). But secondly, and more to your point, even if you were to use training data from another model, YOU STILL NEED TO DO ALL THE TRAINING.
Using data from another model won't save you any training time.
> where the Moore’s law and advancement will eventually allow people to run open models locally
Probably won't be Moore's law (which is kind of slowing down) so much as architectural improvements (both on the compute side and the model side - you could say that R1 represents an architectural improvement of efficiency on the model side).
> So they'll go for getting DeepSeek banned like TikTok was now that a precedent has been set ?
Can't really ban what can be downloaded for free and hosted by anyone. There are many providers hosting the ~700B parameter version that aren't CCP aligned.
I think that's a very possible outcome. A lot of people investing in AI are thinking there's a google moment coming where one monopoly will reign supreme. Google has strong network effects around user data AND economies of scale. Right now, AI is 1-player with much weaker network effects. The user data moat goes away once the model trains itself effectively and the economies of scale advantage goes away with smart small models that can be efficiently hosted by mortals/hobbyists. The DeepSeek result points to both of those happening in the near future. Interesting times.
The humor/hypocrisy of the situation aside, it does seem to be true that OpenAI is consistently the one coming up with new ideas first (GPT 4, o1, 4o-style multimodality, voice chat, DALL-E, …) and then other companies reproduce their work, and get more credit because they actually publish the research
I claim one just can't put the humor/hypocrisy aside that easily.
What OpenAI did with the release of ChatGPT is productize research that was open and ongoing with Deepmind and other leading at least as much. And everything after that was an extension of the basic approach - improved, expanded but ultimately the same sort of beast. One might even say the situation of OpenAI to DeepMind was like Apple to Xerox. Productizing is nothing to sneeze at - it requires creativity and work to productize basic research. But naturally get end-users who consider the productizers the "fountain heads", who overestimate the productizers because products are all they see.
Why would it cast any doubt? If you can use o1 output to build a better R1. Then use R1 output to build a better X1... then a better X2.. XN, that just shows a method to create better systems for a fraction of the cost from where we stand. If it was that obvious OpenAI should have themselves done. But the disruptors did it. It hindsight it might sound obvious, but that is true for all innovations. It is all good stuff.
Reminds me of the Bill Gates quote when Steve Jobs accused him of stealing the ideas of Windows from Mac:
Well, Steve... I think it’s more like we both had this rich neighbor named Xerox and I broke into his house to steal the TV set and found out that you had already stolen it.
Xerox could be seen as Google, whose researchers produced the landmark Attention Is All You Need paper, and the general public, who provided all of the training data to make these models possible.
On another subject, if it belongs to OpenAI because it uses OpenAI, then doesn't that mean that everything produced using OpenAI belongs to OpenAI? Isn't that a reason not to use OpenAI? It's very similar to saying that you used Google and searched; now this product belongs to Google. They couldn't figure out how to respond; they went crazy.
Even if all that about training is true, the bigger cost is inference and Deepseek is 100x cheaper. That destroys OpenAI/Anthropic's value proposition of having a unique secret sauce so users are quickly fleeing to cheaper alternatives.
Google Deepmind's recent Gemini 2.0 Flash Thinking is also priced at the new Deepseek level. It's pretty good (unlike previous Gemini models).
The existence of R1-zero is evidence against any sort of theft of OpenAI's internal COT data. The model sometimes outputs illegible text that's useful only to R1. You can't do distillation without a shared vocabulary. The only way R1 could exist is if they trained it with RL.
DDOSing web sites and grabbing content without anyone's consent is not hard earned at all. They did spent billions on their thing, but nothing was earned as they could never do that legally.
Editorial Channel
What the content says
ND
PreamblePreamble
Medium Practice
No editorial content accessible; paywall prevents evaluation of article substance
FW Ratio: 60%
Observable Facts
Page displays subscription-gated interface with primary call-to-action 'Subscribe to unlock this article'
Only headline visible: 'OpenAI says it has evidence China's DeepSeek used its model to train competitor'
Page structure consists of navigation menus, subscription tier options, and paywall messaging rather than article content
Inferences
Paywall model indicates structural prioritization of revenue over universal information access
Access barrier design restricts practical realization of universal dignity for those without financial capacity to subscribe
ND
Article 1Freedom, Equality, Brotherhood
Low Practice
No editorial content accessible
FW Ratio: 50%
Observable Facts
Subscription pricing tiers create differential access based on payment capacity
Inferences
Paywall structure creates de facto inequality in access to information based on wealth
ND
Article 2Non-Discrimination
Low
No relevant editorial content observable
ND
Article 3Life, Liberty, Security
No relevant content
ND
Article 4No Slavery
No relevant content
ND
Article 5No Torture
No relevant content
ND
Article 6Legal Personhood
No relevant content
ND
Article 7Equality Before Law
No relevant content
ND
Article 8Right to Remedy
No relevant content
ND
Article 9No Arbitrary Detention
No relevant content
ND
Article 10Fair Hearing
No relevant content
ND
Article 11Presumption of Innocence
No relevant content
ND
Article 12Privacy
Medium Practice
No editorial content accessible
FW Ratio: 67%
Observable Facts
Subscription interface requires email and payment credentials
Privacy Policy mentioned in footer indicating data collection practices
Inferences
Paywall monetization model necessitates personal data collection, creating privacy concerns
ND
Article 13Freedom of Movement
No relevant content
ND
Article 14Asylum
No relevant content
ND
Article 15Nationality
No relevant content
ND
Article 16Marriage & Family
No relevant content
ND
Article 17Property
No relevant content
ND
Article 18Freedom of Thought
No relevant content
ND
Article 19Freedom of Expression
High Practice
No editorial content accessible due to paywall; cannot evaluate editorial stance on freedom of expression and information
FW Ratio: 60%
Observable Facts
Article headline visible but full content restricted behind 'Subscribe to unlock this article' barrier
Page prominently displays three subscription tiers (Standard Digital, FT Digital Edition, Premium Digital) as mandatory for content access
No free or alternative access option provided for article content
Inferences
Paywall structure creates deliberate structural barrier to information access contrary to universal information rights
Subscription model restricts freedom to receive information to those with financial capacity, violating universal accessibility principles
ND
Article 20Assembly & Association
No relevant content
ND
Article 21Political Participation
No relevant content
ND
Article 22Social Security
No relevant content
ND
Article 23Work & Equal Pay
No relevant content
ND
Article 24Rest & Leisure
No relevant content
ND
Article 25Standard of Living
Medium Practice
No editorial content accessible
FW Ratio: 60%
Observable Facts
Navigation displays Markets, Economics, Personal Finance, and Business sections all behind paywall
Visible headlines include economic content (UK property crisis, Iran military operations, US bank stocks) accessible only to subscribers
No summary or free tier of economic/financial information available
Inferences
Information access required for informed economic decision-making is restricted to paid subscribers
Unequal access to economic and financial information affects ability to maintain and improve adequate living standards
ND
Article 26Education
Low Practice
No educational content accessible
FW Ratio: 50%
Observable Facts
Business Education and educational content sections visible in navigation but behind paywall
Inferences
Access to educational and knowledge-building content restricted to subscribers only
ND
Article 27Cultural Participation
Medium Practice
No editorial content accessible
FW Ratio: 60%
Observable Facts
Life & Arts section (Books, Food & Drink, Art, Travel, Puzzles) visible in navigation but behind paywall
Scientific analysis, opinion, and intellectual commentary restricted to subscribers
FT Magazines and special reports gated behind subscription
Inferences
Subscription model creates unequal access to cultural participation and scientific information
Intellectual and cultural output restricted to those with financial means to subscribe
ND
Article 28Social & International Order
No relevant content
ND
Article 29Duties to Community
No relevant content
ND
Article 30No Destruction of Rights
No relevant content
Structural Channel
What the site does
ND
PreamblePreamble
Medium Practice
Observable paywall structure restricts user access to information, contrary to UDHR's emphasis on universal human dignity and equal rights to knowledge
ND
Article 1Freedom, Equality, Brotherhood
Low Practice
Paywall creates unequal access conditions; those without financial means cannot access information regardless of other characteristics
ND
Article 2Non-Discrimination
Low
No observable discrimination based on protected characteristics in paywall access policy
ND
Article 3Life, Liberty, Security
No observable relevance to life, liberty, or personal security
ND
Article 4No Slavery
No observable relevance
ND
Article 5No Torture
No observable relevance
ND
Article 6Legal Personhood
No observable relevance
ND
Article 7Equality Before Law
No observable relevance
ND
Article 8Right to Remedy
No observable relevance
ND
Article 9No Arbitrary Detention
No observable relevance
ND
Article 10Fair Hearing
No observable relevance
ND
Article 11Presumption of Innocence
No observable relevance
ND
Article 12Privacy
Medium Practice
Subscription model requires personal data collection (account creation, payment information); user privacy implicated by business structure
ND
Article 13Freedom of Movement
No observable relevance to freedom of movement
ND
Article 14Asylum
No observable relevance
ND
Article 15Nationality
No observable relevance
ND
Article 16Marriage & Family
No observable relevance
ND
Article 17Property
No observable relevance
ND
Article 18Freedom of Thought
No observable relevance
ND
Article 19Freedom of Expression
High Practice
Paywall explicitly restricts access to information; directly contradicts UDHR Article 19 which guarantees freedom to 'seek, receive and impart information and ideas through any media and regardless of frontiers' without financial gatekeeping
ND
Article 20Assembly & Association
No observable relevance to freedom of assembly and association
ND
Article 21Political Participation
No observable relevance
ND
Article 22Social Security
No observable relevance
ND
Article 23Work & Equal Pay
FT operates within labor law framework; no observable violations in employment practices observable from paywall page
ND
Article 24Rest & Leisure
No observable relevance
ND
Article 25Standard of Living
Medium Practice
Paywall restricts access to information (Markets, Economics, Personal Finance, Business) that materially contributes to ability to maintain adequate standard of living
ND
Article 26Education
Low Practice
Paywall restricts access to educational content and knowledge resources (Business Education section visible but gated)
ND
Article 27Cultural Participation
Medium Practice
Paywall restricts access to cultural, scientific, and intellectual content (Arts, Books, Food & Drink, Travel, analysis); contradicts universal access to human cultural heritage and scientific knowledge
ND
Article 28Social & International Order
No observable relevance to social order framework
ND
Article 29Duties to Community
No observable relevance
ND
Article 30No Destruction of Rights
No observable relevance
Supplementary Signals
How this content communicates, beyond directional lean. Learn more
build 6157e1d+ai0o · deployed 2026-02-28 16:55 UTC · evaluated 2026-02-28 16:29:11 UTC
Support HN HRCB
Each evaluation uses real API credits. HN HRCB runs on donations — no ads, no paywalls.
If you find it useful, please consider helping keep it running.