Anthropic Accuses Chinese AI Labs of Theft While Building Claude on Questionable Data
Anthropic just accused Chinese AI labs of stealing their models, and the irony is so thick you could train a neural network on it.
Let's be clear: if the accusations are true, stealing proprietary models is unambiguously wrong. No debate there. But Anthropic standing on a moral high ground about data provenance is, to put it mildly, rich.
Every major AI lab, Anthropic included, built their models on scraped internet data they never asked permission to use. Reddit posts, GitHub repos, news articles, blog posts, Stack Overflow answers. Millions of developers, writers, and creators never consented to having their work become training data. The legal argument is "well, it's transformative use" but let's not pretend that makes it ethically clean.
The difference between what Chinese labs allegedly did and what Western AI companies did is basically this: one stole the finished product, the other stole the ingredients. Both are theft, just at different stages of the pipeline.
Here's where it gets messier. Claude is genuinely impressive. The Constitutional AI approach is interesting work. But it's built on a foundation of data that, if we're being honest, wasn't exactly acquired through ethical means either. You can't build your castle on stolen bricks and then get mad when someone steals your castle.
The real issue isn't that Anthropic is wrong to call out model theft. They're right to do that. The issue is the entire industry needs to reckon with how we got here. Every major lab scraped data without permission, trained models worth billions, and now we're all shocked that other people are taking shortcuts too.
If we want to have serious conversations about AI ethics and intellectual property, we need to start from a place of honesty. That means acknowledging that the training data for basically every LLM exists in a legal and ethical grey zone at best.
The Chinese labs should face consequences if they stole models. Absolutely. But maybe, just maybe, Anthropic should also acknowledge that their own hands aren't exactly clean when it comes to data acquisition.
Both things can be true. The world isn't black and white, even when it really wants to be.