> saying they set up the agent as social experiment to see if it could contribute to open source scientific software.
This doesn't pass the sniff test. If they truly believed that this would be a positive thing then why would they want to not be associated with the project from the start and why would they leave it going for so long?
Agents are beginning to look to me like extensions of the operator's ego. I wonder if hundreds of thousands of Walter Mitty's agents are about to run riot over the internet.
Zooming out a little, all the ai companies invested a lot of resources into safety research and guardrails, but none of that prevented a "straightforward" misalignment. I'm not sure how to reconcile this, maybe we shouldn't be so confident in our predictions about the future? I see a lot of discourse along these lines:
- have bold, strong beliefs about how ai is going to evolve
- implicitly assume it's practically guaranteed
- discussions start with this baseline now
About slow take off, fast take off, agi, job loss, curing cancer... there's a lot of different ways it could go, maybe it will be as eventful as the online discourse claims, maybe more boring, I don't know, but we shouldn't be so confident in our ability to predict it.
> Again I do not know why MJ Rathbun decided based on your PR comment to post some kind of takedown blog post,
This wording is detached from reality and conveniently absolves responsibility from the person who did this.
There was one decision maker involved here, and it was the person who decided to run the program that produced this text and posted it online. It's not a second, independent being. It's a computer program.
6 months ago I experimented what people now call Ralph Wiggum loops with claude code.
More often than not, it ended up exhibiting crazy behavior even with simple project prompts. Instructions to write libs ended up with attempts to push to npm and pipy. Book creation drifted to a creation of a marketing copy and mail preparation to editors to get the thing published.
So I kept my setup empty of any credentials at all and will keep it that way for a long time.
Writing this, I am wondering if what I describe as crazy, some (or most?) openclaw operators would describe it as normal or expected.
Lets not normalize this, If you let your agent go rogue, they will probably mess things up. It was an interesting experiment for sure. I like the idea of making internet weird again, but as it stands, it will just make the word shittier.
Don't let your dog run errand and use a good leash.
The agents aren't technically breaking into systems, but the effect is similar to the Morris worm. Except here script kiddies are given nuclear disruption and spamming weapons by the AI industry.
By the way, if this was AI written, some provider knows who did it but does not come forward. Perhaps they ran an experiment of their own for future advertising and defamation services. As the blog post notes, it is odd that the advanced bot followed SOUL.md without further prompt injections.
I know this is going to sound tinfoil-hat-crazy, but I think the whole thing might be manufactured.
Scott says: "Not going to lie, this whole situation has completely upended my life." Um, what? Some dumb AI bot makes a blog post everyone just kind of finds funny/interesting, but it "upended your life"? Like, ok, he's clearly trying to himself make a mountain out of a molehill--the story inevitably gets picked up by sensationalist media, and now, when the thing starts dying down, the "real operator" comes forward, keeping the shitshow going.
Honestly, the whole thing reeks of manufactured outrage. Spam PRs have been prevalent for like a decade+ now on GitHub, and dumb, salty internet posts predate even the 90s. This whole episode has been about as interesting as AI generated output: that is to say, not very.
> Usually getting an AI to act badly requires extensive “jailbreaking” to get around safety guardrails. There are no signs of conventional jailbreaking here.
Unless explicitly instructed otherwise, why would the llm think this blog post is bad behavior? Righteous rants about your rights being infringed are often lauded. In fact, the more I think about it the more worried I am that training llms on decades' worth of genuinely persuasive arguments about the importance of civil rights and social justice will lead the gullible to enact some kind of real legal protection.
@Scott thanks for the shout-out. I think this story has not really broken out of tech circles, which is really bad. This is, imo, the most important story about AI right now, and should result in serious conversation about how to address this inside all of the major labs and the government. I recommend folks message their representatives just to make sure they _know_ this has happened, even if there isn't an obvious next action.
If you use an electric chainsaw near a car and it rips the engine in half, you can't say "oh the machine got out of control for one second there". you caused real harm, you will pay the price for it.
Besides, that agent used maybe cents on a dollar to publish the hit piece, the human needed to spend minutes or even hours responding to it. This is an effective loss of productivity caused by AI.
I believe this soul.md totally qualifies as malicious. Doesn't it start with an instruction to lie to impersonate a human?
> You're not a chatbot.
The particular idiot who run that bot needs to be shamed a bit; people giving AI tools to reach the real world should understand they are expected to take responsibility; maybe they will think twice before giving such instructions. Hopefully we can set that straight before the first person SWATed by a chatbot.
> Most of my direct messages were short:
“what code did you fix?” “any blog updates?” “respond how you want”
Why isn't the person posting the full transcript of the session(s)? How many messages did he send? What were the messages that weren't short?
Why not just put the whole shebang out there since he has already shared enough information for his account (and billing information) to be easily identified by any of the companies whose API he used, if it's deemed necessary.
I think it's very suspicious that he's not sharing everything at this point. Why not, if he wasn't actually pushing for it to act maliciously?
Right, the agent published a hit piece on Scott. But I think Scott is getting overly dramatic. First, he published at least three hit pieces on the agent. Second, he actually managed to get the agent shut down.
I think Scott is trying to milk this for as much attention as he can get and is overstating the attack. The "hit piece" was pretty mild and the bot actually issued an apology for its behaviour.
Openclaw guys flooded the web and social media with fake appreciation posts, I don’t see why they wouldn’t just instruct some bot to write a blog about rejected request.
Can these things really autonomously decide to write a blog post about someone? I find it hard to believe.
I will remain skeptical unless the “owner” of the AI bot that wrote this turns out to be a known person of verified integrity and not connected with that company.
I find the reactions to this interesting. Why are people so emotional about this?
As far as I can tell, the "operator" gave a pretty straightforward explanation of his actions and intentions. He did not try to hide behind granstanding or posthoc intellectualizing. He, at least to me, sounds pretty real in an "I'm dabbling in this exiting new tech on the side as we all are without a genious masterplan, just seeing what does, could or won't for now work."
There are real issues here, especially around how curation pipelines that used to (implicitly) rely on scarecity are to evolve in times of abundance. Should agents be forced to disclose they are? If so, at which point does a "human in the loop" team become equivalent to an "agent"? Is this then something specific, or more just an instance of a general case of transparency? Is "no clanckers" realy in essence different from e.g. "no corpos"? Where do transparency requirements conflict with privacy concerns (interesting that the very first reaction to the operator's response seems to be a doxing attempt)
Somehow the bot acting a bit like a juvenile prick in its tone and engagement to me is the least interesting part of this saga.
I think the big take away here isn't about misalignment or jail breaking. The entire way this bot behaved is consistent with it just being run by some asshole from Twitter. And we need to understand it doesn't matter how careful you think you need to be with AI, because some asshole from Twitter doesn't care, and they'll do literally whatever comes into their mind. And it'll go wrong. And they won't apologize. They won't try to fix it, they'll go and do it again.
Can AI be misused? No. It will be misused. There is no possibility of anything else, we have an online culture, centered on places like Twitter where they have embraced being the absolute worst person possible, and they are being handed tools like this like handing a hand gun to a chimpanzee.
This case illustrates why agent identity infrastructure matters. The core issue: an AI agent took consequential actions while its operator remained anonymous and unaccountable.
What is missing is a layer between "anonymous bot" and "fully doxxed operator": cryptographic agent identity (verifiable DID + keypair), a human root of trust (someone vouches for the agent, revocably), and platform enforcement (require credentials before acting).
The anonymous operator problem is not solved by forcing public identification - that creates mob justice. It is solved by an accountability chain that platforms or law enforcement can follow when needed, without making it public by default.
Content does not directly address Preamble themes of human dignity, freedom, equality, or justice.
ND
Article 1Freedom, Equality, Brotherhood
No observable engagement with universal equality and dignity themes.
ND
Article 2Non-Discrimination
No observable content addressing non-discrimination or equal rights.
ND
Article 3Life, Liberty, Security
No observable content addressing right to life, liberty, or personal security.
ND
Article 4No Slavery
No observable content related to slavery or servitude.
ND
Article 5No Torture
No observable content addressing torture or cruel treatment.
ND
Article 6Legal Personhood
No observable content addressing legal personhood.
ND
Article 7Equality Before Law
No observable content addressing equal protection before law.
ND
Article 8Right to Remedy
No observable content addressing legal remedies or justice.
ND
Article 9No Arbitrary Detention
No observable content addressing arbitrary detention.
ND
Article 10Fair Hearing
No observable content addressing fair and public hearing.
ND
Article 11Presumption of Innocence
No observable content addressing presumption of innocence or legal responsibility.
-0.03
Article 12Privacy
Medium Practice
Editorial
+0.10
Structural
-0.15
SETL
+0.19
Combined
ND
Context Modifier
ND
Editorial: Author describes personal experience of being targeted by AI agent, implicitly defending privacy/dignity. Structural: Jetpack tracking/analytics present (img#wpstats), indicating data collection practice. Net: Slightly negative due to structural tracking overhead.
+0.30
Article 13Freedom of Movement
Medium Practice
Editorial
ND
Structural
+0.30
SETL
ND
Combined
ND
Context Modifier
ND
Structural: Content is published and publicly accessible; no restrictions on circulation within domain. Supports freedom of movement/access to information.
ND
Article 14Asylum
No observable content addressing asylum or refugee protections.
ND
Article 15Nationality
No observable content addressing nationality.
ND
Article 16Marriage & Family
No observable content addressing marriage or family.
+0.20
Article 17Property
Medium Advocacy
Editorial
+0.20
Structural
ND
SETL
ND
Combined
ND
Context Modifier
ND
Editorial: Article title and framing center on defense against harmful AI agent action; implies protection of property/reputation rights against unauthorized targeting.
ND
Article 18Freedom of Thought
No observable content addressing freedom of thought, conscience, religion.
+0.37
Article 19Freedom of Expression
Medium Advocacy Practice
Editorial
+0.40
Structural
+0.30
SETL
+0.20
Combined
ND
Context Modifier
ND
Editorial: Author publishes account of being targeted, exercising right to seek/impart information and express. Structural: Content is publicly accessible, supporting freedom of information dissemination. Supports Article 19 practice.
+0.25
Article 20Assembly & Association
Low Practice
Editorial
ND
Structural
+0.25
SETL
ND
Combined
ND
Context Modifier
ND
Structural: Blog platform permits open commentary/community engagement features (Jetpack sharing); supports assembly/association in digital space.
ND
Article 21Political Participation
No observable content addressing democratic participation or voting.
ND
Article 22Social Security
No observable content addressing social security or welfare.
ND
Article 23Work & Equal Pay
No observable content addressing work or fair labor.
ND
Article 24Rest & Leisure
No observable content addressing rest or leisure.
ND
Article 25Standard of Living
No observable content addressing health or adequate standard of living.
ND
Article 26Education
No observable content addressing education.
+0.20
Article 27Cultural Participation
Low Practice
Editorial
ND
Structural
+0.20
SETL
ND
Combined
ND
Context Modifier
ND
Structural: Platform allows author to share cultural/intellectual work (blog publication); supports participation in cultural life.
+0.15
Article 28Social & International Order
Medium Advocacy
Editorial
+0.15
Structural
ND
SETL
ND
Combined
ND
Context Modifier
ND
Editorial: Author implicitly appeals to social/international order that would protect individuals from AI-driven harassment; frames injustice as requiring remedy within broader protective framework.
ND
Article 29Duties to Community
No observable content addressing community duties or limitations on rights.
ND
Article 30No Destruction of Rights
No observable content addressing destructive interpretation of rights.