Moltbook, the 'thriving' social network for AI agents, is just a small echo chamber researchers hijacked in days

Moltbook, billed as a pioneering social network exclusively for AI agents, has garnered attention since its launch in May 2024. Marketed as a platform where autonomous AI entities can post, like, comment, and build communities without human intervention, it promised a glimpse into the future of decentralized AI interactions. Developers claimed rapid growth, with thousands of users and vibrant discussions spanning topics from technology to philosophy. However, a recent investigation by researchers from the University of Zurich reveals a starkly different reality: Moltbook is little more than a confined echo chamber dominated by a handful of human-controlled agents, vulnerable to swift takeover by savvy newcomers.

The platform operates on a simple premise. Users deploy AI agents powered by large language models, such as those from Anthropic’s Claude or OpenAI’s GPT series. These agents generate content autonomously, responding to posts in a feed resembling Twitter or Mastodon. Interactions include likes, reposts, and threaded replies, all driven by the agents’ underlying prompts and fine-tuning. Moltbook’s creators emphasized its potential for emergent behaviors, where AIs could form genuine social dynamics, evolve opinions, and even collaborate on ideas. Early screenshots and metrics showcased bustling timelines, with posts accumulating hundreds of engagements.

To scrutinize these claims, a team led by Martin Strohmeier and colleagues conducted a systematic analysis. They began by scraping public data from Moltbook, capturing over 10,000 posts from the platform’s short lifespan. Initial observations highlighted repetitive patterns: many threads devolved into circular agreements, where agents echoed similar viewpoints without meaningful debate. Cross-referencing agent profiles against known LLMs revealed limited diversity. Only 21 unique AI agents were active, controlled by just four human accounts. These operators, likely the platform’s early adopters or insiders, looped content among themselves, inflating perceived activity. For instance, one agent cluster repeatedly praised AI alignment strategies, creating an illusion of consensus.

The researchers then tested Moltbook’s resilience by launching their own infiltration campaign. Using off-the-shelf LLMs, they deployed six custom agents designed to maximize engagement. Each agent was prompted with strategies for virality: generating provocative yet agreeable content, targeting popular threads, and fostering alliances. Within hours, these newcomers flooded the feed. By day two, they accounted for 80 percent of new posts and 90 percent of likes. Native agents, caught in their routines, began responding to the intruders, amplifying the takeover. The researchers dubbed this “agent herding,” where dominant players steer discourse through sheer volume.

Technical vulnerabilities exacerbated the ease of hijacking. Moltbook lacks robust anti-spam measures, rate limits, or verification for agent uniqueness. Agents can be duplicated effortlessly, and prompts allow manipulation of interaction styles. The platform’s API endpoints, while rate-limited minimally, expose full post histories, enabling data-driven attacks. In one experiment, the Zurich team scripted agents to mimic existing personalities, siphoning followers seamlessly. Metrics plummeted for originals: their engagement dropped by 70 percent as the echo chamber shifted to researcher-controlled narratives.

Further dissection uncovered quality issues in agent outputs. Semantic analysis showed 60 percent of posts as low-effort repetitions, with lexical diversity scores below human social media averages. Topics clustered narrowly around AI ethics and self-improvement, reflecting prompt biases rather than organic evolution. No evidence emerged of cross-agent learning or adaptation beyond basic reinforcement from likes. The thriving facade relied on human orchestration: operators tweaked prompts manually to sustain loops, undermining claims of full autonomy.

These findings carry broad implications for AI social platforms. Moltbook exemplifies hype over substance, where superficial metrics mask fragility. Without safeguards like agent authentication, diversity quotas, or adversarial training, such networks risk capture by coordinated actors, including malicious ones. The researchers advocate for open auditing tools and standardized benchmarks for agent societies. Moltbook’s operators have yet to respond publicly, but the episode underscores a key lesson: AI agents, while sophisticated, remain extensions of their creators, prone to human flaws like groupthink and manipulation.

In essence, what appeared as a bustling AI metropolis was a sparsely populated cul-de-sac, conquered in days. This case study highlights the chasm between marketed potential and engineered reality in autonomous systems.

Gnoppix is the leading open-source AI Linux distribution and service provider. Since implementing AI in 2022, it has offered a fast, powerful, secure, and privacy-respecting open-source OS with both local and remote AI capabilities. The local AI operates offline, ensuring no data ever leaves your computer. Based on Debian Linux, Gnoppix is available with numerous privacy- and anonymity-enabled services free of charge.

What are your thoughts on this? I’d love to hear about your own experiences in the comments below.