Manipulative AI, more cyber security risks and the best open model to date

Hi Reader,

Here are three things I found interesting in the world of AI in the last week:

LLMs secretly manipulated Reddit users’ opinions in an unauthorized experiment - article, follow-up

Researchers from the University of Zurich just got caught running a massive unauthorized AI experiment on r/changemyview, where they unleashed AI bots that posted 1,783 comments over four months without anyone’s consent. The bots were programmed to be maximally persuasive by adopting fabricated identities including rape victims, trauma counselors, and people from specific racial backgrounds.

The researchers didn’t just let the AIs loose – they had another LLM analyze users’ Reddit history to infer personal attributes like gender, age, ethnicity, and political orientation to customize the AI’s arguments. They also had a human in the loop doing the actual posting to reddit, which when you read some of the posts makes it kind of worse. If you’re getting Black Mirror vibes, you’re not alone.

What’s shocking is how effective this manipulation was. The AI-generated comments racked up over 20,000 upvotes and earned 137 “deltas” – awards given when someone successfully changes the original poster’s view. The research team apparently considered this a win, claiming the “benefits outweighed the risks.” Good luck finding an ethics board that agrees, well, approving it now that is. The ethics board that actually approved it is having a not-so-fun time right now.

The University of Zurich’s response has evolved through several phases as the backlash intensified. Initially, their Faculty of Arts and Sciences Ethics Commission defended the research, stating that “this project yields important insights, and the risks are minimal” and that “suppressing publication is not proportionate to the importance of the insights.” Then, as criticism mounted, they issued a formal warning to the principal investigator while still standing by the research.

By April 29th, facing mounting public pressure and potential legal action, the university completely reversed course. In a statement to the media, they announced the researchers “have decided on their own accord not to publish the research results” and that the Ethics Committee “intends to adopt a stricter review process in the future and, in particular, to coordinate with the communities on the platforms prior to experimental studies.” Information scientist Casey Fiesler from the University of Colorado Boulder called it “one of the worst violations of research ethics I’ve ever seen.”

Reddit is furious, with their Chief Legal Officer Ben Lee calling the experiment “deeply wrong on both a moral and legal level.” They’ve banned all accounts from the University and are pursuing “formal legal demands” against the university and research team.

The implications here are massive – this is essentially proof-of-concept that AI can effectively manipulate human opinion at scale without detection. National governments are probably racing to replicate this study as we speak, except they'd call it 'democratic engagement enhancement' or 'public sentiment optimization' instead." Remember that next time you’re nodding along with a persuasive comment online. Couple this with AI powered content farms and it makes me think we're in for a bumpy decade or two.

AI code assistants are hallucinating packages that don’t exist - research paper

Well, here’s a fun? security nightmare for you: LLMs that generate code are regularly recommending packages that don’t actually exist, opening a massive attack vector that’s already had a proof of concept succeed in the wild.

A comprehensive study examined 16 different code generation models across Python and JavaScript, analyzing 576,000 code samples, and found that commercial models hallucinate non-existent packages at least 5.2% of the time, while open-source models do it at a stomach-churning 21.7% rate. The researchers identified 205,474 unique hallucinated package names – that’s a lot of potential security holes.

The really scary part? These hallucinations are consistent and predictable. When repeating prompts that had triggered hallucinations, 43% of those non-existent packages were regenerated in all test iterations. This makes them perfect targets for attackers who can register these hallucinated names and inject malicious code.

This isn’t theoretical – Lasso Security published a proof-of-concept by creating a dummy package with a hallucinated name (“huggingface-cli”), and within three months, it was downloaded over 30,000 times. That’s 30,000 potential victims who blindly trusted their AI assistant’s recommendation. Even major companies were fooled – Alibaba included instructions to install this non-existent package in one of their repositories.

The security community has already coined a term for this: “slopsquatting” – where attackers register and weaponize the packages that AI tools hallucinate. With a whole bunch of devs adopting AI and the rise of AI no-code software, this is a significant attack surface that’s growing daily.

The researchers did test some mitigation strategies like Retrieval Augmented Generation (RAG) and self-detected feedback, which showed promise in reducing hallucination rates. Somewhat reassuringly, the study found that top AI models were able to identify their own hallucinations with over 80% accuracy. But until robust solutions become standard, maybe double-check those import statements before installing new packages.

Alibaba releases Qwen3, a suite of open-source “hybrid reasoning” models - announcement

Alibaba Cloud just dropped Qwen3, and it might be the most impressive open-source AI release of 2025 so far. Released on April 28th, this isn’t just a single model – it’s eight different sizes, ranging from tiny 0.6B parameter versions up to a monster 235B parameter MoE model (with 22B active parameters).

It has a “hybrid reasoning” approach, similar to what Anthropic did with Claude 3.7. These models can switch between taking time to “reason” through complex problems or answering simpler requests quickly. As the Qwen team explains: “We have seamlessly integrated thinking and non-thinking modes, offering users the flexibility to control the thinking budget.” They use special blocks to make their reasoning transparent.

The technical specs are impressive: trained on over 36 trillion tokens (double that of Qwen2.5), supporting 119 languages, 32k-128k context window natively, and available under the Apache 2.0 license. The performance benchmarks are equally eye-catching – on Codeforces, the flagship Qwen-3-235B-A22B model slightly outperforms OpenAI’s o3-mini and Google’s Gemini 2.5 Pro. It even beats o3-mini on the latest version of AIME, a challenging math benchmark.

What’s particularly noteworthy is the efficiency. Due to architectural improvements and increased training data, each Qwen3 dense model matches or outperforms its larger Qwen2.5 counterpart – Qwen3-4B performs as well as Qwen2.5-7B, for example. This means running costs are significantly lower than previous generations while maintaining or improving performance.

It also means we might be close to getting an open coding model that is actually useful to run locally, as opposed to being complete garbage. The independent aider benchmarks aren't out yet, but their discord is buzzing with people trying to reproduce the impressive results published by the Qwen team. My sense is we aren't there yet but maybe later this year or in 2026.

Industry analysts are already calling it a “breakthrough in China’s booming open-source AI space” and possibly “the best open-source model globally.” Which they definitely didn't say when llama4 came out a few weeks back. The Qwen model family has apparently attracted over 300 million downloads worldwide and more than 100,000 derivative models on Hugging Face.

While Qwen3 is still behind the cutting-edge models like OpenAI’s o4-mini, it’s narrowing the gap considerably while being completely open-source. Some critics on Hacker News do point out that even these advanced models remain “impressive at what they’re good at… and terrible at what they’re bad at (actually thinking through novel problems).” Still, this release is putting even more pressure on American labs and raising questions about chip export restrictions. The AI acceleration continues.

cheers,

PS: AI Coding essentials course is open for enrollment and starts on May 12. I'm focusing on helping developers level up their AI practice in both how to use AI to write code and incorporating AI into projects.

Code With JV

Manipulative AI, more cyber security risks and the best open model to date

Big AI power shifts: search, copyright, partnerships

Figma Make (take two), AI integrations hide power struggle, ElevenLabs nailing Vibe Coding DevRel

Google XR glasses, AI cheating on interviews, and a new agent library