No best model anymore, AI labs clash with the Pentagon, and smart glasses get creepy


Here are three things I found interesting in the world of AI in the last week:

1. The "best AI model" era is over - Every.to / Digital Applied comparison

OpenAI launched GPT-5.4 on March 5 to the usual amount of noise. Self-described Claude loyalists are excited. Augment Code made it their default model, calling it "a reliable orchestrator" that uses 18-20% fewer tokens on complex tasks. The headline number: 75% on OSWorld, the first frontier model to beat human experts (72.4%) at autonomous desktop navigation.

GPT-5.4 leads on computer use and knowledge work. Claude Opus 4.6 leads on production coding (80.8% SWE-Bench) and visual reasoning. Gemini 3.1 Pro leads on abstract reasoning (94.3% GPQA Diamond, 77.1% ARC-AGI-2) at roughly a fifteenth the price of GPT-5.4 Pro. On SWE-bench, all three score within a single point of each other at around 80%. The top Hacker News comment on the launch: performance is "becoming increasingly clustered rather than diverging."

Personally, I don't mind that at all. The main take away are that all three are very strong, and pretty close to each other in performance. If you really care about performance you'll get an advantage of choosing the right model for the right job, but for most of your use cases any of them will do the job.

In 3 months there will be more releases again, and the gap between early and late adopters will continue to get bigger.

2. Anthropic and OpenAI are stress-testing AI governance in public - FT / Lawfare

The Pentagon awarded major AI prototype contracts to Anthropic on classified stuff. Anthropic reportedly refused pressure to remove restrictions on domestic mass surveillance and fully autonomous weapons. The DoD threatened to not just cancel their contract but label them a supply chain threat so a whole bunch of companies that supply the US military couldn't use them at all. Trump tweeted childish garbage. Anthropic made a lot of noise in public and threatened to sue.

On Feb 27, OpenAI signed contracts to replace them and on March 2 amended their post claiming they got exactly the exclusions Anthropic were asking for. Anthropic is back neogitating with DoD on March 5.

Lots of popcorn emojis from people following the space and none of us will know the real story. Smells a lot of politics and spin to me.

The thing I found most interesting was how this spread into a lot of "Anthropic good, OpenAI bad" narratives. I saw more people joining the QuitGPT movement, and more organisations adding "we won't use OpenAI" as part of their AI policies.

I'm no fan of OpenAI's policies but I think a lot of people missed the comments of Anthropic being totally up for helping research autonomous weapons with no human in the loop. Their only objection was that the current models weren't reliable enough.

It seems likely to me that OpenAI are currently dodgier than Anthropic, but no values can stay intact in a big money US corporate environment and I think the ethical difference between OpenAI and Anthropic is a smaller than most people think.

3. Smart glasses are becoming mainstream surveillance hardware - TechCrunch

Meta reportedly sold 7M+ Ray-Ban smart glasses in 2025, and the privacy harms are no longer hypothetical. Researchers have already shown real-time stranger identification with off-the-shelf tools. Investigations now describe intimate footage being reviewed by offshore contractors for AI training. Meanwhile, the little LED "recording indicator" can be bypassed with cheap mods.

The social norm of "camera visible means consent can be negotiated" breaks down when the camera is almost invisible.

Unfortunately the lovely cess-pool of the internet known as the pick up artist community have started to use the glasses to record a whole lot of interactions with women without their knowledge. Makes the google "glassholes" stuff look tame in comparison.

Combined with Meta's obnoxious approach to privacy we're going to see a pretty strong stigma associated with smart glasses. You only need to look at what teenage girls experience on social media to get a sense of how not-ready society is for this tech.

One dev has built a "any smart glasses active near me" app which looks for bluetooth signals and alerts the user if it finds anything. Maybe facial recognition blocking masks and microphone jammers will become a fashion statement.

My gut feel is it will be a bumpy social and regulatory environment but ultimately smart glasses with AI in your ear will prove too useful to ignore. I expect we will need a bunch of new social norms and etiquette emerge first.

Given that the popular strategy of dealing with a toxic internet is "don't let kids access social media" instead of fixing the core problems I'm not holding my breath for smooth adoption.

Code With JV

Each week I share the three most interesting things I found in AI

Read more from Code With JV

Here are three things I found interesting in the world of AI in the last week: 1. Grammarly turned expert identity into a product - Nieman Lab / TechCrunch Grammarly launched a paid feature that gave users feedback "from" named experts like Julia Angwin, Casey Newton, Kara Swisher and Stephen King. The experts had not agreed to this. The feedback was AI-generated, the product was charging for it, and the disclaimer saying it was not actually endorsed by those people was buried in the fine...

Instead of a newsletter this week I thought I'd experiment with a longer form email on an idea that I think is worth sharing. Let me know what you think and if you want more / less of this format. In my head I've been calling this the 'single good idea'. One of my favourite questions to ask people is "what is a new thing you've recently done with AI", often followed up by a "what do you want to be able to do next". It's a pretty quick way to find out where their learning edge is. "I have a...

Here are three things I found interesting in the world of AI in the last week: 1. China's robots did kung fu on the biggest TV show on Earth - CNN China's Spring Festival Gala is the most-watched broadcast in the world. 677 million viewers across platforms. This year's star wasn't a singer or a comedian. It was a dozen Unitree G1 humanoid robots doing drunken boxing, nunchucks, and backflips off trampolines three metres in the air. A larger H2 model appeared in Monkey King armour wielding a...