|
Hi Reader, Welcome to the new year. I’ve been mulling on what was actually important in 2025 and what was just noise, and I keep coming back to three things. 1. Claude Code created a category, and everyone copied itTerminal-based coding tools weren’t new, and Aider and others had been around for a while, but when Anthropic launched Claude Code in February they combined tool use with thinking models in a way that produced an agent that was actually useful. There were a bunch of projects experimenting with coding (and other) agents in 2023 and 2024, but they all sucked. Claude Code was a milestone in capability. The terminal wasn’t just a developer aesthetic choice. CLI tools meant you could pipe Claude Code into anything: email, GitHub, Slack, CI pipelines, whatever. Web-based AI couldn’t do that, and it turned out that flexibility was crucial for actually delegating tasks rather than just getting advice you still had to act on yourself. The business model mattered as much as the tech. In April, Anthropic launched Max at $100 or $200/month for 5x or 20x the usage, with Claude Code bundled in. Before that, running an agentic CLI meant paying per call and watching API costs tick up unpredictably, but fixed-cost subscriptions removed the friction and let you run the agent for hours without doing mental accounting. It was heavily subsidized to grab market share, and it worked. The numbers moved fast: 195 million lines of code weekly across 115,000 developers by July, and $1B+ annualized revenue by December. Google and OpenAI followed within months, but OpenAI in particular missed the mark initially. In May they launched Codex as a web-based agent that ran in a sandboxed cloud environment connected to your GitHub repos. The problem was it only really worked if you had a strong continuous integration pipeline with automated tests that ran whenever code changed. Most solo developers and small teams don’t have that, so the agent would make changes but you had no way to verify they actually worked without pulling them down and testing manually. OpenAI had actually launched Codex CLI a month earlier in April, but it required paying per-use with API tokens, which meant the same unpredictable costs that made agentic coding feel risky. It wasn’t until late August that they let ChatGPT subscribers sign in with their existing subscription and use the CLI without watching a meter tick. By then Anthropic had a four-month head start on the fixed-cost model. Gemini CLI launched on June 25th with subscription tiers from day one, but Google AI products are still painful for Workspace users: if you have a Workspace account (the business version of Google), you still can't use your subscription to access Gemini CLI and have to pay per API call instead, while personal Google AI Ultra subscribers get it bundled. While close on feature parity and with stronger models Google and OpenAI are both behind on adoption and developer experience. September’s Claude Code 2.0 showed where this is heading: checkpoints that auto-save before every change, subagents that spin up parallel workers for different parts of your codebase, built-in planning, bundled task lists, non blocking backgroud tasks. Developers stopped writing code and started managing agents. Just yesterday, Anthropic made Claude Code available for all users on Team plans, not just the premium seats. A year after launch, they’re still the number one provider in this category, with Codex, Gemini CLI, OpenCode, and others following on their heels but still playing second fiddle. 2. Google closed the gapRemember when Google looked flatfooted after ChatGPT launched? Sundar Pichai declared “code red” in December 2022, reorganized teams, and rushed Bard to market. In 2023 and 2024 Google’s AI offerings were a long way off the mark. Three years later, Sam Altman used the exact same phrase, and the roles had reversed. Gemini 2.5 Pro launched in March with chain-of-thought reasoning and debuted at #1 on LMArena, then Gemini 3 followed in November and beat OpenAI’s o3 on SWE-bench (76.2% vs 71.7%) while leading on graduate-level physics benchmarks. Image generation became a back-and-forth battle. In March, OpenAI’s GPT-4o image update went so viral with Ghibli-style art that Sam Altman said it was “melting” their GPUs. Google struck back in August with an image model codenamed Nano Banana that went viral for turning selfies into 3D figurines, helping Gemini jump from 450 million to 650 million monthly users by October. OpenAI responded again in December, three weeks after Nano Banana Pro launched. Then Apple picked Google to power Siri: a $1 billion/year deal for a custom 1.2 trillion parameter Gemini model running on Apple’s Private Cloud Compute. OpenAI was the previous partner but couldn't hang on to the deal, Anthropic was reportedly too expensive, and Google already had the Safari search relationship and the infrastructure to run inference at Apple scale. When the deal was announced, Google briefly crossed $4 trillion market cap and surpassed Apple for the first time since 2019. Salesforce CEO Marc Benioff, who’d used ChatGPT daily for three years, said after trying Gemini 3: “I’m not going back.” The model gap that analysts pegged at 6 months in 2024 was being called “potentially zero” by November. ChatGPT still has the strongest brand in AI and the highest monthly active users. But that lead has shrunk every year since launch. In 2023, OpenAI was the only game in town. In 2024, competitors were catching up. In 2025, they caught up. The question for 2026 is whether OpenAI can stay ahead, or whether “the AI app” becomes a commodity. 3. Vibe coding went mainstreamOn February 2nd, Andrej Karpathy tweeted something that crystallized what a lot of people were already doing: “There’s a new kind of coding I call ‘vibe coding’, where you fully give in to the vibes, embrace exponentials, and forget that the code even exists… I ‘Accept All’ always, I don’t read the diffs anymore.” The tweet got over 5 million views, and “vibe coding” became Collins Dictionary’s Word of the Year. A whole category of vibe coding tools emerged: Lovable hit $100M ARR in 8 months, doubled to $200M four months later, and closed a $330M Series B at $6.6B in December. Bolt.new, Replit, and Vercel’s v0 all compete for the same space. If you have a Google account, Google AI Studio added vibe coding in October and it’s free to start. Y Combinator’s Winter 2025 batch had 25% of startups running on 95%+ AI-generated codebases. This has already changed how I work with clients. People vibe code before they call me now. They often show up with a working prototype instead of a requirements doc. When they haven’t, we use these tools together to rapidly test ideas before committing to the product direction. Our entire greenfields workflow has pivoted: we either throw away the vibe-coded prototype and use it as a functional spec, or we knock it into shape and build on top of it. At Fundsorter, when I’m considering a new feature, I vibe code a fully functional UI first, test it with stakeholders, see if it’s even interesting or useful, and only then decide whether to make it real. The value of front-end code is rapidly approaching zero. What used to take a week now takes an afternoon. An afternoon while doing a bunch of other things in parallel that is. The impact on the cost and speed of software development is hard to overstate. You don’t need the vibe coding resellers though. Lovable, Bolt, and the rest are putting a markup on AI credits and wrapping them in a UI. If you’re willing to learn the basics, you can go straight to the source with Claude Code (and co), and get better results for less money. The entire focus of Learn to Code a Little Bit has shifted to help people make that leap, and to build more on their own without needing dev help. My definition of vibe coding is “getting the AI to create code you don’t understand” - it’s great for prototyping or very small projects but useless for anything that you actually want to launch. Agentic coding is “getting the AI to create code you do understand” and I have a high conviction that is the future of software development. I’m writing this newsletter while seven agents are working autonomously on coding, testing, researching tasks. I’ll stop for dinner soon and some of them will keep going. If I wasn’t writing this newsletter I would have twice the number running and if my workflows were more robust workflows I would have even more. The limits on knowledge work are moving away from human hours and expertise to the quality of agentic workflows and the amount of AI compute available. Human hours and expertise matter, but nowhere near as much as they used to. That’s my bet on where we will be going in 2026. We'll see some stronger models but the real growth will be in agentic orchestration and workflows. It’s an exciting time to alive, have a great year! cheers, |
Each week I share the three most interesting things I found in AI
Here are three things I found interesting in the world of AI in the last week: 1. Grammarly turned expert identity into a product - Nieman Lab / TechCrunch Grammarly launched a paid feature that gave users feedback "from" named experts like Julia Angwin, Casey Newton, Kara Swisher and Stephen King. The experts had not agreed to this. The feedback was AI-generated, the product was charging for it, and the disclaimer saying it was not actually endorsed by those people was buried in the fine...
Here are three things I found interesting in the world of AI in the last week: 1. The "best AI model" era is over - Every.to / Digital Applied comparison OpenAI launched GPT-5.4 on March 5 to the usual amount of noise. Self-described Claude loyalists are excited. Augment Code made it their default model, calling it "a reliable orchestrator" that uses 18-20% fewer tokens on complex tasks. The headline number: 75% on OSWorld, the first frontier model to beat human experts (72.4%) at autonomous...
Instead of a newsletter this week I thought I'd experiment with a longer form email on an idea that I think is worth sharing. Let me know what you think and if you want more / less of this format. In my head I've been calling this the 'single good idea'. One of my favourite questions to ask people is "what is a new thing you've recently done with AI", often followed up by a "what do you want to be able to do next". It's a pretty quick way to find out where their learning edge is. "I have a...