Anthropic and Cloudflare Launch Code Execution Mode as Token Optimization Takes Center Stage

November 07, 2025 · 13 source posts

Daily Wrap-Up

The big news today is a joint announcement from Anthropic and Cloudflare introducing a code execution mode for AI agents, and the community reaction tells you everything about where agent development stands right now. On one side, you have people like @connordavis_ai celebrating the potential to dramatically reduce token burn on tool calls. On the other, @stevekrouse is climbing onto his soapbox to remind everyone that this whole journey started because LLMs were bad at writing JSON, and maybe we should think harder about where the architectural decisions are taking us. The tension between "ship it now" pragmatism and "build it right" idealism is the defining fault line of the agent ecosystem in late 2025.

What struck me most today was how much the conversation has shifted from "can agents code?" to "how do we make agents code efficiently?" The posts about TOON notation halving token costs, about keeping agent memory "a holy place," about using ast-grep for pattern matching in agent configuration files: these are all optimization problems. The exploration phase is winding down. People have committed to agent-assisted development as their workflow, and now they are grinding on the details that make it sustainable at scale. The practitioners sharing real patterns, like @doodlestein's ast-grep tip and @Hesamation's context engineering techniques, are doing more for the ecosystem than any product launch.

The most entertaining moment was easily @mattarderne losing his mind over generating videos with Remotion, Claude Code, and ElevenLabs, declaring it magic in all caps. There is something genuinely delightful about watching experienced developers rediscover the feeling of being amazed by their tools. The most practical takeaway for developers: if you are building agents or working with coding assistants daily, invest time in context engineering and token optimization now. Techniques like ast-grep for structured code search, disciplined memory management, and token-efficient serialization formats are the difference between an agent workflow that scales and one that bleeds money as your codebase grows.

Quick Hits

@MomentumBullish showing off a setup running on two M3 Ultras, a reminder that Apple Silicon continues to be the go-to for local inference enthusiasts who want serious throughput without a datacenter.

@NoahEpstein_ claims Synta just made every expensive n8n course and consultant look overpriced, arguing the learning curve that sustained the n8n consulting ecosystem is about to flatten dramatically.

@yulintwt sharing a thread on practical AI-assisted coding techniques, calling it a leak of "how to actually code with AI."

@adxtyahq posted what feels like a developer endgame meme, capturing the vibe of reaching full agent-assisted workflow nirvana.

@tom_doerr surfaced ErsatzTV, a project for creating virtual TV channels from your Plex, Jellyfin, or Emby media libraries. Not strictly AI, but a solid self-hosted find for the homelab crowd.

Anthropic and Cloudflare Reshape Agent Architecture

The biggest development today landed at the intersection of two infrastructure giants. Anthropic and Cloudflare jointly announced a code execution mode designed to let agents run code directly rather than shuttling everything through tool calls and JSON schemas. This is a meaningful architectural shift because it attacks the core cost and latency problem that every agent builder has been wrestling with: the overhead of serializing every action into structured tool calls.

@connordavis_ai captured the relief many agent engineers are feeling: "Every agent today burns tokens like fuel every tool call, every definition, every intermediate result jammed into context. Now Anthropic's introducing the fix: code execution." The math here is straightforward. If your agent makes 20 tool calls in a session, each with a schema definition, input parameters, and output parsing, you are spending a significant chunk of your context window and your budget on plumbing rather than reasoning.

But @stevekrouse offered a necessary counterpoint, tracing the entire tool-calling paradigm back to its unglamorous origin: "Let's all remember where this started. LLMs were bad at writing JSON. So OpenAI asked us to write good JSON schemas." His rant about MCP and the proliferation of tool-calling standards is worth reading in full because it asks a question the industry has been dodging: are we building increasingly elaborate infrastructure to work around a fundamental limitation, or are we converging on something genuinely better? Meanwhile, @thisdudelikesAI highlighted Cloudflare's parallel move, open-sourcing VibeSDK as "basically Replit/Cursor but you can deploy your own version in one click." The combination of Anthropic handling the model-side execution and Cloudflare providing the deployment infrastructure suggests a play for the full stack of agent-powered development. Whether this displaces existing tools or becomes another layer in an already crowded stack remains to be seen, but the direction is clear: the platform providers want to own the agent runtime, not just the model.

The Maturing Craft of Coding with Agents

If today's posts are any signal, we have passed the "wow, agents can write code" phase and entered the "here's how to actually make it work" phase. Three posts independently converged on the same message: the craft of working with coding agents is becoming a real discipline with its own tools, best practices, and body of knowledge.

@doodlestein shared a deceptively simple but powerful tip: adding ast-grep to your AGENTS.md or CLAUDE.md configuration. "It's pretty handy for systematically finding general patterns in code that could be tricky to do using regular string matching," they wrote. This is exactly the kind of incremental tooling improvement that compounds over time. Regular expressions and string matching hit a ceiling fast when you are trying to find structural patterns in code, and giving your agent access to an AST-level search tool is a genuine force multiplier.

@nummanali pointed to a comprehensive blog post by Peters covering the full landscape of coding agents, from harnesses and models to MCP and prompting. The endorsement was strong: "It felt like reading my own notes, I pretty much am on the same conclusions." When experienced practitioners start converging on the same set of conclusions independently, that is a sign the field is stabilizing around real patterns rather than hype. The post covers harnesses, models, MCP, prompting, and additional tooling, essentially a curriculum for anyone serious about agent-assisted development.

And then there is @mattarderne, who brought pure enthusiasm to the conversation: "NOTHING comes close to making videos using Remotion + Claude Code + Elevenlabs. THIS SHIT IS MAGIC!!!" This is a creative workflow that would have been unthinkable a year ago: using a coding agent to programmatically generate video content with AI-generated narration. It is a reminder that while the infrastructure debates matter, the most compelling use cases are the ones where agents unlock entirely new creative possibilities rather than just making existing workflows faster.

Context Engineering and Token Economics

Two posts today tackled what might be the most underappreciated skill in agent development: managing context efficiently. As agents get more capable and sessions get longer, the economics of token usage become a real constraint, and the developers who figure out how to optimize this will have a significant edge.

@Hesamation from CamelAI wrote a blog post on context engineering with a memorable framing: "brainwash your agents. Context engineering doesn't have to be hard, there are so many low-hanging fruits. Just keep the memory a holy place and drop the bs messages." The advice to treat agent memory as sacred space resonates because it reflects a pattern every agent builder discovers painfully: context pollution is the silent killer of agent performance. When your agent's memory fills up with irrelevant intermediate results, debug logs, and formatting artifacts, the quality of its reasoning degrades in ways that are hard to diagnose. Disciplined context management is not glamorous work, but it is the difference between an agent that stays coherent through a long session and one that starts hallucinating by turn 15.

@akshay_pachaar proposed a more radical approach to token efficiency with TOON, or Token-Oriented Object Notation: "TOON slashes your LLM token usage in half while keeping data perfectly readable." The idea targets a specific pattern, uniform arrays with consistent structure, where traditional JSON's repeated key names become pure overhead. Whether TOON specifically gains adoption is less important than the underlying insight: the serialization formats we use to communicate with LLMs were designed for human-to-machine communication, not for a world where every character costs money. As agent usage scales, expect more experimentation with compact representations that optimize for the token economy rather than human readability. The two approaches, better memory hygiene and more efficient serialization, are complementary. Together they represent a maturing understanding that agent performance is not just about model capability but about the entire information pipeline surrounding it.

Source Posts

Tom Dörr @tom_doerr · Nov 07

Create TV channels from media on Plex, Jellyfin, or Emby https://t.co/Yv5YvGKX2z

Connor Davis @connordavis_ai · Nov 07

🚨 Anthropic just solved the problem every AI agent engineer’s been screaming about for a year. Every agent today burns tokens like fuel every tool call, every definition, every intermediate result jammed into context. Now Anthropic’s introducing the fix: code execution with… https://t.co/40HzED8jhl

𝔹

𝔹𝕦𝕝𝕝𝕚𝕤𝕙𝕄𝕠𝕞𝕖𝕟𝕥𝕦𝕞🔸📈 @MomentumBullish · Nov 07

@AlexFinn Check it out running on 2 M3 Ultras !!! https://t.co/Fy9J0jeQyT

Numman Ali @nummanali · Nov 07

You need to read Peters' blog It's actually the best write up on the current state of play for Coding Agents It felt like reading my own notes, I pretty much am on the same conclusions He covers: - The harnesses - Models - MCP - Prompting - Additional tooling - etc https://t.co/fTXb3osadN

Yu Lin @yulintwt · Nov 07

This guy literally leaks how to actually code with AI https://t.co/G0gb8dpZVt

aditya @adxtyahq · Nov 07

POV: You reached endgame as a developer https://t.co/iJUY6FJpH7

Jeffrey Emanuel @doodlestein · Nov 07

A useful addendum to your AGENTS dot md or CLAUDE dot md file. First ask codex or claude code to install ast-grep for you if you don't have it already. It's pretty handy for systematically finding general patterns in code that could be tricky to do using regular string matching… https://t.co/2NeTxQoGzK

Akshay 🚀 @akshay_pachaar · Nov 07

A simple trick cuts your LLM costs by 50%! Just stop using JSON and use this instead: TOON (Token-Oriented Object Notation) slashes your LLM token usage in half while keeping data perfectly readable. Here's why it works: TOON's sweet spot: uniform arrays with consistent… https://t.co/qnOBeTgRf8

Matt Arderne 🌊 @mattarderne · Nov 07

I've gotten a lot of joy out of coding agents But NOTHING comes close to making videos using Remotion + Claude Code + Elevenlabs. THIS SHIT IS MAGIC!!! https://t.co/DdFodkrb1d

ℏ

ℏεsam @Hesamation · Nov 07

brainwash your agents. context engineering doesn't have to be hard, there are so many low-hanging fruits. just keep the memory a holy place and drop the bs messages I just wrote a blog post on how we do it at @CamelAIOrg. these are simple to implement, must-have techniques for… https://t.co/s39KtlrJ0O

Nozz @NoahEpstein_ · Nov 07

🚨 Synta just made every $2,000 n8n course look like a scam. And every consultant charging $8K per workflow is about to have a very bad year. Here's what just changed: n8n consultants built their entire business on one thing: The learning curve being so brutal that smart… https://t.co/zG2BRIF56g

Ryan Hart @thisdudelikesAI · Nov 07

What the f*ck... 🤯 Cloudflare just open-sourced an entire AI coding platform that lets anyone build and deploy apps with natural language. VibeSDK is basically Replit/Cursor but you can deploy your own version in one click. Here's how it works: https://t.co/icyuY3s4MV

Steve Krouse @stevekrouse · Nov 07

*gets up on soap box* With the announcement of this new "code mode" from Anthropic and Cloudflare, I've gotta rant about LLMs, MCP, and tool-calling for a second Let's all remember where this started LLMs were bad at writing JSON So OpenAI asked us to write good JSON schemas…