AI Digest.

Pentagon Accepts OpenAI's Identical Safety Terms Hours After Blacklisting Anthropic

The biggest AI story of the year erupted as Anthropic was designated a "supply chain risk" for refusing Pentagon mass surveillance demands, only for OpenAI to swoop in with an identical safety framework. Meanwhile, Qwen 3.5's small-but-mighty models proved consumer GPUs can run frontier-grade coding agents, and Claude Code shipped new built-in skills for automated code review.

Daily Wrap-Up

Today was dominated by a single story that will define the AI industry's relationship with government for years to come. Anthropic refused to let Claude be used for mass surveillance or autonomous weapons, got blacklisted by the Pentagon with the same designation they gave Huawei, and then watched OpenAI sign a deal with identical safety terms hours later. The speed of events was staggering, and the implications for every company building on Claude are real. Whether you view Anthropic as principled or naive depends on your priors, but the fact that the Pentagon accepted the same red lines from a competitor makes the designation look more like retaliation than policy.

Away from the geopolitics, today reinforced a trend that keeps accelerating: local inference on consumer hardware is becoming genuinely viable for serious coding work. Qwen 3.5's 35B-A3B model running at 112 tokens per second on a single RTX 3090 is not a toy demo. People are building complete multi-file applications with procedural audio, particle systems, and boss fights in single prompts. The economics of Apple Silicon for memory-bound inference continue to embarrass NVIDIA's pricing in the personal computing segment. If you've been waiting for "good enough" local models to arrive, the wait is over.

The most entertaining moment was easily @NoahKingJr's take on the Iran situation: "Trump: Hey Siri, tell me how many miles I ran today. Siri: ok, sending missiles to Iran today." Dark humor for dark times. The most practical takeaway for developers: install Qwen 3.5-35B-A3B locally and point Claude Code or OpenCode at it via llama.cpp's Anthropic endpoint. You get Sonnet 4.5-grade coding ability on a $800 used GPU with zero API costs, and the open source harnesses are now reliable enough for sustained multi-file agent sessions.

Quick Hits

  • @EHuanglu shares that AI animation can now be keyframed per-second using text prompts, a significant step toward production-ready AI video tools.
  • @neural_avb drops a framework for building agentic systems, adding to the growing pile of agent orchestration options.
  • @doodlestein claims to be running AI-assisted development "at scale now for a massive number of projects" with the right tooling and workflows.
  • @sukh_saroy highlights the Financial Datasets MCP Server, giving Claude access to live stock prices, financial statements, and crypto data. Wall Street terminal functionality for free.
  • @theallinpod covers Claude's "hit list" of SaaS companies, the datacenter opposition movement, and SCOTUS striking down tariffs. They note the Anthropic/DoW fallout happened after recording and will be covered next week.
  • @michaeljburry launches a new series comparing historical newspaper coverage to today's AI hype, drawing parallels that should make boosters uncomfortable.
  • @morganlinton flags a "must read" from the founder of Cursor, though no details on the content.
  • @affaanmustafa streams a YC Browser Use Hackathon, continuing Y Combinator's heavy investment in browser automation agents.
  • @Full_Metal_QR suggests Anthropic should "just hire this little guy," context unclear but the sentiment resonates.

Anthropic vs. The Pentagon: AI's Biggest Political Crisis

The dominant story across the feed today was the collision between Anthropic and the U.S. Department of War, a sequence of events so compressed and consequential that @cryptopunk7213 called it "the fucking wildest 7 days in U.S. defense history." The core facts: Anthropic drew two hard lines on their Pentagon contract (no mass surveillance of Americans, no autonomous lethal weapons without human oversight), the Pentagon demanded those lines be removed, Anthropic refused, and the administration designated them a "supply chain risk" using the same framework applied to Huawei.

The deepest analysis came from @shanaka86, who surfaced a detail from Axios that changes the calculus entirely:

> "While Anthropic was being blacklisted for refusing to allow mass surveillance, the Pentagon's own 'compromise deal'... would have required Anthropic to allow the collection and analysis of Americans' geolocation data, web browsing history, and personal financial information purchased from data brokers."

This is not an abstract policy dispute. The contract language reportedly asked for access to location tracking, browsing history, and financial records of American citizens. Anthropic said no. Then, as @tedlieu pointed out with genuine bewilderment: "The Department of Defense just agreed to the same two conditions with OpenAI that Anthropic was asking for. Can someone explain? I genuinely don't understand."

Hours after the blacklisting, @sama announced OpenAI's deal with the DoW, carefully noting that "two of our most important safety principles are prohibitions on domestic mass surveillance and human responsibility for the use of force" and that "the DoW agrees with these principles." To OpenAI's credit, they also publicly pushed back on the designation itself, with @OpenAI stating: "We do not think Anthropic should be designated as a supply chain risk and we've made our position on this clear to the Department of War."

But as @markgadala noted, the optics are brutal: "Just a few hours ago he was on TV saying he stood by Anthropic. Then he undercuts them and takes the same contract that Anthropic just lost." The practical fallout extends far beyond the $200M Pentagon contract. @shanaka86 calculates that eight of the ten largest American companies use Claude, and the supply chain designation forces every general counsel with Pentagon exposure to reassess. Anthropic's expected $380B IPO is effectively frozen. @AnthropicAI has announced they will take the administration to court.

Local AI Hits an Inflection Point

Qwen 3.5's release of the 35B-A3B model (35 billion total parameters, only 3 billion active per inference) has kicked off a wave of genuinely impressive local AI demonstrations. @sudoingX provided the most concrete example, giving the model a single detailed spec and watching it produce a complete space shooter game:

> "One prompt. Ten files. 3,483 lines of code. Zero handholding... enemy types, particle systems, procedural audio, powerups, boss fights, ship upgrades, parallax backgrounds, everything in one message."

Running on a single RTX 3090 at 112 tokens per second with no API costs. @KSimback confirmed the broader trend: "Seeing many positive reports of running Qwen 35B-A3B locally on modest consumer hardware. No need for a $10k+ Mac Studio." And @cgtwts went further, claiming the model "outperforms all previous Qwen models, beats models that are 6x larger, smarter than Sonnet 4.5" at coding tasks.

On the hardware economics side, @alexocheema laid out why Apple Silicon dominates for local inference: M3 Ultra memory costs $18/GB versus $360/GB for B200 GPUs. "If DeepSeek V4 is >1T parameters, by far the cheapest way to run it will be Apple Silicon." The interesting wrinkle is the harness layer. @sudoingX found that Claude Code's tool-call error handling was the bottleneck, not the model, and switching to OpenCode with the same local model produced much more sustained autonomous coding sessions. The takeaway: model quality has caught up; now the orchestration layer is the differentiator.

AI Makes You More Productive, Then Burns You Out

A Berkeley research study tracking 200 employees over 8 months produced findings that challenge the simple "AI makes everyone more productive" narrative. @aakashgupta broke down the self-reinforcing cycle the researchers identified:

> "AI accelerated tasks, raised speed expectations, workers leaned harder on AI, scope expanded, wider scope created more work, more work demanded more AI. That loop has no natural stopping point. The company never installed one."

The key insight is not that AI failed, but that organizations failed to adapt. Individual capability went up, organizational design stayed frozen, and the gap created burnout. A separate NBER study found productivity gains of just 3% across thousands of workplaces, and 77% of employees in an Upwork survey said AI tools actually decreased their productivity. @harjtaggar captured the ground truth more concisely: "Everybody I know using AI is working more hours not less." Meanwhile, @johnrushx extrapolated Claude Code's usage to "40,000 full-time software developers working full time" and predicted 1 million developer-equivalents by 2027. The tension between these perspectives is the central question of AI adoption: are we building leverage, or just building more work?

Claude Code Ships /simplify and /batch

The Claude Code team announced two new built-in skills that automate post-coding cleanup. @bcherny revealed that "/simplify reviews your changed code for reuse, quality, and efficiency, then fixes any issues found," while /batch handles "straightforward, parallelizable code migrations." @dani_avila7 provided a hands-on look:

> "I ran it after finishing a PR review and noticed it spawned 3 parallel agents using Haiku 4.5 to do the analysis... fast and cheap."

This aligns with @addyosmani's broader argument that "the unsolved problem isn't generation but verification. That's where engineering judgment becomes your highest-leverage skill." The shift from writing code to orchestrating and verifying AI-generated code continues to accelerate, and built-in tools that handle the verification loop automatically represent a meaningful quality-of-life improvement for developers already living inside Claude Code.

Agent Communication Infrastructure Matures

The agent ecosystem is developing its own communication primitives. @mattshumer_ announced Agent Relay, describing it as "Slack for AI agents: channels + threads + DMs + realtime events + search + persistent history." @willwashburn co-announced the launch. Separately, @sukh_saroy highlighted OpenClaw Studio, a self-hosted agent dashboard with "live chat, approval gates, job scheduling, and full visibility."

The most thoughtful contribution came from @blader, who identified a gap in how long-running agent sessions maintain coherence:

> "Plans are high level and static. Session history is shallow and leads to ratholing. Theorist is a layer in between: a continuously updated mental model of the root cause, and the current theory of victory."

This resonates with anyone who has watched an agent lose the plot 30 minutes into a complex task. The infrastructure for multi-agent systems is moving from "can agents talk to each other" to "can agents maintain shared understanding over time," which is a much harder and more interesting problem.

Sources

J
jacob @jacobgrowth ·
This OpenClaw AI Agent Made Me $10K Clipping Pop Culture Content...
A
Ashpreet Bedi @ashpreetbedi ·
Agentic Software Engineering
Y
yenkel @yenkel ·
great intro to coding harnesses by @juliandeangeIis recommended if you’ve not used them or are early in your journey main takeaway: context is king
J juliandeangeIis @juliandeangeIis

The Coding Agent Harness: How to Actually Make AI Coding Agents Work at Scale

J
Jeffrey Emanuel @doodlestein ·
@iruletheworldmo I’ve been doing this kind of thing for like 4-5 months now. It’s all available today, 100% open-source and free, in a way that works with any model or agent provider at https://t.co/TKaKxRIw5x . No need to get trapped inside a walled garden that doesn’t even work as well.
M
Morgan @morganlinton ·
Great article, and yes, confirmed, the MCP hype is over.
M mfranz_on @mfranz_on

CLI Is All You Need