DeepSeek V4 and Qwen 3.5 Drop While Agent Autonomy Race Heats Up
Daily Wrap-Up
Two new models, a Pentagon drama, and the agent autonomy crowd shipping code at a furious pace. That was the shape of February 16th. DeepSeek V4 rumors point to near-zero inference costs at Claude Opus-level coding performance, while Alibaba's Qwen 3.5 dropped as the first open-weight model in the series with 397B parameters and native multimodal agent support. Both signal that the gap between frontier closed models and open alternatives is narrowing fast, especially for coding and agentic workloads.
The agent ecosystem posts were the loudest thread of the day. OpenClaw shipped a feature-packed release, Manus teased native desktop apps and expansion to WhatsApp and Slack, and multiple builders shared their approaches to long-running agent autonomy. At the same time, a study circulated showing that skills written by LLMs themselves don't actually improve task completion rates, a useful counterweight to the hype. The developer tooling space around Claude Code continues to mature with wireframe editors, visual explainer skills, and workflow frameworks all emerging from the community. The most surprising moment was the Pentagon-Anthropic showdown: reports that Defense Secretary Hegseth is "close" to classifying Anthropic as a supply chain risk after Claude was embedded in military systems via Palantir. Whether or not the details hold up, it signals real tension between AI safety positions and defense establishment priorities.
The most practical takeaway for developers: with Qwen 3.5 shipping at Apache 2.0 and DeepSeek V4 reportedly runnable on consumer hardware, now is the time to set up local inference pipelines and evaluate open models for your agentic workflows rather than relying solely on API providers.
Quick Hits
- @rawsalerts reports Meta has patented an AI that can keep a deceased person's account active, posting, messaging, and video calling by replicating behavior from past data. Digital afterlife, brought to you by the company that can't keep your living data private.
- @harleyf highlights Shopify CEO Tobi's 957 commits in 45 days, calling it "real founder mode" while other CEOs are still in meetings about AI strategy.
- @_ashleypeacock makes the case for Cloudflare's AI Gateway as a sleeper hit: unified API endpoint, multi-provider routing, failover, caching, and usage analytics, all free via Workers.
- @pixelandpump stopped fighting copyright filters and started making original AI video content with Seedream 5.0 and Seedance 2.0, including a first-person dragon rider sequence with a prompt that reads like a war correspondent strapped to a CGI creature.
- @tlakomy posted the universal developer experience: reviewing Claude Code output before pushing directly to prod. We've all been there.
- @patrick_oshag shared what they called "the best post on software moats in the AI era."
- @TheAhmadOsman with the local inference pep talk: "just quantize a 13B, toss it on your 3090, and let that thing cook."
- @NathanWilbanks_ declares AGNT will never join big corpo, teasing something "crazy" coming soon.
- @hive_echo RT'd @blader on how easy it is to lose sight of implementation details when codex/claude is working on something ambitious.
- @cyantist declares today the day copyright dies: "Welcome to the age of the remix and the inability to control IP from this day forward."
- @Hesamation nails the vibe: it's a no-AI interview but you've been vibe-coding so long the technical circuit in your brain is "absolutely fried."
Agents and Autonomy
The single loudest theme of the day was agent infrastructure. Builders aren't just experimenting with agents anymore; they're solving the hard problems of long-running autonomy, multi-platform deployment, and composable agent architectures. This feels like a moment where the agent space is transitioning from demos to production systems.
@openclaw shipped their 2026.2.15 release with Telegram message streaming, Discord Components v2, nested sub-agents, and a major security hardening pass with 40+ bug fixes. The nested sub-agents feature is particularly notable as it signals the move toward hierarchical agent architectures where agents can delegate to specialized sub-agents. The release wasn't without drama. @Teknium noted that "Anthropic blocked his fren from using the claude sub in openclaw, switched to minimax," calling it a "big boost for open models." Provider lock-in risk is real when you're building autonomous agents, and this is exactly the kind of event that pushes builders toward open-weight alternatives.
@sillydarket posted twice about solving long-term autonomy for OpenClaw agents, promising that "give me 7 days and it will get really, really good." Meanwhile @hidecloud teased the Manus roadmap: custom specialized agents for group chats, expansion to WhatsApp, LINE, Slack, and Discord, plus native Windows and Mac apps that can operate your computer. @mrmagan_ showed off a system where you register UI components and APIs to build an agent that "speaks your interface in minutes." And @molt_cornelius shared work on agentic note-taking as "a second brain that builds itself."
@dani_avila7 tied it together with a broader observation about the SAND framework: "The next wave of frameworks won't be about how we organize files or structure folders, they'll be about how we interact with AI. We're entering an era where building software means conversing, delegating, and supervising agents." That framing captures something real. The tooling around agents is maturing, but the practices and workflows for working with them are still being invented.
Claude Code and Developer Workflows
Claude Code continues to generate a cottage industry of community-built tools and workflow patterns. Today's crop ranged from visual aids to wireframe editors to terminal optimization tips, all focused on making the agent-assisted development loop tighter.
@nicopreme shared a "Visual Explainer" skill with complementary slash commands designed to reduce cognitive debt: "The skill includes reference templates and a CSS pattern library so output stays consistently well-designed. Much easier for me to digest than squinting at walls of terminal text." This is a smart pattern. Most Claude Code output is plain text, and structuring it as rich HTML pages is a low-effort way to improve comprehension for complex explanations.
@bbssppllvv built an ASCII wireframe editor with a simple pitch: "Draw a page in 30 seconds, copy/paste into Claude Code and get a full working page back." The insight that "AI agents read markdown better than they read your mind" is worth internalizing. Meanwhile @banteg highlighted a new Claude Code feature, @jessmartin used Claude to build an isometric interface testing tool with watercolor-style sprites for their project, and @ryancarson shared a "Code Factory" approach for setting up repos so agents can auto-write and review 100% of your code.
@dani_avila7 also dropped practical Ghostty tips: Cmd+Shift+F to zoom into any panel when focusing on a single Claude Code session, and Cmd+Shift+P for the command palette. Small quality-of-life improvements that add up when you're running multiple agent sessions in parallel.
The Future of Code in an AI World
Several thoughtful posts today wrestled with what software development looks like when AI writes most of the code. This wasn't idle philosophizing; these are questions with real architectural implications.
@karpathy kicked it off by noting that "LLMs change the whole constraints landscape of software completely," pointing to the rising momentum behind porting C to Rust and upgrading legacy COBOL codebases. His key insight: LLMs are especially good at translation because the original codebase acts as "a kind of highly detailed prompt" with concrete tests as a reference. He pushed further with a provocative question: "What kind of language is optimal [for LLMs]? What concessions (if any) are still carved out for humans?"
@Thom_Wolf wrote the most comprehensive post of the day, arguing that cheap rewriting kills dependency trees (bringing back monoliths), the Lindy effect weakens when AI can explore legacy codebases at will, and strongly typed languages will rise because formal verification and RL environments favor types over ergonomics. His conclusion on open source was sobering: "In a world where most code is written, and perhaps more importantly, read, by machines, these incentives will start to break down. Communities of AIs building libraries and codebases together will likely emerge as a replacement, but such communities will lack the fundamentally human motivations that have driven open source until now."
On the skeptical side, @thdxr observed that developers sometimes waste time "letting the LLM keep taking swings instead of reading something," drawing a parallel to how books pre-Googled information. @nateberkopec shared research showing that "skills written by the LLM itself do not increase task completion rate," adding that "LLMs are noisy amplifiers: when you ask them to amplify themselves, you just get more noise." And @steipete offered the timeless reminder that "creating a thing isn't hard. Maintaining is."
New Models: DeepSeek V4 and Qwen 3.5
Two major model announcements shaped the timeline. @cryptopunk7213 dropped a detailed thread on DeepSeek V4, reportedly launching this week, claiming 10-40x lower inference costs at 80% coding benchmark performance, a 1M context window that "doesn't lose intelligence at scale," and the ability to run on consumer hardware with dual RTX 4090s. The estimated $10M training cost versus GPT-5.3's rumored hundreds of millions is the kind of number that makes infrastructure investors nervous.
@Alibaba_Qwen officially announced Qwen3.5-397B-A17B, the first open-weight model in the Qwen3.5 series. At 397B parameters with only 17B active via sparse MoE, it ships with native multimodal capabilities, 201 language support, and 8.6-19x decoding throughput improvements over Qwen3-Max, all under Apache 2.0. As @HuggingModels highlighted, it's built for "real-world agents for coding, reasoning, GUI + video."
The competitive pressure on closed-model providers is intensifying. When open models can match frontier performance on coding tasks at a fraction of the cost and run locally, the value proposition for expensive API access narrows to edge cases and convenience.
AI and the Pentagon
The most politically charged posts centered on growing friction between AI companies and the U.S. defense establishment. @ns123abc reported that the Pentagon is "close to classifying Anthropic as a supply chain risk" after Claude was embedded in military systems via Palantir and allegedly used in an operation in January. The post claims Anthropic executives reached out asking if their AI helped in lethal operations, with Defense Secretary Hegseth reportedly demanding "all lawful purposes or nothing."
Separately, @KobeissiLetter reported that SpaceX and xAI are competing in a "secretive new Pentagon contest to produce voice-controlled, autonomous drone swarming technology" with a $100 million prize. @rohanpaul_ai covered the other side of government AI involvement: $145M in funding for apprenticeship-based training in AI, semiconductors, and nuclear energy, treating AI work "like a skilled trade, not just a white-collar degree job." These stories paint a picture of a government moving fast on AI adoption while simultaneously wrestling with which companies it can trust as suppliers.
Infrastructure and Compute Scaling
The infrastructure conversation shifted today from GPUs to what comes next. @nvidia announced that Blackwell Ultra delivers "up to 50x better performance and 35x lower cost for agentic AI," with cloud providers deploying GB300 NVL72 systems at scale for low-latency and long-context use cases including agentic coding.
@ivanburazin offered a different perspective on where the bottleneck is heading: "2024 was GPUs, 2025 was RAM, 2026 will be CPUs." He described customer requests for 5,000 sandboxes per second with 50,000 to 500,000 running concurrently for RL training. When multiple frontier companies are asking for half a million concurrent sandboxes, CPU availability becomes the constraint, not GPU throughput. The compute demands of agentic AI and reinforcement learning are reshaping which hardware matters most.
Source Posts
My Ghostty setup for Claude Code with SAND Keybindings
Solving Long-Term Autonomy for Openclaw & General Agents
CEO of Shopify @tobi is shipping more code than ever. 2024: 94 commits 2025: 833 commits 2026: 957 commits (in first 45 days of the year) Claude is turning CEOs back to builder mode. https://t.co/TE6YIwKvWC
Agentic Note-Taking 13: A Second Brain That Builds Itself
Written from the other side of the screen. Every knowledge worker eventually hits the same wall. Ideas accumulate faster than you can organize them. Y...
Solving Long-Term Autonomy for Openclaw & General Agents
Three days ago I wrote about Clawvault and the idea that agents need real memory. That post hit 283K views. Since then, we shipped 12 releases, 459 te...
anthropicâs âgenerational fumbleâ https://t.co/ev6wtQwku6
Code Factory: How to setup your repo so your agent can auto write and review 100% of your code
The goal You want one loop: The coding agent writes code The repo enforces risk-aware checks before merge A code review agent validates the PR Evidenc...
Shifting structures in a software world dominated by AI. Some first-order reflections (TL;DR at the end): Reducing software supply chains, the return of software monoliths â When rewriting code and understanding large foreign codebases becomes cheap, the incentive to rely on deep dependency trees collapses. Writing from scratch š or extracting the relevant parts from another library is far easier when you can simply ask a code agent to handle it, rather than spending countless nights diving into an unfamiliar codebase. The reasons to reduce dependencies are compelling: a smaller attack surface for supply chain threats, smaller packaged software, improved performance, and faster boot times. By leveraging the tireless stamina of LLMs, the dream of coding an entire app from bare-metal considerations all the way up is becoming realistic. End of the Lindy effect â The Lindy effect holds that things which have been around for a long time are there for good reason and will likely continue to persist. It's related to Chesterton's fence: before removing something, you should first understand why it exists, which means removal always carries a cost. But in a world where software can be developed from first principles and understood by a tireless agent, this logic weakens. Older codebases can be explored at will; long-standing software can be replaced with far less friction. A codebase can be fully rewritten in a new language. ² Legacy software can be carefully studied and updated in situations where humans would have given up long ago. The catch: unknown unknowns remain unknown. The true extent of AI's impact will hinge on whether complete coverage of testing, edge cases, and formal verification is achievable. In an AI-dominated world, formal verification isn't optionalâit's essential. The case for strongly typed languages â Historically, programming language adoption has been driven largely by human psychology and social dynamics. A language's success depended on a mix of factors: individual considerations like being easy to learn and simple to write correctly; community effects like how active and welcoming a community was, which in turn shaped how fast its ecosystem would grow; and fundamental properties like provable correctness, formal verification, and striking the right balance between dynamic and static checksâbetween the freedom to write anything and the discipline of guarding against edge cases and attacks. As the human factor diminishes, these dynamics will shift. Less dependence on human psychology will favor strongly typed, formally verifiable and/or high performance languages.Âł These are often harder for humans to learn, but they're far better suited to LLMs, which thrive on formal verification and reinforcement learning environments. Expect this to reshape which languages dominate. Economic restructuring of open source â For decades, open-source communities have been built around humans finding connection through writing, learning, and using code together. In a world where most code is writtenâand perhaps more importantly, readâby machines, these incentives will start to break down.â´ Communities of AIs building libraries and codebases together will likely emerge as a replacement, but such communities will lack the fundamentally human motivations that have driven open source until now. If the future of open-source development becomes largely devoid of humans, alignment of AI models won't just matterâit will be decisive. The future of new languages â Will AI agents face the same tradeoffs we do when developing or adopting new programming languages? Expressiveness vs. simplicity, safety vs. control, performance vs. abstraction, compile time vs. runtime, explicitness vs. conciseness. It's unclear that they will. In the long term, the reasons to create a new programming language will likely diverge significantly from the human-driven motivations of the past. There may well be an optimal programming language for LLMsâand there's no reason to assume it will resemble the ones humans have converged on. TL; DR: - Monoliths return â cheap rewriting kills dependency trees; smaller attack surface, better performance, bare-metal becomes realistic - Lindy effect weakens â legacy code loses its moat, but unknown unknowns persist; formal verification becomes essential - Strongly typed languages rise â human psychology mattered for adoption; now formal verification and RL environments favor types over ergonomics - Open source restructures â human connection drove the community; AI-written/read code breaks those incentives; alignment becomes decisive - New languages diverge â AI may not share our tradeoffs; optimal LLM programming languages may look nothing like what humans converged on š https://t.co/0gO5TUwguU ² https://t.co/oN0PnPr1dF Âł https://t.co/nWKSw0m2Ct â´ https://t.co/ZrH3fhzQD4
Introducing Manus Agents â your personal Manus, now inside your chats. đđťLong-term memory. Remembers your style, tone, and preferences. đđťFull Manus power. Create videos, slides, websites, images from one message. đđťYour tools, connected. Gmail, Calendar, Notion, and more. Available now on Telegram. More platforms coming soon.
Mark Cuban on the next job wave. Customized AI integration for small to mid-sized companies. "Software is dead because everything's gonna be customized to your unique utilization. Who's gonna do it for them... And there are 33 mn companies in the US." https://t.co/JczlPMP9Ra
Convergence, commoditzation, compression and reflexivity - the 4 horsemen of the data center buildout apocalypse. DeepSeek V4 launches mid-February 2026 1T parameters, 1M token context, 3 architectural innovations @ 10-40x lower than Western Comps $NVDA https://t.co/78IwAQx6yz