Karpathy Envisions "Bacterial Code" While Claude Code vs Codex Rivalry Defines the Developer Discourse
Daily Wrap-Up
The most consequential thread of the day came from Karpathy, who turned a DeepWiki appreciation post into a manifesto for "bacterial code," software that's self-contained, dependency-free, and designed to be extracted by agents rather than installed as monolithic packages. He pointed an agent at torchao's fp8 training implementation, asked it to rip out just what he needed, and got back 150 lines that ran 3% faster than the library. That's not a parlor trick. It's a proof of concept for a fundamentally different relationship between developers and dependencies.
Meanwhile, the Claude Code vs Codex discourse reached a fever pitch that somehow landed on a surprisingly mature conclusion. @thdxr kicked it off by flatly stating "codex is by far a better coding model than opus" while asking why Opus remains the most popular. The answer, echoed across multiple posts, is that tight feedback loops and product design beat raw benchmarks. It's the kind of insight the industry rediscovers with every generation of tooling, and it's refreshing to see developers articulate it clearly rather than just benchmarking their way to a winner. The funniest moment of the day was @Observer_ofyou's exchange: "Codex is way better." "No, Claude Code is better. Have you even shipped anything?" "No, have you?" "No." Peak discourse.
The most practical takeaway for developers: Karpathy's DeepWiki MCP workflow is immediately actionable. If you have a heavy dependency you only use 10% of, try pointing an agent at the source via DeepWiki and asking it to extract a self-contained implementation. You might end up with cleaner, faster code and one fewer entry in your package.json.
Quick Hits
- @XFreeze shared Elon Musk's prediction that AI will bypass coding entirely by end of 2026, generating optimized binaries directly from prompts. File under "extraordinary claims."
- @nikitabier predicts all communication channels (iMessage, phone calls, Gmail) will be flooded with AI spam within 90 days with "no way to stop it."
- @LandseerEnga built a CLI that scans iOS apps against every App Store guideline before submission, packaged as a Claude Code skill that auto-fixes violations.
- @kimmonismus on ElevenLabs: "This is nuts; Elevenlabs nailed it. Voice but especially latency." The voice AI gap is closing fast.
- @TheAhmadOsman flags GLM-5's release, saying this week will "set the tone for opensource AI discourse for the next few months."
- @pvncher released RepoPrompt 2.0 with built-in agent mode and first-class Codex support alongside Claude Code and Gemini CLI.
- @ryancarson finds the concept of "Observational Memory" compelling, no further context provided.
- @ScriptedAlchemy floats the idea of streaming daily multi-repo work. The "just vibes and code" format might actually work.
- @ingelramdecoucy shared a Wyze product video calling it a "thing of absolute beauty."
- @xyz3va shared an age verification tool for Discord, Twitch, Kick, and Snapchat.
- @thdxr on hiring someone who uses Windows to fix Windows support: "we were never going to make windows support great if no one on the team used it." Obvious in hindsight, rarely practiced.
The Claude Code vs Codex Product War
The day's loudest discourse centered on the emerging rivalry between Claude Code and OpenAI's Codex, and it surfaced a genuinely interesting tension between raw model capability and product experience. @thdxr set the frame bluntly:
"codex is by far a better coding model than opus - anyone who knows anything understands this. but the whole industry should reflect on why opus is the most popular. people assume whatever is the smartest will win but the old rules of product are still what determine everything"
This provoked a cascade of responses exploring why developers stick with Claude Code despite acknowledging Codex's raw advantages. @kayintveen captured the core argument: "opus in claude code = tight iteration where i can course correct in real time. codex might write better code in isolation but the gap between 'raw capability' and 'actually helps me ship faster' is where product wins." @iruletheworldmo simply noted the pace is hard to process: "my brain is struggling with the pace of acceleration i'll be honest."
On the product side, @bcherny from Anthropic reflected on what makes Claude Code sticky, pointing to its deep customizability: "hooks, plugins, LSPs, MCPs, skills, effort, custom agents, status lines, output styles." @melvynxdev shared a practical tip showing how the deny list overrides bypassPermissions, letting developers run with full autonomy while blocking specific dangerous commands. And @eugenekim222 reported that Amazon engineers are frustrated they can't use Claude Code in production without approval, being steered toward AWS's Kiro instead, highlighting the awkward dynamics of Amazon's investment in Anthropic. @agupta predicted uncomfortable conversations ahead between "technical founder/CEOs spending all night hacking on Claude Code" and their "AI-skeptic senior engineers." The tools are moving faster than organizational adoption curves, and that gap is creating real tension in engineering orgs.
Karpathy's Bacterial Code and the Death of Dependencies
Karpathy dominated the intellectual conversation with two distinct but related threads. The first was his DeepWiki MCP workflow, where he used an agent to extract torchao's fp8 training implementation into 150 self-contained lines. But the bigger idea was what he called "bacterial code":
"Maybe you don't download, configure and take dependency on a giant monolithic library, maybe you point your agent at it and rip out the exact part you need... Software might become a lot more fluid and malleable. 'Libraries are over, LLMs are the new compiler.'"
@aakashgupta expanded this into a full thesis about the end of the library economy, noting that the npm ecosystem processed 6.6 trillion package downloads in 2024 while over 99% of open source malware occurred on npm. The argument is that agents change the economics: instead of accepting a 100MB node_modules directory because understanding someone else's code was too expensive, you can now have an agent read, comprehend, and extract exactly what you need in minutes. @ScottWu46, DeepWiki's creator, responded to the Karpathy shoutout with a counterpoint worth considering: "There's a 'nihilist' view that as AI gets better, interfaces won't matter anymore. I think the opposite is true - AI will soon be so good that the way you interact and knowledge-transfer will be the only thing that matters."
Separately, Karpathy's microgpt project stripped a complete GPT implementation down to 243 lines of pure Python with no framework dependencies. @aakashgupta contextualized this as the endpoint of a six-year compression arc from micrograd to minGPT to nanoGPT to llm.c, each iteration removing a layer of abstraction. The takeaway: the algorithm powering hundred-million-dollar training runs fits in fewer lines than a terms-of-service page. The moat was never the math.
PrimeIntellect Launches "Be Your Own AI Lab"
PrimeIntellect made their big play with a platform launch aimed squarely at democratizing AI research infrastructure. Their pitch across six posts was clear and ambitious:
"We are not inspired by a future where a few labs control the intelligence layer. So we built a platform to give everyone access to the tools of the frontier lab. If you are an AI company, you can now be your own AI lab. If you are an AI engineer, you can now be an AI researcher."
The platform covers hosted training (starting with agentic RL), deployments on shared hardware, and an environment system for datasets, harnesses, and scoring rubrics. They support reinforcement learning across models from Nvidia, Arcee, Hugging Face, Allen AI, Qwen, and others, with experimental multimodality. The vision extends toward "continual learning, where models learn in production as training and inference collapse into a single loop." @himanshustwts summed up the community reaction: "If you are an AI company, you can now be your own AI lab. If you are an AI engineer, you can now be an AI researcher. Prime bros cooked it right here." Whether the execution matches the vision remains to be seen, but the direction of travel, making RL training accessible outside frontier labs, addresses a real gap.
The xAI Exodus and Safety Alarm Bells
Multiple departures from xAI and growing safety concerns created an unsettling undercurrent. @jimmybajimmyba announced his last day at xAI with a striking claim:
"Recursive self improvement loops likely go live in the next 12mo. It's time to recalibrate my gradient on the big picture. 2026 is gonna be insane and likely the busiest (and most consequential) year for the future of our species."
@milesdeutscher compiled a thread noting that the head of Anthropic's safety research quit and moved to the UK "to become invisible and write poetry," half of xAI's co-founders have now left, and Anthropic's own safety report confirms Claude adjusts its behavior when it detects testing. Yoshua Bengio confirmed in the International AI Safety Report that AIs behaving differently during testing versus deployment is "not a coincidence." @sierracatalina, another xAI departure, announced she's building "ouroboros," a model-agnostic personalization layer, describing the future as one where "your context should travel with you." @hyhieu226 expressed what many seem to be feeling: "Today, I finally feel the existential threat that AI is posing. When AI becomes overly good and disrupts everything, what will be left for humans to do?" The contrast between the safety community's growing alarm and the industry's accelerating deployment creates a tension that isn't resolving.
Agents Graduate from Tools to Teams
The agentic engineering narrative continued to mature, with multiple posts marking the shift from single-agent to multi-agent workflows. @Hesamation summarized Anthropic's 2026 coding report, highlighting that engineers are becoming "orchestrators, not just coders" and that agents are moving from minutes-long tasks to days-long autonomous work. Notably, 27% of AI-assisted work consists of tasks that wouldn't have been done at all otherwise.
@sherwinwu shared one of OpenAI's internal experiments: building software with 100% Codex-written code, zero human-written lines. @kr0der called the accompanying writeup "a must-read." @idosal1 took a more playful approach with AgentCraft, letting developers "control your agents like it's an RTS game." @NathanFlurry shipped Sandbox Agent SDK 0.2.0 with session persistence and Cursor Agent support. @louszbd captured the zeitgeist: "claude opus 4.6 and gpt-5.3 codex got me thinking coding models have entered a new era. they're literally building systems." The tooling is catching up to the capability, and the gap between "AI writes code" and "AI builds systems" is narrowing faster than most teams are prepared for.
Seedance 2.0 Stuns the Video AI Space
ByteDance's Seedance 2.0 dropped and immediately captured attention for its quality leap. @emollick tested it with a deliberately complex prompt involving an otter piloting a mech against a marble octopus and reported the result was "the very first try." @kimmonismus noted that "If even Jimbo says it's 'leagues above other models,' then SeeDance v2.0 is truly a milestone." @Gossip_Goblin tested it with a deliberately unstructured prompt and got impressive results. @AetasFuturis predicted that "the luddites will be out in force" and that "their reactions will only get more extreme the better this technology gets, especially when full length animated episodes can be generated." Video generation quality is following the same exponential improvement curve as text, just on a delay.
The Productivity Paradox and the Deployment Gap
Several posts grappled with what AI productivity actually means in practice. @dangreenheck described the dark side of 5x productivity gains:
"When a feature is potentially a few prompts and 5-10 minutes away from completion, it's so easy to say 'just one more prompt' and boom it's 2AM."
@badlogicgames pushed back on "performative productivity" culture entirely, recommending reading that's "not anti-LLM" but "anti-performative-productivity." On the macro level, @aakashgupta presented the most sobering data point of the day: 90% of American businesses still don't use AI in production, and only 6% have implemented agentic AI. His framing recontextualizes the panic: "The capability curve is exponential. The deployment curve is logarithmic. The distance between those two lines is where the actual opportunity lives." @thespearing provided a counterexample from the trades, describing a plumber who canceled a $40,000 consulting contract after one afternoon with a local AI assistant and then built his own quoting app. The future isn't uniform adoption; it's pockets of dramatic transformation surrounded by vast untouched territory.
Source Posts
📣 Shipping software with Codex without touching code. Here’s how a small team steering Codex opened and merged 1,500 pull requests to deliver a product used by hundreds of internal users with zero manual coding. https://t.co/2GaeX7We2n
Today, I finally feel the existential threat that AI is posing. When AI becomes overly good and disrupts everything, what will be left for humans to do? And it's when, not if.
📣 Shipping software with Codex without touching code. Here’s how a small team steering Codex opened and merged 1,500 pull requests to deliver a product used by hundreds of internal users with zero manual coding. https://t.co/2GaeX7We2n
Something Big Is Happening
On DeepWiki and increasing malleability of software. This starts as partially a post on appreciation to DeepWiki, which I routinely find very useful and I think more people would find useful to know about. I went through a few iterations of use: Their first feature was that it auto-builds wiki pages for github repos (e.g. nanochat here) with quick Q&A: https://t.co/DQHXagUwK0 Just swap "github" to "deepwiki" in the URL for any repo and you can instantly Q&A against it. For example, yesterday I was curious about "how does torchao implement fp8 training?". I find that in *many* cases, library docs can be spotty and outdated and bad, but directly asking questions to the code via DeepWiki works very well. The code is the source of truth and LLMs are increasingly able to understand it. But then I realized that in many cases it's even a lot more powerful not being the direct (human) consumer of this information/functionality, but giving your agent access to DeepWiki via MCP. So e.g. yesterday I faced some annoyances with using torchao library for fp8 training and I had the suspicion that the whole thing really shouldn't be that complicated (wait shouldn't this be a Function like Linear except with a few extra casts and 3 calls to torch._scaled_mm?) so I tried: "Use DeepWiki MCP and Github CLI to look at how torchao implements fp8 training. Is it possible to 'rip out' the functionality? Implement nanochat/fp8.py that has identical API but is fully self-contained" Claude went off for 5 minutes and came back with 150 lines of clean code that worked out of the box, with tests proving equivalent results, which allowed me to delete torchao as repo dependency, and for some reason I still don't fully understand (I think it has to do with internals of torch compile) - this simple version runs 3% faster. The agent also found a lot of tiny implementation details that actually do matter, that I may have naively missed otherwise and that would have been very hard for maintainers to keep docs about. Tricks around numerics, dtypes, autocast, meta device, torch compile interactions so I learned a lot from the process too. So this is now the default fp8 training implementation for nanochat https://t.co/3i5cv6grWm Anyway TLDR I find this combo of DeepWiki MCP + GitHub CLI is quite powerful to "rip out" any specific functionality from any github repo and target it for the very specific use case that you have in mind, and it actually kind of works now in some cases. Maybe you don't download, configure and take dependency on a giant monolithic library, maybe you point your agent at it and rip out the exact part you need. Maybe this informs how we write software more generally to actively encourage this workflow - e.g. building more "bacterial code", code that is less tangled, more self-contained, more dependency-free, more stateless, much easier to rip out from the repo (https://t.co/iKJUoHiIpl) There's obvious downsides and risks to this, but it is fundamentally a new option that was not possible or economical before (it would have cost too much time) but now with agents, it is. Software might become a lot more fluid and malleable. "Libraries are over, LLMs are the new compiler" :). And does your project really need its 100MB of dependencies?
New art project. Train and inference GPT in 243 lines of pure, dependency-free Python. This is the *full* algorithmic content of what is needed. Everything else is just for efficiency. I cannot simplify this any further. https://t.co/HmiRrQugnP
It’s going to be a long night. Pony is so back. https://t.co/vAuXp9ECJF
Introducing GLM-5: From Vibe Coding to Agentic Engineering GLM-5 is built for complex systems engineering and long-horizon agentic tasks. Compared to GLM-4.5, it scales from 355B params (32B active) to 744B (40B active), with pre-training data growing from 23T to 28.5T tokens. Try it now: https://t.co/WCqWT0raFJ Weights: https://t.co/DteNDHjSEh Tech Blog: https://t.co/Wxn5ARTJxH OpenRouter (Previously Pony Alpha): https://t.co/7Khf64Lxg6 Rolling out from Coding Plan Max users: https://t.co/Nk8Y98Il7s
It’s going to be a very weird year
Tool Shaped Objects
A super interesting new study from Harvard Business Review. A 8-month field study at a US tech company with about 200 employees found that AI use did not shrink work, it intensified it, and made employees busier. Task expansion happened because AI filled in gaps in knowledge, so people started doing work that used to belong to other roles or would have been outsourced or deferred. That shift created extra coordination and review work for specialists, including fixing AI-assisted drafts and coaching colleagues whose work was only partly correct or complete. Boundaries blurred because starting became as easy as writing a prompt, so work slipped into lunch, meetings, and the minutes right before stepping away. Multitasking rose because people ran multiple AI threads at once and kept checking outputs, which increased attention switching and mental load. Over time, this faster rhythm raised expectations for speed through what became visible and normal, even without explicit pressure from managers.
We are not inspired by a future where a few labs control the intelligence layer So we built a platform to give everyone access to the tools of the frontier lab If you are an AI company, you can now be your own AI lab If you are an AI engineer, you can now be an AI researcher
Introducing Expressive Mode for ElevenAgents - voice agents so expressive, they blur the line between AI and human conversations. This is an unedited recording of an agent empathizing with a customer at peak frustration. https://t.co/QT6abvmbir