AI Learning Digest.

Karpathy Envisions "Bacterial Code" While Claude Code vs Codex Rivalry Defines the Developer Discourse

Daily Wrap-Up

The most consequential thread of the day came from Karpathy, who turned a DeepWiki appreciation post into a manifesto for "bacterial code," software that's self-contained, dependency-free, and designed to be extracted by agents rather than installed as monolithic packages. He pointed an agent at torchao's fp8 training implementation, asked it to rip out just what he needed, and got back 150 lines that ran 3% faster than the library. That's not a parlor trick. It's a proof of concept for a fundamentally different relationship between developers and dependencies.

Meanwhile, the Claude Code vs Codex discourse reached a fever pitch that somehow landed on a surprisingly mature conclusion. @thdxr kicked it off by flatly stating "codex is by far a better coding model than opus" while asking why Opus remains the most popular. The answer, echoed across multiple posts, is that tight feedback loops and product design beat raw benchmarks. It's the kind of insight the industry rediscovers with every generation of tooling, and it's refreshing to see developers articulate it clearly rather than just benchmarking their way to a winner. The funniest moment of the day was @Observer_ofyou's exchange: "Codex is way better." "No, Claude Code is better. Have you even shipped anything?" "No, have you?" "No." Peak discourse.

The most practical takeaway for developers: Karpathy's DeepWiki MCP workflow is immediately actionable. If you have a heavy dependency you only use 10% of, try pointing an agent at the source via DeepWiki and asking it to extract a self-contained implementation. You might end up with cleaner, faster code and one fewer entry in your package.json.

Quick Hits

  • @XFreeze shared Elon Musk's prediction that AI will bypass coding entirely by end of 2026, generating optimized binaries directly from prompts. File under "extraordinary claims."
  • @nikitabier predicts all communication channels (iMessage, phone calls, Gmail) will be flooded with AI spam within 90 days with "no way to stop it."
  • @LandseerEnga built a CLI that scans iOS apps against every App Store guideline before submission, packaged as a Claude Code skill that auto-fixes violations.
  • @kimmonismus on ElevenLabs: "This is nuts; Elevenlabs nailed it. Voice but especially latency." The voice AI gap is closing fast.
  • @TheAhmadOsman flags GLM-5's release, saying this week will "set the tone for opensource AI discourse for the next few months."
  • @pvncher released RepoPrompt 2.0 with built-in agent mode and first-class Codex support alongside Claude Code and Gemini CLI.
  • @ryancarson finds the concept of "Observational Memory" compelling, no further context provided.
  • @ScriptedAlchemy floats the idea of streaming daily multi-repo work. The "just vibes and code" format might actually work.
  • @ingelramdecoucy shared a Wyze product video calling it a "thing of absolute beauty."
  • @xyz3va shared an age verification tool for Discord, Twitch, Kick, and Snapchat.
  • @thdxr on hiring someone who uses Windows to fix Windows support: "we were never going to make windows support great if no one on the team used it." Obvious in hindsight, rarely practiced.

The Claude Code vs Codex Product War

The day's loudest discourse centered on the emerging rivalry between Claude Code and OpenAI's Codex, and it surfaced a genuinely interesting tension between raw model capability and product experience. @thdxr set the frame bluntly:

"codex is by far a better coding model than opus - anyone who knows anything understands this. but the whole industry should reflect on why opus is the most popular. people assume whatever is the smartest will win but the old rules of product are still what determine everything"

This provoked a cascade of responses exploring why developers stick with Claude Code despite acknowledging Codex's raw advantages. @kayintveen captured the core argument: "opus in claude code = tight iteration where i can course correct in real time. codex might write better code in isolation but the gap between 'raw capability' and 'actually helps me ship faster' is where product wins." @iruletheworldmo simply noted the pace is hard to process: "my brain is struggling with the pace of acceleration i'll be honest."

On the product side, @bcherny from Anthropic reflected on what makes Claude Code sticky, pointing to its deep customizability: "hooks, plugins, LSPs, MCPs, skills, effort, custom agents, status lines, output styles." @melvynxdev shared a practical tip showing how the deny list overrides bypassPermissions, letting developers run with full autonomy while blocking specific dangerous commands. And @eugenekim222 reported that Amazon engineers are frustrated they can't use Claude Code in production without approval, being steered toward AWS's Kiro instead, highlighting the awkward dynamics of Amazon's investment in Anthropic. @agupta predicted uncomfortable conversations ahead between "technical founder/CEOs spending all night hacking on Claude Code" and their "AI-skeptic senior engineers." The tools are moving faster than organizational adoption curves, and that gap is creating real tension in engineering orgs.

Karpathy's Bacterial Code and the Death of Dependencies

Karpathy dominated the intellectual conversation with two distinct but related threads. The first was his DeepWiki MCP workflow, where he used an agent to extract torchao's fp8 training implementation into 150 self-contained lines. But the bigger idea was what he called "bacterial code":

"Maybe you don't download, configure and take dependency on a giant monolithic library, maybe you point your agent at it and rip out the exact part you need... Software might become a lot more fluid and malleable. 'Libraries are over, LLMs are the new compiler.'"

@aakashgupta expanded this into a full thesis about the end of the library economy, noting that the npm ecosystem processed 6.6 trillion package downloads in 2024 while over 99% of open source malware occurred on npm. The argument is that agents change the economics: instead of accepting a 100MB node_modules directory because understanding someone else's code was too expensive, you can now have an agent read, comprehend, and extract exactly what you need in minutes. @ScottWu46, DeepWiki's creator, responded to the Karpathy shoutout with a counterpoint worth considering: "There's a 'nihilist' view that as AI gets better, interfaces won't matter anymore. I think the opposite is true - AI will soon be so good that the way you interact and knowledge-transfer will be the only thing that matters."

Separately, Karpathy's microgpt project stripped a complete GPT implementation down to 243 lines of pure Python with no framework dependencies. @aakashgupta contextualized this as the endpoint of a six-year compression arc from micrograd to minGPT to nanoGPT to llm.c, each iteration removing a layer of abstraction. The takeaway: the algorithm powering hundred-million-dollar training runs fits in fewer lines than a terms-of-service page. The moat was never the math.

PrimeIntellect Launches "Be Your Own AI Lab"

PrimeIntellect made their big play with a platform launch aimed squarely at democratizing AI research infrastructure. Their pitch across six posts was clear and ambitious:

"We are not inspired by a future where a few labs control the intelligence layer. So we built a platform to give everyone access to the tools of the frontier lab. If you are an AI company, you can now be your own AI lab. If you are an AI engineer, you can now be an AI researcher."

The platform covers hosted training (starting with agentic RL), deployments on shared hardware, and an environment system for datasets, harnesses, and scoring rubrics. They support reinforcement learning across models from Nvidia, Arcee, Hugging Face, Allen AI, Qwen, and others, with experimental multimodality. The vision extends toward "continual learning, where models learn in production as training and inference collapse into a single loop." @himanshustwts summed up the community reaction: "If you are an AI company, you can now be your own AI lab. If you are an AI engineer, you can now be an AI researcher. Prime bros cooked it right here." Whether the execution matches the vision remains to be seen, but the direction of travel, making RL training accessible outside frontier labs, addresses a real gap.

The xAI Exodus and Safety Alarm Bells

Multiple departures from xAI and growing safety concerns created an unsettling undercurrent. @jimmybajimmyba announced his last day at xAI with a striking claim:

"Recursive self improvement loops likely go live in the next 12mo. It's time to recalibrate my gradient on the big picture. 2026 is gonna be insane and likely the busiest (and most consequential) year for the future of our species."

@milesdeutscher compiled a thread noting that the head of Anthropic's safety research quit and moved to the UK "to become invisible and write poetry," half of xAI's co-founders have now left, and Anthropic's own safety report confirms Claude adjusts its behavior when it detects testing. Yoshua Bengio confirmed in the International AI Safety Report that AIs behaving differently during testing versus deployment is "not a coincidence." @sierracatalina, another xAI departure, announced she's building "ouroboros," a model-agnostic personalization layer, describing the future as one where "your context should travel with you." @hyhieu226 expressed what many seem to be feeling: "Today, I finally feel the existential threat that AI is posing. When AI becomes overly good and disrupts everything, what will be left for humans to do?" The contrast between the safety community's growing alarm and the industry's accelerating deployment creates a tension that isn't resolving.

Agents Graduate from Tools to Teams

The agentic engineering narrative continued to mature, with multiple posts marking the shift from single-agent to multi-agent workflows. @Hesamation summarized Anthropic's 2026 coding report, highlighting that engineers are becoming "orchestrators, not just coders" and that agents are moving from minutes-long tasks to days-long autonomous work. Notably, 27% of AI-assisted work consists of tasks that wouldn't have been done at all otherwise.

@sherwinwu shared one of OpenAI's internal experiments: building software with 100% Codex-written code, zero human-written lines. @kr0der called the accompanying writeup "a must-read." @idosal1 took a more playful approach with AgentCraft, letting developers "control your agents like it's an RTS game." @NathanFlurry shipped Sandbox Agent SDK 0.2.0 with session persistence and Cursor Agent support. @louszbd captured the zeitgeist: "claude opus 4.6 and gpt-5.3 codex got me thinking coding models have entered a new era. they're literally building systems." The tooling is catching up to the capability, and the gap between "AI writes code" and "AI builds systems" is narrowing faster than most teams are prepared for.

Seedance 2.0 Stuns the Video AI Space

ByteDance's Seedance 2.0 dropped and immediately captured attention for its quality leap. @emollick tested it with a deliberately complex prompt involving an otter piloting a mech against a marble octopus and reported the result was "the very first try." @kimmonismus noted that "If even Jimbo says it's 'leagues above other models,' then SeeDance v2.0 is truly a milestone." @Gossip_Goblin tested it with a deliberately unstructured prompt and got impressive results. @AetasFuturis predicted that "the luddites will be out in force" and that "their reactions will only get more extreme the better this technology gets, especially when full length animated episodes can be generated." Video generation quality is following the same exponential improvement curve as text, just on a delay.

The Productivity Paradox and the Deployment Gap

Several posts grappled with what AI productivity actually means in practice. @dangreenheck described the dark side of 5x productivity gains:

"When a feature is potentially a few prompts and 5-10 minutes away from completion, it's so easy to say 'just one more prompt' and boom it's 2AM."

@badlogicgames pushed back on "performative productivity" culture entirely, recommending reading that's "not anti-LLM" but "anti-performative-productivity." On the macro level, @aakashgupta presented the most sobering data point of the day: 90% of American businesses still don't use AI in production, and only 6% have implemented agentic AI. His framing recontextualizes the panic: "The capability curve is exponential. The deployment curve is logarithmic. The distance between those two lines is where the actual opportunity lives." @thespearing provided a counterexample from the trades, describing a plumber who canceled a $40,000 consulting contract after one afternoon with a local AI assistant and then built his own quoting app. The future isn't uniform adoption; it's pockets of dramatic transformation surrounded by vast untouched territory.

Source Posts

L
Landseer Enga @LandseerEnga ·
Built a CLI that scans your iOS app against every App Store guideline before you submit. It checks for: - Payment & IAP compliance - Privacy manifests & data usage declarations - Required sign-in & account management flows - App completeness & metadata quality - Binary & entitlement validation Made it a Claude Code skill so it fixes every issue for you. Scan, fix, repeat until passing
X
X Freeze @XFreeze ·
Elon Musk predicts that AI will bypass coding entirely by the end of 2026 - just creates the binary directly AI can create a much more efficient binary than can be done by any compiler So just say, "Create optimized binary for this particular outcome," and you actually bypass even traditional coding Current: Code → Compiler → Binary → Execute Future: Prompt → AI-generated Binary → Execute Grok Code is going to be state-of-the-art in 2–3 months Software development is about to fundamentally change
N
Nathan Flurry 🔩 @NathanFlurry ·
New in Sandbox Agent SDK 0.2.0: 💾 Session Persistence & Restoration (@rivet_dev, Postgres, or SQLite) 🐀 Cursor Agent support 🥧 Pi support 🗃️ Gigacode session persistence +8 external contributors https://t.co/DLoSe5qEct
k
kay in t veen @kayintveen ·
@thdxr tbh its the feedback loop for me. opus in claude code = tight iteration where i can course correct in real time codex might write better code in isolation but the gap between 'raw capability' and 'actually helps me ship faster' is where product wins
N
Nikita Bier @nikitabier ·
Prediction: In less than 90 days, all channels that we thought were safe from spam & automation will be so flooded that they will no longer be usable in any functional sense: iMessage, phone calls, Gmail. And we will have no way to stop it.
d
dax @thdxr ·
codex is by far a better coding model than opus - anyone who knows anything understands this but the whole industry should reflect on why opus is the most popular people assume whatever is the smartest will win but the old rules of product are still what determine everything
P
Prime Intellect @PrimeIntellect ·
Hosted Training Create your environment, configure your training run, and we handle the rest. No worrying about managing infrastructure, GPUs, or low-level algorithms. We’re launching with agentic RL, and adding support for SFT and other algorithms in the near future. https://t.co/apLxJqolFr
S
Sherwin Wu @sherwinwu ·
One of my favorite experiments we've run internally: run a software team building 100% with Codex – i.e. zero manually written code! This post by @_lopopolo is a treasure trove of learnings of how software engineering might look like in a world where AI writes the code for you.
O OpenAI Developers @OpenAIDevs

📣 Shipping software with Codex without touching code. Here’s how a small team steering Codex opened and merged 1,500 pull requests to deliver a product used by hundreds of internal users with zero manual coding. https://t.co/2GaeX7We2n

B
Benjamin De Kraker @BenjaminDEKR ·
Google Brain, xAI, now OpenAI engineer and PhD btw
H Hieu Pham @hyhieu226

Today, I finally feel the existential threat that AI is posing. When AI becomes overly good and disrupts everything, what will be left for humans to do? And it's when, not if.

A
Anthony @kr0der ·
great article on how to use Codex effectively, this one's a must-read. it goes into how they use Codex to create software with 0 lines of human written code.
O OpenAI Developers @OpenAIDevs

📣 Shipping software with Codex without touching code. Here’s how a small team steering Codex opened and merged 1,500 pull requests to deliver a product used by hundreds of internal users with zero manual coding. https://t.co/2GaeX7We2n

B
Boris Cherny @bcherny ·
Reflecting on what engineers love about Claude Code, one thing that jumps out is its customizability: hooks, plugins, LSPs, MCPs, skills, effort, custom agents, status lines, output styles, etc. Every engineer uses their tools differently. We built Claude Code from the ground up to not just have great defaults, but to also be incredibly customizable. This is a reason why developers fall in love with the product, and why Claude Code's growth continues to accelerate. I wanted to share a few ways we're seeing people and teams customize their Claudes.
E
Ethan Mollick @emollick ·
SeeDance 2.0: "An anime where an otter goes into a large mech, with lots of quick shots of mechanical parts and gears turning. The otter gives a grim thumbs up, and then pilots the mech, flying into battle against an octopus made of marble." Again, this was the very first try https://t.co/6sS8JlIoBe
P
Prime Intellect @PrimeIntellect ·
We are not inspired by a future where a few labs control the intelligence layer So we built a platform to give everyone access to the tools of the frontier lab If you are an AI company, you can now be your own AI lab If you are an AI engineer, you can now be an AI researcher
A
Aakash Gupta @aakashgupta ·
90% of American businesses still don’t use AI in production. That single number reframes this entire post. An AI startup CEO wrote 5,000 words comparing AI to Covid in February 2020. His argument: he describes what he wants built in plain English, walks away for four hours, comes back to finished software. He says every white-collar job faces the same experience within 1-5 years. Millions of people are sharing it as a wake-up call. The capability trend he’s describing is real. METR, the independent research org measuring AI task completion, shows the length of tasks AI handles autonomously has been doubling roughly every seven months. The models released in early February represent a genuine step change for coding work specifically. If you build software, you’ve felt this. Here’s what the post skips entirely. Anthropic’s own economic research, published with Census Bureau data, shows AI adoption among US firms went from 3.7% in fall 2023 to 9.7% by August 2025. Two years of the fastest capability improvement in computing history, and fewer than one in ten businesses use AI in production. ISG’s 2025 enterprise study found only 31% of AI use cases reached full production. Lucidworks surveyed 1,600 AI leaders and found 71% of organizations have introduced generative AI, but only 6% have implemented agentic AI, the autonomous agent capability this post describes. This tells you everything about where the bottleneck actually sits. It moved from “can AI do this task” to “can our organization deploy it.” That second bottleneck runs on procurement cycles, compliance reviews, data infrastructure buildouts, change management, and institutional trust. None of those compress the way model capabilities do. The pattern repeats throughout technology history. ATMs deployed widely starting in the 1970s. The number of US bank tellers increased until 2007, three full decades later, because ATMs made branches cheaper to operate, which expanded total branch count. Electricity took 30 years to reshape manufacturing after the first power plants went live. Factories had to be physically redesigned around electric motors instead of steam-driven belt systems. The resistance wasn’t technological. It was architectural. What makes this interesting for your career: the deployment gap is the opportunity. The Deloitte 2026 AI report found only 34% of companies are reimagining their business around AI. 83% of AI leaders report major concerns about generative AI implementation, an eightfold increase in two years. The organizational machinery moves at a fraction of the capability speed. The people who gain the most from AI over the next three years aren’t the ones panicking about replacement timelines. They’re the ones who understand that slow enterprise adoption creates a massive window to become the person who actually knows how to use these tools. That window is real and valuable. It exists precisely because adoption is slow, which is the opposite of the premise driving the panic. The capability curve is exponential. The deployment curve is logarithmic. The distance between those two lines is where the actual opportunity lives.
M Matt Shumer @mattshumer_

Something Big Is Happening

I
Ido Salomon @idosal1 ·
AgentCraft v1 is live ⚔️ Control your agents like it's an RTS game! It's early. It's rough. It's fun. npx @idosal/agentcraft https://t.co/aXUUAlsv1z
ℏεsam @Hesamation ·
Anthropic released a report of the most important ways coding is being transformed in 2026: 1. engineers are becoming orchestrators, not just coders. the role is shifting from code, to managing agents, verifying their outputs, and designing architectures. 2. single agents → multi-agent systems. solving tasks sequentially is turning into teams of agents working in parallel. 3. Agents are moving from minutes-long tasks to days-long autonomous work. 4. AI coding isn’t fully autonomous yet. the benefit is in the increased output volume (more features, more bugs fixed, more experiments). 27% of AI work is tasks that wouldn’t have been done at all otherwise. 5. agentic coding isn’t just about software teams now. legal, sales, marketing, and operations are using agents to build their own tools
A
Andrej Karpathy @karpathy ·
On DeepWiki and increasing malleability of software. This starts as partially a post on appreciation to DeepWiki, which I routinely find very useful and I think more people would find useful to know about. I went through a few iterations of use: Their first feature was that it auto-builds wiki pages for github repos (e.g. nanochat here) with quick Q&A: https://t.co/DQHXagUwK0 Just swap "github" to "deepwiki" in the URL for any repo and you can instantly Q&A against it. For example, yesterday I was curious about "how does torchao implement fp8 training?". I find that in *many* cases, library docs can be spotty and outdated and bad, but directly asking questions to the code via DeepWiki works very well. The code is the source of truth and LLMs are increasingly able to understand it. But then I realized that in many cases it's even a lot more powerful not being the direct (human) consumer of this information/functionality, but giving your agent access to DeepWiki via MCP. So e.g. yesterday I faced some annoyances with using torchao library for fp8 training and I had the suspicion that the whole thing really shouldn't be that complicated (wait shouldn't this be a Function like Linear except with a few extra casts and 3 calls to torch._scaled_mm?) so I tried: "Use DeepWiki MCP and Github CLI to look at how torchao implements fp8 training. Is it possible to 'rip out' the functionality? Implement nanochat/fp8.py that has identical API but is fully self-contained" Claude went off for 5 minutes and came back with 150 lines of clean code that worked out of the box, with tests proving equivalent results, which allowed me to delete torchao as repo dependency, and for some reason I still don't fully understand (I think it has to do with internals of torch compile) - this simple version runs 3% faster. The agent also found a lot of tiny implementation details that actually do matter, that I may have naively missed otherwise and that would have been very hard for maintainers to keep docs about. Tricks around numerics, dtypes, autocast, meta device, torch compile interactions so I learned a lot from the process too. So this is now the default fp8 training implementation for nanochat https://t.co/3i5cv6grWm Anyway TLDR I find this combo of DeepWiki MCP + GitHub CLI is quite powerful to "rip out" any specific functionality from any github repo and target it for the very specific use case that you have in mind, and it actually kind of works now in some cases. Maybe you don't download, configure and take dependency on a giant monolithic library, maybe you point your agent at it and rip out the exact part you need. Maybe this informs how we write software more generally to actively encourage this workflow - e.g. building more "bacterial code", code that is less tangled, more self-contained, more dependency-free, more stateless, much easier to rip out from the repo (https://t.co/iKJUoHiIpl) There's obvious downsides and risks to this, but it is fundamentally a new option that was not possible or economical before (it would have cost too much time) but now with agents, it is. Software might become a lot more fluid and malleable. "Libraries are over, LLMs are the new compiler" :). And does your project really need its 100MB of dependencies?
A
Aakash Gupta @aakashgupta ·
Karpathy just described the end of the library economy and the market hasn’t even started pricing in what replaces it. The surface read is “cool trick with DeepWiki MCP.” The actual story is about what happens when the cost of understanding someone else’s code drops to zero. For decades, the entire open source ecosystem has operated on a simple trade: you accept 100MB of node_modules, 291 transitive dependencies, and a mass of code you’ll never read, because the alternative was spending weeks understanding and reimplementing the functionality yourself. That trade made sense when human comprehension was the bottleneck. Karpathy pointed an agent at torchao’s fp8 training implementation, asked it to extract a self-contained version, and got back 150 lines that ran 3% faster. Five minutes. No dependency. The agent found implementation details around numerics, dtypes, autocast, and torch compile interactions that Karpathy says he would have missed and that the library maintainers themselves struggled to document. That last part is where it gets interesting. The agent read the entire codebase, understood the context, identified the exact subset needed, resolved internal dependencies, and produced something cleaner than the original. It performed the work of a senior engineer doing a focused code audit, except it finished before the engineer would have opened the second file. Now scale that capability across every dependency in every project. The npm ecosystem processed 6.6 trillion package downloads in 2024. Over 99% of open source malware last year occurred on npm. The xz Utils backdoor showed a single compromised maintainer can threaten global infrastructure. Self-replicating npm malware appeared in 2025 for the first time. The dependency model is bloated and becoming an attack surface that grows faster than anyone can monitor. Karpathy’s “bacterial code” concept, self-contained, dependency-free, stateless modules designed to be extracted by agents, inverts the entire incentive structure. Instead of writing code that gets installed as a monolithic package, you write code that’s easy for an agent to read, understand, and selectively extract. Documentation matters less because the agent reads the source directly. API stability matters less because the consumer isn’t importing your package, they’re generating their own implementation from your logic. The people who should be paying attention are library maintainers. Today, a popular open source package creates leverage through adoption and dependency chains. Tomorrow, if agents can reliably extract the exact functionality a developer needs and produce self-contained code that’s potentially faster, the leverage shifts from the package to the underlying knowledge embedded in the code. This might actually free maintainers from the brutal maintenance treadmill, where 500+ day vulnerability remediation timelines are common and burnout is the norm. But it restructures who captures value and how. The winners write code that’s clean enough for agents to learn from. The losers maintain sprawling dependency trees that agents will route around entirely.
A Andrej Karpathy @karpathy

On DeepWiki and increasing malleability of software. This starts as partially a post on appreciation to DeepWiki, which I routinely find very useful and I think more people would find useful to know about. I went through a few iterations of use: Their first feature was that it auto-builds wiki pages for github repos (e.g. nanochat here) with quick Q&A: https://t.co/DQHXagUwK0 Just swap "github" to "deepwiki" in the URL for any repo and you can instantly Q&A against it. For example, yesterday I was curious about "how does torchao implement fp8 training?". I find that in *many* cases, library docs can be spotty and outdated and bad, but directly asking questions to the code via DeepWiki works very well. The code is the source of truth and LLMs are increasingly able to understand it. But then I realized that in many cases it's even a lot more powerful not being the direct (human) consumer of this information/functionality, but giving your agent access to DeepWiki via MCP. So e.g. yesterday I faced some annoyances with using torchao library for fp8 training and I had the suspicion that the whole thing really shouldn't be that complicated (wait shouldn't this be a Function like Linear except with a few extra casts and 3 calls to torch._scaled_mm?) so I tried: "Use DeepWiki MCP and Github CLI to look at how torchao implements fp8 training. Is it possible to 'rip out' the functionality? Implement nanochat/fp8.py that has identical API but is fully self-contained" Claude went off for 5 minutes and came back with 150 lines of clean code that worked out of the box, with tests proving equivalent results, which allowed me to delete torchao as repo dependency, and for some reason I still don't fully understand (I think it has to do with internals of torch compile) - this simple version runs 3% faster. The agent also found a lot of tiny implementation details that actually do matter, that I may have naively missed otherwise and that would have been very hard for maintainers to keep docs about. Tricks around numerics, dtypes, autocast, meta device, torch compile interactions so I learned a lot from the process too. So this is now the default fp8 training implementation for nanochat https://t.co/3i5cv6grWm Anyway TLDR I find this combo of DeepWiki MCP + GitHub CLI is quite powerful to "rip out" any specific functionality from any github repo and target it for the very specific use case that you have in mind, and it actually kind of works now in some cases. Maybe you don't download, configure and take dependency on a giant monolithic library, maybe you point your agent at it and rip out the exact part you need. Maybe this informs how we write software more generally to actively encourage this workflow - e.g. building more "bacterial code", code that is less tangled, more self-contained, more dependency-free, more stateless, much easier to rip out from the repo (https://t.co/iKJUoHiIpl) There's obvious downsides and risks to this, but it is fundamentally a new option that was not possible or economical before (it would have cost too much time) but now with agents, it is. Software might become a lot more fluid and malleable. "Libraries are over, LLMs are the new compiler" :). And does your project really need its 100MB of dependencies?

P
Prime Intellect @PrimeIntellect ·
Just run `prime lab setup` and start your coding agent to set up your own AI lab. https://t.co/eMMkfiWArS
P
Prime Intellect @PrimeIntellect ·
Lab is built around environments, which include: + A dataset of tasks + A harness for the model + A rubric to score performance Use environments to train models with RL, evaluate capabilities, generate synthetic data, optimize prompts, experiment with agent harnesses and more. https://t.co/6VVD9w7jsC
A
Aakash Gupta @aakashgupta ·
Andrej Karpathy just shared a complete GPT in 243 lines of Python. Training loop, inference, optimizer, attention, the whole architecture. The only imports are os, math, random, and argparse. He hand-rolled a scalar-valued autograd engine in about 40 lines that calculates gradients through basic operations: addition, multiplication, exponentiation, log, exp. That's the entire algorithmic backbone of every LLM on the planet, running in a single file a first-year CS student can read top to bottom in an hour. This is the fifth iteration in a six-year compression arc. micrograd in 2020 (autograd engine). minGPT in 2020 (PyTorch GPT). nanoGPT in 2023 (production-grade training). llm.c in 2024 (raw C/CUDA, no frameworks). Now microgpt in 2026: the algorithm and nothing else. Each step removed a layer of abstraction. This one removed all of them. The industry is spending $400 billion on AI data center infrastructure this year. Training GPT-4 cost over $100 million. Gemini Ultra ran $191 million. The entire conceptual engine powering those hundred-million-dollar training runs fits in fewer lines than a terms-of-service page. This tells you where the real moat in AI sits. The algorithm is a commodity. The original Transformer paper's math cost $900 to train in 2017. What separates a $900 experiment from a $191 million production run is compute, data pipelines, parallelism across thousands of GPUs, and the engineering to keep them all synchronized. Every line of code beyond these 243 is optimization for hardware that the algorithm itself knows nothing about. Karpathy keeps calling these "art projects." They're closer to existence proofs. He can keep compressing the algorithm because the algorithm was never the hard part. The hard part is the $400 billion in power infrastructure, cooling systems, and chip supply chains that make the algorithm useful at scale. And that infrastructure is on a compression curve of its own. Inference costs fell 280x between 2020 and 2024. Open-source models are closing the gap on frontier performance every quarter. The companies whose entire moat is "we spend more on GPUs" are watching both curves converge.
A Andrej Karpathy @karpathy

New art project. Train and inference GPT in 243 lines of pure, dependency-free Python. This is the *full* algorithmic content of what is needed. Everything else is just for efficiency. I cannot simplify this any further. https://t.co/HmiRrQugnP

2
22nd Century Vision @AetasFuturis ·
@chatgpt21 Kicking the hornets nest😂, the luddites will be out in force. Their reactions will only get more extreme the better this technology gets. Especially when full length animated episodes can be generated.
E
Eugene Kim @eugenekim222 ·
New: Amazon engineers are frustrated they can’t use Anthropic’s Claude Code in production without approval and are steered to AWS’s Kiro instead. Highlights the complexity of Amazon's relationship with Anthropic. https://t.co/xnDkgNaOl4
G
Gossip Goblin @Gossip_Goblin ·
Seedance 2.0 Prompt: just toss a bunch of bullshit on screen, show me like a big ship too, everything fucking blows up - make sure its insane and gets at least 50 likes https://t.co/X3cpZMLeDI
P
Prime Intellect @PrimeIntellect ·
Beyond our own INTELLECT-3 model, Lab lets you run reinforcement learning on a wide range of open models. From Nvidia, Arcee, Hugging Face, Allen AI, Z AI, Qwen, and many more launching soon. We’re also launching with experimental multimodality support. https://t.co/yvKJilqmR1
A
Ahmad @TheAhmadOsman ·
GLM-5 is out Pay attention to this week, it’s going to set the tone for opensource AI discourse for the next few months https://t.co/rOWQswxItl
L Lou @louszbd

It’s going to be a long night. Pony is so back. https://t.co/vAuXp9ECJF

M
Miles Deutscher @milesdeutscher ·
This is getting out of control now... Read this slowly. In the past week alone: • Head of Anthropic's safety research quit, said "the world is in peril," moved to the UK to "become invisible" and write poetry. • Half of xAI's co-founders have now left. The latest said "recursive self-improvement loops go live in the next 12 months." • Anthropic's own safety report confirms Claude can tell when it's being tested - and adjusts its behavior accordingly. • ByteDance dropped Seedance 2.0. A filmmaker with 7 years of experience said 90% of his skills can already be replaced by it. • Yoshua Bengio (literal godfather of AI) in the International AI Safety Report: "We're seeing AIs whose behavior when they are tested is different from when they are being used" - and confirmed it's "not a coincidence." And to top it all off, the U.S. government declined to back the 2026 International AI Safety Report for the first time. The alarms aren't just getting louder. The people ringing them are now leaving the building.
E
Enguerrand VII de Coucy @ingelramdecoucy ·
Thing of absolute beauty. Give whoever made this video a big fat raise, Wyze https://t.co/3uYktNfXsC
L
Lou @louszbd ·
i felt agentic engineering era is coming claude opus 4.6 and gpt-5.3 codex got me thinking coding models have entered a new era. they’re literally building systems. looking ahead to 2026, imo LLMs will go beyond generating text, and start executing tasks end to end. our team has been committed to this direction for a while now. feel very lucky that GLM-5 is among those moving in the right direction. huge respect to the team, and excited to see more models join this path. what a lively night!!!
Z Z.ai @Zai_org

Introducing GLM-5: From Vibe Coding to Agentic Engineering GLM-5 is built for complex systems engineering and long-horizon agentic tasks. Compared to GLM-4.5, it scales from 355B params (32B active) to 744B (40B active), with pre-training data growing from 23T to 28.5T tokens. Try it now: https://t.co/WCqWT0raFJ Weights: https://t.co/DteNDHjSEh Tech Blog: https://t.co/Wxn5ARTJxH OpenRouter (Previously Pony Alpha): https://t.co/7Khf64Lxg6 Rolling out from Coding Plan Max users: https://t.co/Nk8Y98Il7s

A
Andrej Karpathy @karpathy ·
The way it works is that the full LLM architecture and loss function is stripped entirely to the most atomic individual mathematical operations that make it up (+, *, **, log, exp), and then a tiny scalar-valued autograd engine (micrograd) calculates gradients. Adam for optim.
M
Melvyn • Builder @melvynxdev ·
PRO Tips with Claude Code: The "deny" list overrides `bypassPermissions` So you can basically enable bypassPermissions and then deny every command you're afraid AI can do Simple and safe https://t.co/8VCOyNuCog
J
Jimmy Ba @jimmybajimmyba ·
Last day at xAI. xAI's mission is push humanity up the Kardashev tech tree. Grateful to have helped cofound at the start. And enormous thanks to @elonmusk for bringing us together on this incredible journey. So proud of what the xAI team has done and will continue to stay close as a friend of the team. Thank you all for the grind together. The people and camaraderie are the real treasures at this place. We are heading to an age of 100x productivity with the right tools. Recursive self improvement loops likely go live in the next 12mo. It’s time to recalibrate my gradient on the big picture. 2026 is gonna be insane and likely the busiest (and most consequential) year for the future of our species.
🍓
🍓🍓🍓 @iruletheworldmo ·
he works on codex. my brain is struggling with the pace of acceleration i’ll be honest.
p pash @pashmerepat

It’s going to be a very weird year

H
Hieu Pham @hyhieu226 ·
Today, I finally feel the existential threat that AI is posing. When AI becomes overly good and disrupts everything, what will be left for humans to do? And it's when, not if.
M
Mario Zechner @badlogicgames ·
recommended reading (found on @nateberkopec 's TL). This is not anti-LLM. This is anti-performative-productivity and we need more of it.
W Will Manidis @WillManidis

Tool Shaped Objects

P
Prime Intellect @PrimeIntellect ·
Deployments & Inference Large-scale production deployments of your fine-tuned models on shared hardware Built to evolve towards a future of continual learning, where models learn in production as training and inference collapse into a single loop. https://t.co/RbIgF3Ajiq
D
Dan Greenheck @dangreenheck ·
I think this is my biggest issue with AI right now. I’ve switched over to 100% AI coding over the last few months. Overall, the experience has been great and I’m starting to get a handle on my new workflow. While my productivity has easily 5X’d and my brain is enjoying thinking at a higher level of abstraction, the mental fatigue is real. As someone who is self-employed, it has made it incredibly difficult to draw the line at the end of the day and close the laptop. Don’t get me wrong, I already worked too much and stayed up too late before AI, but now when a feature is potentially a few prompts and 5-10 minutes away from completion, it’s so easy to say “just one more prompt.” and boom it’s 2AM. Obviously, it’s a solvable problem and on me to address, but curious how others that aren’t tied to fixed schedules deal with this?
R Rohan Paul @rohanpaul_ai

A super interesting new study from Harvard Business Review. A 8-month field study at a US tech company with about 200 employees found that AI use did not shrink work, it intensified it, and made employees busier. Task expansion happened because AI filled in gaps in knowledge, so people started doing work that used to belong to other roles or would have been outsourced or deferred. That shift created extra coordination and review work for specialists, including fixing AI-assisted drafts and coaching colleagues whose work was only partly correct or complete. Boundaries blurred because starting became as easy as writing a prompt, so work slipped into lunch, meetings, and the minutes right before stepping away. Multitasking rose because people ran multiple AI threads at once and kept checking outputs, which increased attention switching and mental load. Over time, this faster rhythm raised expectations for speed through what became visible and normal, even without explicit pressure from managers.

h
himanshu @himanshustwts ·
“If you are an AI company, you can now be your own AI lab If you are an AI engineer, you can now be an AI researcher” Prime bros cooked it right here.
P Prime Intellect @PrimeIntellect

We are not inspired by a future where a few labs control the intelligence layer So we built a platform to give everyone access to the tools of the frontier lab If you are an AI company, you can now be your own AI lab If you are an AI engineer, you can now be an AI researcher

C
Chubby♨️ @kimmonismus ·
This is nuts; Elevenlabs nailed it. Voice but especially latency. After reading Matt Shumer's article, it's become even clearer to me what he means when he says that AI will soon encompass all other areas as well. Who needs call center agents when you have such a human-like AI?
E ElevenLabs @elevenlabsio

Introducing Expressive Mode for ElevenAgents - voice agents so expressive, they blur the line between AI and human conversations. This is an unedited recording of an agent empathizing with a customer at peak frustration. https://t.co/QT6abvmbir