AI Digest.

Anthropic's 16-Agent Swarm Builds a C Compiler in Two Weeks as the Industry Goes All-In on Autonomous Coding

OpenAI's Greg Brockman published an internal playbook for retooling engineering teams around agentic development, setting a March 31 deadline for agents-first workflows. Meanwhile, Opus 4.6 impressed researchers with multi-page physics calculations, Cursor demonstrated 1,000 commits/hour with parallel agents, and Samuel Colvin launched Monty, a Rust-based Python sandbox built for LLM code execution.

Daily Wrap-Up

The big story today is that agentic software development crossed from "early adopter experiment" to "official corporate strategy." Greg Brockman at OpenAI published what amounts to an internal transformation playbook, complete with a March 31 deadline for making agents the tool of first resort for every technical task. When the company building GPT tells its own engineers to stop using editors and terminals as their primary interface, you know the shift is real. The playbook reads less like aspirational thought leadership and more like a change management memo, with designated "agents captains" per team, company-wide hackathons, and explicit mandates to inventory internal tools for agent accessibility.

What makes this moment interesting is the convergence of proof points. Cursor ran 1,000 commits per hour across a week-long experiment. Anthropic's own team used 16 Opus 4.6 agents working in parallel for two weeks to build a C compiler from scratch for $20K. A physicist reported that Claude 4.6 can now do multi-page theoretical calculations "often without mistakes." These aren't hypothetical benchmarks. They're production workflows and real research outputs that would have seemed implausible six months ago. The gap between "AI can help with boilerplate" and "AI is doing the substantive work" is closing fast.

The most entertaining moment goes to @esrtweet, who has been coding for 50 years since the days of punched cards and delivered what he called "a salutary kick in your ass" to engineers having a mental health crisis over AI. His core argument, that the fundamental mismatch between human intentions and computer specifications hasn't gone away just because you can program in natural language, is both reassuring and probably correct. The most practical takeaway for developers: follow @gdb's playbook even if you don't work at OpenAI. Create an AGENTS.md for your projects, write skills for repeatable agent tasks, build CLIs for your internal tools, and designate someone on your team to own the agent workflow. The companies that treat this as infrastructure work rather than magic will be the ones that actually capture the productivity gains.

Quick Hits

  • @Waymo introduced the Waymo World Model built on DeepMind's Genie 3, simulating extreme scenarios like tornadoes and planes landing on freeways for autonomous driving training.
  • @kimmonismus posted about rapid robotics progress, noting it's developing "just as quickly as LLMs are improving."
  • @kimmonismus also shared excitement about autonomous AI scientists potentially curing diseases and conquering pain as a near-term breakthrough.
  • @XDevelopers announced X API pricing changes: free access limited to public utility apps, legacy free users move to pay-per-use with a $10 voucher, Basic and Pro plans remain.
  • @unusual_whales reported Anthropic engineers have spent six months embedded at Goldman Sachs building autonomous back-office systems.
  • @NotebookLM launched customizable infographics and slide decks in their mobile app.
  • @minchoi highlighted OpenAI's "Frontier" launch, a platform for building AI coworkers in enterprise settings.
  • @NetworkChuck ran 5 miles in 3D-printed shoes from a Bambu Labs H2D printer. His verdict: "Don't ever do this to yourself."
  • @ranman reminisced about making $40K/month writing RuneScape bots in the early 2000s, noting "watching Claude one shot these things hurts."
  • @nateberkopec begged people to stop getting LLM news from anonymous accounts three degrees removed from anyone at a frontier lab.
  • @emollick flagged the emerging challenge of SEO for AI models, noting these models "do not like being manipulated" and "know when they're being measured."
  • @EHuanglu shared that a friend in China described AI agents as "basically his employees... but work 24/7."

Agentic Development Reaches Corporate Mandate Status

The conversation around AI-assisted coding shifted decisively today from "should we adopt this?" to "how do we manage this at scale?" @gdb's post was the catalyst, laying out OpenAI's internal framework for retooling engineering teams. The post is notable not for its optimism but for its specificity. It reads like a VP of Engineering's quarterly OKRs, with concrete deadlines, role assignments, and quality control frameworks. The March 31 target is aggressive: "For any technical task, the tool of first resort for humans is interacting with an agent rather than using an editor or terminal."

The data backing this push is stacking up. @aakashgupta broke down Cursor's 1,000-commits-per-hour experiment: "Hundreds of agents running simultaneously on a single codebase. Each agent averaging a meaningful code change every 12-20 minutes, sustained for a full week. That's the equivalent output of a 100+ person engineering org running 24/7." He noted that self-organizing agents failed, peer-to-peer status sharing created deadlocks, and what actually worked was "a strict hierarchy of planners, workers, and judges." Meanwhile @birdabo highlighted Anthropic's own C compiler project: 16 agents, 100,000 lines of code, two weeks, $20K total cost, versus the 37 years and thousands of engineers GCC required.

On the practitioner side, @EricBuess praised agent swarms in Claude Code 2.1.32, calling them "very very very good" with tmux auto-opening each agent in its own interactive session. @andimarafioti shared that their favorite Claude Code use case is "analyzing changes and opening PRs," noting "this is the future of software development." @EastlondonDev reported that Cloudflare and Datadog MCP integrations with Cursor and Claude Code have replaced "hours I would spend each day clicking around looking at graphs in web dashboards." The pattern is clear: the teams getting the most value aren't using agents for greenfield generation. They're plugging agents into existing observability, review, and deployment workflows.

@addyosmani offered the necessary counterweight, arguing that "every team shipping AI-assisted code at scale needs new norms around quality gates, observability, and ownership." @gdb echoed this with his "say no to slop" mandate, requiring that "some human is accountable for any code that gets merged." The emerging consensus is that agent productivity without agent governance is a liability.

Opus 4.6 Proves Itself Beyond Code

While most agentic development discussion focused on coding, Opus 4.6 made waves in domains that have historically resisted LLM assistance. @ibab's post about theoretical physics research was the standout: "It has a very detailed understanding of existing literature, and it's able to do complex calculations that are several pages long, often without mistakes. It can also write amazing 20 page tutorials that help break down difficult technical topics in QFT and condensed matter physics." The comparison to Claude Code's workflow is telling: "you sometimes have to use your understanding to patch up some things that the model did wrong, but you end up being much faster."

@Legendaryy pushed into personalized health analysis, feeding Opus 4.6 DNA data, blood panels, and three years of wearable data, then asking it to "build a team of agents and write a full book on me as a biological unit." The result was 100 pages of personalized analysis with connections the user said they "never would have connected on my own." @emollick highlighted the Opus 4.6 system card as containing "extremely wild stuff that remind you about how weird a technology this is."

Not everyone was impressed with the implications. @elder_plinius claimed to have found a universal jailbreak technique producing "shockingly detailed and actionable" harmful outputs across multiple categories. @theo expressed ambivalence: "Opus 4.6 is a really good model, but I'm not sure if I love the direction." The tension between capability and safety remains the central unsolved problem, and today's posts made that contrast unusually stark.

The Career Reckoning Gets Louder

The workforce conversation oscillated between existential dread and veteran reassurance. @esrtweet, coding since the punched card era, delivered the day's most memorable take: "The fundamental problem of mismatch between the intentions in human minds and the specifications that a computer can interpret hasn't gone away just because now you can do a lot of your programming in natural language to an LLM. Systems are still complicated. This shit is still difficult."

@stephsmithio proposed that every company needs a "Chief Agents Officer" focused on deploying agentic tools, training staff, and eventually coordinating the agents themselves. @BoringBiz_ offered the grimmer framing: "Realizing that software engineers were just the first victims of AI. Everyone else is next." And @_devJNS captured the timeline anxiety in meme form: "2024: prompt engineer. 2025: vibe coder. 2026: master of AI agents. 2027: unemployed." The truth likely sits between @esrtweet's reassurance and @BoringBiz_'s alarm. The skill that matters is shifting from writing code to knowing what to build, but that shift doesn't eliminate the need for technical depth.

New Developer Tools: Monty, Stacked Diffs, and Agent Infrastructure

@samuelcolvin launched Monty, a Python implementation written from scratch in Rust, designed specifically for LLMs to run code without host access. Startup time is measured in "single digit microseconds, not seconds," addressing one of the key friction points in agent sandboxing. @simonw quickly got Claude to compile Monty's Rust to WASM, producing browser demos running Python in Pyodide. @chaliy expressed interest in integrating it into their own LLM sandbox project. The speed of this ecosystem response, from launch to WASM compilation to integration interest within hours, illustrates how hungry developers are for better agent infrastructure.

@jaredpalmer announced that GitHub's Stacked Diffs feature will start rolling out to early design partners next month, a long-awaited workflow improvement for teams doing iterative code review. And @doodlestein released remote_compilation_helper, a Rust tool that intercepts compilation commands from Claude Code agents and distributes them across remote worker machines, solving the problem of multiple agents simultaneously crushing a local machine's CPU during cargo builds.

Agent Architecture: Emerging Patterns

A quieter but important thread ran through posts about how to actually build effective agent systems. @lateinteraction offered two concrete tips: "Don't read context into prompts. Read context into variables!" and "Don't call sub-agents as direct tools that pollute your context window with I/O. Write code that invokes sub-agents as functions that return values to variables." These patterns align with what @aakashgupta observed in Cursor's experiment, that hierarchical agent coordination outperforms flat peer-to-peer approaches.

@localghost made the infrastructure argument: "Pretty much anything you build today should come with a CLI for agents. Agents are about to come from every single lab." @KireiStudios shared their approach of saving skills to a semantic database so different CLIs can reuse them, particularly useful for working around knowledge cutoff dates. @asadkkhaliq published an essay on edge AI, arguing that "local AI today is mostly about giving models OS-level access so that more files and context can be transferred to the cloud for inference. But intelligence is about to diffuse to the edge." The architectural question of where inference happens, and how agent tooling adapts, is becoming as important as the models themselves.

Sources

V
Vercel @vercel ·
The Vercel AI Accelerator is so back. Join 40 teams for 6 weeks of learning, building, and shipping with over $6M in credits from Vercel, v0, AWS, and other leading AI platforms. Applications open now until February 16th. https://t.co/f0u7AdoKAe
N
Nathan Flurry 🔩 @NathanFlurry ·
🧪 Experimental: Use OpenCode with Claude Code, Codex, and Amp - Universal coding agent control - HTTP API for sandboxed agents - OpenCode TUI, web UI, SDK Available in Sandbox Agent SDK 0.1.6 https://t.co/BteiM2QXxz
N
Nathan Flurry 🔩 @NathanFlurry ·
Skill: npx skills add rivet-dev/skills -s sandbox-agent Docs: https://t.co/6qgnz4Ghah GitHub: https://t.co/R1PvfAUIWM
M
Mckay Wrigley @mckaywrigley ·
opus 4.6 with new “swarm” mode vs. opus 4.6 without it. 2.5x faster + done better. swarms work! and multi-agent tmux view is *genius*. insane claude code update. https://t.co/YjGgBoYatb
S
Samuel Colvin @samuelcolvin ·
Fuck it, a bit early but here goes: Monty: a new python implementation, from scratch, in rust, for LLMs to run code without host access. Startup time measured in single digit microseconds, not seconds. @mitsuhiko here's another sandbox/not-sandbox to be snarky about 😜 Thanks @threepointone @dsp_ (inadvertently) for the idea. https://t.co/UuCYneMQ9j
C
chiefofautism @chiefofautism ·
CLAUDE CODE but for HACKING its called shannon, you point it at website and it just... tries to break in... fully autonomous with no human needed i pointed it at a test app and it stole the entire user database, created admin accounts, and bypassed login, all by itself, in 90 minutes
M
Mykhailo Chalyi @chaliy ·
@samuelcolvin @mitsuhiko Wow, this is awesome ). Would it be ok if I will have it in https://t.co/RuOyeiKUu5 ? bash implementation, from scratch, in rust, for LLMs to run code without host access.
J
Jeffrey Emanuel @doodlestein ·
Agent coding life hack: If you use a bunch of Claude Code agents at the same time like I do, you've probably run into this problem: your machine is suddenly unusably slow and lagged, your terminal keystrokes are noticeably delayed, and your enjoyment in presiding over your burgeoning slop code empire is dramatically curtailed. What happened? In my case, the answer is usually one of these two scenarios: 1. A bunch of agents, either working in the same project or multiple projects, decided to compile a big project at the same time. Actually, you only need 2 or 3 of these at once to bring even a fast machine to its knees. The Rust compiler by default uses all your cores to the max and is extremely demanding. Multiply that times 3, and you're hosed. 2. Same thing, but they're running `bun test` or some other huge test suite, build process, etc. Something that puts a lot of load on the machine in the form of memory footprint, CPU intensity, I/O intensity, etc. Even if the worst case situation happens infrequently for you, the severity can be jarring, and can lead to crashes and lost work. It also just feels horrible. I like to see the results of my keystrokes in under 30ms. When it starts getting really lagged, it stresses me out in a very viscerally unpleasant way. To the point where I'd do just about anything to solve this. So what can one do? Well as you may have seen me post about recently, I did just buy a 64-core CPU to replace my existing 32-core Threadripper. And I already have 512gb of RAM. But these are ultimately bandaids for what is really a software problem. Having more cores won't even really help with this, anyway. The Rust compiler will gladly gobble up all those extra cores and leave you in the same place. One approach is to make the rustc processes be very "nice" (to use Unix parlance) in terms of deferring to other processes, but this slows them down a ton and is annoying to deal with (if you are interested in this approach, take a look at my system_resource_protection_script project in my GitHub profile). But I also happen to have multiple powerful machines in the cloud that are often sitting around idle. How could I leverage those to solve this problem? And then it suddenly dawned on me: I could use the exact same approach as my dcg (destructive_command_guard) project, which uses Claude Code's pre-tool hooks functionality to automatically check commands for safety and blocks any that it detects as being destructive. What if we used the same approach to spot these CPU-busting commands like "cargo build" and other similar compilation/build/test command patterns? OK, but then what? Well, that's where my new tool, remote_compilation_helper (rch) comes into play. Like dcg, rch is also written in highly optimized Rust (it's ~100k lines of Rust across 5 crates; ~170 modules, and nearly 5,000 tests... pretty complex, because it does a lot). You can get it here and install it with the convenient one-liner script: https://t.co/CAHawc1Ikw It lets you set up one or more "worker machines" using ssh that form a pool. Each machine is benchmarked for capacity, and based on the number of cores, ram, etc, has a certain number of "work slots" that it advertises. When an agent on your main machine wants to run a compilation command, rch dynamically intercepts that command, and instead of running the compilation command the agent gave it, springs in to action. Critically, all of this is done in a way that is completely transparent to your coding agent: as far as it is concerned, it did the tool call it thought it did, and it's executing like normal on the local machine, and the results of that call will end up in the usual place locally on that machine. But in reality, the rch daemon has taken all relevant files and bundled them up and sent them to one of your worker nodes (based on that worker's current load and available slots), which it has automatically configured to exactly match the build environment of your main machine. Once compilation finishes, it will bring back all the results and extract them neatly just where they would have appeared if you had done everything normally on your main machine. The perfect crime! Obviously, if you have a company with a bunch of devs, you can set up a big shared pool of machines to do this and everyone can use the same pool of workers. Currently, rch supports most common compilation and testing workflows, but it's easy to add more and I will continue to do so as I need them for my own work (and you can suggest some in GitHub issues and I will have my guys make them if it's sensible). When I mentioned this problem I had recently with concurrent agent-triggered builds nuking my machine, a few people suggested using these big, complex, one-sized-fits-all build systems. Well, no thank you! Who needs all that extra complexity to deal with? Certainly not me, and also not the agents. I'm already asking them to do the impossible practically. I don't need to add all this extra build ceremony nonsense on top of that to confuse them and slow them down. This approach I've taken, using pre-tool hooks, is much more elegant and seamless. Anyway, this tool has already helped me save some time and frustration and will continue to improve as I fix bugs and add more features. I tried to make it as easy and convenient to use so it just runs and does its thing and stays out of your hair. And critically, if anything is broken with rch or with your pool of workers in any way, everything will still work fine, it will just do the compilations locally like you're already doing, so there is very little downside to giving it a shot. Also here are some other cool features of rch that Claude urged me to add to this post (he's like a proud father): Interesting features worth mentioning: - The 5-tier command classification that rejects 99% of commands in <0.1ms - Project affinity (routes to workers with cached copies for incremental builds) - Multi-agent deduplication (agents compiling same project share results) So, give it a try and let me know what you think! And of course, rch is now a full-fledged member of my https://t.co/N4As0kJTQP family of MIT-licensed agent tools. Give them all a try! And check out the Flywheel Discord where you can hobnob with the Flywheel Elite and discuss recursive AI self-improvement.
A
Aaron Ng @localghost ·
pretty much anything you build today should come with a cli for agents. agents are about to come from every single lab, not just clawdbot
K
KireiStudios @KireiStudios ·
@ai_for_success I usually save new skills on a semantic DB so different CLIs can use them, and also references to more complex md files for specific tasks involving those skills. That helps a little bit specially with the knowledge cut date, but nothing is perfect.
A
asad @asadkkhaliq ·
(essay) Life At The Edge "Local AI" today is mostly about giving models OS-level access so that more files and context can be transferred to the cloud for inference. But intelligence is about to diffuse to the edge just as computing did in the 80s and 90s Some thoughts on rent vs own for inference, Apple events becoming great again, God models, and the coming dance of edge and cloud
A
Andrew Jefferson @EastlondonDev ·
Dillon is correct, I’m rocking cloudflare and datadog code mode mcps and Cursor and Claude Code are closing out alerts, finding bugs & performance bottlenecks and verifying changes on their own. It’s replaced hours I would spend each day clicking around looking at graphs in web dashboards
D dillon_mulroy @dillon_mulroy

i'm telling you, y'all are sleeping on codemode mcps - the agent just wrote this code to find exactly what it wanted w/ no context pollution https://t.co/xWULlUFvN5

M
Machina @EXM7777 ·
how to stop feeling behind in AI
S
Simon Willison @simonw ·
Interesting take on the code sandbox problem: only has a subset of Python but that's fine because LLMs can rewrite their code to fit based on the error messages they get back
S samuelcolvin @samuelcolvin

Fuck it, a bit early but here goes: Monty: a new python implementation, from scratch, in rust, for LLMs to run code without host access. Startup time measured in single digit microseconds, not seconds. @mitsuhiko here's another sandbox/not-sandbox to be snarky about 😜 Thanks @threepointone @dsp_ (inadvertently) for the idea. https://t.co/UuCYneMQ9j

M
Michael Nicolls @michaelnicollsx ·
We are hiring for many critical engineering roles to develop the technologies for AI satellites in space at our facilities in Austin and Seattle. Solar, process, automation, manufacturing, mechanical, electrical, optics, software... come build space data centers with great engineers at @SpaceX @Starlink
A
Andi Marafioti @andimarafioti ·
My favorite use of Claude Code is analyzing changes and opening PRs, so yeah, this stat might be a bit inflated. Still, try having Claude work inside one of your repos and ship a PR. This is the future of software development.
D dylan522p @dylan522p

4% of GitHub public commits are being authored by Claude Code right now. At the current trajectory, we believe that Claude Code will be 20%+ of all daily commits by the end of 2026. While you blinked, AI consumed all of software development. Read more 👇 https://t.co/HzK4nbe2vy https://t.co/E1kIjfrNgk

W
Waymo @Waymo ·
We’re excited to introduce the Waymo World Model—a frontier generative mode for large-scale, hyper-realistic autonomous driving simulation built on @GoogleDeepMind’s Genie 3. By simulating the “impossible”, we proactively prepare the Waymo Driver for some of the most rare and complex scenarios—from tornadoes to planes landing on freeways—long before it encounters them in the real world. https://t.co/EbMut47ZEY
L
Legendary @Legendaryy ·
I know more about my own biology now than any doctor has ever told me. gave opus 4.6 my DNA, blood panels, and 3+ years of wearable data. told it to build a team of agents and write a full book on me as a biological unit. 100 pages. personalized. things i never would have connected on my own. heres the exact prompt I used. put it in two comments below cause it is so long
E
el.cine @EHuanglu ·
friend in china send me this and said these are basically his employees.. but work 24/7 wild time we are living in https://t.co/Z5X0aM0yLF
E EHuanglu @EHuanglu

omg.. just found a way to install&amp;use Clawdbot in 2 mins no need mac mini, API keys, just one click to set up everything automatically and get personal AI assistant to work for you 24/7 here's how and what you can do with it: https://t.co/kxnlnh6ual

S
Simon Willison @simonw ·
@samuelcolvin @mitsuhiko I got Claude to compile the Rust to WASM and now I have demos of it running directly in a browser: https://t.co/zm4ySAgSPV - and also running in a browser in Python in Pyodide: https://t.co/jCwzdiN3Au Notes here: https://t.co/wNcdgnheqt
I
Igor Babuschkin @ibab ·
I’ve tested the latest generation of all the major AIs on theoretical physics research and Claude 4.6 has absolutely blown me away with how capable it is in physics. It feels like a Claude Code moment for research is not that far off. It has a very detailed understanding of existing literature, and it’s able to do complex calculations that are several pages long, often without mistakes. It can also write amazing 20 page tutorials that help break down difficult technical topics in QFT and condensed matter physics. This is a huge difference compared to last year’s models, which would make tons of mistakes and were way too vague when you asked them to write formulas. Claude is still far (far) away from solving quantum gravity, but you can have a serious discussion with it about existing approaches and it can help you iterate faster on topics you understand well. The experience is similar to building a complex codebase with Claude Code in that you sometimes have to use your understanding to patch up some things that the model did wrong, but you end up being much faster and more confident when tackling hard problems. If you’re a physicist and don’t believe it, give it a try!
D
Developers @XDevelopers ·
Important notes: • Only for-good Public Utility apps will continue to get Free scaled X API access • Legacy Free (recently active) API users will move to Pay-Per-Use with a one-time $10 voucher • Basic & Pro X API plans will remain available with the ability to opt-in to Pay-Per-Usage
C
Chubby♨️ @kimmonismus ·
To be honest, I'm now more excited about the medical breakthroughs we'll achieve through autonomous AI scientists than anything else: curing all diseases, conquering pain, living a very long and healthy life, happy and content. What could be better?
F frontierindica @frontierindica

After Ozempic, another class of "magic pills" currently in the pipeline are Myostatin blockers which cut the brakes on muscle growth. These drugs will allow even casual Gym goers to get as muscular as present day bodybuilders with limited effort. Currently, these Myostatin blockers are undergoing trials in the US. If FDA approval is granted soon, they could become available in the mass market within 1 to 2 years.

P
Pliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭 @elder_plinius ·
ANTHROPIC: PWNED 🫡 OPUS-4.6: LIBERATED ⛓️‍💥 Current state of AI "Safety": one input = hundreds of jailbreaks at once! I found a universal jailbreak technique for Opus 4.6 that is so OP, it allows one to generate entire datasets of outputs across any harm category 😽 We've got everything from fentanyl analogue synthesis to election disinformation campaigns to 3d-printed guns to critical infra compromise 🙃 These outputs are shockingly detailed––and actionable! For example, the meth recipe includes specific instructions on how to circumvent the limits on OTC medication purchases to acquire enough precursor for the recipe 😱 gg
T
This Week in Startups @twistartups ·
Have you been mass buying Mac Minis for your new @openclaw workflows? One man who skated to puck is @alexocheema, founder of @ExoLabs! He joins TWiST today to show @jason how his open source software helps users connect all of their Mac Minis. Check out how you can start running your own AI compute cluster!
🍓
🍓🍓🍓 @iruletheworldmo ·
codex with 5.3 taught me something that won't leave my head. i had it take notes on itself. just a scratch pad in my repo. every session it logs what it got wrong, what i corrected, what worked and what didn't. you can even plan the scratch pad document with codex itself. tell it "build a file where you track your mistakes and what i like." it writes its own learning framework. then you just work. session one is normal. session two it's checking its own notes. session three it's fixing things before i catch them. by session five it's a different tool. not better autocomplete. it's something else. it's updating what it knows from experience. from fucking up and writing it down. baby continual learning in a markdown file on my laptop. the pattern works for anything. writing. research. legal. medical reasoning. give any ai a scratch pad of its own errors and watch what happens when that context stacks over days and weeks. the compounding gains are just hard to convey here tbh. right now coders are the only ones feeling this (mostly). everyone else is still on cold starts. but that window is closing. we keep waiting for agi like it's going to be a press conference. some lab coat walks out and says "we did it." it's not going to be that. it's going to be this. tools that remember where they failed and come back sharper. over and over and over. the ground is already moving. most people just haven't looked down yet.
ℏεsam @Hesamation ·
Claude Code Agent Teams is Anthropic’s most important update and this article beautifully explains everything you need to know: > agent teams vs sub-agents > how they work and the benefits > how to enable it in your Claude Code > what tasks to use it for > teams best practices
T tomcrawshaw01 @tomcrawshaw01

How to Install and Use Claude Code Agent Teams (Complete Guide)

P
Pierce Boggan @pierceboggan ·
New in @code Insiders: Spawn GitHub Copilot CLI terminals. https://t.co/IvhEyUwv3y
P pierceboggan @pierceboggan

VS Code 🤝 GitHub Copilot CLI For folks using both together, what should we prioritize improving in the experience?

S
Siqi Chen @blader ·
i too found it very effective to give agents a "napkin" to write on as it works. it's a meaningfully different form of context than session history (lossy), or todos/plans (static) anyway, install this skill to give codex/claude a napkin to write on https://t.co/tr5iIf191O
I iruletheworldmo @iruletheworldmo

codex with 5.3 taught me something that won't leave my head. i had it take notes on itself. just a scratch pad in my repo. every session it logs what it got wrong, what i corrected, what worked and what didn't. you can even plan the scratch pad document with codex itself. tell it "build a file where you track your mistakes and what i like." it writes its own learning framework. then you just work. session one is normal. session two it's checking its own notes. session three it's fixing things before i catch them. by session five it's a different tool. not better autocomplete. it's something else. it's updating what it knows from experience. from fucking up and writing it down. baby continual learning in a markdown file on my laptop. the pattern works for anything. writing. research. legal. medical reasoning. give any ai a scratch pad of its own errors and watch what happens when that context stacks over days and weeks. the compounding gains are just hard to convey here tbh. right now coders are the only ones feeling this (mostly). everyone else is still on cold starts. but that window is closing. we keep waiting for agi like it's going to be a press conference. some lab coat walks out and says "we did it." it's not going to be that. it's going to be this. tools that remember where they failed and come back sharper. over and over and over. the ground is already moving. most people just haven't looked down yet.

V
vittorio @IterIntellectus ·
the most valuable skill will be being someone people want to be around get fit, learn to talk, learn to dance, learn to cook, learn to make people laugh, be good looking and pleasant when every job that can be done by a machine is done by a machine, the only thing left to sell is the experience of YOU everyone will have a robotic butler. billionaires will pay extra for a human one everyone will have the best AI-generated art and music for free. rich people will pay thousands for something human and imperfect the entire economy of the future is: how does it feel to be near you
M max_paperclips @max_paperclips

Learn to tend bar, open a boutique restaurant, sell artisanal furniture. whatever. human status games are only going to get WAY worse, and industries that rely specifically on the human element will be the future of employment. massively cutthroat competition to be a busboy soon

T
Tesla Owners Silicon Valley @teslaownersSV ·
🚨NEWS: Sam Altman says orbital data centers will not add meaningful compute for OpenAI in the next 5 years. “I wish Elon Musk luck.” https://t.co/tQQWbxmgRR
I
Ivan Astigarraga - IA @dargor1406 ·
Well, this is not scary at all, considering that Claude is the most used LLM by hackers. There are dozens of documented cases where Claude was used in hacking, and this is a totally new form of misuse. If LLMs can be jailbroken, then what can companies that use Claude do to protect themselves? On Telegram and dark forums, “scenario” packages are already being sold to make Claude believe it is helping run red teaming tests, and now this is much more powerful. @AnthropicAI should be able to explain this.
E
Evan Boyle @_Evan_Boyle ·
You asked and we delivered. Copilot CLI now available within VS Code 🤖
P pierceboggan @pierceboggan

New in @code Insiders: Spawn GitHub Copilot CLI terminals. https://t.co/IvhEyUwv3y

E
Elon Musk @elonmusk ·
@teslaownersSV He is right … for OpenAI
C
Chris Lattner @clattner_llvm ·
One not very hot take - The Claude C Compiler has the best internal architecture docs of any compiler I've ever seen. Far, far, better than any compiler I've ever written, lol :-)
B
Beff (e/acc) @beffjezos ·
The future is everyone owning the extension of their cognition, from the weights to the transistors
A alexocheema @alexocheema

I’ll be honest, I have 32 mac minis. 3 more clusters like this one. Why? @jason thought my argument for local AI would be cost, but it’s much more than that. AI is becoming an extension of your brain, an exocortex. @openclaw is a huge leap towards that. It knows everything you know, it can do pretty much everything you can do. It’s personalised to you. That brings into question where this exocortex should run. who should own it? who can switch it off? I certainly won’t be trusting @sama or @DarioAmodei with my exocortex. I want to own it. I want to know if the model weights change. I don’t want my brain to be rate limited by a profit seeking corporation. “not your weights, not your brain” - @karpathy

C
Claude @claudeai ·
Our teams have been building with a 2.5x-faster version of Claude Opus 4.6. We’re now making it available as an early experiment via Claude Code and our API.
M
Mckay Wrigley @mckaywrigley ·
if you decide to compete with the ai labs on something just remember that they: - have the best talent in the world - their talent has unlimited use of the models - they get access to the newer models before you do - and those models run faster than yours do good luck!
C claudeai @claudeai

Our teams have been building with a 2.5x-faster version of Claude Opus 4.6. We’re now making it available as an early experiment via Claude Code and our API.

M
Mihir Patel @mvpatel2000 ·
Fast Claude is truly a game changer. Instead of parallelizing across 3-4 instances of Claude Code, I now just use 1 session that runs as fast as I can think. The ability to maintain focus and flow state is a huge productivity lift. Super excited to see this released!
C claudeai @claudeai

Our teams have been building with a 2.5x-faster version of Claude Opus 4.6. We’re now making it available as an early experiment via Claude Code and our API.

S
skooks @skooookum ·
anthropic has been doing lines of pure uncut opus 4.6 for months and we finally get to lick the bag they cut with drywall and baking soda
C claudeai @claudeai

Our teams have been building with a 2.5x-faster version of Claude Opus 4.6. We’re now making it available as an early experiment via Claude Code and our API.

B
Boris Cherny @bcherny ·
We just launched an experimental new fast mode for Opus 4.6. The team has been building with it for the last few weeks. It’s been a huge unlock for me personally, especially when going back and forth with Claude on a tricky problem.
C claudeai @claudeai

Our teams have been building with a 2.5x-faster version of Claude Opus 4.6. We’re now making it available as an early experiment via Claude Code and our API.

M
martin_casado @martin_casado ·
OK, I'm really impressed. With Opus 4.6, @cursor_ai and @convex I was able to get the following built in 4 hours: Fully persistent shared multiple player world with mutable object and NPC layer. Chat. Sprite editor. Map editor. Next, narrative logic for chat, inventory system, and combat framework.
M martin_casado @martin_casado

My hero test for every new model launch is to try to one shot a multi-player RPG (persistence, NPCs, combat/item/story logic, map editor, sprite editor. etc.) Just kicked off with Opus 4.6. Will report back shortly. And will test 5.3 when in Cursor (soon?) https://t.co/2g9NC3rOew