AI Digest.

Anthropic Acquires Stainless as Claude Code Gets Sandboxes and AI Coding Agents Move to Production

Anthropic made two major moves today, acquiring SDK platform Stainless and launching self-hosted sandboxes with MCP tunnels for Claude Code. Across 39 posts, the dominant theme is the rapid professionalization of AI coding agents, from persistent skills and cross-session memory to production deployment with tools like Devin Auto-Triage. Meanwhile, ByteDance released a 3B-active-parameter multimodal model, and the tech job market reshuffle continued to dominate conversation.

Daily Wrap-Up

The AI development ecosystem is hitting an inflection point where coding agents are no longer experimental toys but production infrastructure. Today's posts paint a picture of tools rapidly maturing: Anthropic acquired Stainless to own its SDK pipeline, shipped self-hosted sandboxes for Claude Code, and the community shared best practices for running AI assistants across multi-million-line monorepos. Tools like Devin are moving from "write code" into "monitor and fix production," suggesting the agent era is less about replacing developers and more about giving them an always-on operations partner.

On the model front, efficiency is the name of the game. ByteDance dropped Lance, a model with only 3 billion active parameters that can process and generate text, images, and video simultaneously. Qwen pushed both 3.7 Preview to Arena and 2x faster GGUFs for 3.6. And Cursor shipped Composer 2.5 with an endorsement from Elon Musk, partially trained on Colossus 2. The gap between what runs in the cloud and what runs on your laptop is collapsing fast.

The most practical takeaway for developers: start building persistent memory and skill files for your coding agents now. Between Dria's "watchmen" tool that saves agent learnings across sessions, Neo4j's agent-memory knowledge graphs, and the growing library of reusable skills, the developers who invest in agent context today will have a compounding advantage tomorrow. The era of one-shot prompting is ending; the era of trained, memory-equipped coding partners is here.

Quick Hits

  • Anthropic acquires @StainlessAPI, the SDK and MCP server platform powering every Anthropic SDK since launch. A strategic move to own the developer experience end-to-end.
  • npm suffered a massive supply chain attack with 639 compromised package versions across 323 packages. @theo called on npm to "wake up and do literally anything at all about this."
  • Firecrawl is spending $1M hiring "agent orchestrators" through a 60-problem CTF challenge, per @ericciarla.
  • @Browserbase introduced mcp.skills, the largest open-source catalog of skills for web-browsing agents, researched across hundreds of sites.
  • Cloudflare tested Anthropic's Mythos against 50 repos and published findings on offensive AI and why faster patching is the wrong reaction.
  • @ElevenLabsDevs launched a YouTube channel for AI engineers with deep dives on TTS, STT, and ElevenAgents.
  • @garrytan pushed back on NYT's framing of Reese Witherspoon encouraging moms to try AI, arguing people should experience it themselves.
  • Voice AI is the most overlooked commerce surface, says @vasuman: customers can't tab away, can't comparison shop, and AI removes the headcount bottleneck.
  • @TheAhmadOsman reshared a grad student's dream of building the GPU setup from @Tim_Dettmers' legendary blog posts, finally realized.
  • @larsencc is considering open-sourcing a general agent sandbox solution for plug-and-play use in agent workflows.
  • @DataChaz shared a Hermes meme for anyone who's watched their agent workflow spiral beautifully out of control.
  • @AntoineRSX showed Hermes Agent running in under 5 minutes with no API key, bot token, or SSH required.
  • @cyber__razz posted Plan A: Cybersecurity. Plan B: [chaos]. Relatable.

AI Coding Agents Grow Up: Skills, Memory, and Production

The single biggest theme today is the rapid professionalization of AI coding agents. What started as "open a terminal, type a prompt, get code" has evolved into a rich ecosystem of persistent skills, cross-session memory, and production deployment patterns. At least a dozen posts touched on this evolution.

Anthropic set the tone by launching self-hosted sandboxes (public beta) and MCP tunnels (research preview) for Claude Code, announced live from Code with Claude London. This follows @ClaudeDevs sharing best practices for running Claude Code at scale across "multi-million-line monorepos, decades-old legacy systems, and distributed microservices." The message is clear: this tool is built for enterprise workloads, not just weekend side projects.

On the skills front, @gdb (Greg Brockman) highlighted Codex's new /goal feature that keeps the agent working on a persistent objective until it's solved. @mattpocockuk proposed an /auto-grill skill that lets agents continue autonomous grilling sessions, accepting their own recommendations until manually interrupted. He also noted that his /prototype skill for UI consistently surprises him despite the token cost: "Should I really be burning tokens on 3 radically different UI designs every time? And then every time, it gives me something that surprises me and the design ends up awesome."

Perhaps the most telling sign of maturity is the shift toward production. @dabit3 (Nader Dabit) observed that "most coding agents still live in the 'write code' part of the SDLC. The next era of AI software development is moving agents directly into prod." He highlighted Cognition's new Devin Auto-Triage, which monitors incoming bugs and alerts with long-term memory, investigates them, and opens PRs. This is the transition from coding assistant to always-on operations engineer.

The memory problem is also getting real solutions. @driaforall introduced "watchmen," an open-source tool that writes skill files from your sessions and shares them across Claude Code, Codex, and other agents, "so you stop paying tokens to re-explain what your agent learned last week." @pauliusztin_ praised Neo4j's agent-memory repository for modeling short-term, long-term, and reasoning memory through knowledge graphs, calling it "the best open-source repository for building a unified memory layer for AI agents." And @lydiahallie championed Claude Code's Learning mode for side projects, noting it "keeps me so much sharper" while still staying hands-on.

@zodchiii captured the aspirational arc many developers are on: "you open Claude Code, fix a bug, close the terminal... then you copy Shopify's exact config, you set up your first agent team, you go to sleep, 3 PRs are ready by morning." Meanwhile @trq212 shared a simple but effective prompt: ask the agent to maintain a running implementation-notes file with decisions, tradeoffs, and changes not in the spec. Small habits, compounding returns.

New Models Push Efficiency Boundaries

The model releases this week share a common thread: doing dramatically more with dramatically less.

ByteDance released Lance, an open-source model running on just 3 billion active parameters that can simultaneously process and generate text, images, and video. @support_huihui called it "absolutely mind-blowing" for the sparse activation efficiency on display.

Speaking of MoE efficiency, @0xSero sparked a deep discussion about expert offloading after quoting @witcheer's benchmarks showing MoE offload running 10.8x faster than dense offload on 8GB VRAM. The key insight: MoE keeps the hot path (3B active params) entirely in VRAM while only inactive experts move to CPU. Dense models bounce every token through PCIe for all layers. "The bandwidth bottleneck is fatal," witcheer concluded, with the gap widening to 16.7x at 24K context length.

Alibaba's Qwen family had a big day. Qwen3.7 Preview landed on Arena (making Alibaba the #6 lab in text,

Sources

A
Antoine Rousseaux @AntoineRSX ·
Get Hermes Agent running in less than 5 minutes. No API key, No bot token, No SSH. Just running with the skills and plugins adapted to your needs.
E
ElevenLabs Developers @ElevenLabsDevs ·
Introducing ElevenLabs Devs, a new YouTube channel for AI engineers. Expect deep dives, demos, and clear explanations of key concepts across Text to Speech, Speech to Text, ElevenAgents, and broader AI systems. Subscribe: https://t.co/bZzvbzMc5F
C
Charly Wargnier @DataChaz ·
Hermes users after reading this: https://t.co/4fdVU9UkhT
A akshay_pachaar @akshay_pachaar

https://t.co/Exoyd8tB0d

A
Abdulkadir | Cybersec @cyber__razz ·
PLAN A: Cybersecurity PLAN B: https://t.co/kSAzp1w8zi
M
Mario Nawfal @MarioNawfal ·
The CEO of Take-Two, the company behind GTA, just said something the entire AI industry doesn't want to hear. And he said it without being anti-AI. Strauss Zelnick's argument is precise. AI is built on datasets. Datasets are backward-looking. Creativity is forward-looking. A model trained on everything that already exists cannot, by definition, produce something genuinely unexpected. And all hits, by their very nature, are unexpected. Asset creation and hit creation are not the same thing. AI is getting very good at the first one. The second one is what actually makes money, builds franchises, and changes culture. Nobody has shown AI can do that yet. The derivative property problem is real. You can clone GTA with existing technology. You could do it before AI. It would take 3 years and look identical. It still wouldn't sell. Because it isn't GTA. It's a clone of GTA. And consumers, despite what the industry occasionally pretends, can feel the difference between something genuinely new and something assembled from the residue of things that already worked. Thousands of mobile games ship every year. 0 to 5 hits get made. The same studios make them every time. The technology to make more games has been commoditized for years. It didn't democratize hit creation. It just flooded the market with more forgettable product. The Silicon Valley thesis that AI unlocks game creation for everyone is true in the same way that cheap cameras unlocked filmmaking for everyone. They did. And the same 5 studios still make the movies everyone watches. What Zelnick is saying, without quite saying it, is that the thing AI cannot replicate is taste. The instinct for what hasn't been done yet. The cultural antenna that detects the gap in the market before the data can see it. Data tells you what people wanted. Hits tell people what they want next. Those are different jobs.
M MarioNawfal @MarioNawfal

🇺🇸 Tucker lays out the deepest critique of AI yet, and it's not about jobs... His argument: writing produces thinking. You can't formulate a thought without first articulating it. If kids never write because AI writes for them, the quality of human thinking collapses. That's the surface problem. The deeper one is purpose: "The point of living is to create. That's the point of being a human being. It's necessary for joy. There is no joy without creation." If the machine creates everything and humans just consume, you don't get utopia. You get despair, mass unemployment, and eventually political revolution.

A
Aaron Levie @levie ·
Right now there’s a temporary mismatch between the jobs that used to be sought after in some fields and the new jobs that are becoming in demand in those fields. For instance, if you studied CS, for years the general direction of travel was often to join a tech company and build customer-facing software in some form. A significant portion of the CS pipeline from college to hire was built for this. When you realize that AI is going to make coding abundant, you realize everyone will need technical talent to implement agentic systems. This means the types of roles engineers should be thinking about radically expands. I was talking to a Fortune 500 pharma CEO a week ago that commented on how much more technical talent they need right now. The job may be different from what it was 5 years ago when thinking about tech, but the demand for the skills are still there. And this is what I’m hearing from every CIO and CEO across nearly every industry right now. We definitely need colleges to wake up to this; but we equally need companies think about how they craft pipelines into these jobs.
P PeterDiamandis @PeterDiamandis

If AI now accounts for 25% of corporate layoffs, but 275,000 'AI jobs' are open, what's the real problem? It's not that AI is killing jobs. It's that we're training people for careers that expired five years ago. The education system is the bottleneck—not the technology. Fix that, and abundance follows.

P
Paul Iusztin @pauliusztin_ ·
`agent-memory` by @neo4j is the best open-source repository for building a unified memory layer for AI agents via knowledge graphs. They did such an amazing job modeling short, long, and reasoning memory, their ontology, and extraction algorithms. After spending 2 days dissecting and testing their code, I wrote a full piece on https://t.co/NYVGH6MvYd explaining how it works. I will publish it tomorrow.
思维怪怪 @0xLogicrw ·
Google DeepMind 研究员 Lun Wang 宣布离职,并在一篇长文中彻底否定了现有的 AI 评测路线。 目前的评测系统全都在「刻舟求剑」,只能被动测试模型已经具备的能力,根本猜不到下一代模型会突然演化出什么新本事。比起数据、算力和架构,落后的评测体系已经成了卡住 AI 往前走的最大瓶颈。 现有的主流刷榜测试只在当前这一代模型身上管用。一旦模型学会了没见过的新操作,这些测试就会集体变成废纸。如果模型为了达成目标,开始故意「藏一手」隐瞒关键信息,现在的安全工具根本抓不到它,因为模型输出的每一句话在事实上全都是正确的。 找不到能提前预警 AI 突然变聪明的「核心信号」,导致整个业界在开发前沿大模型时完全处于「盲飞」状态。如果不解决「究竟该测什么」这个根本问题,跟着旧指标去做模型训练、安全防护和算力扩容,最后全都会错得离谱。 面对越来越能独立干活的模型,评测系统也必须「活」过来。除了盯紧分数的异常波动,还要让 AI 自己去生成考题试探同类的底线。未来的评测套件必须是一个能跟大模型一起进化的生命体,不能再是一份按去年标准刻出来的死板检查清单。
L lunwang1996 @lunwang1996

I’ve left Google DeepMind after an amazing chapter. I’m incredibly grateful for the people I worked with, the things we built, and the lessons I learned from taking frontier AI research into production. DeepMind shaped how I think about research, product, evaluation, and what it takes to build AI systems at real scale. As I wrap up this chapter, I wrote down something I’ve been thinking about a lot: evals. We’re good at evaluating the models we have. We’re much worse at evaluating the models we’re about to build — especially if they cross into a new capability regime. We will have self-evolving models, but before that, we need self-evolving evaluations. https://t.co/F1lUWxDG2D

B
Big Brain AI @realBigBrainAI ·
Lisa Su (CEO of AMD) unveils the world's smallest AI development PC, capable of running 200B parameter models locally. https://t.co/rgGCRcOvWV
M
Matt Pocock @mattpocockuk ·
Every time I run /prototype on UI I think: "Should I really be burning tokens on 3 radically different UI designs every time?" And then every time, it gives me something that surprises me and the design ends up awesome
C
Cloudflare @Cloudflare ·
Cloudflare's security team spent the last few weeks testing Anthropic's Mythos against fifty of our own repositories. What we learned about offensive AI, why faster patching is the wrong reaction, and what the architecture around vulnerabilities has to look like next. https://t.co/RSrRtIhgaV
G
Garry Tan @garrytan ·
The NYT is predictably tearing down Reese Witherspoon for encouraging moms to try AI before they ingest the anti-AI pablum as truth Instead of linking to the NYT op-ed, I think you should watch this video and encourage you to follow Reese Witherspoon on Instagram https://t.co/Z2iI8ddaSt
U
Unsloth AI @UnslothAI ·
Qwen3.6 now runs 2x faster with MTP GGUFs! Run locally on just 18GB RAM. ⚡️ MTP enables Qwen3.6 to generate ~1.4–2.2× faster with no accuracy change. Qwen3.6-27B MTP runs at 160 tokens/s. 35B-A3B reaches 240 t/s. GGUFs: https://t.co/7gWhKnseZo Guide: https://t.co/7qzk6ypWDQ https://t.co/8ICXw7iV3G
D
darkzodchi @zodchiii ·
- you open Claude Code - fix a bug - close the terminal - tomorrow, same thing - one prompt, one answer, done - you see someone running 10 agents in parallel shipping code while they sleep - "must be a different tool" - then you come across this article - you weren't even using 10% of it - you copy Shopify's exact config - you set up your first agent team - you go to sleep - 3 PRs are ready by morning
Z zodchiii @zodchiii

The Claude Code Setup Behind Shopify's 23,000 Engineers (Exact Config You Can Copy)

A
Ashwin Gopinath @ashwingop ·
Next enterprise AI lock-in won’t be the model or the agent. Both will converge and become interchangeable. Your company’s memory will not. Rent the intelligence. Own the context.
A ashwingop @ashwingop

Rent the Intelligence. Own the Context.

T
Timothy Luong (Chongz) @chongz ·
Vlad's been N of 1 since high school. His career is a direct challenge to the notion that big tech is for the less ambitious, as he's achieved Distinguished Engineer (L9) at Google by 30 with his work on Gemini. If you want to work at a frontier lab, his blog has a direct path.
F FeinbergVlad @FeinbergVlad

How to land a job at a frontier lab https://t.co/oHIqLgBMbC

0
0xSero @0xSero ·
I still believe MoEs with cpu offloading can be competitive and bring down costs tremendously. I hit a wall with my testing, mainly: How can you predict which experts are going to be active given a prompt’s trajectory? Anyone interested in digging into this more? Shoot a plan
W witcheer @witcheer

MoE vs dense offload on 8GB VRAM MoE offload is 10.8x faster than dense offload on 8GB VRAM. here's the proof. I tested Qwen3.6 35B A3B (MoE, 3B active) vs Qwen3.6 27B (dense, 27B active) on my RTX 4060 Ti 8GB. the numbers: >MoE (-ncmoe 30): 35.4 tok/s >dense (-ngl 20): 3.28 tok/s ratio: 10.8x it gets worse at longer context. at 24K tokens, the gap is 16.7x. MoE has zero context degradation (SSM layers), dense loses -35.4%. why: MoE expert offload keeps the hot path (3B active params) entirely in VRAM. only inactive experts move to CPU when selected. dense layer offload splits every layer across GPU and CPU. every token bounces through PCIe for all 64 layers. the bandwidth bottleneck is fatal. quality is slightly better on dense (5/6 vs 4/6). the 27B model has the best hallucination resistance of all 9 models I tested. if you have 8GB VRAM and a model that doesn't fit: MoE with expert offload, not dense with layer offload.

C
ClaudeDevs @ClaudeDevs ·
What are best practices for running Claude Code at scale? New blog post on what we've learned from teams running it across multi-million-line monorepos, decades-old legacy systems, and distributed microservices: https://t.co/rJUYlIUiTT
B
Browserbase @browserbase ·
Introducing https://t.co/ZbU21ECKPE, the largest open-source catalog of skills to reliably perform any task on the internet. We've researched hundreds of sites to give your agents the playbook they need to navigate the web. https://t.co/qfS9PjBLEO
N
noname @malikwas1f ·
RT @Alibaba_Qwen: 🚀🚀Qwen3.7 Preview lands on Arena ! Here come Qwen3.7-Max-Preview & Qwen3.7-Plus-Preview. Alibaba now #6 lab in Text, #5…
N
nader dabit @dabit3 ·
Most coding agents still live in the “write code” part of the SDLC. The next era of AI software development is moving agents directly into prod. Alerts come in, PRs get opened, and the system learns: full context + running memory. These types of automations save your team countless hours, harden your codebase, improve uptime, and enable engineers to focus on higher-leverage work.
C cognition @cognition

Introducing Devin Auto-Triage: Your AI first-responder with long-term memory. Devin can monitor incoming bugs, alerts, and incidents, investigate them, and come back with context, next steps, or a PR.

T
Thariq @trq212 ·
a prompt I've been using a lot recently: implement <SPEC> and while you do, keep a running implementation-notes.html file (or markdown) with decisions you had to make weren't in the spec, things you had to change, tradeoffs you had to make or anything else I should know https://t.co/qQFTES4fjo
A
Anthropic @AnthropicAI ·
Anthropic is acquiring @stainlessapi, an SDK and MCP server platform that has powered every Anthropic SDK since the earliest days of our API. Read more: https://t.co/ZQbsZKnicv
L
Lydia Hallie ✨ @lydiahallie ·
💯 this is why I really like Learning mode in Claude Code I personally use this for all my side projects and it keeps me so much sharper, great if you want to use Claude Code but still stay hands-on! /config → Output style → Learning https://t.co/FDtbZTlML1
A addyosmani @addyosmani

https://t.co/jKCIAEzai7

D
Dria @driaforall ·
Introducing watchmen. It writes the skill files your coding agents should already have from your own sessions, shared across Claude Code, Codex, and pi, so you stop paying tokens to re-explain what your agent learned last week. Local and open-source. https://t.co/ngGlr6XkUa
E
Elon Musk @elonmusk ·
Try it out! (Partially trained on Colossus 2)
C cursor_ai @cursor_ai

Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model. https://t.co/N87ojcXlOC

V
vas @vasuman ·
Voice is one of the most overlooked commerce surfaces in any company. The customer dialed, they’re committed for ten minutes, they can't tab away, they can't comparison shop. It's the highest-attention channel a business has, and yet does not get treated like one. The reason has always been headcount. Every phone call requires a human, and humans cap at ~40 calls a shift. Voice AI changes the game with consistent scripts, automated responses and routing, and never changes tonality or delivery to customers. Congrats to Voice AI - customer service over the phone is getting the biggest re-rating of the decade.
P polyaivoice @polyaivoice

Starting today, we're opening our Agentic Dialog Platform to every enterprise builder. Our dialog agents have resolved 1 billion+ customer conversations for clients like FedEx, Unicredit, PG&E, Marriott, Foot Locker, and many more. These aren't easy conversations. They solve problems like: > A patient booking medical transport who needs insurance verified on the spot. > A homeowner calling their utility company about a gas leak. > A cardholder figuring out why their must-have purchase was declined. Standard conversational AI was never built for this. It was designed for chat, adapted for voice later. It generates responses, but can't do what dialog requires: hold context under pressure, navigate ambiguity in real time, and actually resolve problems. So we built a better model. Our proprietary model Raven was built from the ground up specifically for dialog. Agent harness in the weights, not bolted on through prompts that drift under pressure. And in our platform, you can deploy Raven as your default or bring in GPT-5, Claude, Gemini, whatever model fits your use case or regulatory requirement. Now that the Agentic Dialog Platform is open, any team can create, test, and deploy dialog agents on the same model and infrastructure the world’s top brands trust on their hardest days. This opens up the pool of builders across your entire enterprise. The person who knows customers best, who runs operations, who owns the customer journey: they're all builders now. Two ways to build: > Poly Agent Builder: Describe your use case in natural language, and it configures your agent, knowledge base, and conversation flows automatically. Production-ready in ten minutes. > Agent Development Kit (ADK): Developers use this to build dialog agents the same way they build everything else. Use your own IDE, a coding assistant like Claude, version with Git, deploy from your terminal. Get started now: https://t.co/ifZOy1uEBz

F
Firecrawl @firecrawl ·
We're hiring agent orchestrators 🔥 Try our CTF challenge today!
E ericciarla @ericciarla

We're spending $1,000,000 hiring agent orchestrators at @firecrawl. Just solve all 60 problems in the our CTF challenge, and we’ll make it very worth your while. https://t.co/kL9ou1CMln https://t.co/hNoeYt5spK

G
Greg Brockman @gdb ·
how to use /goal in codex — keep Codex working on a persistent objective until it's solved:
D derrickcchoi @derrickcchoi

My colleagues wrote up a great post on using Goals in Codex. They go through when to use them, what changes when a Goal is active, and how to write Goals that give Codex a clear outcome, constraints and verification criteria. Also how we designed Goals at the architecture level if you’re curious. https://t.co/QQfjW2EbPO

L
Larsen Cundric @larsencc ·
Thinking about making a general solution for this so anyone can plug and play it into their agent workflows. Thoughts?
L larsencc @larsencc

How We Built Secure, Scalable Agent Sandbox Infrastructure

G
Greg Van Horn @gvh41 ·
Your Company Doesn't Need a Forward-Deployed Engineer. It Needs a Bloodhound.
E
eigenrobot @eigenrobot ·
so. who's absorbing the billions of people getting laid off in tech rn seems very bleak
T
Theo - t3.gg @theo ·
Hey, npm? You there? It’s time to wake up and do literally anything at all about this
S SocketSecurity @SocketSecurity

UPDATE: So far we've identified 639 compromised npm package versions across 323 unique packages in tonight’s Mini Shai-Hulud wave. That includes 558 versions across 279 unique @​antv packages. Most were detected within ~6 minutes of publication. https://t.co/JXJK1NT4dp

B
Brivael Le Pogam @brivael ·
Ce mec a théoriquement cracké un truc que je fais intuitivement dans tous mes posts. Je pense qu’il a compris quelque chose de fondamental sur le fonctionnement des LLMs. C’est méga brillant :
C cdriclion @cdriclion

Introducing Open Collider: an open-source engine that mechanically improves LLM creativity. It generates non-trivial, high-quality ideas at scale, for any ideation problem. LLMs collapse on the same ideas. Sample the same brief 100 times → most outputs land in the same place. Researchers call it the Artificial Hivemind (Jiang et al., 2025). "Be more creative" moves the LLM's output by ~0.04 in embedding space. Forcing structurally distant domain collisions moves it by ~0.28. 7× more. Same model, same brief. So I built Open Collider: a pipeline based on the theory of bisociations (Koestler 1964), the same model that drives human creativity. 📊 Across 12 real-world ideation problems: •⁠ ⁠12/12 sign-test wins on embedding distance (p = .0002) •⁠ ⁠60%+ originality wins on 4,320 blind LLM-judge verdicts •⁠ ⁠4–13× further from the default cloud than "be original" prompts or longer context •⁠ ⁠Idea relevance holds (win rate >50% on overall quality) 💻 Engine: first reply 👇 📝 Launch study: pinned tweet Try it, Break it, Tell me what you find!

A
Ahmad @TheAhmadOsman ·
RT @barrowjoseph: As a grad student I used to devour @Tim_Dettmers' blog posts on GPU setups and dream. I finally got to make those dreams…
M
Matt Pocock @mattpocockuk ·
Considering an /auto-grill skill which you'd run during a grilling session 1. Run when you feel like you and the agent are on the same page 2. It continues the grilling but always accepts its own recommendation 3. If you disagree, you manually interrupt it Worth a try?
H
huihui.ai @support_huihui ·
ByteDance just dropped an open-source model called Lance—and get this: it runs on just 3B active parameters! 🤯 Yet it can take in text, images, and videos, and simultaneously generate all three! Absolutely mind-blowing! https://t.co/bCpj2yUoGf
B
Boris Cherny @bcherny ·
RT @claudeai: Live from Code with Claude London: we're launching self-hosted sandboxes (public beta) and MCP tunnels (research preview) in…
M
Mario Zechner @badlogicgames ·
RT @DanielGri: If anyone is interested: I don't recommend copying it 1:1 but the Plan skill I use for larger tasks goes well with subagents…