Claude Code Spawns a Harness Ecosystem as Karpathy Programs an AI Research Org
Daily Wrap-Up
The big story today isn't any single announcement. It's the emerging pattern of developers building their own orchestration layers on top of Claude Code rather than waiting for Anthropic to ship features. We're seeing headless setups, custom harnesses, Obsidian integrations, and plugin ecosystems all materialize in parallel. This is what adoption looks like when the underlying model is good enough that the bottleneck shifts from capability to workflow integration. The Claude Code CLI has become a substrate, not a product, and today's posts make that unmistakably clear.
Karpathy's thread on running eight agents as a research org was the most technically interesting content of the day. His honest assessment that agents generate bad experimental ideas even at maximum intelligence is worth internalizing. They'll implement anything you describe precisely, but they won't design a rigorous ablation study or catch confounders in their own results. His framing of "programming an organization" with prompts, skills, and processes as the new source code is exactly the direction the agent harness community is moving, and it's encouraging to see someone of his caliber validating the approach while being transparent about how far it has to go.
On the workforce side, the rhetoric keeps escalating. Reports of YC founders planning to eliminate all engineering roles below staff level sit alongside claims about Anthropic's CEO predicting 50% displacement of lawyers, consultants, and finance professionals. Whether these claims are exaggerated or directionally correct, the signal is consistent: the professional class is waking up to the possibility that AI doesn't just automate blue-collar work. The most practical takeaway for developers: if you're not already building with agent harnesses and multi-agent workflows, start now. Karpathy's thread is a blueprint for the kind of experimental setup you should be running, even if the agents aren't fully autonomous yet. The value is in learning to program organizations, not just software.
Quick Hits
- @sama announced OpenAI has raised a $110 billion round from Amazon, NVIDIA, and SoftBank. That's not a typo. The capital concentration in frontier AI is now at sovereign wealth fund scale.
- @UnslothAI updated Qwen3.5 with improved tool-calling and coding performance. Qwen3.5-35B-A3B now runs on 22GB RAM, with benchmarks across Claude Code and Codex.
- @theo flagged that the government is trying to force Anthropic to remove Claude's safety guards, calling it "probably very bad." Policy pressure on AI safety continues to escalate.
- @cryptopunk7213 shared a story about someone spinning up an AI agent to lowball sellers on Facebook Marketplace, scoring a Jeep Wrangler for $1,500 and free PS5s and TVs. Marketplace arbitrage agents are here.
- @pvncher launched RepoPrompt 2.0 as a fully integrated agent with built-in oracle and context builder, showcasing how much better agents perform with good context engineering tools.
- @jackfriks captured the vibe of the moment perfectly: "cracks knuckles 'claude, read this article and implement all of its advice' retires"
- @nicdunz offered a philosophical take: "prompting LLMs is, in a way, similar to using the search bar on the library of babel website." It's a surprisingly apt metaphor for the retrieval-from-latent-space nature of generation.
- @TheBronxViking retweeted @BillyM2k's "how to run a company in 2026," which at this point probably involves fewer humans than a 2016 startup's founding team.
- @alancarroII posted the obligatory meme about plumbers and electricians watching AI replace everyone who went to college. The trades-vs-knowledge-work inversion narrative continues to gain traction.
Claude Code's Harness Ecosystem Takes Shape
Something interesting is happening in the Claude Code community: developers are increasingly treating the CLI as a foundation to build on rather than a finished product. Today's posts paint a picture of an ecosystem fragmenting in productive ways, with users building custom harnesses, plugins, and integrations that extend Claude Code's capabilities far beyond what ships in the box.
@alxfazio urged developers to be "headless claude maxxing," pointing to an article that apparently explains the pattern better than Anthropic's own docs. Running Claude Code headless, without the interactive terminal UI, unlocks programmatic orchestration that's impossible in the default interactive mode. Meanwhile, @Jaytel declared they're "done with Claude Code" entirely, finding that "building your own harness in Pi is addicting." This isn't a rejection of Claude as a model. It's a rejection of the default interface in favor of something custom-tailored.
The plugin side is evolving too. @affaanmustafa highlighted how easy it is to add Claude Code plugins through Cowork's interface, noting they use "a bit of everything at this point, mainly to check how things work across harnesses." And @noahvnct shared a guide on building an "AI Second Brain Using Obsidian + Claude Code," connecting the coding agent to a knowledge management system. The common thread is that power users want Claude Code integrated into their existing workflows, not the other way around.
@trq212 shared "Lessons from Building Claude Code: Seeing like an Agent," which frames the design philosophy from Anthropic's perspective. The title itself is revealing: the challenge isn't just making a good model, it's making the model see the world the way an effective agent needs to. As the harness ecosystem matures, we're seeing the community answer that question from the other direction, building the scaffolding that helps agents see like developers actually work.
AI Reshapes Professional Work
The workforce disruption conversation took a sharper turn today, moving from abstract predictions to concrete reports of action. @jeffdfeng dropped what might be the most unsettling post of the day: "Spoke with several YC founders planning to lay off all engineers below staff/principal, basically everyone under L5. This only became viable after Opus 4.5 in December." He framed the Block layoffs as a signal that "the floor just collapsed" and advised early-career engineers that "your edge will be how well you integrate AI into the value you create."
Whether these specific claims hold up to scrutiny is less important than the sentiment they represent. The idea that AI coding agents can replace junior and mid-level engineers is now a planning assumption at funded startups, not a thought experiment. Combined with @cgtwts sharing Anthropic CEO Dario Amodei's prediction that "AI will wipe out 50% of lawyers, consultants, and finance professionals within the next 12 months," the message is consistent across industries: the professional class is in the crosshairs.
But the picture isn't purely dystopian. Two posts from the legal world show professionals leaning into AI rather than being displaced by it. @garthwatson, a non-practicing lawyer who built a mobile app with Claude Code, called it "signal" for legal tech, noting his experience founding and scaling a legal tech company. And @zackbshapiro detailed how he's increasingly using Claude as his primary tool in legal practice, not specialized legal AI products like Harvey or CoCounsel, but "a general-purpose AI that I've taught how I practice law." This distinction matters. The professionals who survive the disruption won't be the ones waiting for industry-specific AI tools. They'll be the ones who learn to work directly with general-purpose models and shape them to their domain. The gap between "AI will take your job" and "AI will transform your job" often comes down to whether you're building your own workflows or waiting for someone else to build them for you.
Karpathy Programs an AI Research Organization
Andrej Karpathy shared the most technically substantive post of the day: a detailed account of running eight AI agents (four Claude, four Codex) as a research organization working on nanochat experiments. The setup is ambitious, with each agent getting a GPU, running on git branches with worktree isolation, communicating through simple files, and visible through tmux window grids. The goal: delete logit softcap from the model without regression.
The result? "The TLDR is that it doesn't work and it's a mess... but it's still very pretty to look at." What makes this post valuable is Karpathy's precise diagnosis of why it fails. The agents' ideas are "pretty bad out of the box, even at highest intelligence. They don't think carefully through experiment design, they run a bit non-sensical variations, they don't create strong baselines and ablate things properly." His example is perfect: an agent "discovered" that increasing hidden size improves validation loss, a totally spurious result that conflates model capacity with actual improvement when training time isn't controlled.
But the conceptual frame is where things get interesting. Karpathy describes the work as "programming an organization" where the source code is "the collection of prompts, skills, tools, etc. and processes that make it up. E.g. a daily standup in the morning is now part of the 'org code.'" The evaluation metric becomes: "given an arbitrary task, how quickly does your research org generate progress on it?" This maps directly onto what the Claude Code harness builders are doing at a smaller scale, except Karpathy is applying it to ML research rather than software engineering.
@nummanali picked up a related thread, highlighting Middleman, a tool from the creator of dev-browser that gives you "a single persistent manager agent per project." The pitch captures the emerging consensus: "You're not an IC anymore. You've become a project manager. You need a middle manager." Whether it's Karpathy orchestrating eight researchers or a solo developer managing a Middleman instance, the pattern is the same. The human's role is shifting from doing the work to designing the system that does the work. The question is whether that system can generate genuinely good ideas, or just execute the ones you hand it.
Source Posts
We've also created plugins across HR, design, engineering, ops, financial analysis, investment banking, equity research, private equity, and wealth management to help users see what's possible and start building their own.
Lessons from Building Claude Code: Seeing like an Agent
One of the hardest parts of building an agent harness is constructing its action space. Claude acts through Tool Calling, but there are a number of wa...
we're making @blocks smaller today. here's my note to the company. #### today we're making one of the hardest decisions in the history of our company: we're reducing our organization by nearly half, from over 10,000 people to just under 6,000. that means over 4,000 of you are being asked to leave or entering into consultation. i'll be straight about what's happening, why, and what it means for everyone. first off, if you're one of the people affected, you'll receive your salary for 20 weeks + 1 week per year of tenure, equity vested through the end of may, 6 months of health care, your corporate devices, and $5,000 to put toward whatever you need to help you in this transition (if you’re outside the U.S. you’ll receive similar support but exact details are going to vary based on local requirements). i want you to know that before anything else. everyone will be notified today, whether you're being asked to leave, entering consultation, or asked to stay. we're not making this decision because we're in trouble. our business is strong. gross profit continues to grow, we continue to serve more and more customers, and profitability is improving. but something has changed. we're already seeing that the intelligence tools we’re creating and using, paired with smaller and flatter teams, are enabling a new way of working which fundamentally changes what it means to build and run a company. and that's accelerating rapidly. i had two options: cut gradually over months or years as this shift plays out, or be honest about where we are and act on it now. i chose the latter. repeated rounds of cuts are destructive to morale, to focus, and to the trust that customers and shareholders place in our ability to lead. i'd rather take a hard, clear action now and build from a position we believe in than manage a slow reduction of people toward the same outcome. a smaller company also gives us the space to grow our business the right way, on our own terms, instead of constantly reacting to market pressures. a decision at this scale carries risk. but so does standing still. we've done a full review to determine the roles and people we require to reliably grow the business from here, and we've pressure-tested those decisions from multiple angles. i accept that we may have gotten some of them wrong, and we've built in flexibility to account for that, and do the right thing for our customers. we're not going to just disappear people from slack and email and pretend they were never here. communication channels will stay open through thursday evening (pacific) so everyone can say goodbye properly, and share whatever you wish. i'll also be hosting a live video session to thank everyone at 3:35pm pacific. i know doing it this way might feel awkward. i'd rather it feel awkward and human than efficient and cold. to those of you leaving…i’m grateful for you, and i’m sorry to put you through this. you built what this company is today. that's a fact that i'll honor forever. this decision is not a reflection of what you contributed. you will be a great contributor to any organization going forward. to those staying…i made this decision, and i'll own it. what i'm asking of you is to build with me. we're going to build this company with intelligence at the core of everything we do. how we work, how we create, how we serve our customers. our customers will feel this shift too, and we're going to help them navigate it: towards a future where they can build their own features directly, composed of our capabilities and served through our interfaces. that's what i'm focused on now. expect a note from me tomorrow. jack
The Codex App is still heavily slept on if you aren't using ECC for Codex you're missing out Its super easy and pulls all the skills over Most peoples development related openclaw automations can also just be directly ran from codex I ported a lot of my automations over https://t.co/oCZRV3cvKb
How come the NanoGPT speedrun challenge is not fully AI automated research by now?
Pi is the most interesting agent harness. Tiny core, able to write plugins for itself as you use it. It RLs itself into the agent you want. I was missing cc’s tasks system and told it to spawn clause in tmux and interrogate it about it and make an implementation for itself. It nailed it, including the UX. Clawdbot is based on it and now it makes sense why it feels so magical. Dawn of the age of malleable software.
The Claude-Native Law Firm