Factory AI Drops Agent Readiness Framework While Claude Code's Skills Ecosystem Goes Mainstream
Daily Wrap-Up
The throughline today is infrastructure. Not the servers-and-containers kind, but the organizational kind. The question has shifted from "can AI agents write code?" to "is your codebase ready for them to work in?" Factory AI's Agent Readiness framework gave that question a formal shape, and the response was immediate. Engineering leaders chimed in that this should be priority one. Meanwhile, the Claude Code community is building out the connective tissue that makes agents actually useful: visual feedback tools, skills marketplaces, date-time injection hooks, and CPU profiling output formatted for LLMs. It's plumbing season, and the people laying pipe are winning.
The most surprising moment came from @GergelyOrosz, who noted that inside Big Tech, the internal token usage leaderboards are dominated by distinguished engineers and even VPs. Not junior devs experimenting. Not the AI-curious middle. The most senior, most experienced people, the ones who rarely wrote code day-to-day before LLMs, are now the heaviest users. That says something profound about where AI coding tools deliver the most leverage: not replacing junior work, but unblocking senior people who have deep architectural knowledge but limited time to implement.
The most practical takeaway for developers: invest in making your repositories agent-friendly before investing in fancier agents. Add pre-commit hooks, document environment variables, make builds self-verifiable. As @matanSF put it, fast validation loops make every agent more effective. The tooling is maturing fast, but it can only move as fast as your codebase lets it.
Quick Hits
- @AnthropicAI published a new constitution for Claude, a detailed description of their vision for Claude's behavior and values, written primarily for Claude itself and used directly in training.
- @scaling01 notes Anthropic is "preparing for the singularity," linking to what appears to be internal planning docs.
- @mehulmpt declares "the end of ed-tech is near," a one-liner that hits different after Google launched free AI-powered SAT practice.
- @Abhigyawangoo published "Why your AI agents still don't work," addressing the gap between agent hype and agent reality.
- @__Talley__ made a Polymarket promo video in 30 minutes using only 4-5 prompts, adding to the growing pile of evidence that video editing workflows are being compressed dramatically.
- @TheRealMcCoy shared a breakdown of photonic computing for AI, where light-based matrix multiplication could deliver massive speed gains with lower energy use. Still early but worth watching.
- @theo pointed to the Claude Code ecosystem as the model for what good devrel looks like in 2026.
- @dweekly shared that a Fortune 100 company liked to declare itself on the "frontier of AI" when only 1% of employees had access to any form of it.
Claude Code: From Tool to Ecosystem
Today felt like a tipping point for Claude Code's surrounding ecosystem. The sheer volume of community-built tooling, guides, and workflow patterns suggests the product has crossed from "powerful CLI" into "platform with a developer community."
The headline launch was Agentation from @benjitaylor, a visual feedback tool that lets you click elements, add notes, and copy markdown that gives agents element paths, selectors, and positions. It's the kind of tool that solves a real friction point: agents struggling to understand visual layouts. @benjitaylor went further, noting he "was able to build the entire documentation site solely using Claude Code + Agentation, including all the animated demos." A tool for agents, built by agents.
@affaanmustafa dropped "The Longform Guide to Everything Claude Code," covering token optimization, memory persistence, verification loops, parallelization, and subagent orchestration. It hit 7,500 stars and 1,000 forks in under four days. That kind of traction for a guide, not a tool, signals real hunger for operational knowledge about how to use these systems effectively.
On the tips-and-tricks front, @alexhillman shared several battle-tested patterns. The most immediately useful: injecting date-time context via hooks. "Claude Code doesn't know what time it is. Or what time zone you are in. So when you do date time operations of ANY kind, things get weird fast." His solution is a pre-message hook that generates current datetime and timezone, injected silently into context. Three months battle-tested, and it works. He also pushed a broader philosophy: "If you ask your AI assistant more questions than it asks you, you're gonna have a bad time." His approach combines confidence scoring with interviewing workflows, where agents stop and interview you when they're below a confidence threshold. @rezzz expanded on this, describing how he had the AI interview him about his fears, preferences, and working style so "the system works for me and not the other way around."
The Anthropic team shared their own war stories. @trq212 revealed they found a garbage collection issue in Claude Code's rendering pipeline that only surfaced in certain terminal/OS combinations, noting "some things you can't find until you ship." The underlying migration was massive: porting their entire rendering engine while keeping nothing user-facing broken, work that "could have taken on the order of 1-2 years for a single engineer" without Claude Code.
Other ecosystem moves: @simplifyinAI highlighted a new open-source library with 100+ pre-made agents, skills, and templates for Claude Code. @jakubkrcmar called out how @clawdbot is "quickly becoming the wet dream of leading AI companies." @jarredsumner announced Bun will support --cpu-prof-md, printing CPU profiles as Markdown so LLMs can read and grep them. Even runtime tooling is adapting to agent workflows. And @paraddox shared the simplest possible autonomous loop: a bash script running Claude 50 times with --dangerously-skip-permissions.
Agent Readiness and the Code Review Problem
Factory AI introduced Agent Readiness, a framework that measures how well a repository supports autonomous development across eight axes and five maturity levels. The framing is important: it's not about making better agents, it's about making better environments for agents to work in. @EnoReyes called it "the most essential focus area for a software organization looking to accelerate," warning that without it, "your adoption of AI will actively decelerate your org." @bentossell kept it simple: "all repos should be agent-ready."
@matanSF made the case concrete with examples that will resonate with anyone who's watched an agent flail: "No pre-commit hooks = agent waits 10 min for CI instead of 5 sec. Undocumented env vars = agent guesses, fails, guesses again. Build requires tribal knowledge from Slack = agent can't verify its own work." The pattern is clear. Agent effectiveness is bounded by repository quality.
The code review conversation ran in parallel. @ScottWu46 from Devin argued that current AI review tools focus on catching bugs at arm's length, but "until we reach the point where you can confidently hit 'Merge' on a 5000-line agent PR, you're still bottlenecked on reviewing the code yourself." The real question: would you rather have an AI that catches 80% of bugs, or an AI-powered review UX that makes you 5x faster? @walden_yan echoed this, saying "it felt pretty slop to say AI will review the code that it wrote. The key is going to be helping the HUMAN understand what they're merging." Meanwhile, @steveruizok offered a delightfully unorthodox approach: "The best code review tool I've come up with is asking Claude to reimplement the PR on a new branch in a narratively optimized perfect git history." @pauldix tied it together, arguing that "getting agents into a verification loop is the superpower for 2026."
Skills Discovery and the Context Layer
A potentially significant standards conversation emerged around how agents find and consume skills. @elithrar from Cloudflare proposed using the .well-known URI standard with an index of associated files, so agents can hit /.well-known/skills/index.json to discover related skills. "There's tools like add-skill to add skills, but you have to find them first." He acknowledged the tension around premature standardization but emphasized this is an RFC, not a standard, "big on the 'C' here."
Prefect went bigger. @jlowin announced Prefect Horizon, positioning it as "the context layer" where AI agents interface with your business. Built on top of FastMCP (which they created), Horizon adds managed hosting, a central registry, role-based access control, and an agentic interface for business users. The pitch: MCP tells you how to build a server, but not how to govern it at scale. Horizon aims to fill that gap. Whether the market needs an MCP platform layer this early is an open question, but the bet is clear. @LLMJunky demonstrated the power of the skills approach by generating a full promo video for CodexSkills from a single prompt, calling the result "cracked."
Products and Launches
Several product launches landed today. @tomkrcha launched Pencil, an infinite WebGL design canvas for Claude Code with parallel design agents and a git-native .pen file format. @theworldlabs opened the World API for building. @usekernel introduced Browser Pools with pre-configured logins, cookies, and extensions for agents, and @rfgarcia outlined use cases: spinning up browsers for QA, running evals on browser agents, and giving parallel subagents different research tasks without paying for standby CPU time.
Google entered the AI-for-education space by launching full-length, on-demand SAT practice exams in Gemini, grounded in content from Princeton Review, available at no cost. @Zai_org highlighted GLM Coding Plans paired with Kilo Code, focusing on the practical question of how much work you can get done without worrying about limits or cost.
AI and the Career Landscape
@hamptonism posted the meme of the day: "POV: driving to your $450k SWE job knowing it's just another 8 hours of having Claude do everything for you until you're eventually replaced entirely within 12 months." It's a joke, but @GergelyOrosz's observation about distinguished engineers and VPs dominating internal token leaderboards adds a serious dimension. The most experienced people are getting the most leverage, which suggests AI tools amplify expertise rather than replacing it.
@esrtweet invoked Vernor Vinge's Singularity concept, arguing we're living in it right now: "Nobody knows what to build that will still have value in 3 months." The practical implication for planning is real. Product roadmaps built on 12-month horizons are struggling when the underlying capabilities shift quarterly.
Local AI: Big Models on Small Hardware
@LiorOnAI highlighted AirLLM, which runs 70B parameter models on 4GB VRAM by loading one layer at a time, computing, freeing memory, and loading the next. It can reportedly run Llama 3.1 405B on 8GB VRAM. No quantization required by default, same API as Hugging Face Transformers. The tradeoff is speed for memory, but for prototyping and testing, the accessibility is compelling.
@DataChaz covered NVIDIA's PersonaPlex-7B, an open-source full-duplex conversational model released under MIT license. Unlike traditional ASR-to-LLM-to-TTS pipelines that force rigid turn-taking, PersonaPlex uses a dual-stream transformer to listen and speak simultaneously, enabling "instant back-channel responses, interruptions that feel human, real conversational rhythm." For anyone building voice agents or low-latency assistants, this is a meaningful step toward natural conversation.
Source Posts
Meet Devin Review: a reimagined interface for understanding complex PRs. Code review tools today donโt actually make it easier to read code. Devin Review builds your comprehension and helps you stop slop. Try without an account: https://t.co/Zzu1a3gfKF More below ๐ https://t.co/sYQLjwSk6s
Lunch w/ an exited founder who helps fortune 500 companies adopt AI. Insane reality check: Some of the biggest companies on earth use *zero* AI tools. Not even ChatGPT. Execs only recognize: ChatGPT, Copilot, Gemini (maybe Perplexity). Everyone feels behind. Nobody knows what to buy or how to plug it in. The "AI saturation" narrative is another example of what a bubble Silicon Valley is. Rest of the world hasnโt started yet. We have to build for the 99%.
@theirongolddev @alexhillman What Alex did I thought was geniusโฆ I had it interview me for ergonomics I had it ask me my fears, what I didnโt like, what works for me, what I want, how I want to work/show up, and other things about me so the system works for me and not the other way around.
How I'm using Clawd.bot to change how I get things done.
100% convinced that how we work has now changed....and although this might still be fringe now. I think it will filter down to the masses over time. I...
Remotion now has Agent Skills - make videos just with Claude Code! $ npx skills add remotion-dev/skills This animation was created just by prompting ๐ https://t.co/hadnkHlG6E
Now you can track your @opencode and @claudeai CLI coding sessions in one place. https://t.co/FLe8dRC8Pv provides searchable history, markdown export, and eval-ready datasets. See tool usage, token spend, and session activity across projects. Check out the demo. https://t.co/HGlZOOyugN
@souravbhar871 Itโs all stored locally in your .claude folder, you can ask Claude to read it and create scripts to help visualize it
Weโre launching full-length, on demand practice exams for standardized tests in @GeminiApp, starting with the SAT, available now at no cost. Practice SATs are grounded in rigorously vetted content in partnership with @ThePrincetonRev, and Gemini will provide immediate feedback highlighting where you excelled and where you might need to study more. To try it out, tell Gemini, โI want to take a practice SAT test.โ
Build the machine that builds the machine
Why your AI agents still donโt work
Most agents are horrible at integrating with domain-specific knowledge and adapting to feedback. I know most people hear this and think, "Great, I'll ...
I'm starting to get worried. Did Anthropic solve continual learning? Is that the preparation for evolving agents? https://t.co/pcCoSM4gAr
The Longform Guide to Everything Claude Code
In "The Shorthand Guide to Everything Claude Code", I covered the foundational setup: skills and commands, hooks, subagents, MCPs, plugins, and the co...
Meet Devin Review: a reimagined interface for understanding complex PRs. Code review tools today donโt actually make it easier to read code. Devin Review builds your comprehension and helps you stop slop. Try without an account: https://t.co/Zzu1a3gfKF More below ๐ https://t.co/sYQLjwSk6s
Introducing Agent Readiness. AI coding agents are only as effective as the environment in which they operate. Agent Readiness is a framework to measure how well a repository supports autonomous development. Scores across eight axes place each repo at one of five maturity levels. https://t.co/9POPIY3hXr
v1 of my "reimplement this PR using an ideal commit history" command, actually works quite well. "What commits would I have made if I had perfect information about the desired end state?" https://t.co/5S4kCIo8bR
The Shorthand Guide to Everything Claude Code
Introducing Agent Readiness. AI coding agents are only as effective as the environment in which they operate. Agent Readiness is a framework to measure how well a repository supports autonomous development. Scores across eight axes place each repo at one of five maturity levels. https://t.co/9POPIY3hXr
Remotion now has Agent Skills - make videos just with Claude Code! $ npx skills add remotion-dev/skills This animation was created just by prompting ๐ https://t.co/hadnkHlG6E
Introducing Browser Pools โ instant browsers with the logins, cookies, and extensions your agents depend on. Designed to make using Kernel even faster. https://t.co/Gt6cc9awcd
Yeah this was 1,000% worth it. Separate Claude subscription + Clawd, managing Claude Code / Codex sessions I can kick off anywhere, autonomously running tests on my app and capturing errors through a sentry webhook then resolving them and opening PRs... The future is here.
Introducing Agent Readiness. AI coding agents are only as effective as the environment in which they operate. Agent Readiness is a framework to measure how well a repository supports autonomous development. Scores across eight axes place each repo at one of five maturity levels. https://t.co/9POPIY3hXr
My clawdbot sucks at days and time. It never seems to have any clue what the current day or time is.