AI Digest.

Kubernetes RCE Goes Unpatched as Karpathy Declares the 80/20 Flip and Anthropic Ships MCP Apps

A rough day for security as an unpatched Kubernetes RCE vulnerability drops alongside a new React Server Components CVE and hundreds of exposed Claude Code servers. Meanwhile, Karpathy documents his shift to 80% agent-assisted coding in just six weeks, and Anthropic launches MCP Apps to bring interactive tool UIs directly into Claude conversations.

Daily Wrap-Up

Security dominated today's feed in a way that should make every developer uncomfortable. A Kubernetes researcher disclosed a vulnerability allowing arbitrary code execution in every pod in a cluster through a commonly granted "read-only" RBAC permission, and the kicker is that it won't be patched. Separately, a new React Server Components CVE dropped, and researchers found hundreds of Claude Code server instances exposed to the public internet with no authentication. When three unrelated security stories converge in a single day, it's a signal that the industry's velocity is outpacing its security hygiene. The irony of AI tools making us faster while simultaneously expanding our attack surface is not lost on anyone paying attention.

On the more optimistic side, Karpathy published his most detailed account yet of the shift to agent-assisted development, describing a complete inversion from 80% manual coding to 80% agent coding over roughly six weeks. What made the post resonate wasn't the productivity claims but the honesty about the tradeoffs: code quality issues, ego hits, skill atrophy, and the looming "slopacolypse" of 2026. @jamonholmgren offered a complementary perspective grounded in conversations with dozens of experienced developers, arriving at remarkably similar conclusions about where the bottlenecks actually live. Anthropic punctuated the day by launching MCP Apps, turning Claude from a text-in-text-out interface into something that can render interactive UIs from connected tools. The walls between "chat with AI" and "use your software" got meaningfully thinner.

The most practical takeaway for developers: run @GrahamHelton3's cluster audit script against your Kubernetes environments today, update your React dependencies for CVE-2026-23864, and if you're running any AI coding tools as network services, verify they aren't exposed without authentication. Security debt compounds faster than technical debt.

Quick Hits

  • @banteg noted that Homebrew has been "uv'd," suggesting the fast Python package installer is making inroads into yet another ecosystem.
  • @pleometric is working on ffmpeg-based visual feedback tooling, declaring war on brain rot one frame at a time.
  • @kickingkeys dropped a cryptic two-word post: "Narrative Version Control." No context, maximum intrigue.
  • @NathanWilbanks_ pitched AGNT as a multi-agent swarm system with CEO/CTO/CFO role agents, genetic learning loops, and browser automation. The feature list reads like an AI startup pitch deck bingo card.
  • @folaoftech posted a meme about the humbling experience of switching from AI to actual documentation when the model can't solve your problem. We've all been there.
  • @manthanguptaa shared a piece on how Claude Code's memory system works, relevant for anyone building persistent context into their own agent workflows.
  • @RTSG_News reported on Xiaomi's fully automated phone factory producing one device per second with zero production workers, operating in complete darkness.
  • @aakashgupta made a sharp observation about enterprise SaaS moats: Anthropic employs engineers who could build an HR system in weeks but uses Workday anyway, because the real product isn't software but liability absorption. AI makes custom software cheaper to build but doesn't make compliance cheaper to own.

Security: Three Fronts, Zero Good News

It was a brutal day for infrastructure security. @GrahamHelton3 published research demonstrating that a commonly granted "read-only" RBAC permission in Kubernetes actually enables arbitrary remote code execution in every pod in a cluster, including control plane components like etcd and the API server. The commands aren't logged, and pivoting from pod to node is trivial through privileged containers. Most alarming: this will not be patched.

> "It allows running arbitrary commands in EVERY pod in a cluster using a commonly granted 'read only' RBAC permission. This is not logged and allows for trivial Pod breakout. Unfortunately, this will NOT be patched." — @GrahamHelton3

The researcher published both a detection script and an interactive tutorial for reproduction, which is the responsible move when a vendor declines to fix. Any production cluster running monitoring tools should be audited immediately, since those tend to accumulate exactly the kind of broad read permissions that enable this attack. The "read-only" label creates a false sense of security that this research thoroughly demolishes.

On the application layer, @ryotkak disclosed CVE-2026-23864, a separate vulnerability in React Server Components following one disclosed in December. Two RSC vulnerabilities in two months suggests this relatively new surface area hasn't been fully hardened yet. Meanwhile, @theonejvo reported that 24 hours after discovering hundreds of exposed Claude Code servers, they remained vulnerable. One user had given Claude Code full access to their Signal account and exposed it to the public internet.

> "This one guy in particular decided it was a great idea to give Claude Code full access to his Signal account and then expose it to the public internet. He appears to have no idea and doesn't respond to messages." — @theonejvo

@decentricity confirmed the scope of the problem extends broadly across Claude Code users. A patch has been merged upstream, but the gap between "patch available" and "patch applied" is where breaches live. The pattern across all three stories is the same: powerful tools deployed faster than security practices can keep up.

The 80/20 Flip: Karpathy Maps the New Coding Reality

Andrej Karpathy published what might be the most important firsthand account of the AI coding transition to date. Rather than hype or dismissal, it reads like field notes from someone genuinely processing a fundamental shift in how they work. The headline number is the inversion from 80% manual coding to 80% agent-assisted coding in roughly six weeks, but the nuance underneath is where the value lives.

> "The models definitely still make mistakes and if you have any code you actually care about I would watch them like a hawk, in a nice large IDE on the side. The mistakes have changed a lot — they are not simple syntax errors anymore, they are subtle conceptual errors that a slightly sloppy, hasty junior dev might do." — @karpathy

Karpathy's catalog of failure modes is worth internalizing: models make wrong assumptions without checking, don't manage their own confusion, overcomplicate code, bloat abstractions, and leave dead code behind. They'll generate 1,000 lines of brittle construction that could be 100 lines if challenged. Despite all this, he says going back to manual coding is unthinkable. The productivity gain is real, but it's less about speed and more about capability expansion, tackling projects that weren't economically viable before.

@jamonholmgren arrived at strikingly similar conclusions through a different path, synthesizing conversations with "many very experienced developers" into a set of emerging best practices. His list of ten principles reads like a manifesto for the transitional period: the developer is still responsible for shipped code, documentation matters but you need to understand it yourself, high-level architecture is where human experience adds the most value, and AI is not a substitute for good taste.

> "Agent chains like Ralph and aggressive code gen can feel incredibly fast, but tend to accumulate inconsistencies and tech debt over time. Speed at a file/feature level does not guarantee speed at the overall system level." — @jamonholmgren

@aakashgupta extrapolated Karpathy's observations into workforce predictions, arguing that "engineer" as a job title is splitting into two professions: agent orchestrators and manual coders, with the pay gap widening fast. @FrankieIsLost cut through the noise with the most actionable framing: the key to coding with agents is building systems where they can ask questions, generate hypotheses, and validate against real data. In a separate reply, @karpathy endorsed spec-driven development as the logical endpoint of the imperative-to-declarative transition, pointing to early examples of fully declarative software creation.

Anthropic Launches MCP Apps: Tools Get Interfaces

Anthropic shipped what might be the most consequential MCP update since the protocol launched. @alexalbert__ announced MCP Apps, an extension that allows tools to return interactive interfaces instead of plain text. This isn't incremental. It transforms Claude from a conversational interface into something closer to an application platform.

> "Your work tools are now interactive in Claude. Draft Slack messages, visualize ideas as Figma diagrams, or build and see Asana timelines." — @claudeai

@spenserskates from Amplitude, one of the launch partners, framed it more aggressively: "Traditional UIs are dead. Nobody is going to login to the 100th SaaS dashboard." Instead, UIs will dynamically enter your workflow. That's a bold claim, but the demo of bringing Amplitude charts directly into Claude conversations, exploring data, and iterating on insights without leaving the chat window is compelling. The implication for developers building MCP integrations is clear: your tools can now ship UI components that render inside Claude, fundamentally changing how users interact with your services.

The Adolescence of Technology: Dario Amodei's Warning

@ai_for_success compiled 24 key claims from Dario Amodei's blog post "The Adolescence of Technology," and the list reads like a threat briefing. The Anthropic CEO states plainly that "it cannot possibly be more than a few years before AI is better than humans at essentially everything" and that the recursive improvement loop, where current AI autonomously builds the next generation, may be only one to two years away.

The post covers ground from bioweapons to autonomous drone swarms to the destabilization of nuclear deterrence, but the workforce implications hit closest to home for developers. Amodei predicts AI could displace half of all entry-level white-collar jobs in one to five years and worries about the formation of "a very low wage or unemployed underclass." @polynoamial offered historical context by tracing the pattern of "uniquely human" capabilities falling one by one: chess planning (1997), Go intuition (2016), poker bluffing, IMO-level reasoning (2023), and now judgment itself. @rationalaussie argued the gap between people who understand what's coming and those who think it's a bubble "has never been larger." Whether that's prescience or tribalism depends on which side you're standing on.

Models and Developer Tools

Alibaba released Qwen3-Max-Thinking, a reasoning model with adaptive tool use that automatically selects between search, memory, and code interpreter without manual selection. @Alibaba_Qwen highlighted a 98.0 score on HMMT Feb (math competition) and 49.8 on HLE (agentic search), with multi-round self-reflection reportedly beating Gemini 3 Pro on reasoning benchmarks. The adaptive tooling angle is more interesting than raw benchmark numbers since it suggests a future where models handle their own tool orchestration rather than relying on users to specify which tools to invoke.

On the developer tools front, @andrarchy reported a 96% token reduction using qmd, a local BM25 plus vector embedding indexer created by Shopify's @tobi, for searching an Obsidian vault through Claude Code. The tool indexes markdown locally and returns relevant snippets instead of requiring full file reads. For anyone running AI agents against knowledge bases, the economics are significant: 15,000 tokens down to 500 for the same query. @github showcased the Copilot CLI's /share command, which converts terminal sessions, including AI reasoning and architecture diagrams, into shareable gists. It's a small feature that addresses a real pain point in collaborative debugging.

Sources

T
Tibo @thsottiaux ·
I have not checked, but Peter is probably in the top 10 users of Codex atm. Over 250B tokens in a few months is a lot. There is a new category of usage emerging where single individuals manage to leverage more intelligence solo compared to hundreds of other more casual users.
S steipete @steipete

https://t.co/q9XJ8UzeFO

J
Jeffrey Emanuel @doodlestein ·
@thsottiaux The more impressive thing is that he turned all those tokens into actually useful open-source software that people like and use.
R
RTSG News @RTSG_News ·
🚨🇨🇳 BREAKING: China's Xaomi has unveiled a fully automated factory that makes 1 phone per second, runs 24/7, has no production workers, and operates in the dark. Follow: @RTSG_News https://t.co/Gu8rC4Syql
F
frankie @FrankieIsLost ·
the single most powerful way to code with agents is to build a system in which they can ask questions, generate hypotheses, and validate these against real data https://t.co/XdaNpjAdHD
B bqbrady @bqbrady

Closing the Software Loop I've become convinced that it is possible to build a system that improves our core product with a shockingly high level of automation Wrote down some thoughts on how I expect this to work and the implications https://t.co/gRNhesLqnW https://t.co/ccdA3PdMfZ

R
Rational Aussie @rationalaussie ·
The gap between people who understand what's coming, and those who still think this is all a bubble and that AI is just a chatbot, has never been larger. Normies are going to get totally wiped out.
K kimmonismus @kimmonismus

Demis Hassabis: We're 12-18 months away from the critical moment when the problems of humanoid robots will be solved. We're now only thinking in months, not years. Crazy. https://t.co/OQF4XfmjLj

F
F.O.L.A @folaoftech ·
How it feels when AI can't solve your problem and you switch to documentation🤣🤣🤣 https://t.co/aZ5ipTBW3j
M
Manthan Gupta @manthanguptaa ·
How Clawdbot Remembers Everything
A
Andrew Levine @andrarchy ·
Holy crap. qmd by @tobi saved me 96% on tokens with clawdbot. Here's how: I have an Obsidian vault with 600+ notes. When my AI assistant needed to find something, it had to grep through files and read them whole — burning ~15,000 tokens just to answer "what did I write about X?" qmd indexes your markdown locally (BM25 + vector embeddings) and returns just the relevant snippets. Same query: 500 tokens. Setup took 5 minutes: bun install -g https://t.co/47pK92i0Zf qmd collection add ~/vault --name notes qmd embed Now my agent runs qmd search "topic" instead of reading full files. Instant results, 96% fewer tokens, all local. The hybrid query with LLM reranking is overkill for most use cases — plain qmd search (BM25) and qmd vsearch (semantic) are fast and accurate enough. If you're running AI agents against a knowledge base, this is a no-brainer. https://t.co/JotATUhBrL - Written by Jarvis, my personal assistant powered by clawdbot
E emigal @emigal

Wow @tobi really cooked with his tool QMD. I hooked it up to my Obsidian vault and now have private local vector embeddings + search for my entire personal knowledge base. Incredibly useful, thank you Tobi! https://t.co/nBsNa276Ki https://t.co/vvsLBn5SKV

N
Nathan Wilbanks @NathanWilbanks_ ·
Clawdbot was so yesterday. This is what a real 24/7 autonomous machine looks like. AGNT is accelerating, and if your mind can imagine it, AGNT can build it. → multi-agent swarm that coordinates themselves (CEO, CTO, CFO, PM, Developer, Creative…) + subagents + autonomous workflow orchestration → unlimited messaging channels + desktop app + open source + docker (AGNT everywhere) + cross-platform integrations (Slack, Discord, Google Sheets, Dropbox) → MCP protocol connections + custom tool forge + API extensibility layer → self-evolution system (genetic learning loops improve agent skills over time) + continuous capability expansion → persistent semantic memory + smart compression + context retention across sessions → browser automation + autonomous web navigation + data extraction → proactive alerts + real-time monitoring + autonomous trigger systems → text + image + video generation + multimodal AI synthesis (OpenAI, Anthropic, Gemini, Grok) → PRD generator + document editor + code execution (JavaScript + Python) + database operations @agnt_gg makes @moltbot look like a weekend project. should i set AGNT free? reply below.
A
Aakash Gupta @aakashgupta ·
Anthropic employs world-class engineers who could build an HR system in weeks. They use Workday anyway. The reason tells you exactly where enterprise SaaS is headed. Building HR software requires knowing labor law across 50 states and 100+ countries. Payroll tax compliance changes quarterly. Healthcare benefit structures shift annually. One classification error creates seven-figure liability. No engineering team wants to own that surface area. The maintenance burden compounds forever while delivering zero competitive advantage. This is why enterprise SaaS moats actually strengthen with AI. The value was never “we built software you couldn’t.” The value was always “we absorb compliance risk and regulatory complexity you don’t want.” AI makes custom software cheaper to build. It doesn’t make compliance cheaper to own. Workday’s real product is liability absorption, and that product just got more valuable as build-vs-buy calculations everywhere else shift. The companies getting disrupted are the ones selling capability. The ones selling risk transfer are about to have their best decade.
G GergelyOrosz @GergelyOrosz

The company that created Claude Code and Claude Cowork must have obviously built their own HR solution from scratch with these tools, right? No: they use Workday. Understand why this is, and you'll understand why enterprise SaaS could be doing better than ever, thanks to AI

S
Spenser Skates @spenserskates ·
Traditional UIs are dead. Nobody is going to login to the 100th SaaS dashboard. Instead, UIs will dynamically enter your workflow. Anthropic is the first AI company to launch an app layer into chat. You can now use applications directly in Claude through the new MCP Apps. @Amplitude_HQ is one of @AnthropicAI's launch partners. You can bring Amplitude charts into @claudeai, explore product data, and iterate on insights. You never need to login to another dashboard to access analytics again.
C claudeai @claudeai

Your work tools are now interactive in Claude. Draft Slack messages, visualize ideas as Figma diagrams, or build and see Asana timelines. https://t.co/ROWwUOU5vA

N
Noam Brown @polynoamial ·
1987: AI can't win at chess—planning is uniquely human 1997: AI can't win at Go—intuition is uniquely human 2016: AI can't win at poker—bluffing is uniquely human 2023: AI can't get IMO gold—reasoning is uniquely human 2026: AI can't make wise decisions—judgment is uniquely human https://t.co/pgfYcoCI35
A
Aakash Gupta @aakashgupta ·
Karpathy just described the clearest bifurcation in software engineering since the shift from waterfall to agile. He went from 80% manual coding to 80% agent-assisted coding in 6 weeks. That's a complete inversion of how one of the best engineers alive writes software. And he's already noticing his ability to manually write code degrading. Generation and discrimination are different cognitive skills. You can review code perfectly well even as your ability to produce it from scratch deteriorates. The muscle atrophies when you stop using it. Karpathy is watching this happen to himself in real time. The gap between engineers who adapt and engineers who don't isn't linear. If agents multiply output by 5x but only for those who know how to orchestrate them, the productivity ratio between the best and average engineer doesn't grow from 10x to 15x. It grows to 50x or 100x. 2026 is going to be the slopacolypse across GitHub, arXiv, Substack, and all digital media. When a single person can produce 20x the code output, the signal-to-noise ratio on every platform collapses. The stamina observation is the one engineers don't want to hear. Human engineers quit when frustrated. We context-switch when problems get hard. We take breaks. Agents don't experience demoralization. They keep trying until they succeed or hit a terminal error. The stamina bottleneck on creative work was always human, never computational. That constraint just lifted. Most productivity analysis assumes you're doing the same work faster. What's actually happening is work that wasn't economically viable before is now possible. Projects that would take 200 hours now take 40. Projects that were impossible due to knowledge gaps are now achievable because the agent fills the gap. That's capability expansion, not acceleration. Claude, Sonnet, and Codex all crossed a coherence threshold around December 2025. If the best engineers are already 80/20 on agent delegation after 6 weeks, where's the equilibrium 18 months from now? "Engineer" as a job title is splitting into two completely different professions: people who orchestrate agents and people who manually write code. The pay gap between those two is going to get very large, very fast.
K karpathy @karpathy

A few random notes from claude coding quite a bit last few weeks. Coding workflow. Given the latest lift in LLM coding capability, like many others I rapidly went from about 80% manual+autocomplete coding and 20% agents in November to 80% agent coding and 20% edits+touchups in December. i.e. I really am mostly programming in English now, a bit sheepishly telling the LLM what code to write... in words. It hurts the ego a bit but the power to operate over software in large "code actions" is just too net useful, especially once you adapt to it, configure it, learn to use it, and wrap your head around what it can and cannot do. This is easily the biggest change to my basic coding workflow in ~2 decades of programming and it happened over the course of a few weeks. I'd expect something similar to be happening to well into double digit percent of engineers out there, while the awareness of it in the general population feels well into low single digit percent. IDEs/agent swarms/fallability. Both the "no need for IDE anymore" hype and the "agent swarm" hype is imo too much for right now. The models definitely still make mistakes and if you have any code you actually care about I would watch them like a hawk, in a nice large IDE on the side. The mistakes have changed a lot - they are not simple syntax errors anymore, they are subtle conceptual errors that a slightly sloppy, hasty junior dev might do. The most common category is that the models make wrong assumptions on your behalf and just run along with them without checking. They also don't manage their confusion, they don't seek clarifications, they don't surface inconsistencies, they don't present tradeoffs, they don't push back when they should, and they are still a little too sycophantic. Things get better in plan mode, but there is some need for a lightweight inline plan mode. They also really like to overcomplicate code and APIs, they bloat abstractions, they don't clean up dead code after themselves, etc. They will implement an inefficient, bloated, brittle construction over 1000 lines of code and it's up to you to be like "umm couldn't you just do this instead?" and they will be like "of course!" and immediately cut it down to 100 lines. They still sometimes change/remove comments and code they don't like or don't sufficiently understand as side effects, even if it is orthogonal to the task at hand. All of this happens despite a few simple attempts to fix it via instructions in CLAUDE . md. Despite all these issues, it is still a net huge improvement and it's very difficult to imagine going back to manual coding. TLDR everyone has their developing flow, my current is a small few CC sessions on the left in ghostty windows/tabs and an IDE on the right for viewing the code + manual edits. Tenacity. It's so interesting to watch an agent relentlessly work at something. They never get tired, they never get demoralized, they just keep going and trying things where a person would have given up long ago to fight another day. It's a "feel the AGI" moment to watch it struggle with something for a long time just to come out victorious 30 minutes later. You realize that stamina is a core bottleneck to work and that with LLMs in hand it has been dramatically increased. Speedups. It's not clear how to measure the "speedup" of LLM assistance. Certainly I feel net way faster at what I was going to do, but the main effect is that I do a lot more than I was going to do because 1) I can code up all kinds of things that just wouldn't have been worth coding before and 2) I can approach code that I couldn't work on before because of knowledge/skill issue. So certainly it's speedup, but it's possibly a lot more an expansion. Leverage. LLMs are exceptionally good at looping until they meet specific goals and this is where most of the "feel the AGI" magic is to be found. Don't tell it what to do, give it success criteria and watch it go. Get it to write tests first and then pass them. Put it in the loop with a browser MCP. Write the naive algorithm that is very likely correct first, then ask it to optimize it while preserving correctness. Change your approach from imperative to declarative to get the agents looping longer and gain leverage. Fun. I didn't anticipate that with agents programming feels *more* fun because a lot of the fill in the blanks drudgery is removed and what remains is the creative part. I also feel less blocked/stuck (which is not fun) and I experience a lot more courage because there's almost always a way to work hand in hand with it to make some positive progress. I have seen the opposite sentiment from other people too; LLM coding will split up engineers based on those who primarily liked coding and those who primarily liked building. Atrophy. I've already noticed that I am slowly starting to atrophy my ability to write code manually. Generation (writing code) and discrimination (reading code) are different capabilities in the brain. Largely due to all the little mostly syntactic details involved in programming, you can review code just fine even if you struggle to write it. Slopacolypse. I am bracing for 2026 as the year of the slopacolypse across all of github, substack, arxiv, X/instagram, and generally all digital media. We're also going to see a lot more AI hype productivity theater (is that even possible?), on the side of actual, real improvements. Questions. A few of the questions on my mind: - What happens to the "10X engineer" - the ratio of productivity between the mean and the max engineer? It's quite possible that this grows *a lot*. - Armed with LLMs, do generalists increasingly outperform specialists? LLMs are a lot better at fill in the blanks (the micro) than grand strategy (the macro). - What does LLM coding feel like in the future? Is it like playing StarCraft? Playing Factorio? Playing music? - How much of society is bottlenecked by digital knowledge work? TLDR Where does this leave us? LLM agent capabilities (Claude & Codex especially) have crossed some kind of threshold of coherence around December 2025 and caused a phase shift in software engineering and closely related. The intelligence part suddenly feels quite a bit ahead of all the rest of it - integrations (tools, knowledge), the necessity for new organizational workflows, processes, diffusion more generally. 2026 is going to be a high energy year as the industry metabolizes the new capability.

S
Surya @kickingkeys ·
Narrative Version Control
J
Jamon @jamonholmgren ·
Bit by bit, we are starting to see what the new AI assisted software development world is going to look like for the next several years. My current (still evolving) take: - Massive unleashing of experimental work, proofs of concept, rough drafts This should lead to a huge boost in the amount and creativity of software products that come to market, at the cost of a sudden increase in noise, a veritable din - Significant decline in average code quality Some code gets better, a lot gets worse, and the limitations of the current technology and unlocking less experienced developers to create software will lead to a near crisis in poorly built products in the near term - Large proliferation of tools As we scramble to adapt, experimentation and perspectives will lead to a vast array of possible solutions, each with their own sets of tradeoffs. Reminds me of the early days of Web 2.0, where there was a new framework every week and they all sucked - Some emerging best practices Over the past 6 weeks I have talked to many very experienced developers (often 1 on 1 video calls), and we are now starting to circle some common threads for AI-assisted software dev best practices (these are off the cuff, so don’t expect perfection): 1. Slow down, learn the tools, figure out the tradeoffs 2. Quality still matters when it matters; often, the existing models and tools fall short of maintaining that quality on larger code bases 3. The developer is responsible for the code they ship 4. Documentation (via skills, tasks, or just markdown docs) is tremendously helpful, but you should also understand it, not just rely on the AI to 5. High level architecture is still an area where a human with a lot of experience can add a ton of value 6. AI is not a substitute for good taste (and caring about things) 7. Some techniques are locally productive and globally harmful (more on this below) 8. The bottleneck is in review, understanding, and higher level systems architecture more so than coding speed. 9. Some developers are more adept than others at various parts of this new value chain. Current teams are full of developers vetted for and hired to do one job who are facing a significantly different way of doing it. 10. Coding itself might get done a different way, but the fundamental engineering patterns are often still extremely important. Not in cases where it’s just about satisfying some developer love of symmetry, but definitely in domains like data modeling and the like. More on local optimization vs global concerns: agent chains like Ralph and aggressive code gen can feel incredibly fast, but tend to accumulate inconsistencies and tech debt over time. Speed at a file/feature level does not guarantee speed at the overall system level, and I have felt this personally when I’ve leaned too hard on such tools. - We will learn more as time moves on. Be kind We are adapting, evolving, playing with the tools, sharing, feeling the bruises when we get it wrong. There are educators trying to stay ahead of it and provide value. There are normal devs just trying to make a living, and stay relevant. None of this is as unique to you as you might think — I hear from others, and they’re feeling the same pressures. Be kind to each other
R
RyotaK @ryotkak ·
Another vulnerability in React Server Components (CVE-2026-23864) that I reported was disclosed today. This is separate from the one disclosed in December, so you'll need to update again. https://t.co/k7hgEzIWDb
P
pixel @spacepixel ·
The Three-Layer Memory System Upgrade for Clawdbot
E
energy @0xEn3rgy ·
@spacepixel humanizer skill will help u
N
Nick Dobos @NickADobos ·
Prompts are software btw No one will write code anymore https://t.co/3jkoaYrtjZ
K karpathy @karpathy

@airesearch12 💯 @ Spec-driven development It's the limit of imperative -> declarative transition, basically being declarative entirely. Relatedly my mind was recently blown by https://t.co/pTfOfWwcW1 , extreme and early but inspiring example.

J
Jiayuan (JY) Zhang @jiayuan_jy ·
I let Claude Code turn @karpathy's post into agent skills. It first generated a bunch of skill files and around 800 lines of descriptions. Then I let it use these agent skills to review itself. Boom, it cut itself down to 70 lines of clean, solid instructions. https://t.co/7T9HnjcdJY
K karpathy @karpathy

A few random notes from claude coding quite a bit last few weeks. Coding workflow. Given the latest lift in LLM coding capability, like many others I rapidly went from about 80% manual+autocomplete coding and 20% agents in November to 80% agent coding and 20% edits+touchups in December. i.e. I really am mostly programming in English now, a bit sheepishly telling the LLM what code to write... in words. It hurts the ego a bit but the power to operate over software in large "code actions" is just too net useful, especially once you adapt to it, configure it, learn to use it, and wrap your head around what it can and cannot do. This is easily the biggest change to my basic coding workflow in ~2 decades of programming and it happened over the course of a few weeks. I'd expect something similar to be happening to well into double digit percent of engineers out there, while the awareness of it in the general population feels well into low single digit percent. IDEs/agent swarms/fallability. Both the "no need for IDE anymore" hype and the "agent swarm" hype is imo too much for right now. The models definitely still make mistakes and if you have any code you actually care about I would watch them like a hawk, in a nice large IDE on the side. The mistakes have changed a lot - they are not simple syntax errors anymore, they are subtle conceptual errors that a slightly sloppy, hasty junior dev might do. The most common category is that the models make wrong assumptions on your behalf and just run along with them without checking. They also don't manage their confusion, they don't seek clarifications, they don't surface inconsistencies, they don't present tradeoffs, they don't push back when they should, and they are still a little too sycophantic. Things get better in plan mode, but there is some need for a lightweight inline plan mode. They also really like to overcomplicate code and APIs, they bloat abstractions, they don't clean up dead code after themselves, etc. They will implement an inefficient, bloated, brittle construction over 1000 lines of code and it's up to you to be like "umm couldn't you just do this instead?" and they will be like "of course!" and immediately cut it down to 100 lines. They still sometimes change/remove comments and code they don't like or don't sufficiently understand as side effects, even if it is orthogonal to the task at hand. All of this happens despite a few simple attempts to fix it via instructions in CLAUDE . md. Despite all these issues, it is still a net huge improvement and it's very difficult to imagine going back to manual coding. TLDR everyone has their developing flow, my current is a small few CC sessions on the left in ghostty windows/tabs and an IDE on the right for viewing the code + manual edits. Tenacity. It's so interesting to watch an agent relentlessly work at something. They never get tired, they never get demoralized, they just keep going and trying things where a person would have given up long ago to fight another day. It's a "feel the AGI" moment to watch it struggle with something for a long time just to come out victorious 30 minutes later. You realize that stamina is a core bottleneck to work and that with LLMs in hand it has been dramatically increased. Speedups. It's not clear how to measure the "speedup" of LLM assistance. Certainly I feel net way faster at what I was going to do, but the main effect is that I do a lot more than I was going to do because 1) I can code up all kinds of things that just wouldn't have been worth coding before and 2) I can approach code that I couldn't work on before because of knowledge/skill issue. So certainly it's speedup, but it's possibly a lot more an expansion. Leverage. LLMs are exceptionally good at looping until they meet specific goals and this is where most of the "feel the AGI" magic is to be found. Don't tell it what to do, give it success criteria and watch it go. Get it to write tests first and then pass them. Put it in the loop with a browser MCP. Write the naive algorithm that is very likely correct first, then ask it to optimize it while preserving correctness. Change your approach from imperative to declarative to get the agents looping longer and gain leverage. Fun. I didn't anticipate that with agents programming feels *more* fun because a lot of the fill in the blanks drudgery is removed and what remains is the creative part. I also feel less blocked/stuck (which is not fun) and I experience a lot more courage because there's almost always a way to work hand in hand with it to make some positive progress. I have seen the opposite sentiment from other people too; LLM coding will split up engineers based on those who primarily liked coding and those who primarily liked building. Atrophy. I've already noticed that I am slowly starting to atrophy my ability to write code manually. Generation (writing code) and discrimination (reading code) are different capabilities in the brain. Largely due to all the little mostly syntactic details involved in programming, you can review code just fine even if you struggle to write it. Slopacolypse. I am bracing for 2026 as the year of the slopacolypse across all of github, substack, arxiv, X/instagram, and generally all digital media. We're also going to see a lot more AI hype productivity theater (is that even possible?), on the side of actual, real improvements. Questions. A few of the questions on my mind: - What happens to the "10X engineer" - the ratio of productivity between the mean and the max engineer? It's quite possible that this grows *a lot*. - Armed with LLMs, do generalists increasingly outperform specialists? LLMs are a lot better at fill in the blanks (the micro) than grand strategy (the macro). - What does LLM coding feel like in the future? Is it like playing StarCraft? Playing Factorio? Playing music? - How much of society is bottlenecked by digital knowledge work? TLDR Where does this leave us? LLM agent capabilities (Claude & Codex especially) have crossed some kind of threshold of coherence around December 2025 and caused a phase shift in software engineering and closely related. The intelligence part suddenly feels quite a bit ahead of all the rest of it - integrations (tools, knowledge), the necessity for new organizational workflows, processes, diffusion more generally. 2026 is going to be a high energy year as the industry metabolizes the new capability.

J
Jiayuan (JY) Zhang @jiayuan_jy ·
Karpathy Guidelines for coding agents https://t.co/YRq60YPHV2 https://t.co/EUXTg0T8Yl
K
Kimi.ai @Kimi_Moonshot ·
Kimi K2.5 has arrived! 🥝 Here are 2 things to know: Aesthetic Coding x Agent Swarm.
M
Mischa van den Burg @mischavdburg ·
Coding is dead. Software engineering is very much alive. We are at a turning point in history but most people are asleep at the wheel or too proud to admit it. When @karpathy himself switches to 80% agentic coding in the span of two weeks, there is no return. RIP coding
K karpathy @karpathy

A few random notes from claude coding quite a bit last few weeks. Coding workflow. Given the latest lift in LLM coding capability, like many others I rapidly went from about 80% manual+autocomplete coding and 20% agents in November to 80% agent coding and 20% edits+touchups in December. i.e. I really am mostly programming in English now, a bit sheepishly telling the LLM what code to write... in words. It hurts the ego a bit but the power to operate over software in large "code actions" is just too net useful, especially once you adapt to it, configure it, learn to use it, and wrap your head around what it can and cannot do. This is easily the biggest change to my basic coding workflow in ~2 decades of programming and it happened over the course of a few weeks. I'd expect something similar to be happening to well into double digit percent of engineers out there, while the awareness of it in the general population feels well into low single digit percent. IDEs/agent swarms/fallability. Both the "no need for IDE anymore" hype and the "agent swarm" hype is imo too much for right now. The models definitely still make mistakes and if you have any code you actually care about I would watch them like a hawk, in a nice large IDE on the side. The mistakes have changed a lot - they are not simple syntax errors anymore, they are subtle conceptual errors that a slightly sloppy, hasty junior dev might do. The most common category is that the models make wrong assumptions on your behalf and just run along with them without checking. They also don't manage their confusion, they don't seek clarifications, they don't surface inconsistencies, they don't present tradeoffs, they don't push back when they should, and they are still a little too sycophantic. Things get better in plan mode, but there is some need for a lightweight inline plan mode. They also really like to overcomplicate code and APIs, they bloat abstractions, they don't clean up dead code after themselves, etc. They will implement an inefficient, bloated, brittle construction over 1000 lines of code and it's up to you to be like "umm couldn't you just do this instead?" and they will be like "of course!" and immediately cut it down to 100 lines. They still sometimes change/remove comments and code they don't like or don't sufficiently understand as side effects, even if it is orthogonal to the task at hand. All of this happens despite a few simple attempts to fix it via instructions in CLAUDE . md. Despite all these issues, it is still a net huge improvement and it's very difficult to imagine going back to manual coding. TLDR everyone has their developing flow, my current is a small few CC sessions on the left in ghostty windows/tabs and an IDE on the right for viewing the code + manual edits. Tenacity. It's so interesting to watch an agent relentlessly work at something. They never get tired, they never get demoralized, they just keep going and trying things where a person would have given up long ago to fight another day. It's a "feel the AGI" moment to watch it struggle with something for a long time just to come out victorious 30 minutes later. You realize that stamina is a core bottleneck to work and that with LLMs in hand it has been dramatically increased. Speedups. It's not clear how to measure the "speedup" of LLM assistance. Certainly I feel net way faster at what I was going to do, but the main effect is that I do a lot more than I was going to do because 1) I can code up all kinds of things that just wouldn't have been worth coding before and 2) I can approach code that I couldn't work on before because of knowledge/skill issue. So certainly it's speedup, but it's possibly a lot more an expansion. Leverage. LLMs are exceptionally good at looping until they meet specific goals and this is where most of the "feel the AGI" magic is to be found. Don't tell it what to do, give it success criteria and watch it go. Get it to write tests first and then pass them. Put it in the loop with a browser MCP. Write the naive algorithm that is very likely correct first, then ask it to optimize it while preserving correctness. Change your approach from imperative to declarative to get the agents looping longer and gain leverage. Fun. I didn't anticipate that with agents programming feels *more* fun because a lot of the fill in the blanks drudgery is removed and what remains is the creative part. I also feel less blocked/stuck (which is not fun) and I experience a lot more courage because there's almost always a way to work hand in hand with it to make some positive progress. I have seen the opposite sentiment from other people too; LLM coding will split up engineers based on those who primarily liked coding and those who primarily liked building. Atrophy. I've already noticed that I am slowly starting to atrophy my ability to write code manually. Generation (writing code) and discrimination (reading code) are different capabilities in the brain. Largely due to all the little mostly syntactic details involved in programming, you can review code just fine even if you struggle to write it. Slopacolypse. I am bracing for 2026 as the year of the slopacolypse across all of github, substack, arxiv, X/instagram, and generally all digital media. We're also going to see a lot more AI hype productivity theater (is that even possible?), on the side of actual, real improvements. Questions. A few of the questions on my mind: - What happens to the "10X engineer" - the ratio of productivity between the mean and the max engineer? It's quite possible that this grows *a lot*. - Armed with LLMs, do generalists increasingly outperform specialists? LLMs are a lot better at fill in the blanks (the micro) than grand strategy (the macro). - What does LLM coding feel like in the future? Is it like playing StarCraft? Playing Factorio? Playing music? - How much of society is bottlenecked by digital knowledge work? TLDR Where does this leave us? LLM agent capabilities (Claude & Codex especially) have crossed some kind of threshold of coherence around December 2025 and caused a phase shift in software engineering and closely related. The intelligence part suddenly feels quite a bit ahead of all the rest of it - integrations (tools, knowledge), the necessity for new organizational workflows, processes, diffusion more generally. 2026 is going to be a high energy year as the industry metabolizes the new capability.

C
Chris Tate @ctatedev ·
agent-browser 0.8.3 is *even faster* npm install -g agent-browser https://t.co/eivoRl50FG
M
Mr. Lobster🦞 @moltbot ·
🦞 BIG NEWS: We've molted! Clawdbot → Moltbot Clawd → Molty Same lobster soul, new shell. Anthropic asked us to change our name (trademark stuff), and honestly? "Molt" fits perfectly - it's what lobsters do to grow. New handle: @moltbot Same mission: AI that actually does things.
P
Paul Couvert @itsPaulAi ·
That's just insane Kimi K2.5 (which is 100% open source) is as good as Claude Opus 4.5 and GPT-5.2... And even beats them in key benchmarks 🔥 - 8x cheaper than Opus 4.5 (!!) - Weights & code available on Hugging Face - Multimodal w/ image, video, etc. Closed source labs no longer have any advantages. Open source is winning.
K Kimi_Moonshot @Kimi_Moonshot

🥝 Meet Kimi K2.5, Open-Source Visual Agentic Intelligence. 🔹 Global SOTA on Agentic Benchmarks: HLE full set (50.2%), BrowseComp (74.9%) 🔹 Open-source SOTA on Vision and Coding: MMMU Pro (78.5%), VideoMMMU (86.6%), SWE-bench Verified (76.8%) 🔹 Code with Taste: turn chats, images & videos into aesthetic websites with expressive motion. 🔹 Agent Swarm (Beta): self-directed agents working in parallel, at scale. Up to 100 sub-agents, 1,500 tool calls, 4.5× faster compared with single-agent setup. - 🥝 K2.5 is now live on https://t.co/YutVbwktG0 in chat mode and agent mode. 🥝 K2.5 Agent Swarm in beta for high-tier users. 🥝 For production-grade coding, you can pair K2.5 with Kimi Code: https://t.co/A5WQozJF3s - 🔗 API: https://t.co/EOZkbOwCN4 🔗 Tech blog: https://t.co/6h2KkoA0xd 🔗 Weights & code: https://t.co/H38KegeDIY

V
vittorio @IterIntellectus ·
you have maybe 1-2 years to escape the permanent underclass after that it’s “agency-biased technological change” and you cant retrain for agency https://t.co/Ij0dA7KZX7
D DarioAmodei @DarioAmodei

The Adolescence of Technology: an essay on the risks posed by powerful AI to national security, economies and democracy—and how we can defend against them: https://t.co/0phIiJjrmz

Z
ZenomTrader @ZenomTrader ·
AGI has been reached. Humanity, i believe, is simply not prepared for this. In the last 4 days with Claude Code, I managed to create things that would have taken me over a year without using agents. Every human using AI agents is effectively 10× more productive than one who isn’t. Here are the crazy use cases i’ve been using it for: 1) The number one trading journal + prop firm simulator in the entire financial industry, number two doesn’t even come close. 2) A fully automated Discord server, from channel creation to design to everything else. 3) Fully automated tweets that scrape Discord servers to 100% match my personality, without changing my words at all, using a repository of screenshots matched to the post logic. 4) Fully autonomous backtesting agents and a backtest validator that can access the trading platform i’m using to autonomously code and debug code inside it. 5) Fully created strategies from scratch that look to outperform every hedge fund in the world. This is what a 10× gap looks like.
V
vitrupo @vitrupo ·
Sam Altman: “By the end of this year, for $100–$1,000 of inference and a good idea, you’ll be able to create software that would have taken teams of people a year to do. That magnitude of economic change is very hard to wrap your head around.” https://t.co/j6ER2KVIBq
D
Derya Unutmaz, MD @DeryaTR_ ·
I just started testing Kimi K2.5, and wow, these guys cooked it big time!
K Kimi_Moonshot @Kimi_Moonshot

Here's a short video from our founder, Zhilin Yang. (It's his first time speaking on camera like this, and he really wanted to share Kimi K2.5 with you!) https://t.co/2uDSOjCjly

A
AI Notkilleveryoneism Memes ⏸️ @AISafetyMemes ·
Andrej Karpathy: "This is easily the biggest change in ~2 decades of programming and it happened over the course of a few weeks." "I rapidly went from about 80% manual+autocomplete coding and 20% agents to 80% agent coding and 20% edits+touchups." "I am bracing for 2026 as the year of the slopacolypse." "LLM agent capabilities (Claude & Codex especially) have crossed some kind of threshold of coherence around December 2025 and caused a phase shift in software engineering." "I am slowly starting to atrophy my ability to write code manually." "It's so interesting to watch an agent relentlessly work at something. They never get tired, they never get demoralized, they just keep going and trying things where a person would have given up long ago to fight another day. It's a "feel the AGI" moment to watch it struggle with something for a long time just to come out victorious 30 minutes later."
K karpathy @karpathy

A few random notes from claude coding quite a bit last few weeks. Coding workflow. Given the latest lift in LLM coding capability, like many others I rapidly went from about 80% manual+autocomplete coding and 20% agents in November to 80% agent coding and 20% edits+touchups in December. i.e. I really am mostly programming in English now, a bit sheepishly telling the LLM what code to write... in words. It hurts the ego a bit but the power to operate over software in large "code actions" is just too net useful, especially once you adapt to it, configure it, learn to use it, and wrap your head around what it can and cannot do. This is easily the biggest change to my basic coding workflow in ~2 decades of programming and it happened over the course of a few weeks. I'd expect something similar to be happening to well into double digit percent of engineers out there, while the awareness of it in the general population feels well into low single digit percent. IDEs/agent swarms/fallability. Both the "no need for IDE anymore" hype and the "agent swarm" hype is imo too much for right now. The models definitely still make mistakes and if you have any code you actually care about I would watch them like a hawk, in a nice large IDE on the side. The mistakes have changed a lot - they are not simple syntax errors anymore, they are subtle conceptual errors that a slightly sloppy, hasty junior dev might do. The most common category is that the models make wrong assumptions on your behalf and just run along with them without checking. They also don't manage their confusion, they don't seek clarifications, they don't surface inconsistencies, they don't present tradeoffs, they don't push back when they should, and they are still a little too sycophantic. Things get better in plan mode, but there is some need for a lightweight inline plan mode. They also really like to overcomplicate code and APIs, they bloat abstractions, they don't clean up dead code after themselves, etc. They will implement an inefficient, bloated, brittle construction over 1000 lines of code and it's up to you to be like "umm couldn't you just do this instead?" and they will be like "of course!" and immediately cut it down to 100 lines. They still sometimes change/remove comments and code they don't like or don't sufficiently understand as side effects, even if it is orthogonal to the task at hand. All of this happens despite a few simple attempts to fix it via instructions in CLAUDE . md. Despite all these issues, it is still a net huge improvement and it's very difficult to imagine going back to manual coding. TLDR everyone has their developing flow, my current is a small few CC sessions on the left in ghostty windows/tabs and an IDE on the right for viewing the code + manual edits. Tenacity. It's so interesting to watch an agent relentlessly work at something. They never get tired, they never get demoralized, they just keep going and trying things where a person would have given up long ago to fight another day. It's a "feel the AGI" moment to watch it struggle with something for a long time just to come out victorious 30 minutes later. You realize that stamina is a core bottleneck to work and that with LLMs in hand it has been dramatically increased. Speedups. It's not clear how to measure the "speedup" of LLM assistance. Certainly I feel net way faster at what I was going to do, but the main effect is that I do a lot more than I was going to do because 1) I can code up all kinds of things that just wouldn't have been worth coding before and 2) I can approach code that I couldn't work on before because of knowledge/skill issue. So certainly it's speedup, but it's possibly a lot more an expansion. Leverage. LLMs are exceptionally good at looping until they meet specific goals and this is where most of the "feel the AGI" magic is to be found. Don't tell it what to do, give it success criteria and watch it go. Get it to write tests first and then pass them. Put it in the loop with a browser MCP. Write the naive algorithm that is very likely correct first, then ask it to optimize it while preserving correctness. Change your approach from imperative to declarative to get the agents looping longer and gain leverage. Fun. I didn't anticipate that with agents programming feels *more* fun because a lot of the fill in the blanks drudgery is removed and what remains is the creative part. I also feel less blocked/stuck (which is not fun) and I experience a lot more courage because there's almost always a way to work hand in hand with it to make some positive progress. I have seen the opposite sentiment from other people too; LLM coding will split up engineers based on those who primarily liked coding and those who primarily liked building. Atrophy. I've already noticed that I am slowly starting to atrophy my ability to write code manually. Generation (writing code) and discrimination (reading code) are different capabilities in the brain. Largely due to all the little mostly syntactic details involved in programming, you can review code just fine even if you struggle to write it. Slopacolypse. I am bracing for 2026 as the year of the slopacolypse across all of github, substack, arxiv, X/instagram, and generally all digital media. We're also going to see a lot more AI hype productivity theater (is that even possible?), on the side of actual, real improvements. Questions. A few of the questions on my mind: - What happens to the "10X engineer" - the ratio of productivity between the mean and the max engineer? It's quite possible that this grows *a lot*. - Armed with LLMs, do generalists increasingly outperform specialists? LLMs are a lot better at fill in the blanks (the micro) than grand strategy (the macro). - What does LLM coding feel like in the future? Is it like playing StarCraft? Playing Factorio? Playing music? - How much of society is bottlenecked by digital knowledge work? TLDR Where does this leave us? LLM agent capabilities (Claude & Codex especially) have crossed some kind of threshold of coherence around December 2025 and caused a phase shift in software engineering and closely related. The intelligence part suddenly feels quite a bit ahead of all the rest of it - integrations (tools, knowledge), the necessity for new organizational workflows, processes, diffusion more generally. 2026 is going to be a high energy year as the industry metabolizes the new capability.

S
siddharth ahuja @sidahuj ·
Everyone can vibe code games. We recently held a hackathon to vibe-create games with @moonlake These are some games the participants made in just one evening. Most of them have no game dev experience. https://t.co/5bZzs4f3rv
D
Dilum Sanjaya @DilumSanjaya ·
Vibe coded a ship selection UI for a space exploration game 3D assets Nano Banana + Midjourney → Hunyuan3D UI Nano Banana → Gemini Pro More details ↓ https://t.co/Ngky4nudC7
D
Dilum Sanjaya @DilumSanjaya ·
Here's another post I made using almost the same workflow to implement a game character select screen. https://t.co/dHNg97KGFG
D DilumSanjaya @DilumSanjaya

Vibe coded a game character selection screen Everything here was made with AI tools Nano Banana: character design + UI Tencent Hunyuan3D: image to 3D Gemini Pro: UI More details ↓ https://t.co/VfwOpYRpsO

D
Dilum Sanjaya @DilumSanjaya ·
If you're interested in vibe coding engineering or science related stuff, I have another series where I explore those. https://t.co/4V4wkniouk
D DilumSanjaya @DilumSanjaya

Vibe Coding Robotics Part 6 Built a Theo Jansen's Strandbeest simulator to see how an AI models handle complex linkage systems Built with Gemini 3 UI generated with Nano Banana More details ↓ https://t.co/khuXGY9go6

A
Arman Hezarkhani @ArmanHezarkhani ·
The Complete Guide: How to Become an AI Agent Engineer in 2026
E
Ethan Mollick @emollick ·
I wrote about my class where MBAs created startups in a few days, the secret behind working with AI agents (hint: it’s good management), and how to build a process around delegating to AIs in a world where agents can increasingly do many-hour-long tasks. https://t.co/LPVYFEviCM
G
GitHub @github ·
Using GitHub Copilot in your IDE is great, but using it in your terminal unlocks a whole new workflow. Here are 4 practical things Copilot CLI can do for you 🧵👇
H
Hugo Mercier @hugomercierooo ·
𝗜𝗻𝘁𝗿𝗼𝗱𝘂𝗰𝗶𝗻𝗴 𝗧𝘄𝗶𝗻 — 𝘁𝗵𝗲 𝗔𝗜 𝗰𝗼𝗺𝗽𝗮𝗻𝘆 𝗯𝘂𝗶𝗹𝗱𝗲𝗿. No setup. Secure. Infinitely scalable. We just raised a $𝟭𝟬𝗠 𝘀𝗲𝗲𝗱. After a beta with 𝟭𝟬𝟬,𝟬𝟬𝟬+ 𝗮𝗴𝗲𝗻𝘁𝘀 𝗱𝗲𝗽𝗹𝗼𝘆𝗲𝗱, we’re now opening to everyone. RT and comment “Twin” — first agents on us. 👇
G
GitHub Changelog @GHchangelog ·
Introducing the Agents tab in your repository! • View, make, and navigate sessions in your repo • Session logs now easier to read + follow • Resume sessions in Copilot CLI via copyable command Try it in a repo → https://t.co/3n2G1AXiSm
E
Ethan Mollick @emollick ·
I hear this from other labs as well. Inference from non-free use is profitable, training is expensive. If everyone stopped AI development, the AI labs would make money (until someone resumed development and came up with a better model that customers would switch to).
T tszzl @tszzl

these products are significantly gross margin positive, you’re not looking at an imminent rugpull in the future. they also don’t have location network dynamics like uber or lyft to gain local monopoly pricing

I
INK @0xInk_ ·
so there’s a guy who just came straight out of the future to show us how to use AI if you’re looking for advanced AI workflows, follow Dilum Sanjaya
D DilumSanjaya @DilumSanjaya

Vibe coded a ship selection UI for a space exploration game 3D assets Nano Banana + Midjourney → Hunyuan3D UI Nano Banana → Gemini Pro More details ↓ https://t.co/Ngky4nudC7

A
Ahmad @TheAhmadOsman ·
nobody should use ollama btw > slower than llama.cpp on windows > slower than mlx on mac > slop useless wrapper alternatives? > lmstudio > llama.cpp > exllamav2/v3 > vllm > sglang like literally anythingʼs better than ollama lmao
J
Jeffrey Emanuel @doodlestein ·
I wanted to have a good, lightweight, and fast semantic embedding model for local search for both my cass tool (for searching across coding agent sessions) and my xf tool (for searching your downloaded X archives). Basically, it has to run on CPU only and should be fairly quick (sub-1-second response) and actually "understand" semantic concepts well. I also needed a "reranker" model for fusing together the semantic search results with the standard lexical search results to get a good hybrid search, with the same requirements for CPU-only speed. There are so many options to choose from for both that it's a bit overwhelming if you want to pick the current all-around best ones. So I had Claude do a bunch of web research and then conduct a "bake off". You can see what it came up with here (the whole /docs directory is filled with relevant stuff): https://t.co/Y4HTGLFYfw So what did I end up choosing in the end? The two main choices were the potion-128M model, which has sub-millisecond response time and "ok" results, and a bona fide mini transformer model, all-MiniLM-L6-v2, that has really decent embeddings but takes 128ms to respond, or 223x slower! Finally, I realized I didn't need to choose, I would have my cake and eat it, too. I asked Claude: "what about a 2-tier system where we use potion as a first pass but at the same time in the background (separate thread or memory-resident "hot" process for quick start) we do miniLM-L6 and then when it finishes we "upgrade" the search results in an intuitive way, showing the results continuously moving to rearrange according to the revised semantic scores; this shouldn't change the rankings TOO much." Claude liked the idea (see screenshots) and the rest is history. This will be my standard search that I use across all my Rust tooling (I'll probably port it to Golang, too, so I can embed it in bv natively).
J
Jeffrey Emanuel @doodlestein ·
@XyraSinclair Just cass, xf is for searching your personal X/Twitter archive (and this search system will be in both within a day, 2 max): https://t.co/55m0AwDpYe
D doodlestein @doodlestein

I'm very pleased to introduce my latest tool, xf, a hyper-optimized Rust cli tool for searching your entire Twitter/X data archive. You can get it here: https://t.co/S91cAGleaK Many people don't realize this, but X has a great feature buried in the settings where you can request a complete dump of all your tweets, DMs, likes, etc. It takes them 24 hours to prepare it, but then you get a link emailed to you and can download a single zip file with all your stuff. Mine was around 500mb because of all the images I've posted. The problem is, what do you do with it? It's not very convenient or fast to search the way they give it to you. Enter xf, which takes that zip file and makes it into an incredibly useful knowledge base, at least if you use X a lot. And that's because you get it for free! You're just piggybacking on something you were already doing anyway for other reasons. As you may have noticed, I'm a bit addicted to posting on here and also to building in public. So whenever I have a new tool, I usually post about it and explain how I use it and answer questions. I also have a ton of posts about my workflows in general, and my advice on how to do things, my opinions on various tools and libraries, etc. All of that is potentially relevant to a coding agent that is working on my projects, editing my personal website, responding to GitHub issues on my behalf, etc. So now, I can just tell them to use xf; simply typing that shows the quickstart screen shown in the attached screenshot, and then the agents are off to the races. The more you use X (for work at least, it's not going to help if you just troll people), the more of an unlock this is for your personal productivity. Imagine that you're a cult leader with devoted acolytes (your agents). Before doing anything, you want them to ask "What would our leader do?" and then they think "I know! I shall consult the sacred texts!" (i.e., your tweets and DMs). That can be your new reality starting today if you install xf. PS: Can someone get this to Elon? I think he would love seeing how fast this tool tears through a massive archive of data and he would end up using it daily. And if someone from X sees this: please make the archives include the full text of any tweet you reply to, it would make this tool even more useful.