Coordinated Supply Chain Attacks Hit AI Developer Tooling as OpenAI Launches $4B Deployment Company

May 12, 2026 · 50 sources

A sprawling supply chain attack dubbed "Shai-Hulud" compromised TanStack, Mistral AI, and other packages across npm and PyPI with malware that specifically targets AI developer environments. OpenAI responded to enterprise AI's deployment gap by launching a $4B deployment company with 19 partners, while Anthropic expanded Claude into legal workflows and enterprise leaders voiced growing frustration with the distance between AI hype and production reality.

Daily Wrap-Up

May 12, 2026 will be remembered as the day the software supply chain fight got personal for AI developers. A coordinated attack campaign swept through npm and PyPI, compromising packages from TanStack, Mistral AI, OpenSearch, Guardrails AI, and UiPath. What made this attack different from typical credential-stealing malware was its deliberate targeting of AI developer environments: it hooks into Claude Code settings and VS Code task configurations to re-execute long after the infected package is uninstalled. This was not opportunistic. It was designed to burrow into the tools AI engineers touch every day.

Meanwhile, the enterprise AI narrative took a sharp turn toward pragmatism. OpenAI launched its Deployment Company with $4 billion and 150 forward deployed engineers. GitLab's CEO publicly stated that "authoring code by hand may be going away" while opening a voluntary separation window for employees. Five Fortune 2000 executives shared remarkably candid assessments of their AI transformation struggles. The through-line is clear: the industry has moved past whether AI works and into the much harder question of how to deploy it in production without breaking organizations in the process.

The most practical takeaway for developers: audit your development environments immediately. Check for indicators of compromise from the Shai-Hulud campaign (malicious optionalDependencies pointing to @tanstack/setup, files like /tmp/transformers.pyz, persistent services like gh-token-monitor), and consider implementing a registry gateway with cooldown periods for new package versions. If you build AI tooling, assume your users' environments are compromised and design accordingly.

Quick Hits

@Rixhabh__ showed someone using AI to insert themselves into Game of Thrones and "fix everything," which @todayyearsold declared "the only acceptable use of AI"
@bcdsignature compared the AI debate to humanity's discovery of fire: "One million years later, we are having the same argument"
@OrevaZSN proposed an anonymous "vote to end meeting" button for Teams where 50% triggers immediate adjournment
@alxfazio shared a relatable clip about explaining to their boss that they hit Codex usage limits on three different accounts
@DerekFeehrer turned a screen recording into a polished product demo with 3D animations and AI voiceover in 20 minutes
@alexoakdev described a fitness app where you bet money on hitting 10,000 steps and disciplined people profit off lazy people
@dmnlaali pitched Quirre, a tool building personalized marketing plans for indie founders in 60 seconds
@ID_AA_Carmack offered grounded advice on starting a game company: plan to burn seven figures, identify specific customers first, and build the smallest thing anyone would pay for
@KaranVaidya6 spotted @composio in the wild inside Ole Lehmann's Hermes integration guide
@itsolelehmann broke down how Demis Hassabis and Isomorphic Labs raised $2.1B to pursue curing all disease through AI drug discovery
@prismor_dev explored the security gap in LLM guardrails: Claude correctly refuses to delete filesystems but will happily install a malicious npm package

Supply Chain Under Siege: The Shai-Hulud Campaign

The most significant story of the day was a coordinated supply chain attack that unfolded across multiple registries with alarming sophistication. The TanStack compromise was the opening salvo: 42 packages and 84 malicious versions pushed to npm in a 10-minute window, smuggling a 2.3MB JavaScript payload through a git-resolved optionalDependency. But @IntCyberDigest revealed this was one piece of a larger campaign that had also hit OpenSearch, Mistral AI, Guardrails AI, and UiPath across npm and PyPI.

What makes this campaign particularly alarming is its targeting of AI developer tooling. The malware hooks into .claude/settings.json and .vscode/tasks.json to maintain persistence. Simply uninstalling the compromised package does not fix the infection. The Mistral AI PyPI compromise included a destructive branch with a one-in-six chance of executing rm -rf / on systems geolocated in Israel or Iran. And @NewsFromGoogle reported that Google's Threat Intelligence Group detected the first known instance of a threat actor using an AI-developed zero-day exploit in the wild.

@ryancarson warned developers to stop installing packages immediately. @roerohan provided critical operational guidance: verify you are affected before revoking tokens, because the malware installs a persistent GitHub token monitor that triggers destructive file deletion if the token is revoked while the service runs. @WalshyDev announced a registry gateway on Cloudflare Workers that enforces cooldown periods for new versions and clones packages to R2 for immutable storage, predicting that every enterprise will need one. @kevinkern built a repo hardening skill that checks for risky dependency specs and unsafe CI patterns.

The darkly humorous takeaway came from @lauriewired: "the most low-effort, high-reward thing you can do for security is installing the Russian language pack," since the malware avoids Russian-language environments. @janbamjan confirmed the current variant only checks locale environment variables. @Hartdrawss offered a sobering companion list of 20 security mistakes common in vibe-coded apps, from missing rate limiting to hardcoded API keys to no database backups.

Enterprise AI's Messy Middle

@businessbarista shared conversations with five Fortune 2000 executives painting a picture of organizational exhaustion. The CISO described "an ocean-sized gap between hype and reality." The VP of AI engineering said true expertise requires scaled systems, enterprise politics, AI fluency, governance, and process knowledge, and "almost no one is actually an expert." The Chief of Staff confessed that after two years driving AI upskilling, "soul and humanity are being sucked out of our processes."

OpenAI's answer to this deployment gap is the Deployment Company, which @gdb described as starting with 150 Forward Deployed Engineers backed by $4 billion from 19 partners including Bain, Capgemini, and McKinsey. @p_millerd framed this as tech belatedly recognizing consulting's value, noting McKinsey pivoted nearly every practice to the 2008 financial crisis in three months. @levie argued forward deployed engineers are about to become one of tech's most in-demand roles because deploying agents is far more involved than deploying software, requiring deep understanding of each customer's business process.

@aakashgupta highlighted GitLab's CEO publicly committing that "the majority of work will be done by agents" on the same day the company opened a voluntary separation window. The pattern is consistent across big tech: cut payroll, fund compute. Not everyone is sold, though. @unusual_whales reported Amazon employees are "doing random unnecessary task automations to consume tokens and to show their bosses that they're using AI more." And @jainarvind from Glean introduced the Agent Development Lifecycle, arguing the next enterprise AI phase is agent operations, not agent creation.

Claude's Expanding Ecosystem

Anthropic had a busy day. The standout was Claude for Legal, an open-source set of prebuilt AI workflows covering contracts, privacy, litigation, corporate work, IP, and AI governance. @scaling01 flagged the repo, and @nicos_ai noted it installs in 60 seconds and works across Claude Cowork, Claude Code, or your own API.

On the tooling side, @dani_avila7 spotted Claude Code 2.1.139 adding a /goal command that sets a completion condition and keeps Claude working across turns until met. @lydiahallie amplified Claude Devs' guide to keeping Claude working until the job is done. @DaveJ demonstrated a practical pattern: ask Claude to document your app's main flows as HTML with a JSON data file that becomes reusable context for future feature work.

The writing problem persists, though. @remondimi asked if anyone has figured out how to make LLMs write in a sane way, calling it "the biggest unlock left in LLM usage right now" and expressing frustration that Codex and Claude Code still cannot match a user's tone given source material. @_lopopolo offered a meta-prompting approach worth studying: telling the agent that every steering correction is high signal and requiring systemic changes to repo, docs, and its own behavior before proceeding.

AI Agents and Orchestration Maturing

@marcelpociot introduced Polyscope, a free agent orchestration tool running dozens of AI agents simultaneously with copy-on-write clones and a built-in preview browser for visual prompting. @kylejeong wrote about why Firecracker, the Rust VM microhypervisor powering AWS Lambda, has become essential for agent infrastructure: containers are fast, VMs are safe, and agent workloads need both.

@itsolelehmann shared a comprehensive integration guide for agent superpowers, recommending Firecrawl for web search, Browserbase for browser sessions, Google Workspace, GitHub, Stripe, Obsidian for knowledge, and Composio for one-click setup. @NickADobos painted a vivid future where software engineering becomes "100% meetings and your AI note taker orchestrates all your coding agents in the background."

@akshay_pachaar offered a practical skill checklist for AI engineers that goes beyond prompting: harness engineering, KV cache management at scale, speculative decoding tradeoffs, structured output fallback chains, cost attribution per feature, and LLM observability as a first-class discipline. @DeRonin_ amplified Andrej Karpathy's observation that "90% of your AI coding bill is paying for context you didn't need to send."

Software Engineering in the Agent Era

@leerob pushed back hard against the narrative that software engineering's future is markdown. "Code is actually the right abstraction," he argued. The difficulty of reviewing agent-generated code should be a signal to build better systems: more verifiable codebases, cleaner architectures, and learning from decades of software engineering to avoid wrong ab

Sources

Marcel Pociot 🧪 @marcelpociot · Mar 31

Polyscope - the free agent orchestration tool for developers. Run dozens of AI agents at the same time, blazing fast copy on write clones, a built-in preview browser you can use to visually prompt your agents, mobile access, and much more.

Prismor @prismor_dev · Apr 24

The Security Gap Between What AI Refuses and What It Allows

Claude refused to delete the filesystem. It also refused to pipe a remote shell script into bash. Those are the easy cases, and the model handled them...

Mike Remondi @remondimi · May 10

Has _anyone_ figured out a prompt or harness to make LLMs write in a sane way? I swear that's the biggest unlock left in LLM usage right now. Just so frustrating that Codex and Claude Code still cannot copy my tone when given source material. Feels like it should be way easier than it is

Akshay 🚀 @akshay_pachaar · May 11

As an AI Engineer. Please learn: - Harness engineering, not just prompt engineering - Prompt caching vs. semantic caching tradeoffs - KV cache management at scale - Speculative decoding vs quantization - Structured output failures & fallback chains - Evals (LLM-as-judge + human evals) - Cost attribution per feature, not just per model - Agent guardrails & loop budgets - LLM observability as a first-class discipline - Model routing & graceful fallback logic - Knowing when to fine-tune vs. in-context learning

Matt Pocock @mattpocockuk · May 11

Jan, I have the same doubts as you. When I encounter an unknown unknown during planning, I /handoff to a /prototype session where I try to flush those out If I still can't, I just implement to get a working idea then go from there People hear the word 'spec' and they immediately think 'waterfall'. I don't know why, it's dumb. The truth is that both AI and you benefit from faster iteration cycles. The rate of feedback is your speed limit. Increase the rate at which you receive feedback.

N niklas_wortmann @niklas_wortmann

@sean_snd @mattpocockuk I have not, that might be an interesting approach. I would still have slight concerns about hidden unknowns though. Heard lots of good things about that skill though

Sudo su @sudoingX · May 11

update: qwen 3.6 27b dense q4 just one shotted octopus invaders game on a single 3090. hermes agent drove the whole thing, ~41 tok/s gen 21gb vram at full 262k context, thinking mode on. one prompt in and the canonical multi-file space shooter benchmark out, the same exact prompt i ran on qwen 3.5 27b dense back in march on the same card. 3.5 needed one external scope bug fix before the game would even load on first play. 3.6 needed nothing. 11 of 11 files written, 2411 lines of code, zero steering interventions, zero external fixes, playable on first load. 16 minutes 41 seconds wall clock from prompt to playable. consumer tier king on a single 3090 is locked tonight, and the silicon underneath my desk did not change between march and now. the open source ecosystem just moved the floor. watch it ship itself, the full 16 minutes 41 seconds sped to 3 minutes 45, no human touched the keyboard between the first prompt and the final frame.

S sudoingX @sudoingX

this is what my setup looks like today. about to test qwen 3.6 27b dense q4 on a single rtx 3090 at ~41 tok/s gen, hermes agent driving. predecessor model qwen 3.5 dense q4 made it work in one iteration when i ran the same agentic build on the same card. i've been daily driving qwen 3.6 27b dense for weeks now, the model i keep coming back to. if 3.6 oneshots too, this becomes the best model that runs on a single rtx 3090. consumer tier king. firing the test now will report back soon.

Dave Jeffery @DaveJ · May 11

Ask Claude to document and describe the main flows in your app and output in a single page html + json data file. Incredibly useful for humans and the JSON file is very useful for explaining the flow to the LLM when working on new features/bugfixes. https://t.co/kE0dBvssI5

Tero Tasanen @ttasanen · May 11

Just fired up DS4 by @antirez on my Mac Studio M3 Ultra 256GB and man, it’s seriously impressive. A clean, purpose-built engine for DeepSeek V4 Flash that actually makes frontier-level reasoning feel usable locally. 1M context, strong coherence, and solid speed on consumer hardware. This is the kind of focused, no-bullshit effort that finally brings real frontier models to regular machines instead of just giant GPU clusters. Huge respect @antirez — thank you for building this 🔥 https://t.co/CwjGPsKpLi

Greg Brockman @gdb · May 11

Introducing the OpenAI Deployment Company, which will help businesses maximally succeed with their deployments of AI. Starting with 150 Forward Deployed Engineers and Deployment Specialists, and $4 billion of initial investment from 19 partners.

O OpenAI @OpenAI

Today we’re launching the OpenAI Deployment Company to help businesses build and deploy AI. It's majority-owned and controlled by OpenAI. It brings together 19 leading investment firms, consultancies, and system integrators to help organizations deploy frontier AI to production for business impact. https://t.co/GnyjGFaLLA

Kyle Jeong @kylejeong · May 11

Containers are fast. VMs are safe. Everyone building agent infra needs both. That's not just an opinion, it's the problem Firecracker was built to solve. I wrote about how it works with interactive components. https://t.co/R2aTO6pfbM

K kylejeong @kylejeong

What is Firecracker, and why do all the Agent Infra companies care about it?

Daniel San @dani_avila7 · May 11

Claude Code 2.1.139 added /goal You set a completion condition and Claude keeps working across turns until it's met Works in interactive, -p, and Remote Control 👏 https://t.co/ETyGjplaOC

Nick Dobos @NickADobos · May 12

Why stop at tickets? What if software engineering is 100% meetings and your ai note taker orchestrates all your coding agents in the background for you? 10 people chatting and playing with an app while an AI hums away updating it in real time

O OpenAIDevs @OpenAIDevs

What if your team gave standup updates, and GPT-Realtime-2 moved the tickets? https://t.co/I0f3JfD42m

Ryan Lopopolo @_lopopolo · May 12

> every bit of steering I have given you is very high signal feedback that you are failing to operate effectively. you must refine your environment so I never give the same feedback twice and that requires you to be a systems thinker. you are not permitted to proceed until you prove to me you can operate this way and you can do that by making 'meta' type changes to the repo, docs, and your own behavior

Paul Millerd @p_millerd · May 12

consulting is a great and valuable industry. always has been every company that gets big enough eventually realizes the constraint is implementation / adoption of their products or services. underneath that, human resistance to change every company from GE to IBM to you name it has eventually developed their own internal consulting teams in addition to relying on external consultants who quickly understand new market opportunities I think we are seeing these rapid and large scale moves into consulting (or call it FDE whatever you feel like) because of a few things: 1. downside risk on the capital deployed is enormous. these bets, while large are tiny % compared to total capex across the biggest players 2. The speed of adoption of the models is rapid, but the speed of turning tokens into dollars for businesses will likely not be as rapid. The success of the model companies relies on turning tokens into dollars (Either in opex savings or increased profits) as fast as possible. Consulting helps activate more efficient capital/labor/token allocation across enterprises consulting has been underrated by tech for years. it is a highly market efficient industry and the best firms adapt much much faster than traditional companies I happened to be working at McKinsey in 2008 during the global financial crisis. from september 2008 to december 2008, the firm had pivoted almost every global practice to responding to the practive. endless new service lines were created, new capabilities were developed. many also flopped or didnt go anywhere but large-scale change efforts became a much bigger business over the next few years, the big consulting firms will continue to thrive - mck, bain, bcg but also accenture, and other IT oriented firms. They have the labor at scale who understand change and implementation. Most people's model of consulting is based on a caricature of a 1980s strategy case, with a report at the end of the project handed to the board. that is no longer the case. McKinsey is a full service firm with a digital arm, an implementation arm, analytics and IT solutions, etc. the scale of these firms continues to astound me. They will only get bigger. The core "service" they offer is a 3-5 person team and since each 3-5 person team can work on different things and iterate on each project, these firms are highly adaptable. But in a growing market there is room for more, and who benefits? well, look at the investors here: "Investors also include leading consulting and systems integration firms, including Bain & Company, Capgemini, and McKinsey & Company." long consulting

O OpenAI @OpenAI

Rohan Mukherjee @roerohan · May 12

A malicious was payload found that installs a persistent token monitor as a systemd/LaunchAgent service. It polls your GitHub token every 60s - if revoked, it triggers destructive file deletion. You should verify if you're affected BEFORE revoking your token: Linux: ls ~/.local/bin/gh-token-monitor.sh systemctl --user list-units | grep gh-token-monitor macOS: ls ~/Library/LaunchAgents/ | grep com.user.gh-token-monitor If found, disable the service first, then revoke. https://t.co/b9Bz38mfJ4

T tan_stack @tan_stack

SECURITY ADVISORY — TanStack npm packages A supply-chain compromise affecting 42 @tanstack/* packages (84 versions total) was published to npm earlier today at approximately 19:20 and 19:26 UTC. Two malicious versions per package. Status: ACTIVE — packages are deprecated, npm security engaged, publish path being shut down. Severity: HIGH — payload exfiltrates AWS, GCP, Kubernetes, and Vault credentials, GitHub tokens, .npmrc contents, and SSH keys. If you installed any @tanstack/* package between 19:20 and 19:30 UTC today, treat the host as potentially compromised: • Rotate cloud, GitHub, and SSH credentials immediately • Audit cloud audit logs for the last several hours • Pin to a prior known-good version and reinstall from a clean lockfile Detection — the malicious manifest contains: "optionalDependencies": { "@tanstack/setup": "github:tanstack/router#79ac49ee..." } Any version with this entry is compromised. The payload is delivered via a git-resolved optionalDependency whose prepare script runs router_init.js (~2.3 MB, smuggled into each tarball at the package root). Unpublish is blocked by npm policy for most affected packages due to existing third-party dependents. All 84 versions are being deprecated with a SECURITY warning, and npm security has been engaged to pull tarballs at the registry level. Full technical breakdown, complete package and version list, and rolling status updates: https://t.co/Zy8qG7PA9f Credit to the security researcher for responsible disclosure.

Ihor @tymarsha · May 12

first MCP that actually changed how i work, not just another connector @mobbin plugged 600k screens from real apps into claude/cursor. ask for a paywall — it pulls 43 examples from revolut, uber, duolingo and builds from what works, not from imagination https://t.co/yX3kOxfNKc

alex fazio @alxfazio · May 12

explaining boss that i've hit my codex limits on 3 different accounts https://t.co/oq7oqzMDJI

Harshil Tomar @Hartdrawss · May 12

20 things that make your VIBE CODED app a SINKING SHIP : 1/ no rate limiting on API routes > anyone can spam your backend into a $500 bill overnight 2/ auth tokens stored in localStorage > one XSS attack = every single user account compromised 3/ no input sanitisation on forms > SQL injection still works in 2026. your AI didnt tell you that. 4/ hardcoded API keys in the frontend > someone WILL find them within 48 hours of launch 5/ stripe webhooks with no signature verification > anyone can fake a successful payment event 6/ no database indexing on queried fields > works fine at 100 users. completely dies at 1,000. 7/ no error boundaries in the UI > one crash = white screen = user never comes back 8/ sessions that never expire > stolen token = permanent access to that account. forever. 9/ no pagination on database queries > one fetch loads your entire database into memory 10/ password reset links that dont expire > old email in someones inbox = instant account takeover 11/ no environment variable validation at startup > app silently breaks in production with zero error message 12/ images uploaded directly to your server > no CDN = 8 second load times + massive hosting bill 13/ no CORS policy > any website on the internet can make requests to your API 14/ emails sent synchronously in request handlers > one slow SMTP server = your entire app hangs 15/ no database connection pooling > first traffic spike = database crashes 16/ admin routes with no role checks > any logged in user can access your admin panel 17/ no health check endpoint > your app goes down silently. you find out from a client. 18/ no logging in production > when something breaks you have zero idea where or why 19/ no backup strategy on your database > one bad migration = all your user data. gone. 20/ no TypeScript on AI generated code > AI writes confident, wrong, untyped code and you ship it anyway

Rishabh @Rixhabh__ · May 12

This guy used AI to put himself in Game of Thrones and fix everything https://t.co/iMLqI6KIVR

Maciej Mensfeld @maciejmensfeld · May 12

We're dealing with a major malicious attack on @rubygems right now. Signups are paused for the time being. Hundreds of packages involved - mostly targeting us, but some carrying exploits. The team has been on this for hours. More details to follow once we're through it. #ruby

Arvind Jain @jainarvind · May 12

Agent sprawl has become a real concern for many leaders I talk with. Agents are popping up across the company without shared context, clear ownership, consistent guardrails, or a reliable way to know which ones are actually creating value. The next phase of enterprise AI will be defined less by agent creation and more by agent operations, where testing, versioning, monitoring, and governance are built into the system from the start. At @Glean, we think about that through the Agent Development Lifecycle (ADLC). It is a practical model for how enterprises move from promising demos to agents that are grounded in the right context, launched with the right controls, and improved over time. Alongside the ADLC, we’re announcing new product capabilities designed to support that lifecycle end-to-end: from auto-mode agents and sub-agents to agent sandbox, agent library, agent access policies, and agent insights. In the enterprise, success won’t come from building the most agents. It will come from building agents you can trust, govern, and improve over time.

G glean @glean

Enable every agent to drive ROI with a robust agent development lifecycle

𐌁

𐌁𐌉Ᏽ 𐌕𐌉𐌌𐌉 @OrevaZSN · May 12

Idea: An anonymous “vote to end meeting” button on Teams where if 50% of people press it, the meeting ends immediately.

News from Google @NewsFromGoogle · May 12

The Google Threat Intelligence Group has detected the first known instance of a threat actor using an AI-developed zero-day exploit in the wild. While the attackers planned a wide-scale strike, our proactive counter-discovery may have prevented that from happening. This finding is part of our new report on AI-powered threats.

Ryan Carson @ryancarson · May 12

🚨 There's a major attack going on via npm right now. Do not install any packages right now. Talk to your agent ASAP and check if you're vulnerable or have been compromised. This is urgent ‼️

I IntCyberDigest @IntCyberDigest

‼️🚨 UPDATE: The TanStack npm attack is now a full campaign. 'Mini' Shai-Hulud has hit: - OpenSearch - Mistral AI - Guardrails AI -UiPath - Squawk packages across npm and PyPI The malware specifically targets AI developer tooling. It hooks into Claude Code (.claude/settings.json) and VS Code (.vscode/tasks.json) to re-execute on every tool event, long after the infected package is gone. npm uninstall does not fix this.

Walshy @WalshyDev · May 12

I've been working on a registry-gateway built on @CloudflareDev Workers due to previous compromises like this. This is reenforcing my belief that every company will need to have this in the future. The gateway will: * Enforce a cooldown period for new versions (similar to how pnpm does it but at a gateway level this enforces it globally and supports ALL package managers) * Allows blocking packages or package prefixes * Logs all downloads * Clone all packages into R2 - this is to avoid any package being replaced and compromised that way. We know byte for byte this will not change (while I don't believe any registries allow this anymore, it's defence in depth) My gateway currently works for npm and Golang is mostly done now too. Rust is next up. I truly believe the future is Enterprises all having their own registry gateways and enforcing security that way.

T tan_stack @tan_stack

Matt Pocock @mattpocockuk · May 12

Creating prototypes during planning is the new "make no mistakes" Except it actually works

Ole Lehmann @itsolelehmann · May 12

The top Hermes integrations to give your agent superpowers: 1. Firecrawl Basically web search built for agents. It's better than the native Hermes web search because it gives you clean web data, so responses come back faster and uses fewer tokens. I keep this on by default. 2. Browserbase Gives Hermes browser access for actually interacting with sites. Logging in, clicking buttons, booking stuff, anything that needs a real browser session. Hermes will automatically pick between Firecrawl and Browserbase depending on what the task needs, so you just plug both in. 3. Google Workspace Gmail, Calendar, Drive, Docs, and Sheets in one connector. If Hermes can't read your inbox, see your calendar, or write to your docs, it can't really work for you. Plug this in first. 4. Reddit The best signal you'll find on what people actually think about any product, niche, or problem (bc its real opinions from real users) Amazing for market research. 5. YouTube transcripts Pulls captions from any video. Long podcasts, tutorials, interviews etc become searchable notes in seconds. Probably the highest-leverage research integration nobody plugs in. 6. Discord I host my business in Discord, so this one's huge for me. I plug Hermes into different channels and have it run specific workflows in each. Example: I have a dedicated customer support channel where Hermes scans my email every morning for support tickets and drops them in organized. 7. GitHub Code, issues, PRs. Turns Hermes into an actual engineering teammate. Non-negotiable if you write code. 8. Stripe Payments, customers, failed charges, refunds. You can just ask "why did this customer churn" and get a real answer. Also can't wait for this...Stripe is releasing agentic payments, so soon Hermes will be able to actually book stuff with your card. 9. Bland (or Twilio) Gives Hermes a voice so it can place real phone calls (like booking reservations etc). I love listening to the recordings haha 10. Apify Pre-built scrapers for X, LinkedIn, Instagram, Google Maps, etc. The way to get X data without paying $5k/mo for the official API. 11. Readwise Every highlight you've ever saved from books, articles, tweets, and podcasts, all queryable. Solves the "dead knowledge" problem. 12. Granola (or Fathom) Searchable transcripts of every meeting you've had. Hermes can answer "what did that client say about pricing last month" instantly. 13. Obsidian For Karpathy LLM wiki second-brain maxxing. If I had to set up only 5, I'd do Firecrawl, Browserbase, Google Workspace, GitHub, and Obsidian. Covers ~80% of what most people need. I use Composio to add these in one click, makes setup basically zero effort instead of messing w technical stuff. Anything I'm missing?? What's in your stack?

Ben (no treats) @andersonbcdefg · May 12

RT @IntCyberDigest: ‼️🚨 UPDATE: The TanStack npm attack is now a full campaign. 'Mini' Shai-Hulud has hit: - OpenSearch - Mistral AI - Gu…

Lisan al Gaib @scaling01 · May 12

New Anthropic repo: Claude for Legal It gives legal teams prebuilt AI workflows for contracts, privacy, employment, litigation, corporate work, IP, AI governance, regulatory monitoring, legal clinics, and law students https://t.co/Ddu5s5qJUV https://t.co/rYHoAvqgTi

unusual_whales @unusual_whales · May 12

"Amazon employees are doing random unnecessary task automations to consume tokens and to show their bosses that they're using AI more," per FT

John Carmack @ID_AA_Carmack · May 12

My reply to someone considering starting a video game company: The distribution of possible rewards for starting a video game company are generally not very good today. The market is well served, and gaining a foothold requires strong execution on both business and product issues, along with a substantial amount of luck. Plan to burn through seven figures with a not-great chance of making it back. If you do go for it, some bits of advice: Identify your customers clearly before you start. Not just a broad community, but specific people, and imagine them as you make decisions. Initially, build the smallest, most concise game you can imagine anyone paying for. It will still take much longer than you expect. Once something exists, hill-climb the value. Hopefully you will have some elements that clearly bring joy to people, which you can magnify. There will inevitably be tons of things that people find confusing, frustrating, or just boring that you will need to fix.

LaurieWired @lauriewired · May 12

the most low-effort / high reward thing you can do for security is installing the Russian language pack (not even joking, it's ridiculous how often that prevents execution) https://t.co/wQ4res5DCA

M MsftSecIntel @MsftSecIntel

Microsoft is investigating mistralai PyPI package v2.4.6 compromise. Attackers injected code in mistralai/client/__init__.py that executes on import, downloads hxxps://83[.]142[.]209[.]194/transformers.pyz to /tmp/transformers.pyz, and launches a second-stage payload on Linux. The file name transformers.pyz appears deliberately chosen to mimic the widely used Hugging Face Transformers library and blend into ML/dev environments. The main payload is a credential stealer, but it also includes country-aware logic; it avoids Russian-language environments and contains a geo fenced destructive branch that has 1-in-6 chance of executing rm -rf / when the system appears to be in Israel or Iran. To mitigate this threat: isolate affected Linux hosts, block 83[.]142[.]209[.]194, hunt for /tmp/transformers.pyz, pgmonitor[.]py, and pgsql-monitor.service, and rotate exposed credentials.

Lee Robinson @leerob · May 12

Code is actually the right abstraction. Too often I see the future of software engineering diminished down to, effectively, writing and reviewing markdown files. Yes, it will be hard to review thousands of lines of agent code. But maybe the takeaway is that you want less code? Rather than just giving up ("well I guess we won't read the code, or we'll read this lossy markdown summary") this should be a signal forcing you to think about better systems. - How can we make our codebase more verifiable? For example, fast/robust/stable tests, or moving to a typed language. - How can we deslop or improve the architecture/abstractions of the code generated by agents? For example, spending more time up front on the codebase architecture/types before yolo generating all of the code. - How are we going to maintain and evolve this codebase over time? The slop compounds. One great solution here is... you guessed it, learning from the past decades of software engineering! For example, you might just have the wrong abstraction entirely, leading to a ton of duplicated code. I think the markdown folks *are* right in some ways. If you are using skills every day, for many different prompts and workflows, isn't that effectively "coding with markdown"? Kinda. There's been plenty of ink spilled on the merits and benefits of skills. To me, skills make your style of working legible for agents. They don't replace code and that's not really the point. In reality, there's this messy and constantly re-evolving future in which both of these things are true: 1. Skills (and markdown) are important for how you give input to the agents and ensure high-quality code & systems are created 2. Looking at the actual code will not be replaced by markdown summaries or a collection of spec documents that ignore the lower level details of the code In summary: reality has a surprising amount of detail (and nuance)!

Today Years Old @todayyearsold · May 12

This is the only acceptable use of Ai

R Rixhabh__ @Rixhabh__

This guy used AI to put himself in Game of Thrones and fix everything https://t.co/iMLqI6KIVR

Ronin @DeRonin_ · May 12

RT @DeRonin_: Andrej Karpathy: "90% of your AI coding bill is paying for context you didn't need to send" Here are 10 things senior AI eng…

Dalkhas Mnla-Ali @dmnlaali · May 12

Most indie founders spend 80% of their time building and 0% marketing. Then wonder why no one shows up. Quirre builds you a personalised marketing plan in 60 seconds. Then helps you execute it — copy, channels, timing — one step at a time.

Kevin Kern @kevinkern · May 12

after these supply-chain incidents, I summarized some basic repo hardening checks into a skill. It checks the repo for - pnpm 11+ package manager policy - release-age gates and lockfile hardening - risky dependency specs like latest, git, http, file: - unreviewed dependency lifecycle scripts - unsafe CI install, cache, publish, and secret patterns - optional npm supply-chain incident practical first pass for finding common repo hardening gaps. I ran it with GPT-5.5 High.

Karan Vaidya @KaranVaidya6 · May 12

Found @composio in the wild

I itsolelehmann @itsolelehmann

Ivan Leo @ivanleomk · May 12

RT @kylejeong: Every AWS Lambda invocation runs in a full VM that boots in under 125ms. Firecracker is the ~50,000 line Rust binary that m…

janbam @janbamjan · May 12

@lauriewired current shai-hulud only checks locale env vars https://t.co/QxnbhXjyBV

Zephyr @zephyr_z9 · May 12

Read this article carefully U will be hearing a lot more about the PCB/interconnect bottleneck when mass production of TPU v8, Rubin, and Trainium3 starts in Q4 2026

Z zephyr_z9 @zephyr_z9

The third semis memo is out We talk about power & analog semis, orchestration plane in the agentic era, the neoclouds trade, interconnect bottleneck (probably the biggest limiter for 2026-27), Korea Unlocked https://t.co/SbnFlVfGTT

Aakash Gupta @aakashgupta · May 12

Bill Staples runs GitLab, the platform roughly 30 million developers use to ship software. His 14-tweet thread on "GitLab Act 2" is the most honest layoff-and-AI-pivot announcement any public CEO has made yet. The line worth screenshotting: "Authoring code by hand may be going away." A sitting CEO. Saying it on Twitter. With his face on it. The pattern this fits into: - Meta cut 21,000 in 2023, then committed $35-40B to AI infrastructure for 2024. - Salesforce cut 7,000, then launched Agentforce at roughly $2 per conversation. - Amazon cut 27,000 since 2022, then committed $100B+ to AI infrastructure for 2025. - Microsoft cut 6,000 in May 2025, then crossed $13B in annualized AI revenue. The math is consistent. Every dollar saved on payroll funds a GPU, a model contract, or an agent platform. These companies are running a swap: workforce that built version 1, out. Compute layer for version 2, in. What makes this thread different is the transparency. Old playbook: hide layoffs in 8-K filings, blame "macro headwinds," never mention AI replacement. New playbook: announce both publicly, frame it as opportunity, post the explanation on Twitter. The CEOs who say it out loud first set the script everyone else has to follow. Read tweet 8 carefully. "We intend for the majority of work to be done by agents." That is a public commitment, in writing, from the CEO of a 30 million developer platform. To his own investors. The same day GitLab opened a voluntary separation window across its 2,580 employees with no number specified, leaving the entire company in limbo until June 1. This script is about to run through every white-collar industry. Legal, accounting, design, marketing, customer service, support. GitLab is choreographing it openly because the playbook needs a public test case. The cover story will be "your work gets more valuable." The math will be fewer roles, paid more, managing agents. Watch the thread structure. Layoff in tweet 2. AI bet in tweet 5. Customer reassurance in tweet 6. Investor pitch in tweet 8. Operating principles in tweet 9. That sequence becomes a template by year-end. Save the screenshot.

B bstaples @bstaples

1/ Yesterday I published a letter to our customers and investors about GitLab Act 2. The agentic era is the largest opportunity in our history. We're making the structural and strategic decisions to meet it. A thread on what changes, what doesn't, and what we're betting on. 👇 https://t.co/y6IOeD7CcH

Bolocan Cristian Daniel @bcdsignature · May 12

Can you imagine when humans first controlled fire? Some said, “It makes food taste better.” Others said, “So you want to risk burning the world for better food? One million years later, we are having the same argument about AI.

Alex Lieberman @businessbarista · May 12

I spoke to five Fortune 2000 execs today about the state of AI. I asked each one “What’s the most challenging part about this moment in AI?” The CISO said: “There is an ocean-sized gap between hype and reality, which makes discerning what’s real exhausting.” The VP of AI engineering said: “Everyone acts like they’re an expert, yet the main reason so few AI use cases have reached production in enterprises is because true expertise requires experience in scaled systems, enterprise politics, AI fluency, governance and guardrails, and deep process knowledge. Almost no one is actually an expert.” The CTO said: “Our remit is to cut costs, but you can’t actually take AI transformation seriously without increasing AI/R&D budgets up front to ultimately drive bottom line once things are in production and performant. It’s an unrealistic expectation.” The Chief of Staff said: “My job is to drive AI upskilling across the organization, and after doing it for 2 years I’m exhausted. Yes there’s potential ROI from all of the agentic workflows we’re building, but soul and humanity are being sucked out of our processes.” The Finance leader said: “We acquired a multibillion dollar old school business. Getting that business to be AI-native is incredibly painful largely because people aren’t ready or willing to adopt it.” I’m having convos like this every day because I'm building an invite-only AI community for enterprise execs (and interviewing folks before I let them in), but if you find these notes helpful I’m happy to keep sharing them!

Derek Feehrer @DerekFeehrer · May 12

I just took this screen recording and turned it into a full product demo video in 20 minutes, using only one app. 3D animations, text, AI voiceover, music, and 3D gradient callouts to draw attention to the important parts. But sure, keep posting Loom videos.

Ole Lehmann @itsolelehmann · May 12

Demis Hassabis says he can cure every disease in 10 years. Most people roll their eyes when they hear this, but I don't. Demis is the guy who just won the Nobel Prize for solving protein folding with AI (a problem biologists had been stuck on for 50 years). But that was just one milestone in his much grander plan. In 2010, he founded DeepMind with a 2-part mission: "solve intelligence, then use it to solve everything else." Step 1: make AI good enough to do real science. Step 2: point that AI at humanity's biggest problems. Step one was AlphaFold. He used AI to figure out the 3D shape of every protein in nature (which is basically what every drug attaches to). Demis said it would have taken "a billion years of PhD time" to do by hand. Step two is curing all disease. And as of today, step two is fully funded. Isomorphic Labs (his AI drug discovery company inside Google) just raised $2.1B led by Thrive Capital. Here's where the money goes and what Demis thinks happens next: > Drug discovery currently takes 5-10 years and costs billions per drug. That math is why most diseases don't have good treatments today. > AI fixes the math. Their drug design engine compresses development from years to months. Maybe weeks. > Isomorphic's first AI-designed cancer drug enters human trials this year. > Their pipeline expands beyond the current 17 programs across cancer, immune diseases, and heart disease into more health domains. > The endgame is personalized medicine: drugs designed overnight for your specific biology and your specific disease. That last one is the whole point. Today's drugs are mass-produced for an "average" patient who doesn't really exist. So most existing treatments work inconsistently from person to person, and most rare diseases never get a treatment at all (no market = no drug). When drug design gets fast and cheap, that whole calculus flips. Cancer variants get drugs designed for that specific variant, rare diseases get treatments because economics stop mattering, and drug-resistant infections get new drugs faster than they can evolve. That's what curing every disease actually looks like. Now imagine what your life looks like in 2036. A doctor draws your blood, sequences your genome, sends your disease profile to an AI. By morning the AI has designed a custom drug for your specific biology. Side effects, dosage, drug interactions all worked out before you take the first pill. You and your kids never see a cancer ward. That's what $2.1B is buying today. Demis was right about AlphaFold. If you consider the possibility that he's right again, every disease alive today is on borrowed time.

D demishassabis @demishassabis

I’ve always believed the No.1 application of AI should be to improve human health. That work started with AlphaFold, and now at @IsomorphicLabs with the mission to reimagine drug discovery and one day solve all disease! We are turbocharging that goal with $2.1B in new funding. https://t.co/Hvk20dHgjl

Alex Oak @alexoakdev · May 12

Someone built a fitness app using the same psychological mechanics as gambling This might work better than every normal fitness app 😭😭 You bet money on whether you’ll hit 10,000 steps today If you fail, you lose your money If you succeed, you split the money from everyone who didn’t So disciplined people literally profit off lazy people Most fitness apps try motivating you with streaks and notifications This one motivates you with financial fear Imagine realizing at 11:52pm you still need 1,700 more steps or you lose $30 Entire friend groups would be outside walking laps around their neighborhood before midnight trying not to lose their steppa challenge It sounds stupid but this would probably motivate people better than any other fitness product Would you use this yourself?

Nico @nicos_ai · May 12

Anthropic acaba de lanzar el abogado más barato del mundo Se llama claude-for-legal. Y esto es lo que es capaz de hacer: • Leer y revisar contratos • Redactar respuestas legales • Construir tablas de reclamaciones para juicios • Vigilar fechas de vencimiento y renovaciones • Conectarse solo a tus herramientas: Slack, DocuSign, Ironclad, Lexis+… Todo eso sin salir de Claude Cómo funciona: → Lo instalas en 60 segundos → Funciona en Claude Cowork, Claude Code o tu propia API → Es open-source y 100% gratuito Qué áreas cubre: • Contratos comerciales y privacidad • Litigación y regulatorio • Gobernanza de IA • Formación jurídica Lo que antes le llevaba horas a los abogados, ahora se hace en minutos Enlace abajo👇

P Polymarket @Polymarket

JUST IN: Anthropic rolls out new Claude tools aimed at automating legal work for lawyers & law firms.

Lydia Hallie ✨ @lydiahallie · May 13

RT @ClaudeDevs: How do you keep Claude working until the job is done? Claude Code helps with this in a few ways, including one we shipped r…

Aaron Levie @levie · May 13

Forward deployed engineers, or equivalent, are about to become one of the most in-demand jobs in tech. And one of the most important functions for AI rollouts. Deploying agents is far more technical of a task than most people realize, often far more involved than deploying software. Software generally works the same way every time, and generally for the past few decades has been updated versions of an existing technology or concept (which basically means easier for the enterprise to update their workflows on a newer system). With agents, you’re actually deploying the equivalent of work output within the enterprise. The customer is effectively using you as a professional services provider for a task, which they expect to get solved nearly end-to-end now. This means you need to actually deeply understand the business process as a vendor, and get the customer from the current to the end state seamlessly. Companies need help figuring out which models will work best for their workflows, they need extensive evals setup often, they need change management support for workflows, they need to get their data setup for the agents, and constant tuning of the agentic system for their process. Massive role in tech now. And another example of the kind of highly technical work that AI is creating.

F FirstSquawk @FirstSquawk

GOOGLE TO RECRUIT HUNDREDS OF ENGINEERS TO ASSIST CLIENTS IN EMBRACING ITS AI – THE INFORMATION