AI Learning Digest.

AI Learnings - December 30, 2025

AI Learnings - December 30, 2025

Overview

Discussions spanning Claude Code & Workflows, AI Agents & Orchestration, and Models & Capabilities.

Claude Code & Workflows

  • @rahulgs: "yes things are changing fast, but also I see companies (even faang) way behind the frontier for no reason"

AI Agents & Orchestration

  • @lorden_eth: "For those who thought about building AI bots to trade on Polymarket"
  • @camsoft2000: "co/Hcy1DQ68nR file encourages the agent to work on self-improvement when it sees a common pattern or improvement it can make"
  • @mitchellh: "Slop drives me crazy and it feels like 95+% of bug reports, but man, AI code analysis is getting really good"

Models & Capabilities

  • @0xSero: "1 running fully local in AWQ-4Bit with full context window (170 GB VRAM w full context)"
  • @DzambhalaHODL: "Once again, I am pounding the table on using Gemini to analyze your genes"

Other Highlights

  • @AlexReibman: "Simple trick to Claude to run for 4-5 hours at a time"
  • @orphcorp: "claude, grant me the serenity to accept the things I cannot change, the courage to change the things I can, and the wisdom to know the difference"
  • @BrianRoemmele: "I am using Nash Equilibrium on the attention head of an LLM"

Key Takeaways

1. Claude Code continues to reshape how developers approach coding

2. Agent orchestration patterns are maturing with new tools and frameworks

---

Curated from 9 posts

Source Posts

L
Lorden @lorden_eth ·
For those who thought about building AI bots to trade on Polymarket You should check the official materials on GitHub before doing anything They've literally told you how to trade autonomously on Polymarket using AI Agents https://t.co/yb3jU29KNT https://t.co/8LqIlHjhqs
S
Steven Lubka ☀️ @DzambhalaHODL ·
Once again, I am pounding the table on using Gemini to analyze your genes. Get a basic Ancestry DNA test, opt into their privacy options, and once you get your results login and download your "raw DNA file" Ask Gemini to give you the identifiers to search for high impact genes and then use it to understand your own data and suggest interventions for the ones with a detrimental impact It's legitimately life changing
M
Mitchell Hashimoto @mitchellh ·
Slop drives me crazy and it feels like 95+% of bug reports, but man, AI code analysis is getting really good. There are users out there reporting bugs that don't know ANYTHING about our stack, but are great AI drivers and producing some high quality issue reports. This person (linked below) was experiencing Ghostty crashes and took it upon themselves to use AI to write a python script that can decode our crash files, match them up with our dsym files, and analyze the codebase for attempting to find the root cause, and extracted that into an Agent Skill. They then came into Discord, warned us they don't know Zig at all, don't know macOS dev at all, don't know terminals at all, and that they used AI, but that they thought critically about the issues and believed they were real and asked if we'd accept them. I took a look at one, was impressed, and said send them all. This fixed 4 real crashing cases that I was able to manually verify and write a fix for from someone who -- on paper -- had no fucking clue what they were talking about. And yet, they drove an AI with expert skill. I want to call out that in addition to driving AI with expert skill, they navigated the terrain with expert skill as well. They didn't just toss slop up on our repo. They came to Discord as a human, reached out as a human, and talked to other humans about what they've done. They were careful and thoughtful about the process. People like this give me hope for what is possible. But it really, really depends on high quality people like this. Most today -- to continue the analogy -- are unfortunately driving like a teenager who has only driven toy go-karts. Examples: https://t.co/n8xCcPYSjw
r
rahul @rahulgs ·
yes things are changing fast, but also I see companies (even faang) way behind the frontier for no reason. you are guaranteed to lose if you fall behind. the no unforced-errors ai leader playbook: For your team: - use coding agents. give all engineers their pick of harnesses, models, background agents: Claude code, Cursor, Devin, with closed/open models. Hearing Meta engineers are forced to use Llama 4. Opus 4.5 is the baseline now. - give your agents tools to ALL dev tooling: Linear, GitHub, Datadog, Sentry, any Internal tooling. If agents are being held back because of lack of context that’s your fault. - invest in your codebase specific agent docs. stop saying “doesn’t do X well”. If that’s an issue, try better prompting, https://t.co/SOjpn47yxo, linting, and code rules. Tell it how you want things. Every manual edit you make is an opportunity for https://t.co/S1ZvtYQwta improvement - invest in robust background agent infra - get a full development stack working on VM/sandboxes. yes it’s hard to set up but it will be worth it, your engineers can run multiple in parallel. Code review will be the bottleneck soon. - figure out security issues. stop being risk averse and do what is needed to unblock access to tools. in your product: - always use the latest generation models in your features (move things off of last gen models asap, unless robust evals indicate otherwise). Requires changes every 1-2 weeks - eg: GitHub copilot mobile still offers code review with gpt 4.1 and Sonnet 3.5 @jaredpalmer. You are leaving money on the table by being on Sonnet 4, or gpt 4o - Use embedding semantic search instead of fuzzy search. Any general embedding model will do better than Levenshtein / fuzzy heuristics. - leave no form unfilled. use structured outputs and whatever context you have on the user to do a best-effort pre-fill - allow unstructured inputs on all product surfaces - must accept freeform text and documents. Forms are dead. - custom finetuning is dead. Stop wasting time on it. Frontier is moving too fast to invest 8 weeks into finetuning. Costs are dropping too quickly for price to matter. Better prompting will take you very far and this will only become more true as instruction following improves - build evals to make quick model-upgrade decisions. they don’t need to be perfect but at least need to allow you to compare models relative to each other. most decisions become clear on a Pareto cost vs benchmark perf plot - encourage all engineers to build with ai: build primitives to call models from all code bases / models: structured output, semantic similarity endpoints, sandbox code execution. etc What else am I missing?
0
0xSero @0xSero ·
MiniMax-M2.1 running fully local in AWQ-4Bit with full context window (170 GB VRAM w full context) - 1000~ to 16,000~ tps prefill - 100~ tps generation speeds - Opencode It’s doing real work, updating my blog with little steering or specificity. The problem with local LLMs is that they require too much steering, this means baby sitting which I don’t have the time to do MiniMax cracked the cost, intelligence, and speed challenge, I would say this is a top tier model. I run frontier models like Gemini and it just fails to call tools, in this year lol… ——————— I think glm-4.?-air is needed still. We need a viable model at each hardware entry point, a Mac M1 Ultra 192GB? is relatively cheap 5k to be able to run this model at 40 tps is a huge societal unlock. Smaller models can be good but size matters :p
o
orph @orphcorp ·
claude, grant me the serenity to accept the things I cannot change, the courage to change the things I can, and the wisdom to know the difference. do not make mistakes.
c
camsoft2000 @camsoft2000 ·
My global https://t.co/Hcy1DQ68nR file encourages the agent to work on self-improvement when it sees a common pattern or improvement it can make. I allow it to maintain it's own section in the file, as well as dump ideas and improvements into a folder on my file-system. That way I can just ask a new agent session to read those files and propose changes to my OSS that I maintain. While I'm still exploring this as an idea I feel like giving agents persisted memory and the ability to change themselves should unlock a super-power.
B
Brian Roemmele @BrianRoemmele ·
BOOM! It works on LLMs! I am using Nash Equilibrium on the attention head of an LLM! I may be the first to do this at this level. I am achieving a 50-70% effective size reduction on a quantization of 4-bit weights shrinking the model and is enabling on-device inference for smaller LLMs eg. A 70B params! This allows for a nice LLM on high-end phones—low-end laptops. But my goal is individual LLM modular for each motor on robots connected in a mash network nervous system. This would make reaction times and exactness superior to anything we have ever seen. I’ll test it when I scrape up enough coffee money: https://t.co/ctXLWrs5Pj More soon!
A
Alex Reibman 🖇️ @AlexReibman ·
Simple trick to Claude to run for 4-5 hours at a time Get it to play Saw https://t.co/pdbjd8yytp