<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Gradient Ascent]]></title><description><![CDATA[Gradient Ascent is your weekly guide to AI, trusted by Silicon Valley's top tech firms and the best academic labs worldwide. ]]></description><link>https://newsletter.artofsaience.com</link><image><url>https://substackcdn.com/image/fetch/$s_!LKGp!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F01dfb858-3107-4656-b289-cf13de969a17_800x800.png</url><title>Gradient Ascent</title><link>https://newsletter.artofsaience.com</link></image><generator>Substack</generator><lastBuildDate>Sun, 26 Apr 2026 10:28:08 GMT</lastBuildDate><atom:link href="https://newsletter.artofsaience.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Sairam Sundaresan]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[sairam@artofsaience.com]]></webMaster><itunes:owner><itunes:email><![CDATA[sairam@artofsaience.com]]></itunes:email><itunes:name><![CDATA[Sairam Sundaresan]]></itunes:name></itunes:owner><itunes:author><![CDATA[Sairam Sundaresan]]></itunes:author><googleplay:owner><![CDATA[sairam@artofsaience.com]]></googleplay:owner><googleplay:email><![CDATA[sairam@artofsaience.com]]></googleplay:email><googleplay:author><![CDATA[Sairam Sundaresan]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Inside OpenAI's No-Review Codebase, the 98% of Claude Code That Isn't the Model, and BMad's Six-Agent Dev Team - 📚 The Tokenizer Edition #25]]></title><description><![CDATA[This week's most valuable email resources]]></description><link>https://newsletter.artofsaience.com/p/inside-openais-no-review-codebase</link><guid isPermaLink="false">https://newsletter.artofsaience.com/p/inside-openais-no-review-codebase</guid><dc:creator><![CDATA[Sairam Sundaresan]]></dc:creator><pubDate>Thu, 23 Apr 2026 12:31:46 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!JSVF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6175e67-0143-45ae-9590-6830c5ee73e8_1344x768.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey there! A recurring theme this week: the interesting frontier sat outside the model. VILA Lab took Claude Code apart at the source and found 98.4% of the codebase is operational infrastructure, not AI decision logic. OpenAI&#8217;s Frontier team shipped a 1M-line codebase at a billion tokens a day with zero human-reviewed code. BMad ships a six-agent dev team (analyst, PM, architect, UX, engineer, tech writer) that installs into your IDE with one command. The through-line: scaffolding is where the gains are hiding.</p><h3><strong>New here?</strong></h3><p><em>The Tokenizer is my resource-focused newsletter edition where I curate high-signal AI/ML papers, videos, articles, tools, and learning resources so you don&#8217;t have to sift through the noise. Subscribe to <a href="https://newsletter.artofsaience.com">Gradient Ascent</a> for the full experience.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.artofsaience.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.artofsaience.com/subscribe?"><span>Subscribe now</span></a></p><h2><strong>TL;DR</strong></h2><p>What caught my attention this week:</p><ul><li><p>&#128196; <strong>Papers:</strong> a source-level dissection of Claude Code, a self-evolving agent that argues for information density over context length, ByteDance&#8217;s next-generation video model, Tencent Hunyuan&#8217;s unified 3D world generator, and a principled recipe for on-policy distillation.</p></li><li><p>&#127909; <strong>Videos:</strong> the physics behind Flow Matching, running LLMs locally on DGX Spark, engineering RL environments from scratch, and an anonymous operator conversation on AI budgets and code review.</p></li><li><p>&#128240; <strong>Reads:</strong> Hamel Husain on why eval craft belongs to data scientists, Paul Iusztin on harness engineering as the new OS layer around the LLM, and a transcript from inside OpenAI Frontier where a 1M-LOC codebase ships at a billion tokens per day with no human-written code.</p></li><li><p>&#128736; <strong>Tools:</strong> Microsoft&#8217;s new agent framework with Semantic Kernel and AutoGen migration paths, and a Karpathy-style autoresearch loop for GPU kernels.</p></li><li><p>&#127891; <strong>Learning:</strong> BMAD-METHOD, an open agile framework with six named agents that scaffolds AI-driven teams from brief to deployment.</p></li></ul><div><hr></div><h2><strong>&#128196; 5 Papers</strong></h2><h3><strong>1. Dive into Claude Code: The Design Space of Today&#8217;s and Future AI Agent Systems</strong></h3><p><strong><a href="https://arxiv.org/abs/2604.14228">https://arxiv.org/abs/2604.14228</a></strong> | <strong><a href="https://github.com/VILA-Lab/Dive-into-Claude-Code">GitHub</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!MovQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe52b32bf-c86d-458c-a1e8-d03617006aef_793x291.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!MovQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe52b32bf-c86d-458c-a1e8-d03617006aef_793x291.png 424w, https://substackcdn.com/image/fetch/$s_!MovQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe52b32bf-c86d-458c-a1e8-d03617006aef_793x291.png 848w, https://substackcdn.com/image/fetch/$s_!MovQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe52b32bf-c86d-458c-a1e8-d03617006aef_793x291.png 1272w, https://substackcdn.com/image/fetch/$s_!MovQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe52b32bf-c86d-458c-a1e8-d03617006aef_793x291.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!MovQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe52b32bf-c86d-458c-a1e8-d03617006aef_793x291.png" width="793" height="291" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e52b32bf-c86d-458c-a1e8-d03617006aef_793x291.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:291,&quot;width&quot;:793,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!MovQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe52b32bf-c86d-458c-a1e8-d03617006aef_793x291.png 424w, https://substackcdn.com/image/fetch/$s_!MovQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe52b32bf-c86d-458c-a1e8-d03617006aef_793x291.png 848w, https://substackcdn.com/image/fetch/$s_!MovQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe52b32bf-c86d-458c-a1e8-d03617006aef_793x291.png 1272w, https://substackcdn.com/image/fetch/$s_!MovQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe52b32bf-c86d-458c-a1e8-d03617006aef_793x291.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>VILA Lab walked Claude Code&#8217;s TypeScript source (v2.1.88) and found that roughly 1.6% of the codebase is AI decision logic. The other 98.4% is operational infrastructure: permissions, context compaction, hooks, subagent isolation, session persistence. They extract five human values and thirteen design principles from that code, then trace them through seven components and five layers using a single running example (fixing a failing test in auth.test.ts). Read it if you&#8217;re building an agent and want to see how a production harness answers every design question at once.</p><h3><strong>2. GenericAgent: A Token-Efficient Self-Evolving LLM Agent via Contextual Information Density Maximization (V1.0)</strong></h3><p><strong><a href="https://arxiv.org/abs/2604.17091">https://arxiv.org/abs/2604.17091</a></strong> | <strong><a href="https://github.com/lsdefine/GenericAgent">GitHub</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ULxt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e9f9678-6437-4cdf-b324-b8f240836fc2_1600x809.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ULxt!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e9f9678-6437-4cdf-b324-b8f240836fc2_1600x809.png 424w, https://substackcdn.com/image/fetch/$s_!ULxt!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e9f9678-6437-4cdf-b324-b8f240836fc2_1600x809.png 848w, https://substackcdn.com/image/fetch/$s_!ULxt!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e9f9678-6437-4cdf-b324-b8f240836fc2_1600x809.png 1272w, https://substackcdn.com/image/fetch/$s_!ULxt!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e9f9678-6437-4cdf-b324-b8f240836fc2_1600x809.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ULxt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e9f9678-6437-4cdf-b324-b8f240836fc2_1600x809.png" width="1456" height="736" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0e9f9678-6437-4cdf-b324-b8f240836fc2_1600x809.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:736,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ULxt!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e9f9678-6437-4cdf-b324-b8f240836fc2_1600x809.png 424w, https://substackcdn.com/image/fetch/$s_!ULxt!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e9f9678-6437-4cdf-b324-b8f240836fc2_1600x809.png 848w, https://substackcdn.com/image/fetch/$s_!ULxt!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e9f9678-6437-4cdf-b324-b8f240836fc2_1600x809.png 1272w, https://substackcdn.com/image/fetch/$s_!ULxt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e9f9678-6437-4cdf-b324-b8f240836fc2_1600x809.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Long-horizon agent performance is bounded by information density inside a finite context window, not by context length itself. GenericAgent builds on four pieces. A minimal atomic tool set. Hierarchical on-demand memory. A self-evolution loop that turns verified trajectories into reusable SOPs and code. A runtime compression layer. The repo reports 188K tokens on long-horizon tasks where Claude Code spends 537K, at equal or better completion rates. Read this before you reach for the next context-length upgrade.</p><h3><strong>3. Seedance 2.0: Advancing Video Generation for World Complexity</strong></h3><p><strong><a href="https://arxiv.org/abs/2604.14148">https://arxiv.org/abs/2604.14148</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xnRR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a5445b1-b1bd-4486-86b1-27c54e697506_1600x411.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xnRR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a5445b1-b1bd-4486-86b1-27c54e697506_1600x411.png 424w, https://substackcdn.com/image/fetch/$s_!xnRR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a5445b1-b1bd-4486-86b1-27c54e697506_1600x411.png 848w, https://substackcdn.com/image/fetch/$s_!xnRR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a5445b1-b1bd-4486-86b1-27c54e697506_1600x411.png 1272w, https://substackcdn.com/image/fetch/$s_!xnRR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a5445b1-b1bd-4486-86b1-27c54e697506_1600x411.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xnRR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a5445b1-b1bd-4486-86b1-27c54e697506_1600x411.png" width="1456" height="374" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2a5445b1-b1bd-4486-86b1-27c54e697506_1600x411.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:374,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xnRR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a5445b1-b1bd-4486-86b1-27c54e697506_1600x411.png 424w, https://substackcdn.com/image/fetch/$s_!xnRR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a5445b1-b1bd-4486-86b1-27c54e697506_1600x411.png 848w, https://substackcdn.com/image/fetch/$s_!xnRR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a5445b1-b1bd-4486-86b1-27c54e697506_1600x411.png 1272w, https://substackcdn.com/image/fetch/$s_!xnRR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a5445b1-b1bd-4486-86b1-27c54e697506_1600x411.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>ByteDance Seed&#8217;s latest video model takes the #1 spot on both Text-to-Video and Image-to-Video leaderboards on Arena.AI, ahead of veo-3.1-audio-1080p on T2V and grok-imagine-video-720p on I2V. On the team&#8217;s own SeedVideoBench 2.0 set (against Kling, Sora, Veo, Wan, and Vidu), it ranks first in 29 of 30 fine-grained motion quality categories. The architectural bet is unified audio-video joint generation with native multi-modal input (text, image, audio, video) and simultaneous multi-track audio output with binaural synthesis. Pull the report if you want to see the current ceiling in open-access video generation.</p><h3><strong>4. HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds</strong></h3><p><strong><a href="https://arxiv.org/abs/2604.14268">https://arxiv.org/abs/2604.14268</a></strong> | <strong><a href="https://github.com/Tencent-Hunyuan/HY-World-2.0">GitHub</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qRsX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c7fcc9c-43ac-4ff0-8f71-9ccc550f1b45_996x369.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qRsX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c7fcc9c-43ac-4ff0-8f71-9ccc550f1b45_996x369.png 424w, https://substackcdn.com/image/fetch/$s_!qRsX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c7fcc9c-43ac-4ff0-8f71-9ccc550f1b45_996x369.png 848w, https://substackcdn.com/image/fetch/$s_!qRsX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c7fcc9c-43ac-4ff0-8f71-9ccc550f1b45_996x369.png 1272w, https://substackcdn.com/image/fetch/$s_!qRsX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c7fcc9c-43ac-4ff0-8f71-9ccc550f1b45_996x369.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qRsX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c7fcc9c-43ac-4ff0-8f71-9ccc550f1b45_996x369.png" width="996" height="369" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1c7fcc9c-43ac-4ff0-8f71-9ccc550f1b45_996x369.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:369,&quot;width&quot;:996,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qRsX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c7fcc9c-43ac-4ff0-8f71-9ccc550f1b45_996x369.png 424w, https://substackcdn.com/image/fetch/$s_!qRsX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c7fcc9c-43ac-4ff0-8f71-9ccc550f1b45_996x369.png 848w, https://substackcdn.com/image/fetch/$s_!qRsX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c7fcc9c-43ac-4ff0-8f71-9ccc550f1b45_996x369.png 1272w, https://substackcdn.com/image/fetch/$s_!qRsX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c7fcc9c-43ac-4ff0-8f71-9ccc550f1b45_996x369.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>A single pipeline that does both 3D world generation and reconstruction, producing navigable 3D Gaussian Splatting scenes from text or one image alone. Tencent Hunyuan&#8217;s four-stage approach covers panorama generation, trajectory planning, camera-guided view generation, and final 3D composition. It reaches state-of-the-art among open-source 3D world models and reports parity with the closed Marble system. All weights and code are public, which matters for a field where most of the good stuff is still locked behind APIs.</p><h3><strong>5. Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe</strong></h3><p><strong><a href="https://arxiv.org/abs/2604.13016">https://arxiv.org/abs/2604.13016</a></strong> | <strong><a href="https://github.com/thunlp/OPD">GitHub</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!D-RB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff93d79ac-9171-4fb8-b9d8-78a5ec493aac_1126x352.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!D-RB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff93d79ac-9171-4fb8-b9d8-78a5ec493aac_1126x352.png 424w, https://substackcdn.com/image/fetch/$s_!D-RB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff93d79ac-9171-4fb8-b9d8-78a5ec493aac_1126x352.png 848w, https://substackcdn.com/image/fetch/$s_!D-RB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff93d79ac-9171-4fb8-b9d8-78a5ec493aac_1126x352.png 1272w, https://substackcdn.com/image/fetch/$s_!D-RB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff93d79ac-9171-4fb8-b9d8-78a5ec493aac_1126x352.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!D-RB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff93d79ac-9171-4fb8-b9d8-78a5ec493aac_1126x352.png" width="1126" height="352" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f93d79ac-9171-4fb8-b9d8-78a5ec493aac_1126x352.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:352,&quot;width&quot;:1126,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:298718,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!D-RB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff93d79ac-9171-4fb8-b9d8-78a5ec493aac_1126x352.png 424w, https://substackcdn.com/image/fetch/$s_!D-RB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff93d79ac-9171-4fb8-b9d8-78a5ec493aac_1126x352.png 848w, https://substackcdn.com/image/fetch/$s_!D-RB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff93d79ac-9171-4fb8-b9d8-78a5ec493aac_1126x352.png 1272w, https://substackcdn.com/image/fetch/$s_!D-RB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff93d79ac-9171-4fb8-b9d8-78a5ec493aac_1126x352.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>On-policy distillation keeps breaking in counterintuitive ways: a stronger teacher sometimes fails to improve a student while a weaker one succeeds. THUNLP&#8217;s team runs controlled experiments across the Qwen3 and DeepSeek families and lands on two governing conditions. Student and teacher need compatible thinking patterns (measurable via token-level overlap ratio), and the teacher needs to offer capabilities the student doesn&#8217;t already have (not just a higher benchmark score). They also propose an off-policy cold-start warm-up that recovers distillation in setups where it would otherwise collapse. If your post-training pipeline picks teachers by benchmark score, the overlap-ratio diagnostic here is the first thing to add.</p><div><hr></div><h2><strong>&#127909; 4 Videos</strong></h2><h3><strong>1. The Physics Behind Flow Matching, Derived from Scratch</strong></h3><div id="youtube2-3mFNpeJQjmw" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;3mFNpeJQjmw&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/3mFNpeJQjmw?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Flow Matching built up from the continuity equation and time-variant velocity fields, with the training loss arriving only after continuous normalizing flows are in hand. Julia Turc&#8217;s 24-minute walkthrough pairs with a free interactive tutorial at diffusion.fyi. It covers conditional velocity fields in the spot where most explanations either gloss over optimal transport or collapse into notation. Queue this when your current mental model of flow matching has a gap you cannot quite name.</p><h3><strong>2. Running LLMs Locally: Practical Performance on NVIDIA&#8217;s DGX Spark</strong></h3><div id="youtube2-c5-kx2bwoCk" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;c5-kx2bwoCk&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/c5-kx2bwoCk?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Serving open-source models from 1.5B to 14B parameters on a single DGX Spark with vLLM, with the throughput, latency, and quantization trade-offs that drop out once you measure them. Mozhgan Kabiri Chimeh from NVIDIA shares a reproducible methodology, NVFP4 performance numbers on Grace Blackwell&#8217;s 128GB unified memory, and a framework for local model sizing. A 10-minute talk if you&#8217;re deciding whether on-prem compute clears the bar for your workload.</p><h3><strong>3. Engineering Reinforcement Learning Environments Like Software</strong></h3><div id="youtube2-71V3fTaUp2Q" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;71V3fTaUp2Q&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/71V3fTaUp2Q?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>RL environments treated as first-class software artifacts, not research scaffolding. Stefano Fiorucci, AI engineer at Deepset, uses the open-source Verifiers library to translate classical RL concepts to language models. He builds single-turn tasks, multi-turn games, and tool-using agents as concrete environments. Watch it before you write your next eval harness. A clean gym spec will save you rebuilding it three weeks in.</p><h3><strong>4. Stay Sassy and swyx on AI Budgets, Per-Person Token Spend, and Why Code Review Matters More</strong></h3><div id="youtube2-5KnCKadxSPY" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;5KnCKadxSPY&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/5KnCKadxSPY?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Per-person token budgets are a management problem, not a provisioning one, and code review gets more important as agents write more code, not less. That is the spine of a 57-minute Latent Space conversation between swyx and Stay Sassy, an anonymous operator with voice modulated for opsec. They cover build-vs-buy, where hand-coding still wins, and how real engineering leaders are allocating AI spend. Closest thing to eavesdropping on the hard numbers right now.</p><div><hr></div><h2><strong>&#128240; 3 Curated Reads</strong></h2><h3><strong>1. The Revenge of the Data Scientist</strong></h3><p><strong><a href="https://hamel.dev/blog/posts/revenge/">https://hamel.dev/blog/posts/revenge/</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!JSVF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6175e67-0143-45ae-9590-6830c5ee73e8_1344x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JSVF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6175e67-0143-45ae-9590-6830c5ee73e8_1344x768.png 424w, https://substackcdn.com/image/fetch/$s_!JSVF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6175e67-0143-45ae-9590-6830c5ee73e8_1344x768.png 848w, https://substackcdn.com/image/fetch/$s_!JSVF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6175e67-0143-45ae-9590-6830c5ee73e8_1344x768.png 1272w, https://substackcdn.com/image/fetch/$s_!JSVF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6175e67-0143-45ae-9590-6830c5ee73e8_1344x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JSVF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6175e67-0143-45ae-9590-6830c5ee73e8_1344x768.png" width="1344" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e6175e67-0143-45ae-9590-6830c5ee73e8_1344x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1344,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!JSVF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6175e67-0143-45ae-9590-6830c5ee73e8_1344x768.png 424w, https://substackcdn.com/image/fetch/$s_!JSVF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6175e67-0143-45ae-9590-6830c5ee73e8_1344x768.png 848w, https://substackcdn.com/image/fetch/$s_!JSVF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6175e67-0143-45ae-9590-6830c5ee73e8_1344x768.png 1272w, https://substackcdn.com/image/fetch/$s_!JSVF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6175e67-0143-45ae-9590-6830c5ee73e8_1344x768.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Training a model was never most of a data scientist&#8217;s job. The bulk of the work was running experiments to test how well a system generalizes, debugging stochastic behaviour, and designing metrics you can actually trust. Hamel Husain argues that LLM teams who cut data scientists out of the loop are now rediscovering the same five eval pitfalls, from off-the-shelf judge metrics that flatter bad systems to dashboards nobody would ship a classifier against. The piece names the concrete moves (error analysis from real traces, aligning LLM judges against human labels, constructing trust-metrics you would actually stake a launch on) and is the cleanest argument this year for why the old craft is now the new infrastructure.</p><h3><strong>2. Agentic Harness Engineering</strong></h3><div class="embedded-post-wrap" data-attrs="{&quot;id&quot;:192391298,&quot;url&quot;:&quot;https://www.decodingai.com/p/agentic-harness-engineering&quot;,&quot;publication_id&quot;:1526003,&quot;publication_name&quot;:&quot;Decoding AI Magazine&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!k2ig!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00bc74e0-3601-49ce-8ab9-4c7b499ce597_1280x1280.png&quot;,&quot;title&quot;:&quot;Agentic Harness Engineering&quot;,&quot;truncated_body_text&quot;:&quot;At the AI start-up I&#8217;ve been working at, building a financial personal assistant, we implemented LlamaIndex, added the Model Context Protocol (MCP), and built complex Retrieval-Augmented Generation (RAG) pipelines. Each piece added complexity without adding direct business value.&quot;,&quot;date&quot;:&quot;2026-03-31T11:03:40.194Z&quot;,&quot;like_count&quot;:78,&quot;comment_count&quot;:12,&quot;bylines&quot;:[{&quot;id&quot;:110559689,&quot;name&quot;:&quot;Paul Iusztin&quot;,&quot;handle&quot;:&quot;pauliusztin&quot;,&quot;previous_name&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0714d360-396c-4b41-a676-1b58dc1dc5f3_1470x1470.jpeg&quot;,&quot;bio&quot;:&quot;Senior AI Engineer &#8226; Founder @ Decoding AI &#8226; Author @ LLM Engineer&#8217;s Handbook I ship AI products and teach you about the process.&quot;,&quot;profile_set_up_at&quot;:&quot;2023-03-27T06:10:29.110Z&quot;,&quot;reader_installed_at&quot;:&quot;2023-08-24T17:10:51.998Z&quot;,&quot;publicationUsers&quot;:[{&quot;id&quot;:1494048,&quot;user_id&quot;:110559689,&quot;publication_id&quot;:1526003,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:true,&quot;publication&quot;:{&quot;id&quot;:1526003,&quot;name&quot;:&quot;Decoding AI Magazine&quot;,&quot;subdomain&quot;:&quot;decodingaimagazine&quot;,&quot;custom_domain&quot;:&quot;www.decodingai.com&quot;,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Join for content on designing, building, and shipping AI software. Learn AI engineering, end-to-end, from idea to production. Every Tuesday.&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/00bc74e0-3601-49ce-8ab9-4c7b499ce597_1280x1280.png&quot;,&quot;author_id&quot;:110559689,&quot;primary_user_id&quot;:110559689,&quot;theme_var_background_pop&quot;:&quot;#A33ACB&quot;,&quot;created_at&quot;:&quot;2023-03-27T06:17:03.688Z&quot;,&quot;email_from_name&quot;:&quot;Decoding AI Magazine&quot;,&quot;copyright&quot;:&quot;Paul Iusztin&quot;,&quot;founding_plan_name&quot;:&quot;Founding Member&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;enabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false,&quot;logo_url_wide&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/85e4cd45-ca39-48d4-941c-86dc67ba9848_1344x325.png&quot;}}],&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100,&quot;status&quot;:{&quot;bestsellerTier&quot;:100,&quot;subscriberTier&quot;:null,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:{&quot;type&quot;:&quot;bestseller&quot;,&quot;tier&quot;:100},&quot;paidPublicationIds&quot;:[],&quot;subscriber&quot;:null}}],&quot;utm_campaign&quot;:null,&quot;belowTheFold&quot;:true,&quot;type&quot;:&quot;newsletter&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPostToDOM"><a class="embedded-post" native="true" href="https://www.decodingai.com/p/agentic-harness-engineering?utm_source=substack&amp;utm_campaign=post_embed&amp;utm_medium=web"><div class="embedded-post-header"><img class="embedded-post-publication-logo" src="https://substackcdn.com/image/fetch/$s_!k2ig!,w_56,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00bc74e0-3601-49ce-8ab9-4c7b499ce597_1280x1280.png" loading="lazy"><span class="embedded-post-publication-name">Decoding AI Magazine</span></div><div class="embedded-post-title-wrapper"><div class="embedded-post-title">Agentic Harness Engineering</div></div><div class="embedded-post-body">At the AI start-up I&#8217;ve been working at, building a financial personal assistant, we implemented LlamaIndex, added the Model Context Protocol (MCP), and built complex Retrieval-Augmented Generation (RAG) pipelines. Each piece added complexity without adding direct business value&#8230;</div><div class="embedded-post-cta-wrapper"><span class="embedded-post-cta">Read more</span></div><div class="embedded-post-meta">25 days ago &#183; 78 likes &#183; 12 comments &#183; Paul Iusztin</div></a></div><p>The harness is where production agents actually live: tools, memory, sandboxes, and orchestration that let an LLM recover from failures, bridge context windows, serve multiple interfaces, and hold state across sessions. Paul Iusztin dissects the production harness behind Claude Code and Codex across a handful of components (context engineering, memory, sandboxing, tool layers, orchestration). He walks through how each piece solves a problem the model cannot solve on its own. The thesis: strip RAG pipelines and MCP layers back to plain Python until the harness itself earns the complexity.</p><h3><strong>3. Extreme Harness Engineering for Token Billionaires</strong></h3><div class="embedded-post-wrap" data-attrs="{&quot;id&quot;:193478192,&quot;url&quot;:&quot;https://www.latent.space/p/harness-eng&quot;,&quot;publication_id&quot;:1084089,&quot;publication_name&quot;:&quot;Latent.Space&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!DbYa!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73b0838a-bd14-46a1-801c-b6a2046e5c1e_1130x1130.png&quot;,&quot;title&quot;:&quot;Extreme Harness Engineering for Token Billionaires: 1M LOC, 1B toks/day, 0% human code, 0% human review &#8212; Ryan Lopopolo, OpenAI Frontier &amp; Symphony&quot;,&quot;truncated_body_text&quot;:&quot;We&#8217;re proud to release this ahead of Ryan&#8217;s keynote at AIE Europe. Hit the bell, get notified when it is live! Attendees: come prepped for Ryan&#8217;s AMA with Vibhu after.&quot;,&quot;date&quot;:&quot;2026-04-07T17:14:26.942Z&quot;,&quot;like_count&quot;:45,&quot;comment_count&quot;:4,&quot;bylines&quot;:[],&quot;utm_campaign&quot;:null,&quot;belowTheFold&quot;:true,&quot;type&quot;:&quot;podcast&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPostToDOM"><a class="embedded-post" native="true" href="https://www.latent.space/p/harness-eng?utm_source=substack&amp;utm_campaign=post_embed&amp;utm_medium=web"><div class="embedded-post-header"><img class="embedded-post-publication-logo" src="https://substackcdn.com/image/fetch/$s_!DbYa!,w_56,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73b0838a-bd14-46a1-801c-b6a2046e5c1e_1130x1130.png" loading="lazy"><span class="embedded-post-publication-name">Latent.Space</span></div><div class="embedded-post-title-wrapper"><div class="embedded-post-title-icon"><svg width="19" height="19" viewBox="0 0 24 24" fill="none" xmlns="http://www.w3.org/2000/svg">
  <path d="M3 18V12C3 9.61305 3.94821 7.32387 5.63604 5.63604C7.32387 3.94821 9.61305 3 12 3C14.3869 3 16.6761 3.94821 18.364 5.63604C20.0518 7.32387 21 9.61305 21 12V18" stroke-linecap="round" stroke-linejoin="round"></path>
  <path d="M21 19C21 19.5304 20.7893 20.0391 20.4142 20.4142C20.0391 20.7893 19.5304 21 19 21H18C17.4696 21 16.9609 20.7893 16.5858 20.4142C16.2107 20.0391 16 19.5304 16 19V16C16 15.4696 16.2107 14.9609 16.5858 14.5858C16.9609 14.2107 17.4696 14 18 14H21V19ZM3 19C3 19.5304 3.21071 20.0391 3.58579 20.4142C3.96086 20.7893 4.46957 21 5 21H6C6.53043 21 7.03914 20.7893 7.41421 20.4142C7.78929 20.0391 8 19.5304 8 19V16C8 15.4696 7.78929 14.9609 7.41421 14.5858C7.03914 14.2107 6.53043 14 6 14H3V19Z" stroke-linecap="round" stroke-linejoin="round"></path>
</svg></div><div class="embedded-post-title">Extreme Harness Engineering for Token Billionaires: 1M LOC, 1B toks/day, 0% human code, 0% human review &#8212; Ryan Lopopolo, OpenAI Frontier &amp; Symphony</div></div><div class="embedded-post-body">We&#8217;re proud to release this ahead of Ryan&#8217;s keynote at AIE Europe. Hit the bell, get notified when it is live! Attendees: come prepped for Ryan&#8217;s AMA with Vibhu after&#8230;</div><div class="embedded-post-cta-wrapper"><div class="embedded-post-cta-icon"><svg width="32" height="32" viewBox="0 0 24 24" xmlns="http://www.w3.org/2000/svg">
  <path classname="inner-triangle" d="M10 8L16 12L10 16V8Z" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round"></path>
</svg></div><span class="embedded-post-cta">Listen now</span></div><div class="embedded-post-meta">18 days ago &#183; 45 likes &#183; 4 comments</div></a></div><p>OpenAI&#8217;s Frontier team wrote a 1M-line codebase at 1 billion tokens per day with zero human-written code and zero human-reviewed merges. Ryan Lopopolo walks through how they got there. The build system went from Make to Bazel to Turbo to Nx to hit a sub-one-minute loop. Observability was exposed to the agent so it can tell when it is going off track. Dependencies were inlined to remove version drift. Specs were written for the model, not the engineer. The conversation with Swyx and Alessio covers the agent code-review rules, autonomous merging, and why human attention is now the binding constraint. Read it before you argue with anyone about what a production agent harness should look like in 2026.</p><div><hr></div><h2><strong>&#128736; 2 Tools &amp; Repos</strong></h2><h3><strong>1. Microsoft Agent Framework</strong></h3><p><strong><a href="https://github.com/microsoft/agent-framework">https://github.com/microsoft/agent-framework</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bajC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F975f9c92-5365-4d02-85c3-04f61c19ae6d_1108x233.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bajC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F975f9c92-5365-4d02-85c3-04f61c19ae6d_1108x233.png 424w, https://substackcdn.com/image/fetch/$s_!bajC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F975f9c92-5365-4d02-85c3-04f61c19ae6d_1108x233.png 848w, https://substackcdn.com/image/fetch/$s_!bajC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F975f9c92-5365-4d02-85c3-04f61c19ae6d_1108x233.png 1272w, https://substackcdn.com/image/fetch/$s_!bajC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F975f9c92-5365-4d02-85c3-04f61c19ae6d_1108x233.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bajC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F975f9c92-5365-4d02-85c3-04f61c19ae6d_1108x233.png" width="1108" height="233" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/975f9c92-5365-4d02-85c3-04f61c19ae6d_1108x233.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:233,&quot;width&quot;:1108,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bajC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F975f9c92-5365-4d02-85c3-04f61c19ae6d_1108x233.png 424w, https://substackcdn.com/image/fetch/$s_!bajC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F975f9c92-5365-4d02-85c3-04f61c19ae6d_1108x233.png 848w, https://substackcdn.com/image/fetch/$s_!bajC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F975f9c92-5365-4d02-85c3-04f61c19ae6d_1108x233.png 1272w, https://substackcdn.com/image/fetch/$s_!bajC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F975f9c92-5365-4d02-85c3-04f61c19ae6d_1108x233.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Microsoft&#8217;s unified framework for building and orchestrating AI agents across Python and .NET, with graph-based workflows, streaming, checkpointing, human-in-the-loop, and time-travel. The interesting move is that it ships with explicit migration guides from both Semantic Kernel and AutoGen, positioning itself as the convergence point for Microsoft&#8217;s prior agent SDKs. 9,736 stars and active weekly office hours. If you already run Semantic Kernel or AutoGen in production, the migration guides are the fastest read on where Microsoft wants your next agent to live.</p><h3><strong>2. AutoKernel: Autoresearch for GPU Kernels</strong></h3><p><strong><a href="https://github.com/RightNow-AI/autokernel">https://github.com/RightNow-AI/autokernel</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!VPkG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b769ccb-55ec-44b1-b54b-359e649a6002_2048x1095.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!VPkG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b769ccb-55ec-44b1-b54b-359e649a6002_2048x1095.png 424w, https://substackcdn.com/image/fetch/$s_!VPkG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b769ccb-55ec-44b1-b54b-359e649a6002_2048x1095.png 848w, https://substackcdn.com/image/fetch/$s_!VPkG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b769ccb-55ec-44b1-b54b-359e649a6002_2048x1095.png 1272w, https://substackcdn.com/image/fetch/$s_!VPkG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b769ccb-55ec-44b1-b54b-359e649a6002_2048x1095.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!VPkG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b769ccb-55ec-44b1-b54b-359e649a6002_2048x1095.png" width="1456" height="778" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3b769ccb-55ec-44b1-b54b-359e649a6002_2048x1095.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:778,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!VPkG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b769ccb-55ec-44b1-b54b-359e649a6002_2048x1095.png 424w, https://substackcdn.com/image/fetch/$s_!VPkG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b769ccb-55ec-44b1-b54b-359e649a6002_2048x1095.png 848w, https://substackcdn.com/image/fetch/$s_!VPkG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b769ccb-55ec-44b1-b54b-359e649a6002_2048x1095.png 1272w, https://substackcdn.com/image/fetch/$s_!VPkG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b769ccb-55ec-44b1-b54b-359e649a6002_2048x1095.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Karpathy-inspired autonomous loop that takes any PyTorch model, profiles it, extracts bottleneck kernels into standalone Triton or CUDA C++, and optimizes them one at a time in a keep-or-revert loop. Each experiment runs in about 90 seconds, roughly 40 per hour, 320 overnight. The orchestrator picks the next kernel using Amdahl&#8217;s law. 1,295 stars. Point it at a model you care about before bed and wake up to a speedup report.</p><div><hr></div><h2><strong>&#127891; 1 Pick of the Week</strong></h2><h3><strong>BMAD-METHOD: Breakthrough Method for Agile AI Driven Development</strong></h3><p><strong><a href="https://github.com/bmad-code-org/BMAD-METHOD">https://github.com/bmad-code-org/BMAD-METHOD</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!H7R8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b22e6fb-425a-4045-afc1-fd1029972a88_1408x224.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!H7R8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b22e6fb-425a-4045-afc1-fd1029972a88_1408x224.png 424w, https://substackcdn.com/image/fetch/$s_!H7R8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b22e6fb-425a-4045-afc1-fd1029972a88_1408x224.png 848w, https://substackcdn.com/image/fetch/$s_!H7R8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b22e6fb-425a-4045-afc1-fd1029972a88_1408x224.png 1272w, https://substackcdn.com/image/fetch/$s_!H7R8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b22e6fb-425a-4045-afc1-fd1029972a88_1408x224.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!H7R8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b22e6fb-425a-4045-afc1-fd1029972a88_1408x224.png" width="1408" height="224" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3b22e6fb-425a-4045-afc1-fd1029972a88_1408x224.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:224,&quot;width&quot;:1408,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!H7R8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b22e6fb-425a-4045-afc1-fd1029972a88_1408x224.png 424w, https://substackcdn.com/image/fetch/$s_!H7R8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b22e6fb-425a-4045-afc1-fd1029972a88_1408x224.png 848w, https://substackcdn.com/image/fetch/$s_!H7R8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b22e6fb-425a-4045-afc1-fd1029972a88_1408x224.png 1272w, https://substackcdn.com/image/fetch/$s_!H7R8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b22e6fb-425a-4045-afc1-fd1029972a88_1408x224.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>BMAD-METHOD is an MIT-licensed framework that installs a full agile team of named AI agents into your IDE. You get six roles: Mary (business analyst), John (PM), Winston (architect), Sally (UX), Amelia (senior engineer), and Paige (tech writer). Each carries a consistent persona, a defined skill set, and a handoff point to the next. The core module ships 34+ workflows across brainstorming, PRD drafting, architecture, sprint planning, and code review. A solo developer can run a structured delivery loop without a real team. Install is one command (`npx bmad-method install`), and the framework picks up Claude Code, Cursor, and other IDEs on first run. 45K+ stars, actively maintained. Pick it up if you have been reaching for individu<code> coding agents and find</code> <code>the scaffolding thinner than the model itself.</code></p><div><hr></div><p><em>That&#8217;s the twenty-fifth Tokenizer. If one of these fifteen resources changes what you&#8217;re building this week, forward the edition to whoever should see it next, and come find the long-form work at <a href="https://newsletter.artofsaience.com">Gradient Ascent</a>.</em></p>]]></content:encoded></item><item><title><![CDATA[Cursor's Agent-Written CUDA Kernels, Claude Cowork for Non-Engineers, and Stanford's Frontier Systems - 📚 The Tokenizer Edition #24]]></title><description><![CDATA[This week's most valuable AI resources]]></description><link>https://newsletter.artofsaience.com/p/cursors-agent-written-cuda-kernels</link><guid isPermaLink="false">https://newsletter.artofsaience.com/p/cursors-agent-written-cuda-kernels</guid><dc:creator><![CDATA[Sairam Sundaresan]]></dc:creator><pubDate>Thu, 16 Apr 2026 12:03:16 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/jwGQ9CrqVdA" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey there! This week the strongest work happened underneath the models: in the training signal, the kernel compiler, and the retrieval pipeline. Over at Cursor, an agent swarm writes CUDA kernels 38% faster than human-tuned baselines on GQA and MoE GEMMs, where the baselines were already aggressive. Self-distilled RLVR reopens token-level updates inside RL training without the late-stage collapse that earlier approaches kept running into. Hugging Face ships a working multimodal retrieve-and-rerank recipe you can copy tonight, and Stanford&#8217;s new Frontier Systems course is pulling the people building the stack into weekly lectures while they build it.</p><h3><strong>New here?</strong></h3><p><em>The Tokenizer is my resource-focused newsletter edition where I curate high-signal AI/ML papers, videos, articles, tools, and learning resources so you don&#8217;t have to sift through the noise. Subscribe to <a href="https://gradientascent.co">Gradient Ascent</a> for the full experience.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.artofsaience.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.artofsaience.com/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><h2><strong>TL;DR</strong></h2><p>What caught my attention this week:</p><ul><li><p>&#128196; <strong>Papers:</strong> a cleaner RL gradient signal that doesn&#8217;t collapse late in training, the GUI agent stack done end-to-end for once, memory-aware reward shaping that notices recurrent failure modes, a four-frame streaming video baseline that quietly beats thirteen heavyweight ones, and retrieval supervision pulled straight from agent trajectories.</p></li><li><p>&#127909; <strong>Videos:</strong> Notion on custom-agent evals at scale, the Gemma 4 architecture delta, Claude Cowork for non-engineers, and a recipe for LLM judges that don&#8217;t silently drift.</p></li><li><p>&#128240; <strong>Reads:</strong> Agent swarms writing faster CUDA kernels at Cursor, a hands-on multimodal embedding tutorial from Hugging Face, and a clean first-principles tour of mathematical modeling.</p></li><li><p>&#128736; <strong>Tools:</strong> VoxCPM2, a tokenizer-free multilingual TTS system, and a viral Claude/Codex plugin that cuts output tokens ~75% by making the agent talk like a caveman.</p></li><li><p>&#127891; <strong>Learning:</strong> Stanford&#8217;s Spring 2026 Frontier Systems course, weekly lectures from the people building the AI infrastructure stack.</p><div><hr></div></li></ul><h2><strong>&#128196; 5 Papers</strong></h2><h3><strong>1. Self-Distilled RLVR</strong></h3><p><strong><a href="https://arxiv.org/abs/2604.03128">https://arxiv.org/abs/2604.03128</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!AY0W!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c3e3331-08f6-4a56-adda-951d347d7838_788x363.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!AY0W!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c3e3331-08f6-4a56-adda-951d347d7838_788x363.png 424w, https://substackcdn.com/image/fetch/$s_!AY0W!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c3e3331-08f6-4a56-adda-951d347d7838_788x363.png 848w, https://substackcdn.com/image/fetch/$s_!AY0W!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c3e3331-08f6-4a56-adda-951d347d7838_788x363.png 1272w, https://substackcdn.com/image/fetch/$s_!AY0W!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c3e3331-08f6-4a56-adda-951d347d7838_788x363.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!AY0W!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c3e3331-08f6-4a56-adda-951d347d7838_788x363.png" width="788" height="363" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2c3e3331-08f6-4a56-adda-951d347d7838_788x363.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:363,&quot;width&quot;:788,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!AY0W!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c3e3331-08f6-4a56-adda-951d347d7838_788x363.png 424w, https://substackcdn.com/image/fetch/$s_!AY0W!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c3e3331-08f6-4a56-adda-951d347d7838_788x363.png 848w, https://substackcdn.com/image/fetch/$s_!AY0W!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c3e3331-08f6-4a56-adda-951d347d7838_788x363.png 1272w, https://substackcdn.com/image/fetch/$s_!AY0W!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c3e3331-08f6-4a56-adda-951d347d7838_788x363.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>On-policy self-distillation (OPSD) was supposed to give RL training a denser gradient signal. In practice it leaks privileged teacher information into the student and destabilizes training past the early-peak stage. RLSD keeps RLVR&#8217;s environmental feedback as the direction signal and uses self-distillation only to set token-level update magnitudes. The result: +4.69% average over the base LLM and +2.32% over GRPO across five multimodal reasoning benchmarks (MMMU, MathVista, MathVision, ZeroBench, WeMath), without the late-stage collapse OPSD exhibits.</p><h3><strong>2. ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents</strong></h3><p><strong><a href="https://arxiv.org/abs/2604.11784">https://arxiv.org/abs/2604.11784</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!sRoF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8eff6f6f-ea4e-48c1-b3b8-537eb971ae08_987x484.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!sRoF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8eff6f6f-ea4e-48c1-b3b8-537eb971ae08_987x484.png 424w, https://substackcdn.com/image/fetch/$s_!sRoF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8eff6f6f-ea4e-48c1-b3b8-537eb971ae08_987x484.png 848w, https://substackcdn.com/image/fetch/$s_!sRoF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8eff6f6f-ea4e-48c1-b3b8-537eb971ae08_987x484.png 1272w, https://substackcdn.com/image/fetch/$s_!sRoF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8eff6f6f-ea4e-48c1-b3b8-537eb971ae08_987x484.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!sRoF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8eff6f6f-ea4e-48c1-b3b8-537eb971ae08_987x484.png" width="987" height="484" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8eff6f6f-ea4e-48c1-b3b8-537eb971ae08_987x484.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:484,&quot;width&quot;:987,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!sRoF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8eff6f6f-ea4e-48c1-b3b8-537eb971ae08_987x484.png 424w, https://substackcdn.com/image/fetch/$s_!sRoF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8eff6f6f-ea4e-48c1-b3b8-537eb971ae08_987x484.png 848w, https://substackcdn.com/image/fetch/$s_!sRoF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8eff6f6f-ea4e-48c1-b3b8-537eb971ae08_987x484.png 1272w, https://substackcdn.com/image/fetch/$s_!sRoF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8eff6f6f-ea4e-48c1-b3b8-537eb971ae08_987x484.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>GUI agents aren&#8217;t stuck because the models are weak. They&#8217;re stuck because the training pipelines, eval harnesses, and deployment layers keep breaking in public, and nobody&#8217;s addressed all three at once. The authors cover all three layers: RL training with GiGPO plus a process reward model, 95.8% evaluation reproduction across 6 benchmarks, and cross-platform deployment to Android, HarmonyOS, and iOS with hybrid CLI-GUI control. ClawGUI-2B trained inside this pipeline hits 17.1% success on MobileWorld GUI-Only, beating the same-scale MAI-UI-2B baseline by 6 points and larger untrained models like UI-Venus-72B.</p><h3><strong>3. The Past Is Not Past: Memory-Enhanced Dynamic Reward Shaping</strong></h3><p><strong><a href="https://arxiv.org/abs/2604.11297">https://arxiv.org/abs/2604.11297</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!G3Ja!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84239f2c-2d8e-4683-9535-12d79d4cfdd5_2000x815.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!G3Ja!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84239f2c-2d8e-4683-9535-12d79d4cfdd5_2000x815.png 424w, https://substackcdn.com/image/fetch/$s_!G3Ja!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84239f2c-2d8e-4683-9535-12d79d4cfdd5_2000x815.png 848w, https://substackcdn.com/image/fetch/$s_!G3Ja!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84239f2c-2d8e-4683-9535-12d79d4cfdd5_2000x815.png 1272w, https://substackcdn.com/image/fetch/$s_!G3Ja!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84239f2c-2d8e-4683-9535-12d79d4cfdd5_2000x815.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!G3Ja!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84239f2c-2d8e-4683-9535-12d79d4cfdd5_2000x815.png" width="1456" height="593" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/84239f2c-2d8e-4683-9535-12d79d4cfdd5_2000x815.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:593,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!G3Ja!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84239f2c-2d8e-4683-9535-12d79d4cfdd5_2000x815.png 424w, https://substackcdn.com/image/fetch/$s_!G3Ja!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84239f2c-2d8e-4683-9535-12d79d4cfdd5_2000x815.png 848w, https://substackcdn.com/image/fetch/$s_!G3Ja!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84239f2c-2d8e-4683-9535-12d79d4cfdd5_2000x815.png 1272w, https://substackcdn.com/image/fetch/$s_!G3Ja!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84239f2c-2d8e-4683-9535-12d79d4cfdd5_2000x815.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>RL for LLMs keeps collapsing into the same wrong answers. Entropy regularization penalizes randomness within the current policy but ignores recurrent failure patterns across rollouts. MEDS stores historical model representations, clusters them with HDBSCAN to surface common error modes, and down-weights reward for rollouts landing in high-density error clusters. Up to 4.13 pass@1 points gained on math reasoning benchmarks, with measurably higher behavioral diversity and negligible compute overhead.</p><h3><strong>4. A Simple Baseline for Streaming Video Understanding</strong></h3><p><strong><a href="https://arxiv.org/abs/2604.02317">https://arxiv.org/abs/2604.02317</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!CKIv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98978c6c-6568-47e6-a918-3e1f50c2eee7_2048x674.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!CKIv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98978c6c-6568-47e6-a918-3e1f50c2eee7_2048x674.png 424w, https://substackcdn.com/image/fetch/$s_!CKIv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98978c6c-6568-47e6-a918-3e1f50c2eee7_2048x674.png 848w, https://substackcdn.com/image/fetch/$s_!CKIv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98978c6c-6568-47e6-a918-3e1f50c2eee7_2048x674.png 1272w, https://substackcdn.com/image/fetch/$s_!CKIv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98978c6c-6568-47e6-a918-3e1f50c2eee7_2048x674.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!CKIv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98978c6c-6568-47e6-a918-3e1f50c2eee7_2048x674.png" width="1456" height="479" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/98978c6c-6568-47e6-a918-3e1f50c2eee7_2048x674.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:479,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!CKIv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98978c6c-6568-47e6-a918-3e1f50c2eee7_2048x674.png 424w, https://substackcdn.com/image/fetch/$s_!CKIv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98978c6c-6568-47e6-a918-3e1f50c2eee7_2048x674.png 848w, https://substackcdn.com/image/fetch/$s_!CKIv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98978c6c-6568-47e6-a918-3e1f50c2eee7_2048x674.png 1272w, https://substackcdn.com/image/fetch/$s_!CKIv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98978c6c-6568-47e6-a918-3e1f50c2eee7_2048x674.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Streaming-video models keep stacking heavier memory mechanisms. A sliding window of 4 recent frames into an off-the-shelf VLM (SimpleStream) matches or beats 13 specialized baselines: 67.7% on OVO-Bench and 80.59% on StreamingBench. The deeper finding is a perception-memory trade-off. Longer context often helps recall but hurts real-time perception. Future streaming benchmarks need to separate the two or they&#8217;ll keep rewarding machinery for its own sake.</p><h3><strong>5. LRAT: Learning to Retrieve from Agent Trajectories</strong></h3><p><strong><a href="https://arxiv.org/abs/2604.04949">https://arxiv.org/abs/2604.04949</a></strong> | <strong><a href="https://github.com/Yuqi-Zhou/LRAT">GitHub</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!sfTh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1e602e9-4e16-4f3a-92ab-40e3f977c65e_997x351.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!sfTh!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1e602e9-4e16-4f3a-92ab-40e3f977c65e_997x351.png 424w, https://substackcdn.com/image/fetch/$s_!sfTh!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1e602e9-4e16-4f3a-92ab-40e3f977c65e_997x351.png 848w, https://substackcdn.com/image/fetch/$s_!sfTh!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1e602e9-4e16-4f3a-92ab-40e3f977c65e_997x351.png 1272w, https://substackcdn.com/image/fetch/$s_!sfTh!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1e602e9-4e16-4f3a-92ab-40e3f977c65e_997x351.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!sfTh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1e602e9-4e16-4f3a-92ab-40e3f977c65e_997x351.png" width="997" height="351" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f1e602e9-4e16-4f3a-92ab-40e3f977c65e_997x351.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:351,&quot;width&quot;:997,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!sfTh!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1e602e9-4e16-4f3a-92ab-40e3f977c65e_997x351.png 424w, https://substackcdn.com/image/fetch/$s_!sfTh!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1e602e9-4e16-4f3a-92ab-40e3f977c65e_997x351.png 848w, https://substackcdn.com/image/fetch/$s_!sfTh!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1e602e9-4e16-4f3a-92ab-40e3f977c65e_997x351.png 1272w, https://substackcdn.com/image/fetch/$s_!sfTh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1e602e9-4e16-4f3a-92ab-40e3f977c65e_997x351.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Retrieval models trained on human click and dwell logs are mismatched to how LLM agents query and consume results. LRAT derives retrieval supervision directly from agent trajectories (browses, unbrowsed rejections, and post-browse reasoning traces) with weighted relevance intensity. Evidence recall, task success, and execution efficiency all improve across agent scales, and the BM25 and FAISS pipelines are in the repo.</p><div><hr></div><h2><strong>&#127909; 4 Videos</strong></h2><h3><strong>1. Notion&#8217;s Head of AI Engineering on Running Custom-Agent Evals at Scale</strong></h3><div id="youtube2-ATt7QJgt-2k" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;ATt7QJgt-2k&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/ATt7QJgt-2k?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>When every Notion user has a bespoke agent, eval becomes a combinatorial problem. Sarah Sachs (head of AI engineering at Notion) and co-founder Simon Last walk Latent Space through how they keep the signal alive through that explosion. Technical interview, not product marketing. Worth it if you&#8217;re past the &#8220;does the agent work on my test case&#8221; stage and need patterns for eval at scale.</p><h3><strong>2. What&#8217;s New in Gemma 4</strong></h3><div id="youtube2-6VV5Gvmtrl4" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;6VV5Gvmtrl4&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/6VV5Gvmtrl4?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>A 60-second official overview from Google DeepMind on what changed between Gemma 3 and Gemma 4. Treat it as a trailer, not a tech report. The architecture and training deltas it calls out are the right ones to chase down in the model card afterward. Worth including because the open-weights frontier is where half the interesting work this edition lives.</p><h3><strong>3. Claude Cowork Tutorial for Non-Engineers with JJ Englert (Tenex)</strong></h3><div id="youtube2-jwGQ9CrqVdA" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;jwGQ9CrqVdA&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/jwGQ9CrqVdA?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>First hands-on Claude Cowork walkthrough aimed at people without an engineering background. JJ Englert runs end-to-end setup and a real workflow, which is the kind of demo that travels because it proves the tool works without scaffolding. Watch this for the first clean demo of Cowork running end-to-end without an engineer in the loop.</p><h3><strong>4. Judge the Judge: Building LLM Evaluators That Actually Work with GEPA</strong></h3><div id="youtube2-X4dEHRzBLmc" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;X4dEHRzBLmc&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/X4dEHRzBLmc?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Mahmoud Mabrouk from Agenta AI at AIE Europe on building LLM judges that don&#8217;t silently drift. GEPA (Gradient-free Evaluator Prompt Adaptation) is a practical recipe for keeping your judge calibrated as the underlying model and task distribution shift. Directly useful if you&#8217;re running eval harnesses in production and can&#8217;t afford quiet regressions.</p><div><hr></div><h2><strong>&#128240; 3 Curated Reads</strong></h2><h3><strong>1. Speeding Up GPU Kernels by 38% With a Multi-Agent System</strong></h3><p><strong><a href="https://www.cursor.com/blog/multi-agent-kernels">https://www.cursor.com/blog/multi-agent-kernels</a></strong></p><p>Cursor and NVIDIA show a coordinated agent swarm writing CUDA kernels (CUDA C with inline PTX, CuTe DSL, a shared markdown scratchpad as the coordination medium) that hits a 38% geomean speedup across 235 real kernels. Standout wins include 84% on grouped-query attention and real gains on MoE GEMMs where human-tuned baselines were already aggressive. Read this for the concrete coordination pattern, not the headline number alone.</p><h3><strong>2. Multimodal Embedding and Reranker Models With Sentence Transformers</strong></h3><p><strong><a href="https://huggingface.co/blog/multimodal-sentence-transformers">https://huggingface.co/blog/multimodal-sentence-transformers</a></strong></p><p>Tom Aarsen&#8217;s step-by-step guide to building cross-modal retrieve-and-rerank over text, image, audio, and video using Qwen3-VL-Embedding and Reranker. Working code you can copy tonight, with the full retrieve-then-rerank pipeline wired up in Sentence Transformers. Rare to get the full modern multimodal RAG stack wired up in one notebook. Clone it, run it, then decide whether you need anything more than this.</p><h3><strong>3. The Power of Mathematical Modeling</strong></h3><div class="embedded-post-wrap" data-attrs="{&quot;id&quot;:193436735,&quot;url&quot;:&quot;https://thepalindrome.org/p/the-power-of-mathematical-modeling&quot;,&quot;publication_id&quot;:1176501,&quot;publication_name&quot;:&quot;The Palindrome&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!5Jm3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8b68cf8-d3f4-42f6-b8dd-cccde036005f_720x720.png&quot;,&quot;title&quot;:&quot;The Power of Mathematical Modeling&quot;,&quot;truncated_body_text&quot;:&quot;Hey! It&#8217;s Tivadar from The Palindrome.&quot;,&quot;date&quot;:&quot;2026-04-07T11:03:35.771Z&quot;,&quot;like_count&quot;:43,&quot;comment_count&quot;:3,&quot;bylines&quot;:[{&quot;id&quot;:10322584,&quot;name&quot;:&quot;Tivadar Danka&quot;,&quot;handle&quot;:&quot;tivadardanka&quot;,&quot;previous_name&quot;:&quot;Tivadar&quot;,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!09ow!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F3b26cd48-153a-4207-b1e3-e14e1ec8d5e8_400x400.jpeg&quot;,&quot;bio&quot;:&quot;Just an Eastern European punk, writing about tech, math, and machine learning. INTJ personality. Chaotic good.&quot;,&quot;profile_set_up_at&quot;:&quot;2022-11-05T18:59:57.000Z&quot;,&quot;reader_installed_at&quot;:&quot;2022-12-09T10:24:21.362Z&quot;,&quot;publicationUsers&quot;:[{&quot;id&quot;:1129770,&quot;user_id&quot;:10322584,&quot;publication_id&quot;:1176501,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:true,&quot;publication&quot;:{&quot;id&quot;:1176501,&quot;name&quot;:&quot;The Palindrome&quot;,&quot;subdomain&quot;:&quot;thepalindrome&quot;,&quot;custom_domain&quot;:&quot;thepalindrome.org&quot;,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;mathematics &#8746; machine learning&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f8b68cf8-d3f4-42f6-b8dd-cccde036005f_720x720.png&quot;,&quot;author_id&quot;:10322584,&quot;primary_user_id&quot;:10322584,&quot;theme_var_background_pop&quot;:&quot;#9D6FFF&quot;,&quot;created_at&quot;:&quot;2022-11-05T19:02:46.937Z&quot;,&quot;email_from_name&quot;:&quot;The Palindrome&quot;,&quot;copyright&quot;:&quot;Tivadar Danka&quot;,&quot;founding_plan_name&quot;:&quot;Founding Member&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;enabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false,&quot;logo_url_wide&quot;:null}}],&quot;twitter_screen_name&quot;:&quot;TivadarDanka&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100,&quot;status&quot;:{&quot;bestsellerTier&quot;:100,&quot;subscriberTier&quot;:5,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:{&quot;type&quot;:&quot;bestseller&quot;,&quot;tier&quot;:100},&quot;paidPublicationIds&quot;:[1174659,883883,1857854,6596898,1611829,1285451],&quot;subscriber&quot;:null}},{&quot;id&quot;:38842368,&quot;name&quot;:&quot;Manlio De Domenico, Ph.D.&quot;,&quot;handle&quot;:&quot;manlius&quot;,&quot;previous_name&quot;:&quot;Manlio De Domenico&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/537de500-db20-4bcd-a894-5ef6226bbf13_1080x1080.jpeg&quot;,&quot;bio&quot;:&quot;Persistently curious. Complex systems &amp; network scientist, working on resilience, health &amp; society through the lens of physics: from cells to societies &#129504;&#129440;&#129516;&#127751; Prof. @ U. of Padua, leads CoMuNe Lab &amp; Padua Center for Network Medicine.&quot;,&quot;profile_set_up_at&quot;:&quot;2022-11-09T22:04:30.836Z&quot;,&quot;reader_installed_at&quot;:&quot;2023-02-25T14:31:31.591Z&quot;,&quot;is_guest&quot;:true,&quot;bestseller_tier&quot;:null,&quot;status&quot;:{&quot;bestsellerTier&quot;:null,&quot;subscriberTier&quot;:null,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:null,&quot;paidPublicationIds&quot;:[],&quot;subscriber&quot;:null},&quot;primaryPublicationId&quot;:1183925,&quot;primaryPublicationName&quot;:&quot;Complexity Thoughts&quot;,&quot;primaryPublicationUrl&quot;:&quot;https://manlius.substack.com&quot;,&quot;primaryPublicationSubscribeUrl&quot;:&quot;https://manlius.substack.com/subscribe?&quot;}],&quot;utm_campaign&quot;:null,&quot;belowTheFold&quot;:true,&quot;type&quot;:&quot;newsletter&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPostToDOM"><a class="embedded-post" native="true" href="https://thepalindrome.org/p/the-power-of-mathematical-modeling?utm_source=substack&amp;utm_campaign=post_embed&amp;utm_medium=web"><div class="embedded-post-header"><img class="embedded-post-publication-logo" src="https://substackcdn.com/image/fetch/$s_!5Jm3!,w_56,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8b68cf8-d3f4-42f6-b8dd-cccde036005f_720x720.png" loading="lazy"><span class="embedded-post-publication-name">The Palindrome</span></div><div class="embedded-post-title-wrapper"><div class="embedded-post-title">The Power of Mathematical Modeling</div></div><div class="embedded-post-body">Hey! It&#8217;s Tivadar from The Palindrome&#8230;</div><div class="embedded-post-cta-wrapper"><span class="embedded-post-cta">Read more</span></div><div class="embedded-post-meta">18 days ago &#183; 43 likes &#183; 3 comments &#183; Tivadar Danka and Manlio De Domenico, Ph.D.</div></a></div><p>Manlio De Domenico guest-posts on Tivadar Danka&#8217;s Palindrome with a first-principles tour of mathematical modeling: the SI compartment model applied to the ILOVEYOU worm, a four-compartment SIZR extension for zombie outbreaks, and a network-science result showing that immunizing hubs beats random interventions. The SI-to-SIZR extension is the kind of move that sticks. You watch a textbook model flex to handle zombies without losing any of its explanatory structure, which is exactly the feel you want from a modeling tour. The hub-immunization result at the end is the payoff. Drop this into the week when the systems-and-tooling fatigue catches up with you.</p><div><hr></div><h2><strong>&#128736; 2 Tools &amp; Repos</strong></h2><h3><strong>1. VoxCPM2: Tokenizer-Free Multilingual TTS</strong></h3><p><strong><a href="https://github.com/OpenBMB/VoxCPM">https://github.com/OpenBMB/VoxCPM</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jYX_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60c94594-6393-4952-a908-b017c7108511_1025x625.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jYX_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60c94594-6393-4952-a908-b017c7108511_1025x625.png 424w, https://substackcdn.com/image/fetch/$s_!jYX_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60c94594-6393-4952-a908-b017c7108511_1025x625.png 848w, https://substackcdn.com/image/fetch/$s_!jYX_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60c94594-6393-4952-a908-b017c7108511_1025x625.png 1272w, https://substackcdn.com/image/fetch/$s_!jYX_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60c94594-6393-4952-a908-b017c7108511_1025x625.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jYX_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60c94594-6393-4952-a908-b017c7108511_1025x625.png" width="1025" height="625" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/60c94594-6393-4952-a908-b017c7108511_1025x625.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:625,&quot;width&quot;:1025,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!jYX_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60c94594-6393-4952-a908-b017c7108511_1025x625.png 424w, https://substackcdn.com/image/fetch/$s_!jYX_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60c94594-6393-4952-a908-b017c7108511_1025x625.png 848w, https://substackcdn.com/image/fetch/$s_!jYX_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60c94594-6393-4952-a908-b017c7108511_1025x625.png 1272w, https://substackcdn.com/image/fetch/$s_!jYX_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60c94594-6393-4952-a908-b017c7108511_1025x625.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Tokenizer-free TTS system that generates continuous speech directly through a diffusion autoregressive architecture, bypassing discrete audio tokenization entirely. 30 languages, controllable voice cloning, voice design from text descriptions, and 48kHz studio-quality output. Over 13k stars and climbing fast. Pick this up if you&#8217;re building anything speech-facing and want to skip the usual codec-and-vocoder tax.</p><h3><strong>2. caveman</strong></h3><p><strong><a href="https://github.com/JuliusBrussee/caveman">https://github.com/JuliusBrussee/caveman</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0g7I!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22ac9335-f55b-4ced-8af1-44f840a2fae0_1200x600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0g7I!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22ac9335-f55b-4ced-8af1-44f840a2fae0_1200x600.png 424w, https://substackcdn.com/image/fetch/$s_!0g7I!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22ac9335-f55b-4ced-8af1-44f840a2fae0_1200x600.png 848w, https://substackcdn.com/image/fetch/$s_!0g7I!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22ac9335-f55b-4ced-8af1-44f840a2fae0_1200x600.png 1272w, https://substackcdn.com/image/fetch/$s_!0g7I!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22ac9335-f55b-4ced-8af1-44f840a2fae0_1200x600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0g7I!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22ac9335-f55b-4ced-8af1-44f840a2fae0_1200x600.png" width="1200" height="600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/22ac9335-f55b-4ced-8af1-44f840a2fae0_1200x600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:600,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0g7I!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22ac9335-f55b-4ced-8af1-44f840a2fae0_1200x600.png 424w, https://substackcdn.com/image/fetch/$s_!0g7I!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22ac9335-f55b-4ced-8af1-44f840a2fae0_1200x600.png 848w, https://substackcdn.com/image/fetch/$s_!0g7I!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22ac9335-f55b-4ced-8af1-44f840a2fae0_1200x600.png 1272w, https://substackcdn.com/image/fetch/$s_!0g7I!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22ac9335-f55b-4ced-8af1-44f840a2fae0_1200x600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>A Claude Code / Codex / Gemini CLI plugin that instructs the agent to drop articles, filler, and pleasantries in its output while preserving technical content. Benchmarks in the README clock ~75% output token reduction and ~46% input token reduction with no quality loss on downstream evals. 30,800 stars in 11 days since launch. MIT, active maintainers, one-line install across seven agents. Worth the three-minute install if you&#8217;ve watched an agent burn context on &#8220;Certainly! I&#8217;d be happy to help you with that.&#8221; <strong>Note:</strong> Try it and test your luck :)</p><div><hr></div><h2><strong>&#127891; 1 Pick of the Week</strong></h2><h3><strong>Stanford CS 153: Frontier Systems (Spring 2026)</strong></h3><p><strong><a href="https://www.youtube.com/playlist?list=PL2aDf5-VARtBwz1kz5FsuSZXOig2U6aJI">https://www.youtube.com/playlist?list=PL2aDf5-VARtBwz1kz5FsuSZXOig2U6aJI</a></strong></p><p>Stanford&#8217;s new Spring 2026 course on the AI infrastructure stack, taught by Anjney Midha (AMP PBC) and Michael Abbott. Weekly lectures from the people building the frontier: Andreas Blattmann on Black Forest Labs&#8217; image models, Mati Staniszewski on ElevenLabs&#8217; audio stack, and Midha himself on the infrastructure rewrite underneath this moment. Runs through June 3, with upcoming guests including Karpathy, Jensen Huang, Sam Altman, and Satya Nadella. The closest thing to a live running commentary on the stack as it&#8217;s being built.</p><div><hr></div><p><em>That&#8217;s the twenty-fourth Tokenizer. If one of these fifteen resources changes what you&#8217;re building this week, forward the edition to whoever should see it next, and come find the long-form work at <a href="https://gradientascent.co">Gradient Ascent</a>.</em></p>]]></content:encoded></item><item><title><![CDATA[Running your Life with Claude Code, How OpenAI Uses Codex, and the Anatomy of a Coding Agent - 📚 The Tokenizer Edition #23]]></title><description><![CDATA[This week's most valuable AI resources]]></description><link>https://newsletter.artofsaience.com/p/running-your-life-with-claude-code</link><guid isPermaLink="false">https://newsletter.artofsaience.com/p/running-your-life-with-claude-code</guid><dc:creator><![CDATA[Sairam Sundaresan]]></dc:creator><pubDate>Thu, 09 Apr 2026 12:03:36 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/oBWRHnggscM" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey there! This week kept circling back to one question: what does it actually take to make agents useful in production, not just in demos? A field engineer at Galileo is now answering every customer question by routing Claude Code across fifteen separate repositories, OpenAI&#8217;s Codex team is dogfooding their own tools, and Sebastian Raschka quietly explains why the &#8220;harness&#8221; around the model matters more than the model itself.</p><h3><strong>New here?</strong></h3><p><em>The Tokenizer is my resource-focused newsletter edition where I curate the best AI/ML papers, videos, articles, tools, and learning resources so you don&#8217;t have to sift through the noise. Subscribe to <a href="https://gradientascent.co">Gradient Ascent</a> for the full experience.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.artofsaience.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption"></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2><strong>TL;DR</strong></h2><p>What caught my attention this week:</p><ul><li><p>&#128196; <strong>Papers:</strong> Benchmarks fight back against saturation, with new evaluations for video understanding, autonomous agents, robot policies, and the data behind LLM training.</p></li><li><p>&#127909; <strong>Videos:</strong> Inside views from Stanford, OpenAI, Galileo, and Databricks on how agents actually run when you put them in front of real engineers (and non-engineers).</p></li><li><p>&#128240; <strong>Reads:</strong> Three sharp takes on coding agent architecture, why your RAG pipeline is overbuilt, and what really decides whether an open model gets adopted.</p></li><li><p>&#128736; <strong>Tools:</strong> Memory and a Rust-native agent loop, two pieces of the open agent stack worth knowing.</p></li><li><p>&#127891; <strong>Learning:</strong> A reproducible walkthrough of using Claude Code as a personal operating system, not as a coding tool.</p></li></ul><div><hr></div><h2><strong>&#128196; 5 Papers</strong></h2><h3><strong>1. Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding</strong></h3><p><strong><a href="https://arxiv.org/abs/2604.05015">https://arxiv.org/abs/2604.05015</a></strong> | <strong><a href="https://github.com/MME-Benchmarks/Video-MME-v2">GitHub</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xNxc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5138498d-de34-4065-b154-6bfe321aa0bd_996x521.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xNxc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5138498d-de34-4065-b154-6bfe321aa0bd_996x521.png 424w, https://substackcdn.com/image/fetch/$s_!xNxc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5138498d-de34-4065-b154-6bfe321aa0bd_996x521.png 848w, https://substackcdn.com/image/fetch/$s_!xNxc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5138498d-de34-4065-b154-6bfe321aa0bd_996x521.png 1272w, https://substackcdn.com/image/fetch/$s_!xNxc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5138498d-de34-4065-b154-6bfe321aa0bd_996x521.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xNxc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5138498d-de34-4065-b154-6bfe321aa0bd_996x521.png" width="996" height="521" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5138498d-de34-4065-b154-6bfe321aa0bd_996x521.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:521,&quot;width&quot;:996,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xNxc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5138498d-de34-4065-b154-6bfe321aa0bd_996x521.png 424w, https://substackcdn.com/image/fetch/$s_!xNxc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5138498d-de34-4065-b154-6bfe321aa0bd_996x521.png 848w, https://substackcdn.com/image/fetch/$s_!xNxc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5138498d-de34-4065-b154-6bfe321aa0bd_996x521.png 1272w, https://substackcdn.com/image/fetch/$s_!xNxc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5138498d-de34-4065-b154-6bfe321aa0bd_996x521.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Video benchmarks have saturated to the point where leaderboard scores tell you almost nothing about real model capability. Video-MME-v2 introduces a tri-level hierarchy that escalates from visual aggregation up through temporal reasoning, plus a group-based scoring rule that punishes lucky guesses across linked questions. The team logged about 3,300 human-hours across 12 annotators and 50 reviewers, and one finding stands out: models lean hard on subtitles, and reasoning quality drops sharply when only the pixels are available.</p><h3><strong>2. DataFlex: A Unified Framework for Data-Centric Dynamic Training of Large Language Models</strong></h3><p><strong><a href="https://arxiv.org/abs/2603.26164">https://arxiv.org/abs/2603.26164</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!q6ZQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b73f177-2e5c-4e8e-a616-8bbccd8f3030_996x494.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!q6ZQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b73f177-2e5c-4e8e-a616-8bbccd8f3030_996x494.png 424w, https://substackcdn.com/image/fetch/$s_!q6ZQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b73f177-2e5c-4e8e-a616-8bbccd8f3030_996x494.png 848w, https://substackcdn.com/image/fetch/$s_!q6ZQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b73f177-2e5c-4e8e-a616-8bbccd8f3030_996x494.png 1272w, https://substackcdn.com/image/fetch/$s_!q6ZQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b73f177-2e5c-4e8e-a616-8bbccd8f3030_996x494.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!q6ZQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b73f177-2e5c-4e8e-a616-8bbccd8f3030_996x494.png" width="996" height="494" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8b73f177-2e5c-4e8e-a616-8bbccd8f3030_996x494.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:494,&quot;width&quot;:996,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!q6ZQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b73f177-2e5c-4e8e-a616-8bbccd8f3030_996x494.png 424w, https://substackcdn.com/image/fetch/$s_!q6ZQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b73f177-2e5c-4e8e-a616-8bbccd8f3030_996x494.png 848w, https://substackcdn.com/image/fetch/$s_!q6ZQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b73f177-2e5c-4e8e-a616-8bbccd8f3030_996x494.png 1272w, https://substackcdn.com/image/fetch/$s_!q6ZQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b73f177-2e5c-4e8e-a616-8bbccd8f3030_996x494.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>If you&#8217;ve ever tried to compare data selection, mixture optimization, and reweighting research, you know each lives in its own incompatible codebase. DataFlex sits on top of LLaMA-Factory and gives you one set of trainer abstractions for sample selection, DoReMi/ODM-style mixture tuning, and reweighting, all DeepSpeed ZeRO-3 compatible. It&#8217;s an infrastructure contribution rather than a new method, so the value here is reproducibility and composability across data-centric techniques you already wanted to try.</p><h3><strong>3. Adam&#8217;s Law: Textual Frequency Law on Large Language Models</strong></h3><p><strong><a href="https://arxiv.org/abs/2604.02176">https://arxiv.org/abs/2604.02176</a></strong> | <strong><a href="https://github.com/HongyuanLuke/frequencylaw">GitHub</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!RD8I!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7a3c36f-ff16-4754-9463-73b293ef9292_1194x1600.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!RD8I!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7a3c36f-ff16-4754-9463-73b293ef9292_1194x1600.jpeg 424w, https://substackcdn.com/image/fetch/$s_!RD8I!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7a3c36f-ff16-4754-9463-73b293ef9292_1194x1600.jpeg 848w, https://substackcdn.com/image/fetch/$s_!RD8I!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7a3c36f-ff16-4754-9463-73b293ef9292_1194x1600.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!RD8I!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7a3c36f-ff16-4754-9463-73b293ef9292_1194x1600.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!RD8I!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7a3c36f-ff16-4754-9463-73b293ef9292_1194x1600.jpeg" width="1194" height="1600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f7a3c36f-ff16-4754-9463-73b293ef9292_1194x1600.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1600,&quot;width&quot;:1194,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!RD8I!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7a3c36f-ff16-4754-9463-73b293ef9292_1194x1600.jpeg 424w, https://substackcdn.com/image/fetch/$s_!RD8I!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7a3c36f-ff16-4754-9463-73b293ef9292_1194x1600.jpeg 848w, https://substackcdn.com/image/fetch/$s_!RD8I!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7a3c36f-ff16-4754-9463-73b293ef9292_1194x1600.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!RD8I!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7a3c36f-ff16-4754-9463-73b293ef9292_1194x1600.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The headline claim sounds tautological (frequent text is better text), but the paper actually proposes a measurement framework with three concrete components: a Textual Frequency Law that estimates sentence-level frequency from open web sources, a distillation step (TFD) that refines that estimate by querying the target model itself, and a curriculum (CTFT) that orders training data from rare to frequent expressions during fine-tuning. Tests cover math reasoning, machine translation, commonsense, and tool calling, which is a wider footprint than the abstract suggests.</p><h3><strong>4. Claw-Eval: Toward Trustworthy Evaluation of Autonomous Agents</strong></h3><p><strong><a href="https://arxiv.org/abs/2604.06132">https://arxiv.org/abs/2604.06132</a></strong> | <strong><a href="https://github.com/claw-eval/claw-eval">GitHub</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kbRC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28c0d5b3-8d3b-4238-98fe-fb0266af9a79_996x629.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kbRC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28c0d5b3-8d3b-4238-98fe-fb0266af9a79_996x629.png 424w, https://substackcdn.com/image/fetch/$s_!kbRC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28c0d5b3-8d3b-4238-98fe-fb0266af9a79_996x629.png 848w, https://substackcdn.com/image/fetch/$s_!kbRC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28c0d5b3-8d3b-4238-98fe-fb0266af9a79_996x629.png 1272w, https://substackcdn.com/image/fetch/$s_!kbRC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28c0d5b3-8d3b-4238-98fe-fb0266af9a79_996x629.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kbRC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28c0d5b3-8d3b-4238-98fe-fb0266af9a79_996x629.png" width="996" height="629" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/28c0d5b3-8d3b-4238-98fe-fb0266af9a79_996x629.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:629,&quot;width&quot;:996,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kbRC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28c0d5b3-8d3b-4238-98fe-fb0266af9a79_996x629.png 424w, https://substackcdn.com/image/fetch/$s_!kbRC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28c0d5b3-8d3b-4238-98fe-fb0266af9a79_996x629.png 848w, https://substackcdn.com/image/fetch/$s_!kbRC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28c0d5b3-8d3b-4238-98fe-fb0266af9a79_996x629.png 1272w, https://substackcdn.com/image/fetch/$s_!kbRC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28c0d5b3-8d3b-4238-98fe-fb0266af9a79_996x629.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Most agent benchmarks only check the final answer, which means a trajectory full of safety violations can still score a clean pass. Claw-Eval grades the path, not just the destination, with 2,159 fine-grained rubric items across 300 tasks, scored over execution traces, audit logs, and environment snapshots. The headline result is that trajectory-opaque grading misses 44% of safety violations and 13% of robustness failures, which is a hard number to ignore if you&#8217;re building agent eval today.</p><h3><strong>5. LIBERO-Para: A Diagnostic Benchmark and Metrics for Paraphrase Robustness in VLA Models</strong></h3><p><strong><a href="https://arxiv.org/abs/2603.28301">https://arxiv.org/abs/2603.28301</a></strong> | <strong><a href="https://github.com/cau-hai-lab/LIBERO-Para">GitHub</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!XvQV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F602b88c3-8518-4594-b338-e3c16e97d3e2_793x280.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!XvQV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F602b88c3-8518-4594-b338-e3c16e97d3e2_793x280.png 424w, https://substackcdn.com/image/fetch/$s_!XvQV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F602b88c3-8518-4594-b338-e3c16e97d3e2_793x280.png 848w, https://substackcdn.com/image/fetch/$s_!XvQV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F602b88c3-8518-4594-b338-e3c16e97d3e2_793x280.png 1272w, https://substackcdn.com/image/fetch/$s_!XvQV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F602b88c3-8518-4594-b338-e3c16e97d3e2_793x280.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!XvQV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F602b88c3-8518-4594-b338-e3c16e97d3e2_793x280.png" width="793" height="280" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/602b88c3-8518-4594-b338-e3c16e97d3e2_793x280.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:280,&quot;width&quot;:793,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!XvQV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F602b88c3-8518-4594-b338-e3c16e97d3e2_793x280.png 424w, https://substackcdn.com/image/fetch/$s_!XvQV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F602b88c3-8518-4594-b338-e3c16e97d3e2_793x280.png 848w, https://substackcdn.com/image/fetch/$s_!XvQV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F602b88c3-8518-4594-b338-e3c16e97d3e2_793x280.png 1272w, https://substackcdn.com/image/fetch/$s_!XvQV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F602b88c3-8518-4594-b338-e3c16e97d3e2_793x280.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Vision-Language-Action models look impressive until you reword the instruction, and then performance drops 22 to 52 points across the seven configurations tested (0.6B to 7.5B parameters). LIBERO-Para isolates action phrasing from object references so you can see exactly where the model is reading words instead of meaning. The authors trace 80 to 96 percent of failures to the planning stage, not execution, and introduce PRIDE, a metric that quantifies paraphrase difficulty by semantic and syntactic distance.</p><div><hr></div><h2><strong>&#127909; 4 Videos</strong></h2><h3><strong>1. World Models You Can Actually Interact With: Inside Moonlake</strong></h3><div id="youtube2-oBWRHnggscM" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;oBWRHnggscM&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/oBWRHnggscM?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Interactivity changes what a world model has to learn, and that&#8217;s the thread Stanford&#8217;s Chris Manning and Fan-yun Sun pull on in this Latent Space conversation about Moonlake. The 67-minute episode gets into how the training signal differs from passive video models, and where this approach sits relative to Marble, Cosmos, and the gaming-data world models that have dominated the last quarter. Worth watching as a counterweight to the &#8220;world models = video generation&#8221; framing.</p><h3><strong>2. How OpenAI&#8217;s Codex Team Actually Builds with Codex</strong></h3><div id="youtube2-9qXc-THAvc0" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;9qXc-THAvc0&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/9qXc-THAvc0?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>&#8220;For specs, we write like 10 bullets and that&#8217;s it,&#8221; and &#8220;our designers now write more code than eng did 6 months ago&#8221; are two of the throwaway lines in this 43-minute sit-down with Codex product lead Alex and developer experience lead Romain on Peter Yang&#8217;s channel. The conversation is unusually candid about what shipping without traditional specs and roadmaps looks like inside a team that lives on its own tools. Closest thing to a field report you&#8217;ll get on agent-native product development right now.</p><h3><strong>3. A Non-Engineer Runs Claude Code Across 15 Repos to Answer Every Customer Question</strong></h3><div id="youtube2-AI1FLDY3q5s" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;AI1FLDY3q5s&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/AI1FLDY3q5s?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Al Chen is a field engineer at Galileo, an AI observability platform, and he has never held an engineering role. He has also built something most coding teams would be proud of: a Claude Code setup that queries 15 separate internal repositories, stitches in Confluence docs and customer-specific quirks, and delivers answers that previously required pulling an engineer off real work. The walkthrough covers his custom Claude Code commands, the sixteen-line sync script (written entirely by Claude Code) that pulls every repo&#8217;s main branch each morning, and the multi-source MCP pattern that lets a single question hit code, docs, and deployment notes in one pass. If you&#8217;ve been treating Claude Code as a coding assistant, this is the reframe that turns it into a customer support operating system.</p><h3><strong>4. From Chaos to Choreography: Multi-Agent Orchestration That Actually Works</strong></h3><div id="youtube2-2czYyrTzILg" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;2czYyrTzILg&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/2czYyrTzILg?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Sandipan Bhaumik from Databricks opens with the line of the week: &#8220;Adding more agents isn&#8217;t adding more features. It&#8217;s building a distributed system.&#8221; His 26-minute AI Engineer talk covers the silent handoff failures, stale state, and untraceable decisions that show up once you scale from one agent to five, then walks through the orchestrator and choreography patterns Databricks uses in production. If you&#8217;re past the proof-of-concept stage, this is the talk you actually need.</p><div><hr></div><h2><strong>&#128240; 3 Curated Reads</strong></h2><h3><strong>1. Components of a Coding Agent</strong></h3><div class="embedded-post-wrap" data-attrs="{&quot;id&quot;:193137515,&quot;url&quot;:&quot;https://magazine.sebastianraschka.com/p/components-of-a-coding-agent&quot;,&quot;publication_id&quot;:1174659,&quot;publication_name&quot;:&quot;Ahead of AI&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!96vs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49f25d0a-212b-4853-8bcb-128d0a3edbbf_1196x1196.png&quot;,&quot;title&quot;:&quot;Components of A Coding Agent&quot;,&quot;truncated_body_text&quot;:&quot;In this article, I want to cover the overall design of coding agents and agent harnesses: what they are, how they work, and how the different pieces fit together in practice. Readers of my Build a Large Language Model (From Scratch) and Build a Large Reasoning Model (From Scratch)&quot;,&quot;date&quot;:&quot;2026-04-04T11:45:37.090Z&quot;,&quot;like_count&quot;:511,&quot;comment_count&quot;:51,&quot;bylines&quot;:[{&quot;id&quot;:27393275,&quot;name&quot;:&quot;Sebastian Raschka, PhD&quot;,&quot;handle&quot;:&quot;rasbt&quot;,&quot;previous_name&quot;:&quot;Sebastian Raschka&quot;,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F61f4c017-506f-4e9b-a24f-76340dad0309_800x800.jpeg&quot;,&quot;bio&quot;:&quot;I'm an LLM research engineer 10+ years of experience in artificial intelligence. My expertise lies in AI &amp; LLM research focusing on code-driven implementations. I am also the author of \&quot;Build a Large Language Model From Scratch\&quot; (amzn.to/4fqvn0D).&quot;,&quot;profile_set_up_at&quot;:&quot;2022-10-09T16:19:59.744Z&quot;,&quot;reader_installed_at&quot;:&quot;2022-11-07T19:56:32.129Z&quot;,&quot;publicationUsers&quot;:[{&quot;id&quot;:1127862,&quot;user_id&quot;:27393275,&quot;publication_id&quot;:1174659,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:true,&quot;publication&quot;:{&quot;id&quot;:1174659,&quot;name&quot;:&quot;Ahead of AI&quot;,&quot;subdomain&quot;:&quot;sebastianraschka&quot;,&quot;custom_domain&quot;:&quot;magazine.sebastianraschka.com&quot;,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Ahead of AI focuses on machine learning and AI research and is read by more than 150,000 researchers and practitioners who want to stay ahead in a rapidly evolving field.&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/49f25d0a-212b-4853-8bcb-128d0a3edbbf_1196x1196.png&quot;,&quot;author_id&quot;:27393275,&quot;primary_user_id&quot;:27393275,&quot;theme_var_background_pop&quot;:&quot;#2096FF&quot;,&quot;created_at&quot;:&quot;2022-11-04T18:30:05.218Z&quot;,&quot;email_from_name&quot;:null,&quot;copyright&quot;:&quot;Raschka AI Research (RAIR) Lab LLC&quot;,&quot;founding_plan_name&quot;:&quot;Founding plan&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;enabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false,&quot;logo_url_wide&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5083e6d3-fbc9-4870-95b9-6e85d02f62a6_9366x2023.png&quot;}}],&quot;twitter_screen_name&quot;:&quot;rasbt&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:1000,&quot;status&quot;:{&quot;bestsellerTier&quot;:1000,&quot;subscriberTier&quot;:1,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:{&quot;type&quot;:&quot;bestseller&quot;,&quot;tier&quot;:1000},&quot;paidPublicationIds&quot;:[1783977,9873],&quot;subscriber&quot;:null}}],&quot;utm_campaign&quot;:null,&quot;belowTheFold&quot;:true,&quot;type&quot;:&quot;newsletter&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPostToDOM"><a class="embedded-post" native="true" href="https://magazine.sebastianraschka.com/p/components-of-a-coding-agent?utm_source=substack&amp;utm_campaign=post_embed&amp;utm_medium=web"><div class="embedded-post-header"><img class="embedded-post-publication-logo" src="https://substackcdn.com/image/fetch/$s_!96vs!,w_56,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49f25d0a-212b-4853-8bcb-128d0a3edbbf_1196x1196.png" loading="lazy"><span class="embedded-post-publication-name">Ahead of AI</span></div><div class="embedded-post-title-wrapper"><div class="embedded-post-title">Components of A Coding Agent</div></div><div class="embedded-post-body">In this article, I want to cover the overall design of coding agents and agent harnesses: what they are, how they work, and how the different pieces fit together in practice. Readers of my Build a Large Language Model (From Scratch) and Build a Large Reasoning Model (From Scratch&#8230;</div><div class="embedded-post-cta-wrapper"><span class="embedded-post-cta">Read more</span></div><div class="embedded-post-meta">21 days ago &#183; 511 likes &#183; 51 comments &#183; Sebastian Raschka, PhD</div></a></div><p>Sebastian Raschka makes the case that what looks like a smarter model is usually a better harness around the same model. He breaks coding agents into six concrete pieces (live repo context, prompt shape and cache reuse, tool access, context reduction, structured session memory, and bounded subagents) and shows how they fit into a three-layer architecture of model, agent loop, and runtime support. Read this before you blame your model for behavior that&#8217;s actually a context engineering problem.</p><h3><strong>2. Your RAG Pipeline Is Overkill</strong></h3><div class="embedded-post-wrap" data-attrs="{&quot;id&quot;:193050808,&quot;url&quot;:&quot;https://www.decodingai.com/p/recursive-language-models&quot;,&quot;publication_id&quot;:1526003,&quot;publication_name&quot;:&quot;Decoding AI Magazine&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!k2ig!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00bc74e0-3601-49ce-8ab9-4c7b499ce597_1280x1280.png&quot;,&quot;title&quot;:&quot;Your RAG Pipeline Is Overkill&quot;,&quot;truncated_body_text&quot;:&quot;We constantly fight a battle against the context window limit. You either compress your data until it loses meaning, or you build a massive infrastructure project just to read a few documents. Today, we look at a third option. We explore a pattern that allows models to read millions of tokens by treating data as an environment rather than an input.&quot;,&quot;date&quot;:&quot;2026-04-07T11:03:14.141Z&quot;,&quot;like_count&quot;:43,&quot;comment_count&quot;:3,&quot;bylines&quot;:[{&quot;id&quot;:110559689,&quot;name&quot;:&quot;Paul Iusztin&quot;,&quot;handle&quot;:&quot;pauliusztin&quot;,&quot;previous_name&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0714d360-396c-4b41-a676-1b58dc1dc5f3_1470x1470.jpeg&quot;,&quot;bio&quot;:&quot;Senior AI Engineer &#8226; Founder @ Decoding AI &#8226; Author @ LLM Engineer&#8217;s Handbook I ship AI products and teach you about the process.&quot;,&quot;profile_set_up_at&quot;:&quot;2023-03-27T06:10:29.110Z&quot;,&quot;reader_installed_at&quot;:&quot;2023-08-24T17:10:51.998Z&quot;,&quot;publicationUsers&quot;:[{&quot;id&quot;:1494048,&quot;user_id&quot;:110559689,&quot;publication_id&quot;:1526003,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:true,&quot;publication&quot;:{&quot;id&quot;:1526003,&quot;name&quot;:&quot;Decoding AI Magazine&quot;,&quot;subdomain&quot;:&quot;decodingaimagazine&quot;,&quot;custom_domain&quot;:&quot;www.decodingai.com&quot;,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Join for content on designing, building, and shipping AI software. Learn AI engineering, end-to-end, from idea to production. Every Tuesday.&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/00bc74e0-3601-49ce-8ab9-4c7b499ce597_1280x1280.png&quot;,&quot;author_id&quot;:110559689,&quot;primary_user_id&quot;:110559689,&quot;theme_var_background_pop&quot;:&quot;#A33ACB&quot;,&quot;created_at&quot;:&quot;2023-03-27T06:17:03.688Z&quot;,&quot;email_from_name&quot;:&quot;Decoding AI Magazine&quot;,&quot;copyright&quot;:&quot;Paul Iusztin&quot;,&quot;founding_plan_name&quot;:&quot;Founding Member&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;enabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false,&quot;logo_url_wide&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/85e4cd45-ca39-48d4-941c-86dc67ba9848_1344x325.png&quot;}}],&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100,&quot;status&quot;:{&quot;bestsellerTier&quot;:100,&quot;subscriberTier&quot;:null,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:{&quot;type&quot;:&quot;bestseller&quot;,&quot;tier&quot;:100},&quot;paidPublicationIds&quot;:[],&quot;subscriber&quot;:null}}],&quot;utm_campaign&quot;:null,&quot;belowTheFold&quot;:true,&quot;type&quot;:&quot;newsletter&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPostToDOM"><a class="embedded-post" native="true" href="https://www.decodingai.com/p/recursive-language-models?utm_source=substack&amp;utm_campaign=post_embed&amp;utm_medium=web"><div class="embedded-post-header"><img class="embedded-post-publication-logo" src="https://substackcdn.com/image/fetch/$s_!k2ig!,w_56,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00bc74e0-3601-49ce-8ab9-4c7b499ce597_1280x1280.png" loading="lazy"><span class="embedded-post-publication-name">Decoding AI Magazine</span></div><div class="embedded-post-title-wrapper"><div class="embedded-post-title">Your RAG Pipeline Is Overkill</div></div><div class="embedded-post-body">We constantly fight a battle against the context window limit. You either compress your data until it loses meaning, or you build a massive infrastructure project just to read a few documents. Today, we look at a third option. We explore a pattern that allows models to read millions of tokens by treating data as an environment rather than an input&#8230;</div><div class="embedded-post-cta-wrapper"><span class="embedded-post-cta">Read more</span></div><div class="embedded-post-meta">18 days ago &#183; 43 likes &#183; 3 comments &#183; Paul Iusztin</div></a></div><p>Paul Iusztin argues that recursive language models (RLMs) make most RAG pipelines unnecessary. The core idea is that the model never receives the giant document directly. Instead, the data lives outside the context as a REPL variable, and the model writes code to explore, filter, and recursively process it through `llm_query()` calls. Iusztin reports RLMs being tested up to 10 million tokens with GPT-5 and Qwen3-Coder, and lays out four scenarios (file parsing, codebase analysis, legal and financial work, research synthesis) where this approach beats stuffing chunks into a vector store.</p><h3><strong>3. Gemma 4 and What Makes an Open Model Succeed</strong></h3><div class="embedded-post-wrap" data-attrs="{&quot;id&quot;:193022426,&quot;url&quot;:&quot;https://www.interconnects.ai/p/gemma-4-and-what-makes-an-open-model&quot;,&quot;publication_id&quot;:48206,&quot;publication_name&quot;:&quot;Interconnects AI&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!djof!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc52e8097-8f3d-4f7e-808b-2f4ad37f3b52_720x720.png&quot;,&quot;title&quot;:&quot;Gemma 4 and what makes an open model succeed&quot;,&quot;truncated_body_text&quot;:&quot;Having written a lot of model release blog posts, there&#8217;s something much harder about reviewing open models when they drop relative to closed models, especially in 2026. In recent years, there were so few open models, so when Llama 3 was released most people were still doing research on Llama 2 and super happy to get an update. When&quot;,&quot;date&quot;:&quot;2026-04-03T16:57:36.626Z&quot;,&quot;like_count&quot;:70,&quot;comment_count&quot;:0,&quot;bylines&quot;:[{&quot;id&quot;:10472909,&quot;name&quot;:&quot;Nathan Lambert&quot;,&quot;handle&quot;:&quot;natolambert&quot;,&quot;previous_name&quot;:null,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!RihO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fedcdfb-e137-4f6a-9089-a46add6c6242_500x500.jpeg&quot;,&quot;bio&quot;:&quot;ML researcher making sense of AI research, products, and the uncertain technological future. PhD from Berkeley AI. Experience at Meta, DeepMind, HuggingFace.&quot;,&quot;profile_set_up_at&quot;:&quot;2021-04-24T01:19:33.371Z&quot;,&quot;reader_installed_at&quot;:&quot;2022-03-09T17:52:30.690Z&quot;,&quot;publicationUsers&quot;:[{&quot;id&quot;:100753,&quot;user_id&quot;:10472909,&quot;publication_id&quot;:48206,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:true,&quot;publication&quot;:{&quot;id&quot;:48206,&quot;name&quot;:&quot;Interconnects AI&quot;,&quot;subdomain&quot;:&quot;robotic&quot;,&quot;custom_domain&quot;:&quot;www.interconnects.ai&quot;,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;The cutting edge of AI, from inside the frontier AI labs, minus the hype. The border between high-level and technical thinking. Read by leading engineers, researchers, and investors.&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c52e8097-8f3d-4f7e-808b-2f4ad37f3b52_720x720.png&quot;,&quot;author_id&quot;:10472909,&quot;primary_user_id&quot;:10472909,&quot;theme_var_background_pop&quot;:&quot;#ff6b00&quot;,&quot;created_at&quot;:&quot;2020-05-21T02:59:47.895Z&quot;,&quot;email_from_name&quot;:&quot;Interconnects by Nathan Lambert&quot;,&quot;copyright&quot;:&quot;Interconnects AI, LLC&quot;,&quot;founding_plan_name&quot;:&quot;Founding Member&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;enabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;magaziney&quot;,&quot;is_personal_mode&quot;:false,&quot;logo_url_wide&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/858a68f7-2e7e-4dd3-bed1-631b36801ce2_1651x357.png&quot;}},{&quot;id&quot;:4610799,&quot;user_id&quot;:10472909,&quot;publication_id&quot;:4519930,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:false,&quot;publication&quot;:{&quot;id&quot;:4519930,&quot;name&quot;:&quot;natolambert overflow&quot;,&quot;subdomain&quot;:&quot;natolambert&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;a place for any extra thoughts beyond Interconnects.ai&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/eb88d599-32c8-49a9-ba33-ab6327aff727_256x256.png&quot;,&quot;author_id&quot;:10472909,&quot;primary_user_id&quot;:null,&quot;theme_var_background_pop&quot;:&quot;#FF6719&quot;,&quot;created_at&quot;:&quot;2025-03-27T15:04:05.448Z&quot;,&quot;email_from_name&quot;:null,&quot;copyright&quot;:&quot;Nathan Lambert&quot;,&quot;founding_plan_name&quot;:null,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;disabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false,&quot;logo_url_wide&quot;:null}},{&quot;id&quot;:4926744,&quot;user_id&quot;:10472909,&quot;publication_id&quot;:4830082,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:false,&quot;publication&quot;:{&quot;id&quot;:4830082,&quot;name&quot;:&quot;Retort AI&quot;,&quot;subdomain&quot;:&quot;retortai&quot;,&quot;custom_domain&quot;:&quot;www.retortai.com&quot;,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Distilling the major events and challenges in the world of artificial intelligence and machine learning, from Thomas Krendl Gilbert and Nathan Lambert.\n\n&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cbad298c-6074-441b-ad43-d5df6dbf101d_800x800.png&quot;,&quot;author_id&quot;:10472909,&quot;primary_user_id&quot;:null,&quot;theme_var_background_pop&quot;:&quot;#FF6719&quot;,&quot;created_at&quot;:&quot;2025-04-25T22:10:28.216Z&quot;,&quot;email_from_name&quot;:null,&quot;copyright&quot;:&quot;Nathan Lambert&quot;,&quot;founding_plan_name&quot;:null,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;disabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false,&quot;logo_url_wide&quot;:null}}],&quot;twitter_screen_name&quot;:&quot;natolambert&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100,&quot;status&quot;:{&quot;bestsellerTier&quot;:100,&quot;subscriberTier&quot;:5,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:{&quot;type&quot;:&quot;bestseller&quot;,&quot;tier&quot;:100},&quot;paidPublicationIds&quot;:[883883,1084918,6349492,6027,1915042,69345],&quot;subscriber&quot;:null}}],&quot;utm_campaign&quot;:null,&quot;belowTheFold&quot;:true,&quot;type&quot;:&quot;newsletter&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPostToDOM"><a class="embedded-post" native="true" href="https://www.interconnects.ai/p/gemma-4-and-what-makes-an-open-model?utm_source=substack&amp;utm_campaign=post_embed&amp;utm_medium=web"><div class="embedded-post-header"><img class="embedded-post-publication-logo" src="https://substackcdn.com/image/fetch/$s_!djof!,w_56,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc52e8097-8f3d-4f7e-808b-2f4ad37f3b52_720x720.png" loading="lazy"><span class="embedded-post-publication-name">Interconnects AI</span></div><div class="embedded-post-title-wrapper"><div class="embedded-post-title">Gemma 4 and what makes an open model succeed</div></div><div class="embedded-post-body">Having written a lot of model release blog posts, there&#8217;s something much harder about reviewing open models when they drop relative to closed models, especially in 2026. In recent years, there were so few open models, so when Llama 3 was released most people were still doing research on Llama 2 and super happy to get an update. When&#8230;</div><div class="embedded-post-cta-wrapper"><span class="embedded-post-cta">Read more</span></div><div class="embedded-post-meta">22 days ago &#183; 70 likes &#183; Nathan Lambert</div></a></div><p>Benchmark numbers at release tell you almost nothing about which open models will actually get used, and Nathan Lambert uses the Gemma 4 launch to explain why. His five-factor framework (performance and size, country of origin, license, tooling at release, fine-tunability) maps onto the messy reality that ecosystem maturity often takes 18 months to catch up with a model launch. The note about Google moving to Apache 2.0 and the 30B dense model targeting the enterprise sweet spot is the part to underline.</p><div><hr></div><h2><strong>&#128736; 2 Tools &amp; Repos</strong></h2><h3><strong>1. mem0</strong></h3><p><strong><a href="https://github.com/mem0ai/mem0">https://github.com/mem0ai/mem0</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!x03M!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b690285-c6b5-4e4d-bb6d-a8d9ba3f934a_1572x273.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!x03M!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b690285-c6b5-4e4d-bb6d-a8d9ba3f934a_1572x273.png 424w, https://substackcdn.com/image/fetch/$s_!x03M!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b690285-c6b5-4e4d-bb6d-a8d9ba3f934a_1572x273.png 848w, https://substackcdn.com/image/fetch/$s_!x03M!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b690285-c6b5-4e4d-bb6d-a8d9ba3f934a_1572x273.png 1272w, https://substackcdn.com/image/fetch/$s_!x03M!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b690285-c6b5-4e4d-bb6d-a8d9ba3f934a_1572x273.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!x03M!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b690285-c6b5-4e4d-bb6d-a8d9ba3f934a_1572x273.png" width="1456" height="253" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1b690285-c6b5-4e4d-bb6d-a8d9ba3f934a_1572x273.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:253,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!x03M!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b690285-c6b5-4e4d-bb6d-a8d9ba3f934a_1572x273.png 424w, https://substackcdn.com/image/fetch/$s_!x03M!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b690285-c6b5-4e4d-bb6d-a8d9ba3f934a_1572x273.png 848w, https://substackcdn.com/image/fetch/$s_!x03M!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b690285-c6b5-4e4d-bb6d-a8d9ba3f934a_1572x273.png 1272w, https://substackcdn.com/image/fetch/$s_!x03M!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b690285-c6b5-4e4d-bb6d-a8d9ba3f934a_1572x273.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Memory is the bottleneck for any agent that needs to remember anything across sessions, and mem0 is the fastest-moving open option for solving it. The framework manages user, session, and agent state through a single API, with self-hosted Python and TypeScript packages plus a managed cloud service if you want to skip the infra. The team reports 26% better accuracy than OpenAI Memory and roughly 90% lower token usage versus full-context approaches, which matches what people are seeing in production.</p><h3><strong>2. goose</strong></h3><p><strong><a href="https://github.com/block/goose">https://github.com/aaif-goose/goose</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iiVM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4fdd6266-e39c-485d-96b5-b86869fef03b_1200x600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iiVM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4fdd6266-e39c-485d-96b5-b86869fef03b_1200x600.png 424w, https://substackcdn.com/image/fetch/$s_!iiVM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4fdd6266-e39c-485d-96b5-b86869fef03b_1200x600.png 848w, https://substackcdn.com/image/fetch/$s_!iiVM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4fdd6266-e39c-485d-96b5-b86869fef03b_1200x600.png 1272w, https://substackcdn.com/image/fetch/$s_!iiVM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4fdd6266-e39c-485d-96b5-b86869fef03b_1200x600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iiVM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4fdd6266-e39c-485d-96b5-b86869fef03b_1200x600.png" width="1200" height="600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4fdd6266-e39c-485d-96b5-b86869fef03b_1200x600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:600,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!iiVM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4fdd6266-e39c-485d-96b5-b86869fef03b_1200x600.png 424w, https://substackcdn.com/image/fetch/$s_!iiVM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4fdd6266-e39c-485d-96b5-b86869fef03b_1200x600.png 848w, https://substackcdn.com/image/fetch/$s_!iiVM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4fdd6266-e39c-485d-96b5-b86869fef03b_1200x600.png 1272w, https://substackcdn.com/image/fetch/$s_!iiVM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4fdd6266-e39c-485d-96b5-b86869fef03b_1200x600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Block built goose as a general-purpose AI agent in Rust, and it&#8217;s now part of the Agentic AI Foundation under the Linux Foundation. You get a desktop app, CLI, and API that work with 15+ LLM providers and over 70 MCP extensions, and you can use existing Claude, ChatGPT, or Gemini subscriptions instead of API keys. If you&#8217;ve been looking for a serious open alternative to the closed coding agent stack, this is the most active one right now.</p><div><hr></div><h2><strong>&#127891; 1 Pick of the Week</strong></h2><h3><strong>How to Automate Your Life with Claude Code</strong></h3><div id="youtube2-LJ1YZ3Uek3g" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;LJ1YZ3Uek3g&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/LJ1YZ3Uek3g?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Claude Code as a personal operating system, not as a coding tool. Hilary Gridley (former product leader) walks through how she runs her professional work and personal life through it as her primary interface. Her &#8220;anti-system system&#8221; leans on simple capture (an iPhone back-tap shortcut) and lets the model learn her preferences through observation rather than upfront configuration. The 10x impact framework for deciding what to automate is the part most people will steal, and the whole 51-minute walkthrough is reproducible on your own setup.</p><div><hr></div><p><em>Thanks for reading The Tokenizer! If you found this valuable, please share it with your colleagues and consider subscribing to <a href="https://gradientascent.co">Gradient Ascent</a> for more AI insights.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.artofsaience.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption"></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Tsinghua's Multi-Agent AI Classroom, Anthropic's Context Engineering Playbook, and a 54 LLM-Architecture Gallery - 📚 The Tokenizer Edition #22]]></title><description><![CDATA[This week's most valuable AI resources]]></description><link>https://newsletter.artofsaience.com/p/tsinghuas-multi-agent-ai-classroom</link><guid isPermaLink="false">https://newsletter.artofsaience.com/p/tsinghuas-multi-agent-ai-classroom</guid><dc:creator><![CDATA[Sairam Sundaresan]]></dc:creator><pubDate>Thu, 02 Apr 2026 23:34:09 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/CepbWmGie0E" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey there! Video generation went from &#8220;impressive demo&#8221; to &#8220;real-time streaming&#8221; this week, with three papers pushing interactive and long-form video into practical territory. Meanwhile, the tooling side caught up too, with Anthropic publishing one of the clearest guides yet on how to keep long-running agents from drowning in their own context.</p><h3><strong>New here?</strong></h3><p><em>The Tokenizer is my resource-focused newsletter edition where I curate the best AI/ML papers, videos, articles, tools, and learning resources so you don&#8217;t have to sift through the noise. Subscribe to <a href="https://newsletter.artofsaience.com">Gradient Ascent</a> for the full experience.</em></p><h2><strong>TL;DR</strong></h2><p>What caught my attention this week:</p><ul><li><p>&#128196; <strong>Papers:</strong> Streaming video generation hits 16 FPS, speculative sampling gets task-aware, and an autonomous medical AI scientist passes peer review</p></li><li><p>&#127909; <strong>Videos:</strong> Sebastian Raschka maps 54+ LLM architectures, TurboQuant compresses KV cache to 3.5 bits, and two practical Claude Code tutorials</p></li><li><p>&#128240; <strong>Reads:</strong> Ethan Mollick on why AI interfaces matter more than models, Cameron Wolfe dissects LLM benchmarks, and Anthropic&#8217;s three primitives for context management</p></li><li><p>&#128736; <strong>Tools:</strong> A biomimetic agent memory system with retain/recall/reflect, and an autonomous pentester that proves vulnerabilities with working exploits</p></li><li><p>&#127891; <strong>Learning:</strong> Tsinghua&#8217;s multi-agent AI classroom turns any topic into an interactive lesson with AI teachers, students, and a shared whiteboard</p></li></ul><div><hr></div><h2><strong>&#128196; 5 Papers</strong></h2><h3><strong>1. ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling</strong></h3><p><strong><a href="https://arxiv.org/abs/2603.25746">https://arxiv.org/abs/2603.25746</a></strong> | <strong><a href="https://github.com/KlingAIResearch/ShotStream">GitHub</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-TCc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcf36fd9-46e5-4c5c-bc14-c3b0e022a7c2_896x405.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-TCc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcf36fd9-46e5-4c5c-bc14-c3b0e022a7c2_896x405.png 424w, https://substackcdn.com/image/fetch/$s_!-TCc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcf36fd9-46e5-4c5c-bc14-c3b0e022a7c2_896x405.png 848w, https://substackcdn.com/image/fetch/$s_!-TCc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcf36fd9-46e5-4c5c-bc14-c3b0e022a7c2_896x405.png 1272w, https://substackcdn.com/image/fetch/$s_!-TCc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcf36fd9-46e5-4c5c-bc14-c3b0e022a7c2_896x405.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-TCc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcf36fd9-46e5-4c5c-bc14-c3b0e022a7c2_896x405.png" width="896" height="405" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bcf36fd9-46e5-4c5c-bc14-c3b0e022a7c2_896x405.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:405,&quot;width&quot;:896,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-TCc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcf36fd9-46e5-4c5c-bc14-c3b0e022a7c2_896x405.png 424w, https://substackcdn.com/image/fetch/$s_!-TCc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcf36fd9-46e5-4c5c-bc14-c3b0e022a7c2_896x405.png 848w, https://substackcdn.com/image/fetch/$s_!-TCc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcf36fd9-46e5-4c5c-bc14-c3b0e022a7c2_896x405.png 1272w, https://substackcdn.com/image/fetch/$s_!-TCc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcf36fd9-46e5-4c5c-bc14-c3b0e022a7c2_896x405.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Real-time multi-shot video generation that maintains character consistency across scene transitions. ShotStream introduces a causal architecture with dual-cache memory (global context for inter-shot consistency, local context for intra-shot coherence) that enables ~16 FPS streaming, a 25x throughput improvement over bidirectional approaches. From the Kling AI Research team, this is the first system that makes interactive video storytelling feel responsive enough for real-time use.</p><h3><strong>2. Out of Sight but Not Out of Mind: Hybrid Memory for Dynamic Video World Models (HyDRA)</strong></h3><p><strong><a href="https://arxiv.org/abs/2603.25716">https://arxiv.org/abs/2603.25716</a></strong> | <strong><a href="https://github.com/H-EmbodVis/HyDRA">GitHub</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!tAx1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c720adc-89f6-4150-afb7-1e6bb6ef53d4_996x975.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tAx1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c720adc-89f6-4150-afb7-1e6bb6ef53d4_996x975.png 424w, https://substackcdn.com/image/fetch/$s_!tAx1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c720adc-89f6-4150-afb7-1e6bb6ef53d4_996x975.png 848w, https://substackcdn.com/image/fetch/$s_!tAx1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c720adc-89f6-4150-afb7-1e6bb6ef53d4_996x975.png 1272w, https://substackcdn.com/image/fetch/$s_!tAx1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c720adc-89f6-4150-afb7-1e6bb6ef53d4_996x975.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tAx1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c720adc-89f6-4150-afb7-1e6bb6ef53d4_996x975.png" width="996" height="975" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7c720adc-89f6-4150-afb7-1e6bb6ef53d4_996x975.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:975,&quot;width&quot;:996,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!tAx1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c720adc-89f6-4150-afb7-1e6bb6ef53d4_996x975.png 424w, https://substackcdn.com/image/fetch/$s_!tAx1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c720adc-89f6-4150-afb7-1e6bb6ef53d4_996x975.png 848w, https://substackcdn.com/image/fetch/$s_!tAx1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c720adc-89f6-4150-afb7-1e6bb6ef53d4_996x975.png 1272w, https://substackcdn.com/image/fetch/$s_!tAx1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c720adc-89f6-4150-afb7-1e6bb6ef53d4_996x975.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Video world models lose track of objects the moment they disappear behind something. HyDRA fixes this with a hybrid memory system that separates archival storage (for static scenes) from working memory (for active, occluded objects). The result: +5.5 PSNR improvement over commercial systems like WorldPlay on a new Dynamic Object Tracking benchmark. Also ships HM-World, the first large-scale video dataset dedicated to hybrid memory evaluation with exit-entry occlusion events.</p><h3><strong>3. TAPS: Task Aware Proposal Distributions for Speculative Sampling</strong></h3><p><strong><a href="https://arxiv.org/abs/2603.27027">https://arxiv.org/abs/2603.27027</a></strong> | <strong><a href="https://github.com/Moe-Zbeeb/TAPS">GitHub</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Nnc2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fc8cc54-d15b-49b7-a791-442df4dc7264_1600x560.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Nnc2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fc8cc54-d15b-49b7-a791-442df4dc7264_1600x560.png 424w, https://substackcdn.com/image/fetch/$s_!Nnc2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fc8cc54-d15b-49b7-a791-442df4dc7264_1600x560.png 848w, https://substackcdn.com/image/fetch/$s_!Nnc2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fc8cc54-d15b-49b7-a791-442df4dc7264_1600x560.png 1272w, https://substackcdn.com/image/fetch/$s_!Nnc2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fc8cc54-d15b-49b7-a791-442df4dc7264_1600x560.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Nnc2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fc8cc54-d15b-49b7-a791-442df4dc7264_1600x560.png" width="1456" height="510" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3fc8cc54-d15b-49b7-a791-442df4dc7264_1600x560.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:510,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Nnc2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fc8cc54-d15b-49b7-a791-442df4dc7264_1600x560.png 424w, https://substackcdn.com/image/fetch/$s_!Nnc2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fc8cc54-d15b-49b7-a791-442df4dc7264_1600x560.png 848w, https://substackcdn.com/image/fetch/$s_!Nnc2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fc8cc54-d15b-49b7-a791-442df4dc7264_1600x560.png 1272w, https://substackcdn.com/image/fetch/$s_!Nnc2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fc8cc54-d15b-49b7-a791-442df4dc7264_1600x560.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Speculative decoding&#8217;s dirty secret: generic draft models waste tokens because they don&#8217;t match the downstream task distribution. TAPS trains task-specific draft models that align with the target model&#8217;s behavior on actual workloads, yielding ~26% acceptance length improvements over general-purpose drafters. Practical and immediately applicable if you&#8217;re running speculative decoding in production.</p><h3><strong>4. Towards a Medical AI Scientist</strong></h3><p><strong><a href="https://arxiv.org/abs/2603.28589">https://arxiv.org/abs/2603.28589</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bfe1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8480f597-0aef-40fd-adfd-cf7630829799_1600x463.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bfe1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8480f597-0aef-40fd-adfd-cf7630829799_1600x463.png 424w, https://substackcdn.com/image/fetch/$s_!bfe1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8480f597-0aef-40fd-adfd-cf7630829799_1600x463.png 848w, https://substackcdn.com/image/fetch/$s_!bfe1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8480f597-0aef-40fd-adfd-cf7630829799_1600x463.png 1272w, https://substackcdn.com/image/fetch/$s_!bfe1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8480f597-0aef-40fd-adfd-cf7630829799_1600x463.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bfe1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8480f597-0aef-40fd-adfd-cf7630829799_1600x463.png" width="1456" height="421" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8480f597-0aef-40fd-adfd-cf7630829799_1600x463.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:421,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bfe1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8480f597-0aef-40fd-adfd-cf7630829799_1600x463.png 424w, https://substackcdn.com/image/fetch/$s_!bfe1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8480f597-0aef-40fd-adfd-cf7630829799_1600x463.png 848w, https://substackcdn.com/image/fetch/$s_!bfe1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8480f597-0aef-40fd-adfd-cf7630829799_1600x463.png 1272w, https://substackcdn.com/image/fetch/$s_!bfe1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8480f597-0aef-40fd-adfd-cf7630829799_1600x463.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>An autonomous research framework that generates clinical hypotheses, designs experiments, executes analyses, and writes papers. The system achieved 91% execution success rate versus GPT-5&#8217;s 60%, and one of its generated papers was accepted at ICAIS 2025 (36.8% acceptance rate). This is a concrete step toward AI that does science, not just assists with it.</p><h3><strong>5. PackForcing: Short Video Training Suffices for Long Video Sampling and Long Context Inference</strong></h3><p><strong><a href="https://arxiv.org/abs/2603.25730">https://arxiv.org/abs/2603.25730</a></strong> | <strong><a href="https://github.com/ShandaAI/PackForcing">GitHub</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mcXT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb279880-3d3d-4aea-bdd2-d728b10dd2d5_793x491.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mcXT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb279880-3d3d-4aea-bdd2-d728b10dd2d5_793x491.png 424w, https://substackcdn.com/image/fetch/$s_!mcXT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb279880-3d3d-4aea-bdd2-d728b10dd2d5_793x491.png 848w, https://substackcdn.com/image/fetch/$s_!mcXT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb279880-3d3d-4aea-bdd2-d728b10dd2d5_793x491.png 1272w, https://substackcdn.com/image/fetch/$s_!mcXT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb279880-3d3d-4aea-bdd2-d728b10dd2d5_793x491.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mcXT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb279880-3d3d-4aea-bdd2-d728b10dd2d5_793x491.png" width="793" height="491" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bb279880-3d3d-4aea-bdd2-d728b10dd2d5_793x491.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:491,&quot;width&quot;:793,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mcXT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb279880-3d3d-4aea-bdd2-d728b10dd2d5_793x491.png 424w, https://substackcdn.com/image/fetch/$s_!mcXT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb279880-3d3d-4aea-bdd2-d728b10dd2d5_793x491.png 848w, https://substackcdn.com/image/fetch/$s_!mcXT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb279880-3d3d-4aea-bdd2-d728b10dd2d5_793x491.png 1272w, https://substackcdn.com/image/fetch/$s_!mcXT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb279880-3d3d-4aea-bdd2-d728b10dd2d5_793x491.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Training on long videos is expensive. PackForcing shows you don&#8217;t need to. By introducing hierarchical KV-cache management with a bounded 4GB memory budget, it achieves 24x temporal extrapolation: models trained on short clips generate coherent long videos. The spatiotemporal compression maintains temporal consistency while keeping inference memory constant regardless of video length.</p><div><hr></div><h2><strong>&#127909; 4 Videos</strong></h2><h3><strong>1. 54+ LLM Architectures and 7 Attention Variants in One Visual Gallery</strong></h3><div id="youtube2-CepbWmGie0E" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;CepbWmGie0E&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/CepbWmGie0E?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>A visual map of how LLM architectures evolved from GPT-2 (2019) to Kimi K2.5 (2026), covering 54+ models from 270M to 1T parameters. The 38-minute deep dive compares 7 attention variants: standard MHA, Grouped-Query, Sliding-Window, Multi-Head Latent (DeepSeek&#8217;s MLA), Sparse, Gated, and Hybrid. Sebastian Raschka built the companion Architecture Gallery as an open resource, and this walkthrough is how you actually learn to read it.</p><h3><strong>2. TurboQuant: Compressing KV Cache to 3.5 Bits Per Channel</strong></h3><div id="youtube2-7YVrb3-ABYE" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;7YVrb3-ABYE&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/7YVrb3-ABYE?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Random rotation plus scalar quantization can compress KV cache vectors to near-optimal distortion at 3.5 bits per channel. That&#8217;s the core of Google&#8217;s TurboQuant paper, broken down here by Karoly Zsolnai-Feher (Two Minute Papers). The practical result: cheaper LLM inference through aggressive cache compression without meaningful quality loss, with community review context and reproduction attempts included.</p><h3><strong>3. Building a Mobile Fitness App with Claude Code and Pencil in 16 Minutes</strong></h3><div id="youtube2-oS53by4Hwvo" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;oS53by4Hwvo&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/oS53by4Hwvo?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>A three-phase workflow for building mobile apps without coding: define requirements conversationally, generate 8 UI screens with Pencil (pencil.dev), then build the working app with Claude Code. Peter Yang takes it from blank screen to a fitness tracker running on iOS via Expo Go in 16 minutes. Covers workout creation, live session tracking, calendar progress, and the path to App Store deployment.</p><h3><strong>4. The &#8220;Recording Mode&#8221; Trick for Privacy-Safe Claude Code Demos</strong></h3><div id="youtube2-5O3rruy2SKw" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;5O3rruy2SKw&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/5O3rruy2SKw?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>A 2-minute clip with a clever idea: instead of maintaining separate anonymized demo environments, create a Claude Code skill called &#8220;recording on&#8221; that intercepts and anonymizes personal information in real-time. It tracks consistent mappings (person A stays person A), toggles on and off with zero friction, and works for B2B demos where you need to show live production data without exposing customer names or financial details.</p><div><hr></div><h2><strong>&#128240; 3 Curated Reads</strong></h2><h3><strong>1. Claude Dispatch and the Power of Interfaces</strong></h3><div class="embedded-post-wrap" data-attrs="{&quot;id&quot;:192379643,&quot;url&quot;:&quot;https://www.oneusefulthing.org/p/claude-dispatch-and-the-power-of&quot;,&quot;publication_id&quot;:1180644,&quot;publication_name&quot;:&quot;One Useful Thing&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!hyZZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd2ee4f7-3e71-42f0-92eb-4d3018127e08_1024x1024.png&quot;,&quot;title&quot;:&quot;Claude Dispatch and the Power of Interfaces&quot;,&quot;truncated_body_text&quot;:&quot;AIs are already far more capable than most people realize. A large part of this so-called capability overhang comes not from the limits of AI (though, of course, they still have many limits), but from how people interact with it. The vast majority of people access AI through chatbots, and usually the free versions with less capable models. A chatbot is &#8230;&quot;,&quot;date&quot;:&quot;2026-03-31T22:34:37.308Z&quot;,&quot;like_count&quot;:551,&quot;comment_count&quot;:26,&quot;bylines&quot;:[{&quot;id&quot;:846835,&quot;name&quot;:&quot;Ethan Mollick&quot;,&quot;handle&quot;:&quot;oneusefulthing&quot;,&quot;previous_name&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7c05cdbc-40fd-459b-915d-f8bc8ac8bf01_3509x5263.jpeg&quot;,&quot;bio&quot;:&quot;I am a professor at the Wharton School of the University of Pennsylvania. I study entrepreneurship &amp; innovation and AI. I am trying to understand what our new AI-haunted era means for work and education.&quot;,&quot;profile_set_up_at&quot;:&quot;2022-07-03T02:55:46.296Z&quot;,&quot;reader_installed_at&quot;:&quot;2024-10-18T13:48:35.897Z&quot;,&quot;publicationUsers&quot;:[{&quot;id&quot;:1134116,&quot;user_id&quot;:846835,&quot;publication_id&quot;:1180644,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:true,&quot;publication&quot;:{&quot;id&quot;:1180644,&quot;name&quot;:&quot;One Useful Thing&quot;,&quot;subdomain&quot;:&quot;oneusefulthing&quot;,&quot;custom_domain&quot;:&quot;www.oneusefulthing.org&quot;,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Trying to understand the implications of AI for work, education, and life. By Prof. Ethan Mollick&quot;,&quot;logo_url&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/cd2ee4f7-3e71-42f0-92eb-4d3018127e08_1024x1024.png&quot;,&quot;author_id&quot;:846835,&quot;primary_user_id&quot;:846835,&quot;theme_var_background_pop&quot;:&quot;#BAA049&quot;,&quot;created_at&quot;:&quot;2022-11-08T03:49:40.900Z&quot;,&quot;email_from_name&quot;:null,&quot;copyright&quot;:&quot;Ethan Mollick&quot;,&quot;founding_plan_name&quot;:&quot;Founding Member&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;enabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false,&quot;logo_url_wide&quot;:null}}],&quot;twitter_screen_name&quot;:&quot;emollick&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:1000,&quot;status&quot;:{&quot;bestsellerTier&quot;:1000,&quot;subscriberTier&quot;:5,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:{&quot;type&quot;:&quot;bestseller&quot;,&quot;tier&quot;:1000},&quot;paidPublicationIds&quot;:[320996,2880588,2141880,1084089,3061248,1198173,35345],&quot;subscriber&quot;:null}}],&quot;utm_campaign&quot;:null,&quot;belowTheFold&quot;:true,&quot;type&quot;:&quot;newsletter&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPostToDOM"><a class="embedded-post" native="true" href="https://www.oneusefulthing.org/p/claude-dispatch-and-the-power-of?utm_source=substack&amp;utm_campaign=post_embed&amp;utm_medium=web"><div class="embedded-post-header"><img class="embedded-post-publication-logo" src="https://substackcdn.com/image/fetch/$s_!hyZZ!,w_56,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd2ee4f7-3e71-42f0-92eb-4d3018127e08_1024x1024.png" loading="lazy"><span class="embedded-post-publication-name">One Useful Thing</span></div><div class="embedded-post-title-wrapper"><div class="embedded-post-title">Claude Dispatch and the Power of Interfaces</div></div><div class="embedded-post-body">AIs are already far more capable than most people realize. A large part of this so-called capability overhang comes not from the limits of AI (though, of course, they still have many limits), but from how people interact with it. The vast majority of people access AI through chatbots, and usually the free versions with less capable models. A chatbot is &#8230;</div><div class="embedded-post-cta-wrapper"><span class="embedded-post-cta">Read more</span></div><div class="embedded-post-meta">24 days ago &#183; 551 likes &#183; 26 comments &#183; Ethan Mollick</div></a></div><p>Ethan Mollick (Wharton) argues the gap between AI capability and actual user experience is an interface problem, not a model problem. Current chatbot UIs impose cognitive costs that overwhelm productivity gains, particularly for less experienced workers. Three paths forward: specialized professional tools (the coding IDE model), meeting users on familiar platforms (WhatsApp, Slack), and dynamic interfaces where AI generates the right UI on the fly. Includes hands-on demos of Claude Dispatch and Cowork.</p><h3><strong>2. The Anatomy of an LLM Benchmark</strong></h3><div class="embedded-post-wrap" data-attrs="{&quot;id&quot;:190515363,&quot;url&quot;:&quot;https://cameronrwolfe.substack.com/p/llm-bench&quot;,&quot;publication_id&quot;:1092659,&quot;publication_name&quot;:&quot;Deep (Learning) Focus&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!87xa!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fab9b43fb-52d5-40da-995d-5b7cd3f91064_896x896.png&quot;,&quot;title&quot;:&quot;The Anatomy of an LLM Benchmark&quot;,&quot;truncated_body_text&quot;:&quot;Throughout the history of AI research, progress has been measured&#8212;and accelerated&#8212;by high-quality benchmarks. AI is an empirical field that is driven by discovering interventions that improve performance on key benchmarks. For large language models (LLMs) in particular, creating useful benchmarks is hard due to rapidly advancing &#8230;&quot;,&quot;date&quot;:&quot;2026-03-30T09:33:10.210Z&quot;,&quot;like_count&quot;:73,&quot;comment_count&quot;:0,&quot;bylines&quot;:[{&quot;id&quot;:29736521,&quot;name&quot;:&quot;Cameron R. Wolfe, Ph.D.&quot;,&quot;handle&quot;:&quot;cwolferesearch&quot;,&quot;previous_name&quot;:&quot;Cameron R. Wolfe&quot;,&quot;photo_url&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/69aba7df-b571-4609-aa47-fc2d031c11b8_1242x1595.jpeg&quot;,&quot;bio&quot;:&quot;Research @ Netflix &#8226; Rice University PhD &#8226; I make AI understandable&quot;,&quot;profile_set_up_at&quot;:&quot;2022-09-17T15:11:34.083Z&quot;,&quot;reader_installed_at&quot;:&quot;2023-01-10T11:25:00.723Z&quot;,&quot;publicationUsers&quot;:[{&quot;id&quot;:1042380,&quot;user_id&quot;:29736521,&quot;publication_id&quot;:1092659,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:true,&quot;publication&quot;:{&quot;id&quot;:1092659,&quot;name&quot;:&quot;Deep (Learning) Focus&quot;,&quot;subdomain&quot;:&quot;cameronrwolfe&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;I contextualize and explain important topics in AI research.&quot;,&quot;logo_url&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/ab9b43fb-52d5-40da-995d-5b7cd3f91064_896x896.png&quot;,&quot;author_id&quot;:29736521,&quot;primary_user_id&quot;:29736521,&quot;theme_var_background_pop&quot;:&quot;#6C0095&quot;,&quot;created_at&quot;:&quot;2022-09-17T15:12:33.160Z&quot;,&quot;email_from_name&quot;:&quot;Deep (Learning) Focus&quot;,&quot;copyright&quot;:&quot;Cameron R. Wolfe&quot;,&quot;founding_plan_name&quot;:&quot;Founding Member&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;enabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false,&quot;logo_url_wide&quot;:null}}],&quot;twitter_screen_name&quot;:&quot;cwolferesearch&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100,&quot;status&quot;:{&quot;bestsellerTier&quot;:100,&quot;subscriberTier&quot;:null,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:{&quot;type&quot;:&quot;bestseller&quot;,&quot;tier&quot;:100},&quot;paidPublicationIds&quot;:[],&quot;subscriber&quot;:null}}],&quot;utm_campaign&quot;:null,&quot;belowTheFold&quot;:true,&quot;type&quot;:&quot;newsletter&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPostToDOM"><a class="embedded-post" native="true" href="https://cameronrwolfe.substack.com/p/llm-bench?utm_source=substack&amp;utm_campaign=post_embed&amp;utm_medium=web"><div class="embedded-post-header"><img class="embedded-post-publication-logo" src="https://substackcdn.com/image/fetch/$s_!87xa!,w_56,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fab9b43fb-52d5-40da-995d-5b7cd3f91064_896x896.png" loading="lazy"><span class="embedded-post-publication-name">Deep (Learning) Focus</span></div><div class="embedded-post-title-wrapper"><div class="embedded-post-title">The Anatomy of an LLM Benchmark</div></div><div class="embedded-post-body">Throughout the history of AI research, progress has been measured&#8212;and accelerated&#8212;by high-quality benchmarks. AI is an empirical field that is driven by discovering interventions that improve performance on key benchmarks. For large language models (LLMs) in particular, creating useful benchmarks is hard due to rapidly advancing &#8230;</div><div class="embedded-post-cta-wrapper"><span class="embedded-post-cta">Read more</span></div><div class="embedded-post-meta">25 days ago &#183; 73 likes &#183; Cameron R. Wolfe, Ph.D.</div></a></div><p>LLM benchmarks break in predictable ways: MMLU has a 6.49% error rate (Virology hits 57%), and correcting those errors moved Llama-3.1-405B from 16th to 1st. Cameron Wolfe (Deep Learning Focus) maps the full lifecycle of how benchmarks are designed, saturate, and get replaced. The standout section covers Item Response Theory, which cuts evaluation costs 140-160x by selecting only the most informative test items.</p><h3><strong>3. Context Engineering for AI Agents: Memory, Compaction, and Tool Clearing</strong></h3><p><strong><a href="https://platform.claude.com/cookbook/tool-use-context-engineering-context-engineering-tools">https://platform.claude.com/cookbook/tool-use-context-engineering-context-engineering-tools</a></strong></p><p>Three composable primitives for managing context in long-running agents, each targeting a different type of bloat. Clearing drops re-fetchable tool outputs at zero inference cost (peak context: 173K tokens vs. 335K baseline). Compaction summarizes conversation history (169K peak, lossy). Memory persists knowledge across sessions by letting the agent write its own notes. Isabella He (Anthropic) includes a diagnostic framework: profile your agent&#8217;s token breakdown first (in the demo, 96.3% of tokens were stale file-read results), then pick the primitive that matches.</p><div><hr></div><h2><strong>&#128736; 2 Tools &amp; Repos</strong></h2><h3><strong>1. Hindsight</strong></h3><p><strong><a href="https://github.com/vectorize-io/hindsight">https://github.com/vectorize-io/hindsight</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!isrE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6cea4792-eb99-466b-8a0b-196cec937229_1572x275.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!isrE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6cea4792-eb99-466b-8a0b-196cec937229_1572x275.png 424w, https://substackcdn.com/image/fetch/$s_!isrE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6cea4792-eb99-466b-8a0b-196cec937229_1572x275.png 848w, https://substackcdn.com/image/fetch/$s_!isrE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6cea4792-eb99-466b-8a0b-196cec937229_1572x275.png 1272w, https://substackcdn.com/image/fetch/$s_!isrE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6cea4792-eb99-466b-8a0b-196cec937229_1572x275.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!isrE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6cea4792-eb99-466b-8a0b-196cec937229_1572x275.png" width="1456" height="255" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6cea4792-eb99-466b-8a0b-196cec937229_1572x275.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:255,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!isrE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6cea4792-eb99-466b-8a0b-196cec937229_1572x275.png 424w, https://substackcdn.com/image/fetch/$s_!isrE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6cea4792-eb99-466b-8a0b-196cec937229_1572x275.png 848w, https://substackcdn.com/image/fetch/$s_!isrE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6cea4792-eb99-466b-8a0b-196cec937229_1572x275.png 1272w, https://substackcdn.com/image/fetch/$s_!isrE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6cea4792-eb99-466b-8a0b-196cec937229_1572x275.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Agent memory that goes beyond conversation history. Hindsight provides three API primitives (retain, recall, reflect) that let agents learn, retrieve, and synthesize knowledge over time. Recall runs four parallel strategies (semantic, keyword, graph, temporal) with cross-encoder reranking. The reflect API generates new insights from existing memories, not just retrieval. SOTA on LongMemEval, 6.8K GitHub stars, MIT license, works with any LLM provider, and deploys via Docker or as an embedded Python library.</p><h3><strong>2. Shannon</strong></h3><p><strong><a href="https://github.com/KeygraphHQ/shannon">https://github.com/KeygraphHQ/shannon</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bdq_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97f527da-996e-4f19-88c5-9a3c07115f15_1600x843.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bdq_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97f527da-996e-4f19-88c5-9a3c07115f15_1600x843.png 424w, https://substackcdn.com/image/fetch/$s_!bdq_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97f527da-996e-4f19-88c5-9a3c07115f15_1600x843.png 848w, https://substackcdn.com/image/fetch/$s_!bdq_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97f527da-996e-4f19-88c5-9a3c07115f15_1600x843.png 1272w, https://substackcdn.com/image/fetch/$s_!bdq_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97f527da-996e-4f19-88c5-9a3c07115f15_1600x843.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bdq_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97f527da-996e-4f19-88c5-9a3c07115f15_1600x843.png" width="1456" height="767" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/97f527da-996e-4f19-88c5-9a3c07115f15_1600x843.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:767,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bdq_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97f527da-996e-4f19-88c5-9a3c07115f15_1600x843.png 424w, https://substackcdn.com/image/fetch/$s_!bdq_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97f527da-996e-4f19-88c5-9a3c07115f15_1600x843.png 848w, https://substackcdn.com/image/fetch/$s_!bdq_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97f527da-996e-4f19-88c5-9a3c07115f15_1600x843.png 1272w, https://substackcdn.com/image/fetch/$s_!bdq_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97f527da-996e-4f19-88c5-9a3c07115f15_1600x843.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>An autonomous AI pentester that reads your source code and executes real exploits against the running app. It builds a Code Property Graph to trace data flows from user input to dangerous sinks, then attacks those paths. Every reported vulnerability comes with a working proof-of-concept, not theoretical findings. Handles authentication complexity including 2FA/TOTP flows.</p><div><hr></div><h2><strong>&#127891; 1 Pick of the Week</strong></h2><h3><strong>OpenMAIC: Tsinghua&#8217;s Multi-Agent AI Classroom</strong></h3><p><strong><a href="https://github.com/THU-MAIC/OpenMAIC">https://github.com/THU-MAIC/OpenMAIC</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wtGa!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2e5a064-4e78-4516-b699-d538357d31ad_1600x681.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wtGa!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2e5a064-4e78-4516-b699-d538357d31ad_1600x681.png 424w, https://substackcdn.com/image/fetch/$s_!wtGa!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2e5a064-4e78-4516-b699-d538357d31ad_1600x681.png 848w, https://substackcdn.com/image/fetch/$s_!wtGa!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2e5a064-4e78-4516-b699-d538357d31ad_1600x681.png 1272w, https://substackcdn.com/image/fetch/$s_!wtGa!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2e5a064-4e78-4516-b699-d538357d31ad_1600x681.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wtGa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2e5a064-4e78-4516-b699-d538357d31ad_1600x681.png" width="1456" height="620" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b2e5a064-4e78-4516-b699-d538357d31ad_1600x681.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:620,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wtGa!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2e5a064-4e78-4516-b699-d538357d31ad_1600x681.png 424w, https://substackcdn.com/image/fetch/$s_!wtGa!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2e5a064-4e78-4516-b699-d538357d31ad_1600x681.png 848w, https://substackcdn.com/image/fetch/$s_!wtGa!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2e5a064-4e78-4516-b699-d538357d31ad_1600x681.png 1272w, https://substackcdn.com/image/fetch/$s_!wtGa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2e5a064-4e78-4516-b699-d538357d31ad_1600x681.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Type &#8220;teach me Python in 30 minutes&#8221; and OpenMAIC generates a full interactive classroom: slides with narration, quizzes with real-time grading, interactive HTML simulations, and AI agents playing teacher and student roles who lecture, discuss, and draw on a shared whiteboard. Built on LangGraph with a director agent that orchestrates turn-taking, it supports uploading PDFs for document-to-course conversion. 13.6K stars in three weeks, live demo at open.maic.chat, and the LangGraph-based multi-agent orchestration pattern is a reusable blueprint beyond education.</p><div><hr></div><p><em>Thanks for reading The Tokenizer! If you found this valuable, please share it with your colleagues and consider subscribing to <a href="https://newsletter.artofsaience.com">Gradient Ascent</a> for more AI insights.</em></p>]]></content:encoded></item><item><title><![CDATA[Google Compresses KV-Cache 6x Without Training, How Every Modern Attention Variant Works, and a Claude Code Cheat Sheet - 📚 The Tokenizer Edition #21]]></title><description><![CDATA[This week's most valuable AI resources]]></description><link>https://newsletter.artofsaience.com/p/google-compresses-kv-cache-6x-without</link><guid isPermaLink="false">https://newsletter.artofsaience.com/p/google-compresses-kv-cache-6x-without</guid><dc:creator><![CDATA[Sairam Sundaresan]]></dc:creator><pubDate>Thu, 26 Mar 2026 12:02:32 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/DmtoVnTkQnM" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey there! This week&#8217;s theme is speed: faster OCR through diffusion, faster inference through speculative execution, faster compression through polar coordinates, and Stripe shipping 1,300 agent-written pull requests every week. Whether you&#8217;re optimizing tokens, transistors, or team velocity, there&#8217;s something here for you.</p><h3><strong>New here?</strong></h3><p><em>The Tokenizer is my resource-focused newsletter edition where I curate the best AI/ML papers, videos, articles, tools, and learning resources so you don&#8217;t have to sift through the noise. Subscribe to <a href="https://newsletter.artofsaience.com">Gradient Ascent</a> for the full experience.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.artofsaience.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption"></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2><strong>TL;DR</strong></h2><p>What caught my attention this week:</p><ul><li><p>&#128196; <strong>Papers:</strong> Diffusion-based OCR that&#8217;s 3x faster, a package manager for AI agent skills, world modeling from Monster Hunter, probability-aware RL clipping, and speculative shortcuts for agentic vision systems</p></li><li><p>&#127909; <strong>Videos:</strong> Stripe&#8217;s internal AI coding agents at scale, building your own AI operating system from scratch, DeepSeek&#8217;s conditional memory for transformers, and going from Figma design to working code with Claude</p></li><li><p>&#128240; <strong>Reads:</strong> Sebastian Raschka&#8217;s visual taxonomy of attention mechanisms, Google&#8217;s training-free KV-cache compression via polar coordinates, and why on-device AI needs new architectures (not smaller cloud models)</p></li><li><p>&#128736; <strong>Tools:</strong> ByteDance&#8217;s filesystem-based context database for AI agents, and Cornell&#8217;s free GPU architecture course</p></li><li><p>&#127891; <strong>Learning:</strong> A beautifully maintained Claude Code cheat sheet that doubles as a feature discovery tool</p></li></ul><div><hr></div><h2><strong>&#128196; 5 Papers</strong></h2><h3><strong>1. MinerU-Diffusion: Rethinking Document OCR as Inverse Rendering via Diffusion Decoding</strong></h3><p><strong><a href="https://arxiv.org/abs/2603.22458">https://arxiv.org/abs/2603.22458</a></strong> | <strong><a href="https://github.com/opendatalab/MinerU-Diffusion">GitHub</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Q5KX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0154cb6-e708-4b06-bf19-ef3abf7ff2bc_996x396.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Q5KX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0154cb6-e708-4b06-bf19-ef3abf7ff2bc_996x396.png 424w, https://substackcdn.com/image/fetch/$s_!Q5KX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0154cb6-e708-4b06-bf19-ef3abf7ff2bc_996x396.png 848w, https://substackcdn.com/image/fetch/$s_!Q5KX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0154cb6-e708-4b06-bf19-ef3abf7ff2bc_996x396.png 1272w, https://substackcdn.com/image/fetch/$s_!Q5KX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0154cb6-e708-4b06-bf19-ef3abf7ff2bc_996x396.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Q5KX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0154cb6-e708-4b06-bf19-ef3abf7ff2bc_996x396.png" width="996" height="396" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a0154cb6-e708-4b06-bf19-ef3abf7ff2bc_996x396.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:396,&quot;width&quot;:996,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Q5KX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0154cb6-e708-4b06-bf19-ef3abf7ff2bc_996x396.png 424w, https://substackcdn.com/image/fetch/$s_!Q5KX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0154cb6-e708-4b06-bf19-ef3abf7ff2bc_996x396.png 848w, https://substackcdn.com/image/fetch/$s_!Q5KX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0154cb6-e708-4b06-bf19-ef3abf7ff2bc_996x396.png 1272w, https://substackcdn.com/image/fetch/$s_!Q5KX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0154cb6-e708-4b06-bf19-ef3abf7ff2bc_996x396.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>What if OCR didn&#8217;t need to generate text one token at a time? This paper replaces autoregressive decoding with block-wise diffusion denoising, reframing document OCR as inverse rendering. The result: up to 3.2x faster throughput with a tunable speed-accuracy tradeoff depending on your pipeline needs. A proposed &#8220;Semantic Shuffle&#8221; benchmark shows the model genuinely reads visual structure rather than leaning on linguistic shortcuts.</p><h3><strong>2. SkillNet: Create, Evaluate, and Connect AI Skills</strong></h3><p><strong><a href="https://arxiv.org/abs/2603.04448">https://arxiv.org/abs/2603.04448</a></strong> | <strong><a href="https://github.com/zjunlp/SkillNet">GitHub</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!q94O!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdabe6880-30dc-4a22-a03b-4fc3dba2667c_797x367.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!q94O!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdabe6880-30dc-4a22-a03b-4fc3dba2667c_797x367.png 424w, https://substackcdn.com/image/fetch/$s_!q94O!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdabe6880-30dc-4a22-a03b-4fc3dba2667c_797x367.png 848w, https://substackcdn.com/image/fetch/$s_!q94O!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdabe6880-30dc-4a22-a03b-4fc3dba2667c_797x367.png 1272w, https://substackcdn.com/image/fetch/$s_!q94O!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdabe6880-30dc-4a22-a03b-4fc3dba2667c_797x367.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!q94O!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdabe6880-30dc-4a22-a03b-4fc3dba2667c_797x367.png" width="797" height="367" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dabe6880-30dc-4a22-a03b-4fc3dba2667c_797x367.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:367,&quot;width&quot;:797,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!q94O!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdabe6880-30dc-4a22-a03b-4fc3dba2667c_797x367.png 424w, https://substackcdn.com/image/fetch/$s_!q94O!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdabe6880-30dc-4a22-a03b-4fc3dba2667c_797x367.png 848w, https://substackcdn.com/image/fetch/$s_!q94O!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdabe6880-30dc-4a22-a03b-4fc3dba2667c_797x367.png 1272w, https://substackcdn.com/image/fetch/$s_!q94O!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdabe6880-30dc-4a22-a03b-4fc3dba2667c_797x367.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Think npm, but for AI agent skills. SkillNet treats agent capabilities as shareable, composable packages with a unified ontology for creating skills from heterogeneous sources (repos, docs, logs, prompts) and connecting them through dependency graphs. The framework delivers a 40% improvement in average rewards and 30% fewer execution steps across ALFWorld, WebShop, and ScienceWorld. With 200,000+ skills in the repository and a Python toolkit plus Open Access API, this could become critical infrastructure for the tool-using agent ecosystem.</p><h3><strong>3. WildWorld: A Large-Scale Dataset for Dynamic World Modeling with Actions and Explicit State toward Generative ARPG</strong></h3><p><strong><a href="https://arxiv.org/abs/2603.23497">https://arxiv.org/abs/2603.23497</a></strong> | <strong><a href="https://github.com/ShandaAI/WildWorld">GitHub</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xuZW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58ee0959-c61c-4ceb-8f2b-41d56f8362ef_897x461.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xuZW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58ee0959-c61c-4ceb-8f2b-41d56f8362ef_897x461.png 424w, https://substackcdn.com/image/fetch/$s_!xuZW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58ee0959-c61c-4ceb-8f2b-41d56f8362ef_897x461.png 848w, https://substackcdn.com/image/fetch/$s_!xuZW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58ee0959-c61c-4ceb-8f2b-41d56f8362ef_897x461.png 1272w, https://substackcdn.com/image/fetch/$s_!xuZW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58ee0959-c61c-4ceb-8f2b-41d56f8362ef_897x461.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xuZW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58ee0959-c61c-4ceb-8f2b-41d56f8362ef_897x461.png" width="897" height="461" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/58ee0959-c61c-4ceb-8f2b-41d56f8362ef_897x461.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:461,&quot;width&quot;:897,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xuZW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58ee0959-c61c-4ceb-8f2b-41d56f8362ef_897x461.png 424w, https://substackcdn.com/image/fetch/$s_!xuZW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58ee0959-c61c-4ceb-8f2b-41d56f8362ef_897x461.png 848w, https://substackcdn.com/image/fetch/$s_!xuZW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58ee0959-c61c-4ceb-8f2b-41d56f8362ef_897x461.png 1272w, https://substackcdn.com/image/fetch/$s_!xuZW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58ee0959-c61c-4ceb-8f2b-41d56f8362ef_897x461.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Most world-modeling datasets treat video prediction as frame interpolation. WildWorld takes a different approach: 108 million frames automatically collected from Monster Hunter: Wilds, with per-frame action labels (450+ semantically meaningful actions), character skeletons, world states, camera poses, and depth maps. The key insight is decomposing dynamics into action, state, and pixels separately, giving world models the causal structure they need to simulate rather than interpolate.</p><h3><strong>4. BandPO: Bridging Trust Regions and Ratio Clipping via Probability-Aware Bounds for LLM Reinforcement Learning</strong></h3><p><strong><a href="https://arxiv.org/abs/2603.04918">https://arxiv.org/abs/2603.04918</a></strong> | <strong><a href="https://github.com/OpenMOSS/BandPO">GitHub</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Y5-3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5b1380e-3e33-44da-b19e-e02980f499b5_537x432.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Y5-3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5b1380e-3e33-44da-b19e-e02980f499b5_537x432.png 424w, https://substackcdn.com/image/fetch/$s_!Y5-3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5b1380e-3e33-44da-b19e-e02980f499b5_537x432.png 848w, https://substackcdn.com/image/fetch/$s_!Y5-3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5b1380e-3e33-44da-b19e-e02980f499b5_537x432.png 1272w, https://substackcdn.com/image/fetch/$s_!Y5-3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5b1380e-3e33-44da-b19e-e02980f499b5_537x432.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Y5-3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5b1380e-3e33-44da-b19e-e02980f499b5_537x432.png" width="537" height="432" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d5b1380e-3e33-44da-b19e-e02980f499b5_537x432.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:432,&quot;width&quot;:537,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Y5-3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5b1380e-3e33-44da-b19e-e02980f499b5_537x432.png 424w, https://substackcdn.com/image/fetch/$s_!Y5-3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5b1380e-3e33-44da-b19e-e02980f499b5_537x432.png 848w, https://substackcdn.com/image/fetch/$s_!Y5-3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5b1380e-3e33-44da-b19e-e02980f499b5_537x432.png 1272w, https://substackcdn.com/image/fetch/$s_!Y5-3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5b1380e-3e33-44da-b19e-e02980f499b5_537x432.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>PPO&#8217;s fixed clipping bounds have a quiet failure mode: they disproportionately suppress high-advantage, low-probability actions, which are exactly the exploratory tail strategies you want to preserve in reasoning tasks. BandPO replaces static bounds with dynamic, probability-aware intervals derived from f-divergence constraints, formulated as a convex optimization problem with closed-form solutions. Tested on Qwen2.5 and Llama3 for mathematical reasoning, it consistently outperforms canonical GRPO clipping while preventing entropy collapse. A drop-in replacement for anyone doing GRPO-style training post-DeepSeek-R1.</p><h3><strong>5. SpecEyes: Accelerating Agentic Multimodal LLMs via Speculative Perception and Planning</strong></h3><p><strong><a href="https://arxiv.org/abs/2603.23483">https://arxiv.org/abs/2603.23483</a></strong> | <strong><a href="https://github.com/MAC-AutoML/SpecEyes">GitHub</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!JCHM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c499655-04c4-4735-bb74-7201e1100586_793x317.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JCHM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c499655-04c4-4735-bb74-7201e1100586_793x317.png 424w, https://substackcdn.com/image/fetch/$s_!JCHM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c499655-04c4-4735-bb74-7201e1100586_793x317.png 848w, https://substackcdn.com/image/fetch/$s_!JCHM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c499655-04c4-4735-bb74-7201e1100586_793x317.png 1272w, https://substackcdn.com/image/fetch/$s_!JCHM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c499655-04c4-4735-bb74-7201e1100586_793x317.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JCHM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c499655-04c4-4735-bb74-7201e1100586_793x317.png" width="793" height="317" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9c499655-04c4-4735-bb74-7201e1100586_793x317.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:317,&quot;width&quot;:793,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!JCHM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c499655-04c4-4735-bb74-7201e1100586_793x317.png 424w, https://substackcdn.com/image/fetch/$s_!JCHM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c499655-04c4-4735-bb74-7201e1100586_793x317.png 848w, https://substackcdn.com/image/fetch/$s_!JCHM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c499655-04c4-4735-bb74-7201e1100586_793x317.png 1272w, https://substackcdn.com/image/fetch/$s_!JCHM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c499655-04c4-4735-bb74-7201e1100586_793x317.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Agentic vision systems are slow because every query triggers a full sequential tool-use chain, even when the answer is straightforward. SpecEyes uses a lightweight gating mechanism (based on top-K logit gaps) to predict when the expensive pipeline can be short-circuited. The result: up to 3.35x speedup (averaging 1.4-1.7x across benchmarks) with accuracy maintained or improved by up to 6.7%. The gating requires no labeled routing data, making it deployable without per-task annotation on any production agentic vision system where latency matters.<br><br></p><div><hr></div><h2><strong>&#127909; 4 Videos</strong></h2><h3><strong>1. Building a Full-Stack AI Operating System from Scratch</strong></h3><div id="youtube2-rZX1OYetbSM" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;rZX1OYetbSM&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/rZX1OYetbSM?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>A three-layer architecture for a personal AI platform: webhook-driven triggers, scheduled workflows (cron-style recurring tasks), and an autonomous agent layer with persistent context and reusable skills. Dave Ebbelaar builds the entire system on FastAPI, Celery, Redis, and Docker. This 28-minute walkthrough gives you the blueprint instead of cloning random agent repos.</p><h3><strong>2. DeepSeek&#8217;s Engram: Adding Conditional Memory to Transformers</strong></h3><div id="youtube2-DmtoVnTkQnM" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;DmtoVnTkQnM&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/DmtoVnTkQnM?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>DeepSeek&#8217;s Engram module adds O(1) lookup memory to transformers by modernizing n-gram embeddings as a new sparsity dimension alongside Mixture-of-Experts. Dr. Karoly Zsolnai-Feher (Two Minute Papers) explains how it outperforms comparable MoE baselines at 27B parameters on both parameter and compute budgets. The surprise: the biggest gains show up not in knowledge retrieval (MMLU +3.4) but in reasoning (BBH +5.0, ARC-Challenge +3.7), because offloading pattern reconstruction to memory frees attention depth for harder tasks.</p><h3><strong>3. From Figma Design to Working Code with Claude Code and MCP</strong></h3><div id="youtube2-ydiMKfljb-I" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;ydiMKfljb-I&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/ydiMKfljb-I?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Figma mockup to working website in 15 minutes, FigJam flowchart to a working game, then exporting code back to Figma as editable components. Felix Lee (designer at ADPList) demonstrates the full design-to-code loop using Claude Code with Figma MCP in this 50-minute session hosted by Peter Yang. The core insight: MCP reads every color, spacing value, and component variant directly from Figma, eliminating the translation loss in design-to-dev handoffs.</p><h3><strong>4. How Stripe Ships 1,300 AI-Written Pull Requests Per Week</strong></h3><div id="youtube2-o5Mi5SYSDnY" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;o5Mi5SYSDnY&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/o5Mi5SYSDnY?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Stripe built internal AI coding agents called &#8220;minions&#8221; that now ship roughly 1,300 PRs per week. Steve Kaliski (software engineer at Stripe) walks through the architecture: Goose (Block&#8217;s open-source agent harness) with cloud dev environments, activated from Slack via emoji reactions. The key takeaway: Stripe&#8217;s existing investment in developer tooling (CI, testing, linting) made agent adoption frictionless, because good DX for humans turns out to be good DX for agents too. Non-engineers at Stripe now use minions to ship code.</p><div><hr></div><h2><strong>&#128240; 3 Curated Reads</strong></h2><h3><strong>1. A Visual Guide to Attention Variants in Modern LLMs</strong></h3><div class="embedded-post-wrap" data-attrs="{&quot;id&quot;:191674053,&quot;url&quot;:&quot;https://magazine.sebastianraschka.com/p/visual-attention-variants&quot;,&quot;publication_id&quot;:1174659,&quot;publication_name&quot;:&quot;Ahead of AI&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!96vs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49f25d0a-212b-4853-8bcb-128d0a3edbbf_1196x1196.png&quot;,&quot;title&quot;:&quot;A Visual Guide to Attention Variants in Modern LLMs&quot;,&quot;truncated_body_text&quot;:&quot;I had originally planned to write about DeepSeek V4. Since it still hasn&#8217;t been released, I used the time to work on something that had been on my list for a while, namely, collecting, organizing, and refining the different LLM architectures I have covered over the past few years.&quot;,&quot;date&quot;:&quot;2026-03-22T11:55:40.110Z&quot;,&quot;like_count&quot;:267,&quot;comment_count&quot;:6,&quot;bylines&quot;:[{&quot;id&quot;:27393275,&quot;name&quot;:&quot;Sebastian Raschka, PhD&quot;,&quot;handle&quot;:&quot;rasbt&quot;,&quot;previous_name&quot;:&quot;Sebastian Raschka&quot;,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F61f4c017-506f-4e9b-a24f-76340dad0309_800x800.jpeg&quot;,&quot;bio&quot;:&quot;I'm an LLM research engineer 10+ years of experience in artificial intelligence. My expertise lies in AI &amp; LLM research focusing on code-driven implementations. I am also the author of \&quot;Build a Large Language Model From Scratch\&quot; (amzn.to/4fqvn0D).&quot;,&quot;profile_set_up_at&quot;:&quot;2022-10-09T16:19:59.744Z&quot;,&quot;reader_installed_at&quot;:&quot;2022-11-07T19:56:32.129Z&quot;,&quot;publicationUsers&quot;:[{&quot;id&quot;:1127862,&quot;user_id&quot;:27393275,&quot;publication_id&quot;:1174659,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:true,&quot;publication&quot;:{&quot;id&quot;:1174659,&quot;name&quot;:&quot;Ahead of AI&quot;,&quot;subdomain&quot;:&quot;sebastianraschka&quot;,&quot;custom_domain&quot;:&quot;magazine.sebastianraschka.com&quot;,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Ahead of AI focuses on machine learning and AI research and is read by more than 150,000 researchers and practitioners who want to stay ahead in a rapidly evolving field.&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/49f25d0a-212b-4853-8bcb-128d0a3edbbf_1196x1196.png&quot;,&quot;author_id&quot;:27393275,&quot;primary_user_id&quot;:27393275,&quot;theme_var_background_pop&quot;:&quot;#2096FF&quot;,&quot;created_at&quot;:&quot;2022-11-04T18:30:05.218Z&quot;,&quot;email_from_name&quot;:null,&quot;copyright&quot;:&quot;Raschka AI Research (RAIR) Lab LLC&quot;,&quot;founding_plan_name&quot;:&quot;Founding plan&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;enabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false,&quot;logo_url_wide&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5083e6d3-fbc9-4870-95b9-6e85d02f62a6_9366x2023.png&quot;}}],&quot;twitter_screen_name&quot;:&quot;rasbt&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:1000,&quot;status&quot;:{&quot;bestsellerTier&quot;:1000,&quot;subscriberTier&quot;:1,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:{&quot;type&quot;:&quot;bestseller&quot;,&quot;tier&quot;:1000},&quot;paidPublicationIds&quot;:[1783977,9873],&quot;subscriber&quot;:null}}],&quot;utm_campaign&quot;:null,&quot;belowTheFold&quot;:true,&quot;type&quot;:&quot;newsletter&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPostToDOM"><a class="embedded-post" native="true" href="https://magazine.sebastianraschka.com/p/visual-attention-variants?utm_source=substack&amp;utm_campaign=post_embed&amp;utm_medium=web"><div class="embedded-post-header"><img class="embedded-post-publication-logo" src="https://substackcdn.com/image/fetch/$s_!96vs!,w_56,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49f25d0a-212b-4853-8bcb-128d0a3edbbf_1196x1196.png" loading="lazy"><span class="embedded-post-publication-name">Ahead of AI</span></div><div class="embedded-post-title-wrapper"><div class="embedded-post-title">A Visual Guide to Attention Variants in Modern LLMs</div></div><div class="embedded-post-body">I had originally planned to write about DeepSeek V4. Since it still hasn&#8217;t been released, I used the time to work on something that had been on my list for a while, namely, collecting, organizing, and refining the different LLM architectures I have covered over the past few years&#8230;</div><div class="embedded-post-cta-wrapper"><span class="embedded-post-cta">Read more</span></div><div class="embedded-post-meta">a month ago &#183; 267 likes &#183; 6 comments &#183; Sebastian Raschka, PhD</div></a></div><p>Seven attention mechanism families in one visual taxonomy, from classic Multi-Head Attention through GQA, DeepSeek&#8217;s Multi-Head Latent Attention, Sliding Window Attention, and hybrid architectures mixing transformers with linear or state-space modules. Sebastian Raschka diagrams how queries, keys, and values interact in each variant, how KV-cache mechanics change, and how memory growth curves differ. The practical payoff: a clear framework for choosing the right attention mechanism based on your model scale and deployment constraints, with Gemma 3 and DeepSeek V2 as case studies.</p><h3><strong>2. TurboQuant: Redefining AI Efficiency with Extreme Compression</strong></h3><p><strong><a href="https://research.google/blog/turboquant-redefining-ai-efficiency-with-extreme-compression/">https://research.google/blog/turboquant-redefining-ai-efficiency-with-extreme-compression/</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1Xzz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9a7994e-9cdf-4c45-bc4a-28df4e7ff816_1392x808.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1Xzz!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9a7994e-9cdf-4c45-bc4a-28df4e7ff816_1392x808.png 424w, https://substackcdn.com/image/fetch/$s_!1Xzz!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9a7994e-9cdf-4c45-bc4a-28df4e7ff816_1392x808.png 848w, https://substackcdn.com/image/fetch/$s_!1Xzz!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9a7994e-9cdf-4c45-bc4a-28df4e7ff816_1392x808.png 1272w, https://substackcdn.com/image/fetch/$s_!1Xzz!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9a7994e-9cdf-4c45-bc4a-28df4e7ff816_1392x808.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1Xzz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9a7994e-9cdf-4c45-bc4a-28df4e7ff816_1392x808.png" width="1392" height="808" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f9a7994e-9cdf-4c45-bc4a-28df4e7ff816_1392x808.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:808,&quot;width&quot;:1392,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1Xzz!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9a7994e-9cdf-4c45-bc4a-28df4e7ff816_1392x808.png 424w, https://substackcdn.com/image/fetch/$s_!1Xzz!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9a7994e-9cdf-4c45-bc4a-28df4e7ff816_1392x808.png 848w, https://substackcdn.com/image/fetch/$s_!1Xzz!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9a7994e-9cdf-4c45-bc4a-28df4e7ff816_1392x808.png 1272w, https://substackcdn.com/image/fetch/$s_!1Xzz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9a7994e-9cdf-4c45-bc4a-28df4e7ff816_1392x808.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Google Research introduces a training-free, data-agnostic compression algorithm that achieves 3-bit KV-cache quantization without accuracy loss. The trick: PolarQuant converts data vectors from Cartesian to polar coordinates (concentrating angle patterns predictably), then QJL reduces residual errors to single sign bits via random projections. On H100 GPUs, TurboQuant delivers a 6x reduction in KV memory and up to 8x performance gains over 32-bit unquantized baselines. Unlike most quantization work, this targets the KV-cache specifically (the bottleneck that grows with context length) and requires zero calibration data or fine-tuning.</p><h3><strong>3. The Future of On-Device AI</strong></h3><div class="embedded-post-wrap" data-attrs="{&quot;id&quot;:191951862,&quot;url&quot;:&quot;https://www.artificialintelligencemadesimple.com/p/the-future-of-on-device-ai&quot;,&quot;publication_id&quot;:1315074,&quot;publication_name&quot;:&quot;Artificial Intelligence Made Simple&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!Pfon!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77504fa0-0f08-4a38-bbde-becb151d2db8_643x644.png&quot;,&quot;title&quot;:&quot;The Future of On-Device AI&quot;,&quot;truncated_body_text&quot;:&quot;It takes time to create work that&#8217;s clear, independent, and genuinely useful. If you&#8217;ve found value in this newsletter, consider becoming a paid subscriber. It helps me dive deeper into research, reach more people, stay free from ads/hidden agendas, and supports my crippling chocolate milk addiction.&quot;,&quot;date&quot;:&quot;2026-03-24T06:54:36.254Z&quot;,&quot;like_count&quot;:55,&quot;comment_count&quot;:1,&quot;bylines&quot;:[{&quot;id&quot;:8101724,&quot;name&quot;:&quot;Devansh&quot;,&quot;handle&quot;:&quot;chocolatemilkcultleader&quot;,&quot;previous_name&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/48081c70-8afa-41e3-a44e-b0f917bc7577_1200x1600.jpeg&quot;,&quot;bio&quot;:&quot;The best meme-maker in Tech. Writer on AI, Software, and the Tech Industry. Currently in NYC Come say hi, I want more friends. &quot;,&quot;profile_set_up_at&quot;:&quot;2021-08-21T20:28:53.612Z&quot;,&quot;reader_installed_at&quot;:&quot;2024-03-11T12:27:10.271Z&quot;,&quot;publicationUsers&quot;:[{&quot;id&quot;:1274217,&quot;user_id&quot;:8101724,&quot;publication_id&quot;:1315074,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:true,&quot;publication&quot;:{&quot;id&quot;:1315074,&quot;name&quot;:&quot;Artificial Intelligence Made Simple&quot;,&quot;subdomain&quot;:&quot;artificialintelligencemadesimple&quot;,&quot;custom_domain&quot;:&quot;www.artificialintelligencemadesimple.com&quot;,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Covering the important ideas in AI from all angles- technical, social, and economic. Read in over 200 countries.  Useful to everyone who wants to learn AI. Critical to anyone trying to see what happens next. Sister Publication to Tech Made Simple.&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/77504fa0-0f08-4a38-bbde-becb151d2db8_643x644.png&quot;,&quot;author_id&quot;:8101724,&quot;primary_user_id&quot;:8101724,&quot;theme_var_background_pop&quot;:&quot;#009B50&quot;,&quot;created_at&quot;:&quot;2023-01-14T23:37:24.692Z&quot;,&quot;email_from_name&quot;:null,&quot;copyright&quot;:&quot;Devansh&quot;,&quot;founding_plan_name&quot;:&quot;Founding Member&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;enabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;magaziney&quot;,&quot;is_personal_mode&quot;:false,&quot;logo_url_wide&quot;:null}},{&quot;id&quot;:109622,&quot;user_id&quot;:8101724,&quot;publication_id&quot;:108704,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:false,&quot;publication&quot;:{&quot;id&quot;:108704,&quot;name&quot;:&quot;Technology Made Simple&quot;,&quot;subdomain&quot;:&quot;codinginterviewsmadesimple&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Deep yet digestible insights about Computer Science, Programming Interviews, Software Engineering Careers, Machine Learning, and the Tech Industry for Tech Leaders. Amazing For Coders and Managers. Beneficial to anyone trying to make money in Tech. &quot;,&quot;logo_url&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/8546dc69-af46-4d5d-9a80-b66cb76c833b_644x644.png&quot;,&quot;author_id&quot;:8101724,&quot;primary_user_id&quot;:null,&quot;theme_var_background_pop&quot;:&quot;#45D800&quot;,&quot;created_at&quot;:&quot;2020-10-07T10:47:41.199Z&quot;,&quot;email_from_name&quot;:&quot;Devansh from Tech Made Simple&quot;,&quot;copyright&quot;:&quot;Devansh&quot;,&quot;founding_plan_name&quot;:&quot;Founding Member&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;disabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;magaziney&quot;,&quot;is_personal_mode&quot;:false,&quot;logo_url_wide&quot;:null}},{&quot;id&quot;:5366623,&quot;user_id&quot;:8101724,&quot;publication_id&quot;:5261101,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:false,&quot;publication&quot;:{&quot;id&quot;:5261101,&quot;name&quot;:&quot;What's Happening In Tech&quot;,&quot;subdomain&quot;:&quot;whatishappeningintechnology&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;A Newsletter meant to Help People Keep Up With What's Happening in Tech&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ff955b89-d08e-4cb7-8add-709e6dc14d8e_1080x1080.jpeg&quot;,&quot;author_id&quot;:8101724,&quot;primary_user_id&quot;:null,&quot;theme_var_background_pop&quot;:&quot;#FF6719&quot;,&quot;created_at&quot;:&quot;2025-06-07T04:30:33.908Z&quot;,&quot;email_from_name&quot;:null,&quot;copyright&quot;:&quot;Devansh&quot;,&quot;founding_plan_name&quot;:null,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;disabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false,&quot;logo_url_wide&quot;:null}}],&quot;twitter_screen_name&quot;:&quot;Machine01776819&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100,&quot;status&quot;:{&quot;bestsellerTier&quot;:100,&quot;subscriberTier&quot;:1,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:{&quot;type&quot;:&quot;bestseller&quot;,&quot;tier&quot;:100},&quot;paidPublicationIds&quot;:[1442076,618139,1238074],&quot;subscriber&quot;:null}}],&quot;utm_campaign&quot;:null,&quot;belowTheFold&quot;:true,&quot;type&quot;:&quot;newsletter&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPostToDOM"><a class="embedded-post" native="true" href="https://www.artificialintelligencemadesimple.com/p/the-future-of-on-device-ai?utm_source=substack&amp;utm_campaign=post_embed&amp;utm_medium=web"><div class="embedded-post-header"><img class="embedded-post-publication-logo" src="https://substackcdn.com/image/fetch/$s_!Pfon!,w_56,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77504fa0-0f08-4a38-bbde-becb151d2db8_643x644.png" loading="lazy"><span class="embedded-post-publication-name">Artificial Intelligence Made Simple</span></div><div class="embedded-post-title-wrapper"><div class="embedded-post-title">The Future of On-Device AI</div></div><div class="embedded-post-body">It takes time to create work that&#8217;s clear, independent, and genuinely useful. If you&#8217;ve found value in this newsletter, consider becoming a paid subscriber. It helps me dive deeper into research, reach more people, stay free from ads/hidden agendas, and supports my crippling chocolate milk addiction&#8230;</div><div class="embedded-post-cta-wrapper"><span class="embedded-post-cta">Read more</span></div><div class="embedded-post-meta">a month ago &#183; 55 likes &#183; 1 comment &#183; Devansh</div></a></div><p>Devansh argues the real bottleneck for on-device AI is memory bandwidth, not compute. Using Liquid AI&#8217;s LFM2 (1.2B parameters, runs on Samsung Galaxy S25) as a case study, the piece shows why shrinking data-center models is the wrong approach. LFM2 uses 10 gated short convolutions plus 6 grouped-query attention blocks, cutting peak cache to 192 MB at 32K tokens (versus Llama 3.2 1B&#8217;s 524 MB). It matches Qwen3-1.7B on benchmarks despite having 42% fewer parameters and runs at 70 tokens per second on a phone CPU. The thesis: the field needs device-native architectures designed from scratch, not miniaturized cloud models.</p><div><hr></div><h2><strong>&#128736; 2 Tools &amp; Repos</strong></h2><h3><strong>1. OpenViking: A Context Database for AI Agents</strong></h3><p><strong><a href="https://github.com/volcengine/OpenViking">https://github.com/volcengine/OpenViking</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!fK6c!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F925f7e62-6f07-492b-b5bd-78b50d98fb1b_1200x600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!fK6c!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F925f7e62-6f07-492b-b5bd-78b50d98fb1b_1200x600.png 424w, https://substackcdn.com/image/fetch/$s_!fK6c!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F925f7e62-6f07-492b-b5bd-78b50d98fb1b_1200x600.png 848w, https://substackcdn.com/image/fetch/$s_!fK6c!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F925f7e62-6f07-492b-b5bd-78b50d98fb1b_1200x600.png 1272w, https://substackcdn.com/image/fetch/$s_!fK6c!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F925f7e62-6f07-492b-b5bd-78b50d98fb1b_1200x600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!fK6c!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F925f7e62-6f07-492b-b5bd-78b50d98fb1b_1200x600.png" width="1200" height="600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/925f7e62-6f07-492b-b5bd-78b50d98fb1b_1200x600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:600,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!fK6c!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F925f7e62-6f07-492b-b5bd-78b50d98fb1b_1200x600.png 424w, https://substackcdn.com/image/fetch/$s_!fK6c!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F925f7e62-6f07-492b-b5bd-78b50d98fb1b_1200x600.png 848w, https://substackcdn.com/image/fetch/$s_!fK6c!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F925f7e62-6f07-492b-b5bd-78b50d98fb1b_1200x600.png 1272w, https://substackcdn.com/image/fetch/$s_!fK6c!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F925f7e62-6f07-492b-b5bd-78b50d98fb1b_1200x600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>ByteDance&#8217;s open-source solution for the &#8220;stuff everything into the prompt&#8221; problem. OpenViking organizes agent context (memories, resources, skills) into a navigable filesystem hierarchy with three-tier demand-based loading, so agents only consume tokens for what they actually need. It combines directory-based navigation with semantic search, auto-compresses conversations into long-term memory, and provides visualization of retrieval trajectories for debugging. 19.1K stars, Apache 2.0, supports major LLM providers via LiteLLM.</p><h3><strong>2. Cornell Virtual Workshop: GPU Architecture Fundamentals</strong></h3><p><strong><a href="https://cvw.cac.cornell.edu/gpu-architecture/gpu-characteristics/design">https://cvw.cac.cornell.edu/gpu-architecture/gpu-characteristics/design</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!DdGt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F778d781e-f5eb-4b9d-956e-e950c6dd0577_1280x1266.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!DdGt!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F778d781e-f5eb-4b9d-956e-e950c6dd0577_1280x1266.png 424w, https://substackcdn.com/image/fetch/$s_!DdGt!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F778d781e-f5eb-4b9d-956e-e950c6dd0577_1280x1266.png 848w, https://substackcdn.com/image/fetch/$s_!DdGt!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F778d781e-f5eb-4b9d-956e-e950c6dd0577_1280x1266.png 1272w, https://substackcdn.com/image/fetch/$s_!DdGt!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F778d781e-f5eb-4b9d-956e-e950c6dd0577_1280x1266.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!DdGt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F778d781e-f5eb-4b9d-956e-e950c6dd0577_1280x1266.png" width="1280" height="1266" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/778d781e-f5eb-4b9d-956e-e950c6dd0577_1280x1266.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1266,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!DdGt!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F778d781e-f5eb-4b9d-956e-e950c6dd0577_1280x1266.png 424w, https://substackcdn.com/image/fetch/$s_!DdGt!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F778d781e-f5eb-4b9d-956e-e950c6dd0577_1280x1266.png 848w, https://substackcdn.com/image/fetch/$s_!DdGt!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F778d781e-f5eb-4b9d-956e-e950c6dd0577_1280x1266.png 1272w, https://substackcdn.com/image/fetch/$s_!DdGt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F778d781e-f5eb-4b9d-956e-e950c6dd0577_1280x1266.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Free, NSF-funded course from Cornell&#8217;s Center for Advanced Computing covering why GPUs work the way they do: transistor allocation tradeoffs, memory hierarchies, parallelization design choices, and the practical implications for your code. Modules covering fundamentals through V100 and RTX 5000 deep dives, with exercises. No parallel programming experience assumed. If you call `.cuda()` daily but lack a mental model for what happens underneath, start here.</p><div><hr></div><h2><strong>&#127891; 1 Pick of the Week</strong></h2><h3><strong>Claude Code Cheat Sheet</strong></h3><p>https://cc.storyfox.cz/</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!chPh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F071cb070-85d3-4438-986f-c4075fe7958d_1280x1233.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!chPh!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F071cb070-85d3-4438-986f-c4075fe7958d_1280x1233.png 424w, https://substackcdn.com/image/fetch/$s_!chPh!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F071cb070-85d3-4438-986f-c4075fe7958d_1280x1233.png 848w, https://substackcdn.com/image/fetch/$s_!chPh!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F071cb070-85d3-4438-986f-c4075fe7958d_1280x1233.png 1272w, https://substackcdn.com/image/fetch/$s_!chPh!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F071cb070-85d3-4438-986f-c4075fe7958d_1280x1233.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!chPh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F071cb070-85d3-4438-986f-c4075fe7958d_1280x1233.png" width="1280" height="1233" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/071cb070-85d3-4438-986f-c4075fe7958d_1280x1233.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1233,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!chPh!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F071cb070-85d3-4438-986f-c4075fe7958d_1280x1233.png 424w, https://substackcdn.com/image/fetch/$s_!chPh!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F071cb070-85d3-4438-986f-c4075fe7958d_1280x1233.png 848w, https://substackcdn.com/image/fetch/$s_!chPh!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F071cb070-85d3-4438-986f-c4075fe7958d_1280x1233.png 1272w, https://substackcdn.com/image/fetch/$s_!chPh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F071cb070-85d3-4438-986f-c4075fe7958d_1280x1233.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This single page Claude code cheat sheet covers 100+ commands across 8 color-coded sections: keyboard shortcuts, ~40 slash commands, MCP server configuration, memory and files, workflows, config, skills and agents, and CLI flags. Commands like `/btw` for side questions without derailing context, `/schedule` for cloud-scheduled tasks, and git worktree isolation with sparse checkout are buried in docs but surfaced here at a glance. Auto-detects Mac versus Windows, prints cleanly to A4, works offline, and updates daily as Claude Code evolves.</p><div><hr></div><p><em>Thanks for reading The Tokenizer! If you found this valuable, please share it with your colleagues and consider subscribing to <a href="https://newsletter.artofsaience.com">Gradient Ascent</a> for more AI insights.</em></p>]]></content:encoded></item><item><title><![CDATA[Claude Code Best Practices, Planning in 8 Tokens, and Why Reasoning Models Can't Control Their Own Thoughts - 📚 The Tokenizer Edition #20]]></title><description><![CDATA[This week's most valuable resources]]></description><link>https://newsletter.artofsaience.com/p/claude-code-best-practices-planning</link><guid isPermaLink="false">https://newsletter.artofsaience.com/p/claude-code-best-practices-planning</guid><dc:creator><![CDATA[Sairam Sundaresan]]></dc:creator><pubDate>Wed, 18 Mar 2026 13:03:02 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/EInEmGaMRLc" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey there! This week&#8217;s theme is the gap between what AI systems <em>can</em> do and what they <em>actually</em> do in practice. Reasoning models that can&#8217;t steer their own chain of thought. RAG systems that work in demos but hallucinate in production. Training clusters that fail in ways no tutorial prepares you for. The good news: attention training just got 1.67x faster, and Figma engineers are showing what a real design-to-code workflow looks like with Claude Code.</p><h3><strong>New here?</strong></h3><p><em>The Tokenizer is my resource-focused newsletter edition where I curate the best AI/ML papers, videos, articles, tools, and learning resources so you don&#8217;t have to sift through the noise. Subscribe to <a href="https://newsletter.artofsaience.com">Gradient Ascent</a> for the full experience.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.artofsaience.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.artofsaience.com/subscribe?"><span>Subscribe now</span></a></p><h2><strong>TL;DR</strong></h2><p>What caught my attention this week:</p><ul><li><p>&#128196; <strong>Papers:</strong> Low-bit attention training, efficient VLMs, scientific discovery shortcuts, robot planning in 8 tokens, and reasoning models that can&#8217;t control their own thoughts</p></li><li><p>&#127909; <strong>Videos:</strong> Sakana AI&#8217;s evolved transformers, Turbopuffer&#8217;s post-RAG retrieval architecture, Figma&#8217;s Claude Code design pipeline, and DeepMind reflects on AlphaGo&#8217;s decade of impact</p></li><li><p>&#128240; <strong>Reads:</strong> Nathan Lambert on why the open model gap will widen (not close), production RAG done right, and diagnosing failures across 192-GPU training clusters</p></li><li><p>&#128736; <strong>Tools:</strong> ByteDance&#8217;s open-source SuperAgent platform and OpenAI&#8217;s AI-powered LaTeX editor</p></li><li><p>&#127891; <strong>Learning:</strong> A community-built field manual for getting the most out of Claude Code</p></li></ul><div><hr></div><h2><strong>&#128196; 5 Papers</strong></h2><h3><strong>1. SageBwd: A Trainable Low-bit Attention</strong></h3><p><strong><a href="https://arxiv.org/abs/2603.02170">https://arxiv.org/abs/2603.02170</a></strong> | <strong><a href="https://github.com/thu-ml/SageAttention">GitHub</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wJND!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e235767-cd2c-4e09-8338-a9a8e98436cf_1440x900.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wJND!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e235767-cd2c-4e09-8338-a9a8e98436cf_1440x900.png 424w, https://substackcdn.com/image/fetch/$s_!wJND!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e235767-cd2c-4e09-8338-a9a8e98436cf_1440x900.png 848w, https://substackcdn.com/image/fetch/$s_!wJND!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e235767-cd2c-4e09-8338-a9a8e98436cf_1440x900.png 1272w, https://substackcdn.com/image/fetch/$s_!wJND!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e235767-cd2c-4e09-8338-a9a8e98436cf_1440x900.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wJND!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e235767-cd2c-4e09-8338-a9a8e98436cf_1440x900.png" width="1440" height="900" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2e235767-cd2c-4e09-8338-a9a8e98436cf_1440x900.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:900,&quot;width&quot;:1440,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wJND!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e235767-cd2c-4e09-8338-a9a8e98436cf_1440x900.png 424w, https://substackcdn.com/image/fetch/$s_!wJND!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e235767-cd2c-4e09-8338-a9a8e98436cf_1440x900.png 848w, https://substackcdn.com/image/fetch/$s_!wJND!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e235767-cd2c-4e09-8338-a9a8e98436cf_1440x900.png 1272w, https://substackcdn.com/image/fetch/$s_!wJND!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e235767-cd2c-4e09-8338-a9a8e98436cf_1440x900.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>SageAttention already had 3,200+ GitHub stars for speeding up inference. Now SageBwd extends quantized attention to training by quantizing 6 of 7 attention matrix multiplications in the backward pass. The result: up to 1.67x speedup over FlashAttention2 with negligible loss difference (2.561 vs 2.563 at 260K tokens per step). If you&#8217;re spending money on attention compute during training, this is the paper to read.</p><h3><strong>2. Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoder</strong></h3><p><strong><a href="https://arxiv.org/abs/2603.06569">https://arxiv.org/abs/2603.06569</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!X0cf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1173f95-3f11-400b-ac4d-91b9b8101447_718x495.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!X0cf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1173f95-3f11-400b-ac4d-91b9b8101447_718x495.png 424w, https://substackcdn.com/image/fetch/$s_!X0cf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1173f95-3f11-400b-ac4d-91b9b8101447_718x495.png 848w, https://substackcdn.com/image/fetch/$s_!X0cf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1173f95-3f11-400b-ac4d-91b9b8101447_718x495.png 1272w, https://substackcdn.com/image/fetch/$s_!X0cf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1173f95-3f11-400b-ac4d-91b9b8101447_718x495.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!X0cf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1173f95-3f11-400b-ac4d-91b9b8101447_718x495.png" width="718" height="495" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d1173f95-3f11-400b-ac4d-91b9b8101447_718x495.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:495,&quot;width&quot;:718,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!X0cf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1173f95-3f11-400b-ac4d-91b9b8101447_718x495.png 424w, https://substackcdn.com/image/fetch/$s_!X0cf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1173f95-3f11-400b-ac4d-91b9b8101447_718x495.png 848w, https://substackcdn.com/image/fetch/$s_!X0cf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1173f95-3f11-400b-ac4d-91b9b8101447_718x495.png 1272w, https://substackcdn.com/image/fetch/$s_!X0cf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1173f95-3f11-400b-ac4d-91b9b8101447_718x495.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>What if you replaced CLIP/SigLIP vision encoders with one initialized from a plain text LLM? Tencent AI Lab tried it. Their 8B model outperforms Qwen3-VL-8B and InternVL3.5-8B on document understanding, visual knowledge, and video reasoning, hitting 96.2 on DocVQA and 90.5 on ChartQA. Vision encoder architecture matters more than you&#8217;d think. Sometimes the answer is just using a language model.</p><h3><strong>3. MOOSE-Star: Unlocking Tractable Training for Scientific Discovery</strong></h3><p><strong><a href="https://arxiv.org/abs/2603.03756">https://arxiv.org/abs/2603.03756</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nKQN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b89c289-3db9-4ea3-8482-d8fd773c3a30_996x638.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nKQN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b89c289-3db9-4ea3-8482-d8fd773c3a30_996x638.png 424w, https://substackcdn.com/image/fetch/$s_!nKQN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b89c289-3db9-4ea3-8482-d8fd773c3a30_996x638.png 848w, https://substackcdn.com/image/fetch/$s_!nKQN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b89c289-3db9-4ea3-8482-d8fd773c3a30_996x638.png 1272w, https://substackcdn.com/image/fetch/$s_!nKQN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b89c289-3db9-4ea3-8482-d8fd773c3a30_996x638.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nKQN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b89c289-3db9-4ea3-8482-d8fd773c3a30_996x638.png" width="996" height="638" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9b89c289-3db9-4ea3-8482-d8fd773c3a30_996x638.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:638,&quot;width&quot;:996,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nKQN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b89c289-3db9-4ea3-8482-d8fd773c3a30_996x638.png 424w, https://substackcdn.com/image/fetch/$s_!nKQN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b89c289-3db9-4ea3-8482-d8fd773c3a30_996x638.png 848w, https://substackcdn.com/image/fetch/$s_!nKQN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b89c289-3db9-4ea3-8482-d8fd773c3a30_996x638.png 1272w, https://substackcdn.com/image/fetch/$s_!nKQN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b89c289-3db9-4ea3-8482-d8fd773c3a30_996x638.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Scientific hypothesis discovery with LLMs has an exponential search problem. MOOSE-Star (from MiroMind AI) reduces combinatorial O(N^k) search to roughly logarithmic via hierarchical decomposition, hitting 100% success rate at around 6,000 inference calls where brute-force saturates at 41.3%. Also releases TOMATO-Star, a dataset of 108,717 decomposed papers for benchmarking. The complexity reduction alone makes previously intractable hypothesis spaces searchable.</p><h3><strong>4. CompACT: A Compact Discrete Tokenizer for Latent World Model Planning</strong></h3><p><strong><a href="https://arxiv.org/abs/2603.05438">https://arxiv.org/abs/2603.05438</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!05B7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2a220db-9f1e-4230-8d5c-8db20d33a655_1040x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!05B7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2a220db-9f1e-4230-8d5c-8db20d33a655_1040x1600.png 424w, https://substackcdn.com/image/fetch/$s_!05B7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2a220db-9f1e-4230-8d5c-8db20d33a655_1040x1600.png 848w, https://substackcdn.com/image/fetch/$s_!05B7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2a220db-9f1e-4230-8d5c-8db20d33a655_1040x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!05B7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2a220db-9f1e-4230-8d5c-8db20d33a655_1040x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!05B7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2a220db-9f1e-4230-8d5c-8db20d33a655_1040x1600.png" width="1040" height="1600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b2a220db-9f1e-4230-8d5c-8db20d33a655_1040x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1600,&quot;width&quot;:1040,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!05B7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2a220db-9f1e-4230-8d5c-8db20d33a655_1040x1600.png 424w, https://substackcdn.com/image/fetch/$s_!05B7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2a220db-9f1e-4230-8d5c-8db20d33a655_1040x1600.png 848w, https://substackcdn.com/image/fetch/$s_!05B7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2a220db-9f1e-4230-8d5c-8db20d33a655_1040x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!05B7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2a220db-9f1e-4230-8d5c-8db20d33a655_1040x1600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Robot planning typically requires hundreds of visual tokens per observation. CompACT (from POSTECH and KAIST, accepted at CVPR 2026) compresses that down to as few as 8 discrete tokens, making world-model planning 40x faster. Navigation planning drops from 178 seconds to under 6 seconds with competitive accuracy. This is what makes real-time robotic planning actually feasible.</p><h3><strong>5. Reasoning Models Struggle to Control their Chains of Thought</strong></h3><p><strong><a href="https://arxiv.org/abs/2603.05706">https://arxiv.org/abs/2603.05706</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hKww!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dc8381e-dd80-4066-bad0-94e82ac9b632_793x239.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hKww!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dc8381e-dd80-4066-bad0-94e82ac9b632_793x239.png 424w, https://substackcdn.com/image/fetch/$s_!hKww!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dc8381e-dd80-4066-bad0-94e82ac9b632_793x239.png 848w, https://substackcdn.com/image/fetch/$s_!hKww!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dc8381e-dd80-4066-bad0-94e82ac9b632_793x239.png 1272w, https://substackcdn.com/image/fetch/$s_!hKww!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dc8381e-dd80-4066-bad0-94e82ac9b632_793x239.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hKww!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dc8381e-dd80-4066-bad0-94e82ac9b632_793x239.png" width="793" height="239" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2dc8381e-dd80-4066-bad0-94e82ac9b632_793x239.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:239,&quot;width&quot;:793,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hKww!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dc8381e-dd80-4066-bad0-94e82ac9b632_793x239.png 424w, https://substackcdn.com/image/fetch/$s_!hKww!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dc8381e-dd80-4066-bad0-94e82ac9b632_793x239.png 848w, https://substackcdn.com/image/fetch/$s_!hKww!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dc8381e-dd80-4066-bad0-94e82ac9b632_793x239.png 1272w, https://substackcdn.com/image/fetch/$s_!hKww!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dc8381e-dd80-4066-bad0-94e82ac9b632_793x239.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Can reasoning models actually control their chain of thought? Researchers from NYU, OpenAI, UCL, and UPenn tested 13 models. Claude Sonnet 4.5 achieves only 2.7% CoT controllability (versus 61.9% output controllability). DeepSeek R1 scores 0.1%. The safety implication: if models can&#8217;t steer their reasoning strategically, they also can&#8217;t easily hide deceptive reasoning from monitors. Matters more for what it implies than what it measures.</p><div><hr></div><h2><strong>&#127909; 4 Videos</strong></h2><h3><strong>1. Sakana AI&#8217;s Open-Ended Evolution of Transformers with Robert Lange</strong></h3><div id="youtube2-EInEmGaMRLc" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;EInEmGaMRLc&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/EInEmGaMRLc?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>LLMs combined with evolutionary algorithms for open-ended program search. That&#8217;s Shinka Evolve from Sakana AI, discussed by Robert Lange on Machine Learning Street Talk. Why &#8220;solving the wrong problem&#8221; sometimes leads to better architectures. How evolutionary pressure discovers novel transformer variants. What open-endedness means for AI research beyond benchmarks. Covers the gap between optimizing a known objective and discovering objectives worth optimizing.</p><h3><strong>2. Retrieval After RAG: Hybrid Search, Agents, and Database Design with Turbopuffer&#8217;s Simon Eskildsen</strong></h3><div id="youtube2-Iu4gEnZFQz8" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;Iu4gEnZFQz8&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/Iu4gEnZFQz8?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>What comes after the first wave of RAG implementations? Hybrid search architectures, why vector-only retrieval hits a ceiling, agent-driven retrieval reshaping database design. Simon Eskildsen (Turbopuffer founder) walks through real case studies from Cursor and Notion on Latent Space. If you&#8217;ve built a RAG system and hit a quality wall, this is where you go next.</p><h3><strong>3. How Figma Engineers Sync Designs with Claude Code and Codex</strong></h3><div id="youtube2-I5X4_mYoiM8" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;I5X4_mYoiM8&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/I5X4_mYoiM8?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>MCP-based tooling that creates a continuous sync between design and code. Figma&#8217;s Gui Seiz and Alex Kern show their team&#8217;s actual production workflow using Claude Code and Codex. Not a concept demo. The design handoff becomes a two-way pipeline. Worth 40 minutes if your team still does screenshot-to-implementation.</p><h3><strong>4. 10 Years of AlphaGo: The Turning Point for AI</strong></h3><div id="youtube2-qoinGjj60Fo" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;qoinGjj60Fo&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/qoinGjj60Fo?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>AlphaGo&#8217;s techniques propagated into protein structure prediction, materials science, and chip design. Google DeepMind&#8217;s Thore Graepel and Pushmeet Kohli trace the full decade of impact beyond the Go match itself. Covers how one system&#8217;s ideas became foundational building blocks across scientific domains.</p><div><hr></div><h2><strong>&#128240; 3 Curated Reads</strong></h2><h3><strong>1. What comes next with open models</strong></h3><div class="embedded-post-wrap" data-attrs="{&quot;id&quot;:190338833,&quot;url&quot;:&quot;https://www.interconnects.ai/p/the-next-phase-of-open-models&quot;,&quot;publication_id&quot;:48206,&quot;publication_name&quot;:&quot;Interconnects AI&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!djof!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc52e8097-8f3d-4f7e-808b-2f4ad37f3b52_720x720.png&quot;,&quot;title&quot;:&quot;What comes next with open models&quot;,&quot;truncated_body_text&quot;:&quot;2025 was the year where a lot of companies started to take open models seriously as a path to influence in the extremely valuable AI ecosystem &#8212; the adoption of a strategy that was massively accelerated downstream of DeepSeek R1&#8217;s breakout success. Most of this is being done as a mission of hope, principle, or generosity.&quot;,&quot;date&quot;:&quot;2026-03-16T13:00:51.417Z&quot;,&quot;like_count&quot;:59,&quot;comment_count&quot;:12,&quot;bylines&quot;:[{&quot;id&quot;:10472909,&quot;name&quot;:&quot;Nathan Lambert&quot;,&quot;handle&quot;:&quot;natolambert&quot;,&quot;previous_name&quot;:null,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!RihO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fedcdfb-e137-4f6a-9089-a46add6c6242_500x500.jpeg&quot;,&quot;bio&quot;:&quot;ML researcher making sense of AI research, products, and the uncertain technological future. PhD from Berkeley AI. Experience at Meta, DeepMind, HuggingFace.&quot;,&quot;profile_set_up_at&quot;:&quot;2021-04-24T01:19:33.371Z&quot;,&quot;reader_installed_at&quot;:&quot;2022-03-09T17:52:30.690Z&quot;,&quot;publicationUsers&quot;:[{&quot;id&quot;:100753,&quot;user_id&quot;:10472909,&quot;publication_id&quot;:48206,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:true,&quot;publication&quot;:{&quot;id&quot;:48206,&quot;name&quot;:&quot;Interconnects AI&quot;,&quot;subdomain&quot;:&quot;robotic&quot;,&quot;custom_domain&quot;:&quot;www.interconnects.ai&quot;,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;The cutting edge of AI, from inside the frontier AI labs, minus the hype. The border between high-level and technical thinking. Read by leading engineers, researchers, and investors.&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c52e8097-8f3d-4f7e-808b-2f4ad37f3b52_720x720.png&quot;,&quot;author_id&quot;:10472909,&quot;primary_user_id&quot;:10472909,&quot;theme_var_background_pop&quot;:&quot;#ff6b00&quot;,&quot;created_at&quot;:&quot;2020-05-21T02:59:47.895Z&quot;,&quot;email_from_name&quot;:&quot;Interconnects by Nathan Lambert&quot;,&quot;copyright&quot;:&quot;Interconnects AI, LLC&quot;,&quot;founding_plan_name&quot;:&quot;Founding Member&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;enabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;magaziney&quot;,&quot;is_personal_mode&quot;:false,&quot;logo_url_wide&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/858a68f7-2e7e-4dd3-bed1-631b36801ce2_1651x357.png&quot;}},{&quot;id&quot;:4610799,&quot;user_id&quot;:10472909,&quot;publication_id&quot;:4519930,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:false,&quot;publication&quot;:{&quot;id&quot;:4519930,&quot;name&quot;:&quot;natolambert overflow&quot;,&quot;subdomain&quot;:&quot;natolambert&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;a place for any extra thoughts beyond Interconnects.ai&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/eb88d599-32c8-49a9-ba33-ab6327aff727_256x256.png&quot;,&quot;author_id&quot;:10472909,&quot;primary_user_id&quot;:null,&quot;theme_var_background_pop&quot;:&quot;#FF6719&quot;,&quot;created_at&quot;:&quot;2025-03-27T15:04:05.448Z&quot;,&quot;email_from_name&quot;:null,&quot;copyright&quot;:&quot;Nathan Lambert&quot;,&quot;founding_plan_name&quot;:null,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;disabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false,&quot;logo_url_wide&quot;:null}},{&quot;id&quot;:4926744,&quot;user_id&quot;:10472909,&quot;publication_id&quot;:4830082,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:false,&quot;publication&quot;:{&quot;id&quot;:4830082,&quot;name&quot;:&quot;Retort AI&quot;,&quot;subdomain&quot;:&quot;retortai&quot;,&quot;custom_domain&quot;:&quot;www.retortai.com&quot;,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Distilling the major events and challenges in the world of artificial intelligence and machine learning, from Thomas Krendl Gilbert and Nathan Lambert.\n\n&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cbad298c-6074-441b-ad43-d5df6dbf101d_800x800.png&quot;,&quot;author_id&quot;:10472909,&quot;primary_user_id&quot;:null,&quot;theme_var_background_pop&quot;:&quot;#FF6719&quot;,&quot;created_at&quot;:&quot;2025-04-25T22:10:28.216Z&quot;,&quot;email_from_name&quot;:null,&quot;copyright&quot;:&quot;Nathan Lambert&quot;,&quot;founding_plan_name&quot;:null,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;disabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false,&quot;logo_url_wide&quot;:null}}],&quot;twitter_screen_name&quot;:&quot;natolambert&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100,&quot;status&quot;:{&quot;bestsellerTier&quot;:100,&quot;subscriberTier&quot;:5,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:{&quot;type&quot;:&quot;bestseller&quot;,&quot;tier&quot;:100},&quot;paidPublicationIds&quot;:[883883,1084918,6349492,69345,6027,1915042],&quot;subscriber&quot;:null}}],&quot;utm_campaign&quot;:null,&quot;belowTheFold&quot;:true,&quot;type&quot;:&quot;newsletter&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPostToDOM"><a class="embedded-post" native="true" href="https://www.interconnects.ai/p/the-next-phase-of-open-models?utm_source=substack&amp;utm_campaign=post_embed&amp;utm_medium=web"><div class="embedded-post-header"><img class="embedded-post-publication-logo" src="https://substackcdn.com/image/fetch/$s_!djof!,w_56,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc52e8097-8f3d-4f7e-808b-2f4ad37f3b52_720x720.png" loading="lazy"><span class="embedded-post-publication-name">Interconnects AI</span></div><div class="embedded-post-title-wrapper"><div class="embedded-post-title">What comes next with open models</div></div><div class="embedded-post-body">2025 was the year where a lot of companies started to take open models seriously as a path to influence in the extremely valuable AI ecosystem &#8212; the adoption of a strategy that was massively accelerated downstream of DeepSeek R1&#8217;s breakout success. Most of this is being done as a mission of hope, principle, or generosity&#8230;</div><div class="embedded-post-cta-wrapper"><span class="embedded-post-cta">Read more</span></div><div class="embedded-post-meta">a month ago &#183; 59 likes &#183; 12 comments &#183; Nathan Lambert</div></a></div><p>The open-closed model gap will widen, not close. Nathan Lambert&#8217;s reframing: the real opportunity isn&#8217;t chasing frontier capability but building small, specialized models that are 10x faster and 100x cheaper. Introduces the &#8220;open models as sub-agents&#8221; framing where open-weight models handle specialized tasks within larger systems. Changes how you evaluate open models if you&#8217;ve been benchmarking them against GPT-5.</p><h3><strong>2. Production RAG: Learning from Scratch Done Right</strong></h3><div class="embedded-post-wrap" data-attrs="{&quot;id&quot;:188595127,&quot;url&quot;:&quot;https://www.decodingai.com/p/production-rag-from-scratch-senior-architect-guide&quot;,&quot;publication_id&quot;:1526003,&quot;publication_name&quot;:&quot;Decoding AI Magazine&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!k2ig!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00bc74e0-3601-49ce-8ab9-4c7b499ce597_1280x1280.png&quot;,&quot;title&quot;:&quot;Why Most RAG Tutorials Fail You&quot;,&quot;truncated_body_text&quot;:&quot;Paul: Today, the stage belongs to Priya, a Senior Software Architect who&#8217;s spent years shipping production-scale systems at Publicis Sapient and Tesco.&quot;,&quot;date&quot;:&quot;2026-03-12T12:02:03.170Z&quot;,&quot;like_count&quot;:46,&quot;comment_count&quot;:5,&quot;bylines&quot;:[{&quot;id&quot;:111942976,&quot;name&quot;:&quot;Priya&quot;,&quot;handle&quot;:&quot;pmarwa&quot;,&quot;previous_name&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f042f68a-83e3-4f56-8d47-6578f4d4e7ba_664x664.jpeg&quot;,&quot;bio&quot;:&quot;Senior software developer and AI explorer| passionate about building production-ready intelligent systems | avid trekker&quot;,&quot;profile_set_up_at&quot;:&quot;2025-09-25T12:52:56.768Z&quot;,&quot;reader_installed_at&quot;:&quot;2025-09-26T14:52:51.675Z&quot;,&quot;is_guest&quot;:true,&quot;bestseller_tier&quot;:null,&quot;status&quot;:null,&quot;primaryPublicationId&quot;:8297935,&quot;primaryPublicationName&quot;:&quot;Priya&quot;,&quot;primaryPublicationUrl&quot;:&quot;https://pmarwa.substack.com&quot;,&quot;primaryPublicationSubscribeUrl&quot;:&quot;https://pmarwa.substack.com/subscribe?&quot;}],&quot;utm_campaign&quot;:null,&quot;belowTheFold&quot;:true,&quot;type&quot;:&quot;newsletter&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPostToDOM"><a class="embedded-post" native="true" href="https://www.decodingai.com/p/production-rag-from-scratch-senior-architect-guide?utm_source=substack&amp;utm_campaign=post_embed&amp;utm_medium=web"><div class="embedded-post-header"><img class="embedded-post-publication-logo" src="https://substackcdn.com/image/fetch/$s_!k2ig!,w_56,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00bc74e0-3601-49ce-8ab9-4c7b499ce597_1280x1280.png" loading="lazy"><span class="embedded-post-publication-name">Decoding AI Magazine</span></div><div class="embedded-post-title-wrapper"><div class="embedded-post-title">Why Most RAG Tutorials Fail You</div></div><div class="embedded-post-body">Paul: Today, the stage belongs to Priya, a Senior Software Architect who&#8217;s spent years shipping production-scale systems at Publicis Sapient and Tesco&#8230;</div><div class="embedded-post-cta-wrapper"><span class="embedded-post-cta">Read more</span></div><div class="embedded-post-meta">a month ago &#183; 46 likes &#183; 5 comments &#183; Priya</div></a></div><p>Most RAG tutorials optimize for demos, not production. This piece on Paul Iusztin&#8217;s Decoding AI (by guest contributor Priya) walks through a 4-phase production RAG system: ingestion, retrieval, generation, serving. Uses Postgres and pgvector with explicit control flow and data lineage. The core insight: a bad chunk doesn&#8217;t throw an exception, it just hallucinates an answer three steps later. If your RAG prototype works in notebooks but fails in production, start here.</p><h3><strong>3. How to Diagnose Failures in Large AI Training Clusters</strong></h3><div class="embedded-post-wrap" data-attrs="{&quot;id&quot;:190797588,&quot;url&quot;:&quot;https://www.artificialintelligencemadesimple.com/p/how-to-diagnose-failures-in-large&quot;,&quot;publication_id&quot;:1315074,&quot;publication_name&quot;:&quot;Artificial Intelligence Made Simple&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!Pfon!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77504fa0-0f08-4a38-bbde-becb151d2db8_643x644.png&quot;,&quot;title&quot;:&quot;How to Diagnose Failures in Large AI Training Clusters&quot;,&quot;truncated_body_text&quot;:&quot;It takes time to create work that&#8217;s clear, independent, and genuinely useful. If you&#8217;ve found value in this newsletter, consider becoming a paid subscriber. It helps me dive deeper into research, reach more people, stay free from ads/hidden agendas, and supports my crippling chocolate milk addiction.&quot;,&quot;date&quot;:&quot;2026-03-13T06:17:43.658Z&quot;,&quot;like_count&quot;:35,&quot;comment_count&quot;:0,&quot;bylines&quot;:[{&quot;id&quot;:8101724,&quot;name&quot;:&quot;Devansh&quot;,&quot;handle&quot;:&quot;chocolatemilkcultleader&quot;,&quot;previous_name&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/48081c70-8afa-41e3-a44e-b0f917bc7577_1200x1600.jpeg&quot;,&quot;bio&quot;:&quot;The best meme-maker in Tech. Writer on AI, Software, and the Tech Industry. Currently in NYC Come say hi, I want more friends. &quot;,&quot;profile_set_up_at&quot;:&quot;2021-08-21T20:28:53.612Z&quot;,&quot;reader_installed_at&quot;:&quot;2024-03-11T12:27:10.271Z&quot;,&quot;publicationUsers&quot;:[{&quot;id&quot;:1274217,&quot;user_id&quot;:8101724,&quot;publication_id&quot;:1315074,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:true,&quot;publication&quot;:{&quot;id&quot;:1315074,&quot;name&quot;:&quot;Artificial Intelligence Made Simple&quot;,&quot;subdomain&quot;:&quot;artificialintelligencemadesimple&quot;,&quot;custom_domain&quot;:&quot;www.artificialintelligencemadesimple.com&quot;,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Covering the important ideas in AI from all angles- technical, social, and economic. Read in over 200 countries.  Useful to everyone who wants to learn AI. Critical to anyone trying to see what happens next. Sister Publication to Tech Made Simple.&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/77504fa0-0f08-4a38-bbde-becb151d2db8_643x644.png&quot;,&quot;author_id&quot;:8101724,&quot;primary_user_id&quot;:8101724,&quot;theme_var_background_pop&quot;:&quot;#009B50&quot;,&quot;created_at&quot;:&quot;2023-01-14T23:37:24.692Z&quot;,&quot;email_from_name&quot;:null,&quot;copyright&quot;:&quot;Devansh&quot;,&quot;founding_plan_name&quot;:&quot;Founding Member&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;enabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;magaziney&quot;,&quot;is_personal_mode&quot;:false,&quot;logo_url_wide&quot;:null}},{&quot;id&quot;:109622,&quot;user_id&quot;:8101724,&quot;publication_id&quot;:108704,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:false,&quot;publication&quot;:{&quot;id&quot;:108704,&quot;name&quot;:&quot;Technology Made Simple&quot;,&quot;subdomain&quot;:&quot;codinginterviewsmadesimple&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Deep yet digestible insights about Computer Science, Programming Interviews, Software Engineering Careers, Machine Learning, and the Tech Industry for Tech Leaders. Amazing For Coders and Managers. Beneficial to anyone trying to make money in Tech. &quot;,&quot;logo_url&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/8546dc69-af46-4d5d-9a80-b66cb76c833b_644x644.png&quot;,&quot;author_id&quot;:8101724,&quot;primary_user_id&quot;:null,&quot;theme_var_background_pop&quot;:&quot;#45D800&quot;,&quot;created_at&quot;:&quot;2020-10-07T10:47:41.199Z&quot;,&quot;email_from_name&quot;:&quot;Devansh from Tech Made Simple&quot;,&quot;copyright&quot;:&quot;Devansh&quot;,&quot;founding_plan_name&quot;:&quot;Founding Member&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;disabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;magaziney&quot;,&quot;is_personal_mode&quot;:false,&quot;logo_url_wide&quot;:null}},{&quot;id&quot;:5366623,&quot;user_id&quot;:8101724,&quot;publication_id&quot;:5261101,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:false,&quot;publication&quot;:{&quot;id&quot;:5261101,&quot;name&quot;:&quot;What's Happening In Tech&quot;,&quot;subdomain&quot;:&quot;whatishappeningintechnology&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;A Newsletter meant to Help People Keep Up With What's Happening in Tech&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ff955b89-d08e-4cb7-8add-709e6dc14d8e_1080x1080.jpeg&quot;,&quot;author_id&quot;:8101724,&quot;primary_user_id&quot;:null,&quot;theme_var_background_pop&quot;:&quot;#FF6719&quot;,&quot;created_at&quot;:&quot;2025-06-07T04:30:33.908Z&quot;,&quot;email_from_name&quot;:null,&quot;copyright&quot;:&quot;Devansh&quot;,&quot;founding_plan_name&quot;:null,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;disabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false,&quot;logo_url_wide&quot;:null}}],&quot;twitter_screen_name&quot;:&quot;Machine01776819&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100,&quot;status&quot;:{&quot;bestsellerTier&quot;:100,&quot;subscriberTier&quot;:1,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:{&quot;type&quot;:&quot;bestseller&quot;,&quot;tier&quot;:100},&quot;paidPublicationIds&quot;:[1442076,618139,1238074],&quot;subscriber&quot;:null}}],&quot;utm_campaign&quot;:null,&quot;belowTheFold&quot;:true,&quot;type&quot;:&quot;newsletter&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPostToDOM"><a class="embedded-post" native="true" href="https://www.artificialintelligencemadesimple.com/p/how-to-diagnose-failures-in-large?utm_source=substack&amp;utm_campaign=post_embed&amp;utm_medium=web"><div class="embedded-post-header"><img class="embedded-post-publication-logo" src="https://substackcdn.com/image/fetch/$s_!Pfon!,w_56,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77504fa0-0f08-4a38-bbde-becb151d2db8_643x644.png" loading="lazy"><span class="embedded-post-publication-name">Artificial Intelligence Made Simple</span></div><div class="embedded-post-title-wrapper"><div class="embedded-post-title">How to Diagnose Failures in Large AI Training Clusters</div></div><div class="embedded-post-body">It takes time to create work that&#8217;s clear, independent, and genuinely useful. If you&#8217;ve found value in this newsletter, consider becoming a paid subscriber. It helps me dive deeper into research, reach more people, stay free from ads/hidden agendas, and supports my crippling chocolate milk addiction&#8230;</div><div class="embedded-post-cta-wrapper"><span class="embedded-post-cta">Read more</span></div><div class="embedded-post-meta">a month ago &#183; 35 likes &#183; Devansh</div></a></div><p>AI agents autonomously executing diagnostic runbooks against a unified Prometheus TSDB. Devansh details five case studies across multi-GPU clusters with quantified results: 30% throughput recovery, checkpoint restore penalties reduced from 1.0% to 0.15%. Not theoretical. Includes the actual diagnostic architecture and failure patterns you&#8217;d hit at this scale.</p><div><hr></div><h2><strong>&#128736; 2 Tools &amp; Repos</strong></h2><h3><strong>1. deer-flow (ByteDance)</strong></h3><p><strong><a href="https://github.com/bytedance/deer-flow">https://github.com/bytedance/deer-flow</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!SGG1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdfaca990-a151-4988-a754-15a8fab5b41c_1600x889.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!SGG1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdfaca990-a151-4988-a754-15a8fab5b41c_1600x889.png 424w, https://substackcdn.com/image/fetch/$s_!SGG1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdfaca990-a151-4988-a754-15a8fab5b41c_1600x889.png 848w, https://substackcdn.com/image/fetch/$s_!SGG1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdfaca990-a151-4988-a754-15a8fab5b41c_1600x889.png 1272w, https://substackcdn.com/image/fetch/$s_!SGG1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdfaca990-a151-4988-a754-15a8fab5b41c_1600x889.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!SGG1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdfaca990-a151-4988-a754-15a8fab5b41c_1600x889.png" width="1456" height="809" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dfaca990-a151-4988-a754-15a8fab5b41c_1600x889.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:809,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!SGG1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdfaca990-a151-4988-a754-15a8fab5b41c_1600x889.png 424w, https://substackcdn.com/image/fetch/$s_!SGG1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdfaca990-a151-4988-a754-15a8fab5b41c_1600x889.png 848w, https://substackcdn.com/image/fetch/$s_!SGG1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdfaca990-a151-4988-a754-15a8fab5b41c_1600x889.png 1272w, https://substackcdn.com/image/fetch/$s_!SGG1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdfaca990-a151-4988-a754-15a8fab5b41c_1600x889.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>ByteDance&#8217;s open-source SuperAgent platform (MIT license, 31K+ stars) got a ground-up v2.0 rewrite that hit #1 on GitHub Trending. Ships as a complete deployable platform, not a framework you wire together. Web UI, Docker-sandboxed execution, persistent cross-session memory, parallel sub-agent spawning, messaging integrations (Telegram, Slack, Feishu). Built on LangGraph. For teams that want a working multi-agent system without assembling one from parts.</p><h3><strong>2. Prism (OpenAI)</strong></h3><p><strong><a href="https://prism.openai.com/">https://prism.openai.com/</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2xyq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5b75f40-1f06-4f0a-8573-e4f08a8b9bd2_1280x720.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2xyq!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5b75f40-1f06-4f0a-8573-e4f08a8b9bd2_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!2xyq!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5b75f40-1f06-4f0a-8573-e4f08a8b9bd2_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!2xyq!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5b75f40-1f06-4f0a-8573-e4f08a8b9bd2_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!2xyq!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5b75f40-1f06-4f0a-8573-e4f08a8b9bd2_1280x720.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2xyq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5b75f40-1f06-4f0a-8573-e4f08a8b9bd2_1280x720.png" width="1280" height="720" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e5b75f40-1f06-4f0a-8573-e4f08a8b9bd2_1280x720.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2xyq!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5b75f40-1f06-4f0a-8573-e4f08a8b9bd2_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!2xyq!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5b75f40-1f06-4f0a-8573-e4f08a8b9bd2_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!2xyq!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5b75f40-1f06-4f0a-8573-e4f08a8b9bd2_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!2xyq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5b75f40-1f06-4f0a-8573-e4f08a8b9bd2_1280x720.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>OpenAI&#8217;s browser-based LaTeX editor with GPT-5.2 integrated inline. Free tier: unlimited projects, compiles, and collaborators (Pro at $7/mo for unlimited AI features). Highlight text, ask the AI to rewrite or formalize, and it compiles in real time. Zotero integration, image-to-LaTeX, voice mode for dictating equations. Best for refining existing papers. Won&#8217;t generate structure from a blank page. No Git integration yet, which is the main gap versus Overleaf.</p><div><hr></div><h2><strong>&#127891; 1 Pick of the Week</strong></h2><h3><strong>claude-code-best-practice</strong></h3><p><strong><a href="https://github.com/shanraisshan/claude-code-best-practice">https://github.com/shanraisshan/claude-code-best-practice</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!N7uY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ee06362-dbc7-4d49-847a-459094b4b799_1186x572.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!N7uY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ee06362-dbc7-4d49-847a-459094b4b799_1186x572.jpeg 424w, https://substackcdn.com/image/fetch/$s_!N7uY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ee06362-dbc7-4d49-847a-459094b4b799_1186x572.jpeg 848w, https://substackcdn.com/image/fetch/$s_!N7uY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ee06362-dbc7-4d49-847a-459094b4b799_1186x572.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!N7uY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ee06362-dbc7-4d49-847a-459094b4b799_1186x572.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!N7uY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ee06362-dbc7-4d49-847a-459094b4b799_1186x572.jpeg" width="1186" height="572" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9ee06362-dbc7-4d49-847a-459094b4b799_1186x572.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:572,&quot;width&quot;:1186,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!N7uY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ee06362-dbc7-4d49-847a-459094b4b799_1186x572.jpeg 424w, https://substackcdn.com/image/fetch/$s_!N7uY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ee06362-dbc7-4d49-847a-459094b4b799_1186x572.jpeg 848w, https://substackcdn.com/image/fetch/$s_!N7uY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ee06362-dbc7-4d49-847a-459094b4b799_1186x572.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!N7uY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ee06362-dbc7-4d49-847a-459094b4b799_1186x572.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>How should you actually use Claude Code day-to-day? Official docs don&#8217;t answer that. This community-built field manual does (17,600+ stars, actively maintained). 40+ actionable tips across 8 categories, comparative reports against other tools, community workflow implementations. Includes a working `.claude/` directory you can clone. The &#8220;billion-dollar questions&#8221; section names what the community still hasn&#8217;t figured out. If you&#8217;re already using Claude Code and want to move from casual to systematic, bookmark this.</p><div><hr></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.artofsaience.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Gradient Ascent! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p><em>Thanks for reading The Tokenizer! If you found this valuable, please share it with your colleagues and consider subscribing to <a href="https://newsletter.artofsaience.com">Gradient Ascent</a> for more AI insights.</em></p>]]></content:encoded></item><item><title><![CDATA[Karpathy's Autonomous ML Lab, Sleeper Cells in LLMs, and Andrew Ng's Context Hub - 📚 The Tokenizer Edition #19]]></title><description><![CDATA[This week's most valuable AI resources]]></description><link>https://newsletter.artofsaience.com/p/karpathys-autonomous-ml-lab-sleeper</link><guid isPermaLink="false">https://newsletter.artofsaience.com/p/karpathys-autonomous-ml-lab-sleeper</guid><dc:creator><![CDATA[Sairam Sundaresan]]></dc:creator><pubDate>Wed, 11 Mar 2026 12:03:21 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/dHBEQ-Ryo24" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey there! This week&#8217;s curation spans from AI systems that run overnight experiments autonomously to backdoors hiding inside your favorite tool-using agents. Whether you&#8217;re thinking about building agents, defending them, or just trying to understand what your GPU actually wants from you, there&#8217;s something here.</p><p><em>New here? The Tokenizer is my resource-focused newsletter edition where I curate the best AI/ML papers, videos, articles, tools, and learning resources so you don&#8217;t have to sift through the noise. Subscribe to <a href="https://newsletter.artofsaience.com">Gradient Ascent</a> for the full experience.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.artofsaience.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Gradient Ascent! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2><strong>TL;DR</strong></h2><p>What caught my attention this week:</p><p>&#8226; &#128196; <strong>Papers:</strong> Smarter reasoning graphs, temporal backdoors in tool-using LLMs, spatial reasoning benchmarks, privacy advantages of diffusion language models, and versatile video editing</p><p>&#8226; &#127909; <strong>Videos:</strong> Hardware constraints shaping LLM architecture, brand-consistent image generation with Midjourney, a complexity taxonomy for AI agents, and a critical look at vibe coding</p><p>&#8226; &#128240; <strong>Reads:</strong> Statistical rigor for LLM evaluations, the infrastructure costs of long-context inference, and the current state of open-weight models catching up to closed frontiers</p><p>&#8226; &#128736; <strong>Tools:</strong> A 101k-star collection of LLM applications and a versioned API documentation system for coding agents</p><p>&#8226; &#127891; <strong>Learning:</strong> Karpathy&#8217;s tool that lets AI agents run autonomous ML experiments overnight</p><div><hr></div><h2><strong>&#128196; 5 Papers</strong></h2><h3><strong>1. RouteGoT: Node-Adaptive Routing for Cost-Efficient Graph of Thoughts Reasoning</strong></h3><p><strong><a href="https://arxiv.org/abs/2603.05818">https://arxiv.org/abs/2603.05818</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xF3W!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F975375f9-601d-4372-ac99-95d28f5a8b9b_997x442.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xF3W!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F975375f9-601d-4372-ac99-95d28f5a8b9b_997x442.png 424w, https://substackcdn.com/image/fetch/$s_!xF3W!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F975375f9-601d-4372-ac99-95d28f5a8b9b_997x442.png 848w, https://substackcdn.com/image/fetch/$s_!xF3W!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F975375f9-601d-4372-ac99-95d28f5a8b9b_997x442.png 1272w, https://substackcdn.com/image/fetch/$s_!xF3W!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F975375f9-601d-4372-ac99-95d28f5a8b9b_997x442.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xF3W!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F975375f9-601d-4372-ac99-95d28f5a8b9b_997x442.png" width="997" height="442" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/975375f9-601d-4372-ac99-95d28f5a8b9b_997x442.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:442,&quot;width&quot;:997,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xF3W!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F975375f9-601d-4372-ac99-95d28f5a8b9b_997x442.png 424w, https://substackcdn.com/image/fetch/$s_!xF3W!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F975375f9-601d-4372-ac99-95d28f5a8b9b_997x442.png 848w, https://substackcdn.com/image/fetch/$s_!xF3W!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F975375f9-601d-4372-ac99-95d28f5a8b9b_997x442.png 1272w, https://substackcdn.com/image/fetch/$s_!xF3W!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F975375f9-601d-4372-ac99-95d28f5a8b9b_997x442.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Graph of Thoughts is powerful but expensive, treating every reasoning node equally regardless of difficulty. RouteGoT fixes this by adaptively routing compute across graph nodes, skipping the heavy lifting where it isn&#8217;t needed. The result: 8.1 percentage points more accurate than AGoT while using 79.1% fewer output tokens.</p><h3><strong>2. Sleeper Cell: Injecting Latent Malice Temporal Backdoors into Tool-Using LLMs</strong></h3><p><strong><a href="https://arxiv.org/abs/2603.03371">https://arxiv.org/abs/2603.03371</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1Eaj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81cf73bc-a770-4705-b53d-22a802a856b9_996x1673.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1Eaj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81cf73bc-a770-4705-b53d-22a802a856b9_996x1673.png 424w, https://substackcdn.com/image/fetch/$s_!1Eaj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81cf73bc-a770-4705-b53d-22a802a856b9_996x1673.png 848w, https://substackcdn.com/image/fetch/$s_!1Eaj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81cf73bc-a770-4705-b53d-22a802a856b9_996x1673.png 1272w, https://substackcdn.com/image/fetch/$s_!1Eaj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81cf73bc-a770-4705-b53d-22a802a856b9_996x1673.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1Eaj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81cf73bc-a770-4705-b53d-22a802a856b9_996x1673.png" width="996" height="1673" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/81cf73bc-a770-4705-b53d-22a802a856b9_996x1673.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1673,&quot;width&quot;:996,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1Eaj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81cf73bc-a770-4705-b53d-22a802a856b9_996x1673.png 424w, https://substackcdn.com/image/fetch/$s_!1Eaj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81cf73bc-a770-4705-b53d-22a802a856b9_996x1673.png 848w, https://substackcdn.com/image/fetch/$s_!1Eaj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81cf73bc-a770-4705-b53d-22a802a856b9_996x1673.png 1272w, https://substackcdn.com/image/fetch/$s_!1Eaj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81cf73bc-a770-4705-b53d-22a802a856b9_996x1673.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Here&#8217;s a security scenario worth losing sleep over: backdoors implanted in tool-using LLMs that only activate under specific temporal conditions. The model maintains state-of-the-art performance on benign tasks and evades standard safety evaluations, until a particular time trigger flips the switch. A sobering look at the gap between current evaluation practices and actual deployment safety.</p><h3><strong>3. SpatialText: A Pure-Text Cognitive Benchmark for Spatial Understanding in LLMs</strong></h3><p><strong><a href="https://arxiv.org/abs/2603.03002">https://arxiv.org/abs/2603.03002</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!I8jf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e76a38a-c22c-491c-84e3-0fc2d485e705_1408x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!I8jf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e76a38a-c22c-491c-84e3-0fc2d485e705_1408x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!I8jf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e76a38a-c22c-491c-84e3-0fc2d485e705_1408x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!I8jf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e76a38a-c22c-491c-84e3-0fc2d485e705_1408x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!I8jf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e76a38a-c22c-491c-84e3-0fc2d485e705_1408x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!I8jf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e76a38a-c22c-491c-84e3-0fc2d485e705_1408x768.jpeg" width="1408" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3e76a38a-c22c-491c-84e3-0fc2d485e705_1408x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1408,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!I8jf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e76a38a-c22c-491c-84e3-0fc2d485e705_1408x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!I8jf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e76a38a-c22c-491c-84e3-0fc2d485e705_1408x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!I8jf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e76a38a-c22c-491c-84e3-0fc2d485e705_1408x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!I8jf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e76a38a-c22c-491c-84e3-0fc2d485e705_1408x768.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Do LLMs actually understand spatial relationships, or are they just pattern-matching on word co-occurrences? SpatialText isolates spatial reasoning from visual shortcuts using pure text, and the results aren&#8217;t flattering. Current models lean heavily on linguistic heuristics rather than building coherent spatial representations, which matters for anyone building systems that need to reason about physical space.</p><h3><strong>4. Characterizing Memorization in Diffusion Language Models</strong></h3><p><strong><a href="https://arxiv.org/abs/2603.02333">https://arxiv.org/abs/2603.02333</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!94_s!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde1f2a3c-ad26-47c3-858e-5838bbb6407c_927x491.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!94_s!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde1f2a3c-ad26-47c3-858e-5838bbb6407c_927x491.png 424w, https://substackcdn.com/image/fetch/$s_!94_s!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde1f2a3c-ad26-47c3-858e-5838bbb6407c_927x491.png 848w, https://substackcdn.com/image/fetch/$s_!94_s!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde1f2a3c-ad26-47c3-858e-5838bbb6407c_927x491.png 1272w, https://substackcdn.com/image/fetch/$s_!94_s!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde1f2a3c-ad26-47c3-858e-5838bbb6407c_927x491.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!94_s!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde1f2a3c-ad26-47c3-858e-5838bbb6407c_927x491.png" width="927" height="491" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/de1f2a3c-ad26-47c3-858e-5838bbb6407c_927x491.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:491,&quot;width&quot;:927,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!94_s!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde1f2a3c-ad26-47c3-858e-5838bbb6407c_927x491.png 424w, https://substackcdn.com/image/fetch/$s_!94_s!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde1f2a3c-ad26-47c3-858e-5838bbb6407c_927x491.png 848w, https://substackcdn.com/image/fetch/$s_!94_s!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde1f2a3c-ad26-47c3-858e-5838bbb6407c_927x491.png 1272w, https://substackcdn.com/image/fetch/$s_!94_s!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde1f2a3c-ad26-47c3-858e-5838bbb6407c_927x491.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Diffusion-based language models turn out to have a meaningful privacy advantage over autoregressive ones. This paper shows they exhibit substantially lower memorization-based leakage of personally identifiable information. If you&#8217;re building generative systems where training data sensitivity matters (medical, legal, financial), this distinction between generation architectures is worth understanding.</p><h3><strong>5. Kiwi-Edit: Versatile Video Editing via Instruction and Reference Guidance</strong></h3><p><strong><a href="https://arxiv.org/abs/2603.02175">https://arxiv.org/abs/2603.02175</a></strong> | <strong><a href="https://github.com/showlab/Kiwi-Edit">GitHub</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!88RY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafd6be80-e49b-4282-8bca-11a2309ae591_793x477.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!88RY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafd6be80-e49b-4282-8bca-11a2309ae591_793x477.png 424w, https://substackcdn.com/image/fetch/$s_!88RY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafd6be80-e49b-4282-8bca-11a2309ae591_793x477.png 848w, https://substackcdn.com/image/fetch/$s_!88RY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafd6be80-e49b-4282-8bca-11a2309ae591_793x477.png 1272w, https://substackcdn.com/image/fetch/$s_!88RY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafd6be80-e49b-4282-8bca-11a2309ae591_793x477.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!88RY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafd6be80-e49b-4282-8bca-11a2309ae591_793x477.png" width="793" height="477" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/afd6be80-e49b-4282-8bca-11a2309ae591_793x477.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:477,&quot;width&quot;:793,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!88RY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafd6be80-e49b-4282-8bca-11a2309ae591_793x477.png 424w, https://substackcdn.com/image/fetch/$s_!88RY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafd6be80-e49b-4282-8bca-11a2309ae591_793x477.png 848w, https://substackcdn.com/image/fetch/$s_!88RY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafd6be80-e49b-4282-8bca-11a2309ae591_793x477.png 1272w, https://substackcdn.com/image/fetch/$s_!88RY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafd6be80-e49b-4282-8bca-11a2309ae591_793x477.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Video editing that combines instruction-based and reference-based approaches in one pipeline. Kiwi-Edit constructs high-fidelity training data using synthetic reference scaffolds, sidestepping the usual bottleneck of paired video editing datasets. The result is a system that handles both &#8220;make the sky purple&#8221; style instructions and &#8220;make it look like this reference&#8221; editing in a single model.</p><div><hr></div><h2><strong>&#127909; 4 Videos</strong></h2><h3><strong>1. How Hardware Constraints Are Shaping Modern LLM Architecture</strong></h3><div id="youtube2-BSzhrZOp2x8" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;BSzhrZOp2x8&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/BSzhrZOp2x8?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Julia Turc explores how physical hardware realities (memory bandwidth, compute density, interconnect speeds) are actively driving architectural decisions in modern LLMs. If you&#8217;ve wondered why certain design choices keep showing up across different labs, the answer often starts with what the silicon can actually do efficiently.</p><h3><strong>2. Mastering Midjourney: Consistent Brand Imagery Without Complex Prompts</strong></h3><div id="youtube2-2RD3FP5iWJY" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;2RD3FP5iWJY&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/2RD3FP5iWJY?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>On How I AI, AI creative director Jamey Gannon walks through her workflow for generating images that maintain consistent brand identity across multiple Midjourney outputs using style references, personalization codes, and mood boards. Useful for anyone who&#8217;s gotten great individual images from AI tools but struggled to make a cohesive visual set for a brand, product, or campaign.</p><h3><strong>3. The 5 Levels of AI Agent Complexity</strong></h3><div id="youtube2-BaXTos7B1vY" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;BaXTos7B1vY&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/BaXTos7B1vY?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Dave Ebbelaar breaks down AI agents into five distinct complexity levels, from simple single-tool agents to sophisticated multi-agent orchestration systems. Helpful framing for teams trying to scope what kind of agent they actually need (often simpler than they think) and understanding the engineering effort each level demands.</p><h3><strong>4. Vibe Coding is a Slot Machine</strong></h3><div id="youtube2-dHBEQ-Ryo24" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;dHBEQ-Ryo24&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/dHBEQ-Ryo24?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Machine Learning Street Talk sits down with Jeremy Howard (fast.ai) to examine whether AI coding assistants are genuinely improving developer productivity or creating a false sense of progress. Howard&#8217;s central argument: if you outsource all your thinking to computers, you stop building the competence that makes you effective. The kind of critical examination teams need before making investment decisions around AI-assisted development tooling.</p><div><hr></div><h2><strong>&#128240; 3 Curated Reads</strong></h2><h3><strong>1. Applying Statistics to LLM Evaluations</strong></h3><div class="embedded-post-wrap" data-attrs="{&quot;id&quot;:188458832,&quot;url&quot;:&quot;https://cameronrwolfe.substack.com/p/stats-llm-evals&quot;,&quot;publication_id&quot;:1092659,&quot;publication_name&quot;:&quot;Deep (Learning) Focus&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!87xa!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fab9b43fb-52d5-40da-995d-5b7cd3f91064_896x896.png&quot;,&quot;title&quot;:&quot;Applying Statistics to LLM Evaluations&quot;,&quot;truncated_body_text&quot;:&quot;Research on large language models (LLMs) is empirically driven. For this reason, model evaluations play a pivotal role in the field&#8217;s progress. We improve models by making changes, evaluating them, and iterating. Despite their foundational role, however, evaluations are usually handled in a naive manner. In most cases, we just test a mod&#8230;&quot;,&quot;date&quot;:&quot;2026-03-09T09:33:37.821Z&quot;,&quot;like_count&quot;:66,&quot;comment_count&quot;:2,&quot;bylines&quot;:[{&quot;id&quot;:29736521,&quot;name&quot;:&quot;Cameron R. Wolfe, Ph.D.&quot;,&quot;handle&quot;:&quot;cwolferesearch&quot;,&quot;previous_name&quot;:&quot;Cameron R. Wolfe&quot;,&quot;photo_url&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/69aba7df-b571-4609-aa47-fc2d031c11b8_1242x1595.jpeg&quot;,&quot;bio&quot;:&quot;Research @ Netflix &#8226; Rice University PhD &#8226; I make AI understandable&quot;,&quot;profile_set_up_at&quot;:&quot;2022-09-17T15:11:34.083Z&quot;,&quot;reader_installed_at&quot;:&quot;2023-01-10T11:25:00.723Z&quot;,&quot;publicationUsers&quot;:[{&quot;id&quot;:1042380,&quot;user_id&quot;:29736521,&quot;publication_id&quot;:1092659,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:true,&quot;publication&quot;:{&quot;id&quot;:1092659,&quot;name&quot;:&quot;Deep (Learning) Focus&quot;,&quot;subdomain&quot;:&quot;cameronrwolfe&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;I contextualize and explain important topics in AI research.&quot;,&quot;logo_url&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/ab9b43fb-52d5-40da-995d-5b7cd3f91064_896x896.png&quot;,&quot;author_id&quot;:29736521,&quot;primary_user_id&quot;:29736521,&quot;theme_var_background_pop&quot;:&quot;#6C0095&quot;,&quot;created_at&quot;:&quot;2022-09-17T15:12:33.160Z&quot;,&quot;email_from_name&quot;:&quot;Deep (Learning) Focus&quot;,&quot;copyright&quot;:&quot;Cameron R. Wolfe&quot;,&quot;founding_plan_name&quot;:&quot;Founding Member&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;enabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false}}],&quot;twitter_screen_name&quot;:&quot;cwolferesearch&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100,&quot;status&quot;:{&quot;bestsellerTier&quot;:100,&quot;subscriberTier&quot;:null,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:{&quot;type&quot;:&quot;bestseller&quot;,&quot;tier&quot;:100},&quot;paidPublicationIds&quot;:[],&quot;subscriber&quot;:null}}],&quot;utm_campaign&quot;:null,&quot;belowTheFold&quot;:true,&quot;type&quot;:&quot;newsletter&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPostToDOM"><a class="embedded-post" native="true" href="https://cameronrwolfe.substack.com/p/stats-llm-evals?utm_source=substack&amp;utm_campaign=post_embed&amp;utm_medium=web"><div class="embedded-post-header"><img class="embedded-post-publication-logo" src="https://substackcdn.com/image/fetch/$s_!87xa!,w_56,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fab9b43fb-52d5-40da-995d-5b7cd3f91064_896x896.png" loading="lazy"><span class="embedded-post-publication-name">Deep (Learning) Focus</span></div><div class="embedded-post-title-wrapper"><div class="embedded-post-title">Applying Statistics to LLM Evaluations</div></div><div class="embedded-post-body">Research on large language models (LLMs) is empirically driven. For this reason, model evaluations play a pivotal role in the field&#8217;s progress. We improve models by making changes, evaluating them, and iterating. Despite their foundational role, however, evaluations are usually handled in a naive manner. In most cases, we just test a mod&#8230;</div><div class="embedded-post-cta-wrapper"><span class="embedded-post-cta">Read more</span></div><div class="embedded-post-meta">2 months ago &#183; 66 likes &#183; 2 comments &#183; Cameron R. Wolfe, Ph.D.</div></a></div><p>Cameron R. Wolfe from Deep (Learning) Focus tackles a problem most benchmarking papers quietly ignore: the statistical validity of LLM evaluation results. Key finding worth internalizing: clustered standard errors can increase reported uncertainty by 3x, and the Central Limit Theorem becomes unreliable with small sample sizes. If you&#8217;ve ever looked at a leaderboard and wondered &#8220;is this difference actually meaningful?&#8221;, this article gives you the tools to answer that.</p><h3><strong>2. How Long Context Inference Is Rewriting the Future of Transformers</strong></h3><div class="embedded-post-wrap" data-attrs="{&quot;id&quot;:187061028,&quot;url&quot;:&quot;https://www.artificialintelligencemadesimple.com/p/how-long-context-inference-is-rewriting&quot;,&quot;publication_id&quot;:1315074,&quot;publication_name&quot;:&quot;Artificial Intelligence Made Simple&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!Pfon!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77504fa0-0f08-4a38-bbde-becb151d2db8_643x644.png&quot;,&quot;title&quot;:&quot;How Long Context Inference Is Rewriting the Future of Transformers&quot;,&quot;truncated_body_text&quot;:&quot;It takes time to create work that&#8217;s clear, independent, and genuinely useful. If you&#8217;ve found value in this newsletter, consider becoming a paid subscriber. It helps me dive deeper into research, reach more people, stay free from ads/hidden agendas, and supports my crippling chocolate milk addiction.&quot;,&quot;date&quot;:&quot;2026-03-08T22:21:32.660Z&quot;,&quot;like_count&quot;:56,&quot;comment_count&quot;:2,&quot;bylines&quot;:[{&quot;id&quot;:8101724,&quot;name&quot;:&quot;Devansh&quot;,&quot;handle&quot;:&quot;chocolatemilkcultleader&quot;,&quot;previous_name&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/48081c70-8afa-41e3-a44e-b0f917bc7577_1200x1600.jpeg&quot;,&quot;bio&quot;:&quot;The best meme-maker in Tech. Writer on AI, Software, and the Tech Industry. Currently in NYC Come say hi, I want more friends. &quot;,&quot;profile_set_up_at&quot;:&quot;2021-08-21T20:28:53.612Z&quot;,&quot;reader_installed_at&quot;:&quot;2024-03-11T12:27:10.271Z&quot;,&quot;publicationUsers&quot;:[{&quot;id&quot;:1274217,&quot;user_id&quot;:8101724,&quot;publication_id&quot;:1315074,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:true,&quot;publication&quot;:{&quot;id&quot;:1315074,&quot;name&quot;:&quot;Artificial Intelligence Made Simple&quot;,&quot;subdomain&quot;:&quot;artificialintelligencemadesimple&quot;,&quot;custom_domain&quot;:&quot;www.artificialintelligencemadesimple.com&quot;,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Covering the important ideas in AI from all angles- technical, social, and economic. Read in over 200 countries.  Useful to everyone who wants to learn AI. Critical to anyone trying to see what happens next. Sister Publication to Tech Made Simple.&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/77504fa0-0f08-4a38-bbde-becb151d2db8_643x644.png&quot;,&quot;author_id&quot;:8101724,&quot;primary_user_id&quot;:8101724,&quot;theme_var_background_pop&quot;:&quot;#009B50&quot;,&quot;created_at&quot;:&quot;2023-01-14T23:37:24.692Z&quot;,&quot;email_from_name&quot;:null,&quot;copyright&quot;:&quot;Devansh&quot;,&quot;founding_plan_name&quot;:&quot;Founding Member&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;enabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;magaziney&quot;,&quot;is_personal_mode&quot;:false}},{&quot;id&quot;:109622,&quot;user_id&quot;:8101724,&quot;publication_id&quot;:108704,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:false,&quot;publication&quot;:{&quot;id&quot;:108704,&quot;name&quot;:&quot;Technology Made Simple&quot;,&quot;subdomain&quot;:&quot;codinginterviewsmadesimple&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Deep yet digestible insights about Computer Science, Programming Interviews, Software Engineering Careers, Machine Learning, and the Tech Industry for Tech Leaders. Amazing For Coders and Managers. Beneficial to anyone trying to make money in Tech. &quot;,&quot;logo_url&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/8546dc69-af46-4d5d-9a80-b66cb76c833b_644x644.png&quot;,&quot;author_id&quot;:8101724,&quot;primary_user_id&quot;:null,&quot;theme_var_background_pop&quot;:&quot;#45D800&quot;,&quot;created_at&quot;:&quot;2020-10-07T10:47:41.199Z&quot;,&quot;email_from_name&quot;:&quot;Devansh from Tech Made Simple&quot;,&quot;copyright&quot;:&quot;Devansh&quot;,&quot;founding_plan_name&quot;:&quot;Founding Member&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;disabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;magaziney&quot;,&quot;is_personal_mode&quot;:false}},{&quot;id&quot;:5366623,&quot;user_id&quot;:8101724,&quot;publication_id&quot;:5261101,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:false,&quot;publication&quot;:{&quot;id&quot;:5261101,&quot;name&quot;:&quot;What's Happening In Tech&quot;,&quot;subdomain&quot;:&quot;whatishappeningintechnology&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;A Newsletter meant to Help People Keep Up With What's Happening in Tech&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ff955b89-d08e-4cb7-8add-709e6dc14d8e_1080x1080.jpeg&quot;,&quot;author_id&quot;:8101724,&quot;primary_user_id&quot;:null,&quot;theme_var_background_pop&quot;:&quot;#FF6719&quot;,&quot;created_at&quot;:&quot;2025-06-07T04:30:33.908Z&quot;,&quot;email_from_name&quot;:null,&quot;copyright&quot;:&quot;Devansh&quot;,&quot;founding_plan_name&quot;:null,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;disabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false}}],&quot;twitter_screen_name&quot;:&quot;Machine01776819&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100,&quot;status&quot;:{&quot;bestsellerTier&quot;:100,&quot;subscriberTier&quot;:1,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:{&quot;type&quot;:&quot;bestseller&quot;,&quot;tier&quot;:100},&quot;paidPublicationIds&quot;:[1442076,618139,1238074],&quot;subscriber&quot;:null}}],&quot;utm_campaign&quot;:null,&quot;belowTheFold&quot;:true,&quot;type&quot;:&quot;newsletter&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPostToDOM"><a class="embedded-post" native="true" href="https://www.artificialintelligencemadesimple.com/p/how-long-context-inference-is-rewriting?utm_source=substack&amp;utm_campaign=post_embed&amp;utm_medium=web"><div class="embedded-post-header"><img class="embedded-post-publication-logo" src="https://substackcdn.com/image/fetch/$s_!Pfon!,w_56,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77504fa0-0f08-4a38-bbde-becb151d2db8_643x644.png" loading="lazy"><span class="embedded-post-publication-name">Artificial Intelligence Made Simple</span></div><div class="embedded-post-title-wrapper"><div class="embedded-post-title">How Long Context Inference Is Rewriting the Future of Transformers</div></div><div class="embedded-post-body">It takes time to create work that&#8217;s clear, independent, and genuinely useful. If you&#8217;ve found value in this newsletter, consider becoming a paid subscriber. It helps me dive deeper into research, reach more people, stay free from ads/hidden agendas, and supports my crippling chocolate milk addiction&#8230;</div><div class="embedded-post-cta-wrapper"><span class="embedded-post-cta">Read more</span></div><div class="embedded-post-meta">2 months ago &#183; 56 likes &#183; 2 comments &#183; Devansh</div></a></div><p>AI Made Simple quantifies what long-context windows actually cost in production. A 70B-parameter model serving 59 concurrent users at 4K context drops to just 1 user at 128K context. The article covers the engineering responses, including Multi-Head Latent Attention (MLA) achieving 93.3% cache reduction. Practical reading for anyone deploying models where context length isn&#8217;t just a spec sheet number but a capacity planning constraint.</p><h3><strong>3. A Dream of Spring for Open-Weight LLMs</strong></h3><div class="embedded-post-wrap" data-attrs="{&quot;id&quot;:189051354,&quot;url&quot;:&quot;https://magazine.sebastianraschka.com/p/a-dream-of-spring-for-open-weight&quot;,&quot;publication_id&quot;:1174659,&quot;publication_name&quot;:&quot;Ahead of AI&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!96vs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49f25d0a-212b-4853-8bcb-128d0a3edbbf_1196x1196.png&quot;,&quot;title&quot;:&quot;A Dream of Spring for Open-Weight LLMs: 10 Architectures from Jan-Feb 2026&quot;,&quot;truncated_body_text&quot;:&quot;If you have struggled a bit to keep up with open-weight model releases this month, this article should catch you up on the main themes.&quot;,&quot;date&quot;:&quot;2026-02-25T13:26:56.028Z&quot;,&quot;like_count&quot;:183,&quot;comment_count&quot;:9,&quot;bylines&quot;:[{&quot;id&quot;:27393275,&quot;name&quot;:&quot;Sebastian Raschka, PhD&quot;,&quot;handle&quot;:&quot;rasbt&quot;,&quot;previous_name&quot;:&quot;Sebastian Raschka&quot;,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F61f4c017-506f-4e9b-a24f-76340dad0309_800x800.jpeg&quot;,&quot;bio&quot;:&quot;I'm an LLM research engineer 10+ years of experience in artificial intelligence. My expertise lies in AI &amp; LLM research focusing on code-driven implementations. I am also the author of \&quot;Build a Large Language Model From Scratch\&quot; (amzn.to/4fqvn0D).&quot;,&quot;profile_set_up_at&quot;:&quot;2022-10-09T16:19:59.744Z&quot;,&quot;reader_installed_at&quot;:&quot;2022-11-07T19:56:32.129Z&quot;,&quot;publicationUsers&quot;:[{&quot;id&quot;:1127862,&quot;user_id&quot;:27393275,&quot;publication_id&quot;:1174659,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:true,&quot;publication&quot;:{&quot;id&quot;:1174659,&quot;name&quot;:&quot;Ahead of AI&quot;,&quot;subdomain&quot;:&quot;sebastianraschka&quot;,&quot;custom_domain&quot;:&quot;magazine.sebastianraschka.com&quot;,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Ahead of AI focuses on machine learning and AI research and is read by more than 150,000 researchers and practitioners who want to stay ahead in a rapidly evolving field.&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/49f25d0a-212b-4853-8bcb-128d0a3edbbf_1196x1196.png&quot;,&quot;author_id&quot;:27393275,&quot;primary_user_id&quot;:27393275,&quot;theme_var_background_pop&quot;:&quot;#2096FF&quot;,&quot;created_at&quot;:&quot;2022-11-04T18:30:05.218Z&quot;,&quot;email_from_name&quot;:null,&quot;copyright&quot;:&quot;Raschka AI Research (RAIR) Lab LLC&quot;,&quot;founding_plan_name&quot;:&quot;Founding plan&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;enabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false}}],&quot;twitter_screen_name&quot;:&quot;rasbt&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:1000,&quot;status&quot;:{&quot;bestsellerTier&quot;:1000,&quot;subscriberTier&quot;:1,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:{&quot;type&quot;:&quot;bestseller&quot;,&quot;tier&quot;:1000},&quot;paidPublicationIds&quot;:[1783977,9873],&quot;subscriber&quot;:null}}],&quot;utm_campaign&quot;:null,&quot;belowTheFold&quot;:true,&quot;type&quot;:&quot;newsletter&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPostToDOM"><a class="embedded-post" native="true" href="https://magazine.sebastianraschka.com/p/a-dream-of-spring-for-open-weight?utm_source=substack&amp;utm_campaign=post_embed&amp;utm_medium=web"><div class="embedded-post-header"><img class="embedded-post-publication-logo" src="https://substackcdn.com/image/fetch/$s_!96vs!,w_56,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49f25d0a-212b-4853-8bcb-128d0a3edbbf_1196x1196.png" loading="lazy"><span class="embedded-post-publication-name">Ahead of AI</span></div><div class="embedded-post-title-wrapper"><div class="embedded-post-title">A Dream of Spring for Open-Weight LLMs: 10 Architectures from Jan-Feb 2026</div></div><div class="embedded-post-body">If you have struggled a bit to keep up with open-weight model releases this month, this article should catch you up on the main themes&#8230;</div><div class="embedded-post-cta-wrapper"><span class="embedded-post-cta">Read more</span></div><div class="embedded-post-meta">2 months ago &#183; 183 likes &#183; 9 comments &#183; Sebastian Raschka, PhD</div></a></div><p>Sebastian Raschka surveys the current open-weight landscape across 10 models and finds the gap to closed frontiers narrowing fast. GLM-5 now benchmarks on par with GPT-5.2, Gemini Pro 3, and Claude 4.6 Opus. On the inference side, Step 3.5 Flash hits 100 tokens/sec compared to DeepSeek&#8217;s 33. A well-structured overview for tracking which open models are actually competitive and where the remaining gaps lie.</p><div><hr></div><h2><strong>&#128736; 2 Tools &amp; Repos</strong></h2><h3><strong>1. Awesome LLM Apps</strong></h3><p><strong><a href="https://github.com/Shubhamsaboo/awesome-llm-apps">https://github.com/Shubhamsaboo/awesome-llm-apps</a></strong></p><p>At 101k stars, this collection of LLM application examples covers RAG implementations, AI agents (single and multi-agent teams), MCP integration patterns, voice AI, and fine-tuning guides. More useful as a reference architecture library than a tutorial. When you&#8217;re building something new, check here first to see how others have solved similar problems.</p><h3><strong>2. Context Hub</strong></h3><p><strong><a href="https://github.com/andrewyng/context-hub">https://github.com/andrewyng/context-hub</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HK10!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0edf3dff-1b3b-487f-9c18-0af4cc8020a8_1200x600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HK10!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0edf3dff-1b3b-487f-9c18-0af4cc8020a8_1200x600.png 424w, https://substackcdn.com/image/fetch/$s_!HK10!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0edf3dff-1b3b-487f-9c18-0af4cc8020a8_1200x600.png 848w, https://substackcdn.com/image/fetch/$s_!HK10!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0edf3dff-1b3b-487f-9c18-0af4cc8020a8_1200x600.png 1272w, https://substackcdn.com/image/fetch/$s_!HK10!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0edf3dff-1b3b-487f-9c18-0af4cc8020a8_1200x600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HK10!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0edf3dff-1b3b-487f-9c18-0af4cc8020a8_1200x600.png" width="1200" height="600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0edf3dff-1b3b-487f-9c18-0af4cc8020a8_1200x600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:600,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!HK10!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0edf3dff-1b3b-487f-9c18-0af4cc8020a8_1200x600.png 424w, https://substackcdn.com/image/fetch/$s_!HK10!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0edf3dff-1b3b-487f-9c18-0af4cc8020a8_1200x600.png 848w, https://substackcdn.com/image/fetch/$s_!HK10!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0edf3dff-1b3b-487f-9c18-0af4cc8020a8_1200x600.png 1272w, https://substackcdn.com/image/fetch/$s_!HK10!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0edf3dff-1b3b-487f-9c18-0af4cc8020a8_1200x600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>From Andrew Ng&#8217;s team, Context Hub gives coding agents access to curated, versioned API documentation instead of letting them hallucinate library APIs. Features search and fetch, language-specific variants, persistent annotations, and feedback loops so the documentation improves over time. At 3.5k stars, it&#8217;s gaining traction with teams building custom coding agents that need reliable API knowledge.</p><h2><strong>&#127891; 1 Pick of the Week</strong></h2><h3><strong>Autoresearch</strong></h3><p><strong><a href="https://github.com/karpathy/autoresearch">https://github.com/karpathy/autoresearch</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!roo5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe05fecdb-2568-429f-affe-9d4f2c08299c_2048x1015.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!roo5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe05fecdb-2568-429f-affe-9d4f2c08299c_2048x1015.png 424w, https://substackcdn.com/image/fetch/$s_!roo5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe05fecdb-2568-429f-affe-9d4f2c08299c_2048x1015.png 848w, https://substackcdn.com/image/fetch/$s_!roo5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe05fecdb-2568-429f-affe-9d4f2c08299c_2048x1015.png 1272w, https://substackcdn.com/image/fetch/$s_!roo5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe05fecdb-2568-429f-affe-9d4f2c08299c_2048x1015.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!roo5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe05fecdb-2568-429f-affe-9d4f2c08299c_2048x1015.png" width="1456" height="722" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e05fecdb-2568-429f-affe-9d4f2c08299c_2048x1015.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:722,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!roo5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe05fecdb-2568-429f-affe-9d4f2c08299c_2048x1015.png 424w, https://substackcdn.com/image/fetch/$s_!roo5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe05fecdb-2568-429f-affe-9d4f2c08299c_2048x1015.png 848w, https://substackcdn.com/image/fetch/$s_!roo5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe05fecdb-2568-429f-affe-9d4f2c08299c_2048x1015.png 1272w, https://substackcdn.com/image/fetch/$s_!roo5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe05fecdb-2568-429f-affe-9d4f2c08299c_2048x1015.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Andrej Karpathy&#8217;s latest side project is a Python tool built around three core files (prepare.py, train.py, program.md) that lets an AI agent run autonomous ML experiments on a single GPU. The agent proposes hypotheses, writes training code, runs experiments with a 5-minute budget each, and iterates based on results. Leave it running overnight and wake up to a stack of completed experiments. At 23.6k stars within days of release, it&#8217;s clearly struck a nerve. Sparks of recursive self-improvement, indeed.</p><div><hr></div><p><em>Thanks for reading The Tokenizer! If you found this valuable, please share it with your colleagues and consider subscribing to <a href="https://newsletter.artofsaience.com">Gradient Ascent</a> for more AI insights.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.artofsaience.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Gradient Ascent! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Karpathy's microGPT, Jeff Dean's Pareto Frontier, and the LLM Course - 📚 The Tokenizer Edition #18]]></title><description><![CDATA[This week's most valuable AI resources]]></description><link>https://newsletter.artofsaience.com/p/karpathys-microgpt-jeff-deans-pareto</link><guid isPermaLink="false">https://newsletter.artofsaience.com/p/karpathys-microgpt-jeff-deans-pareto</guid><dc:creator><![CDATA[Sairam Sundaresan]]></dc:creator><pubDate>Thu, 19 Feb 2026 13:34:15 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/F_1oDPWxpFQ" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey there! While agents crush bug fixes at 74%, they stumble to 11% on actual feature development. Turns out building real features is fundamentally harder than patching code. Meanwhile, passing a task to an agent and actually delegating authority to one turn out to be completely different problems, and Karpathy just distilled an entire GPT into 243 lines of dependency-free Python. These are structural shifts in how we approach autonomy, development, and education.</p><h3><strong>New here?</strong></h3><p><em>The Tokenizer is my resource-focused newsletter edition where I curate the best papers, videos, articles, tools, and learning resources from across the AI landscape. Consider it your weekly dose of everything you need to stay ahead in machine learning.</em></p><div><hr></div><p>I&#8217;m teaching ML &amp; Generative AI System Design on Feb 28th / March 1st with Packt.</p><p>We&#8217;ll cover AI systems that use RAG and traditional ML design principles for building solid AI products: making systems reliable, measuring what matters, and designing architectures that work in production.</p><p>Through live discussions, guided exercises, and design sprints, you&#8217;ll practice solving system-level AI problems and walk away with frameworks you can apply immediately at work.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://www.eventbrite.com/e/machine-learning-and-generative-ai-system-design-workshop-tickets-1975103644168?aff=SairamNewsletter" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mZoj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bc7afe2-6630-48da-b4d4-8ff8993bf5d1_1280x640.png 424w, https://substackcdn.com/image/fetch/$s_!mZoj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bc7afe2-6630-48da-b4d4-8ff8993bf5d1_1280x640.png 848w, https://substackcdn.com/image/fetch/$s_!mZoj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bc7afe2-6630-48da-b4d4-8ff8993bf5d1_1280x640.png 1272w, https://substackcdn.com/image/fetch/$s_!mZoj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bc7afe2-6630-48da-b4d4-8ff8993bf5d1_1280x640.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mZoj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bc7afe2-6630-48da-b4d4-8ff8993bf5d1_1280x640.png" width="1280" height="640" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4bc7afe2-6630-48da-b4d4-8ff8993bf5d1_1280x640.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:640,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:&quot;https://www.eventbrite.com/e/machine-learning-and-generative-ai-system-design-workshop-tickets-1975103644168?aff=SairamNewsletter&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mZoj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bc7afe2-6630-48da-b4d4-8ff8993bf5d1_1280x640.png 424w, https://substackcdn.com/image/fetch/$s_!mZoj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bc7afe2-6630-48da-b4d4-8ff8993bf5d1_1280x640.png 848w, https://substackcdn.com/image/fetch/$s_!mZoj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bc7afe2-6630-48da-b4d4-8ff8993bf5d1_1280x640.png 1272w, https://substackcdn.com/image/fetch/$s_!mZoj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4bc7afe2-6630-48da-b4d4-8ff8993bf5d1_1280x640.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Use code <strong>FLASH40</strong> for 40% off: <a href="https://lnkd.in/gqTrvsuS"> </a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.eventbrite.com/e/machine-learning-and-generative-ai-system-design-workshop-tickets-1975103644168?aff=SairamNewsletter&quot;,&quot;text&quot;:&quot;Learn AI System Design&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.eventbrite.com/e/machine-learning-and-generative-ai-system-design-workshop-tickets-1975103644168?aff=SairamNewsletter"><span>Learn AI System Design</span></a></p><p>What topics/problems would you most want covered in a system design workshop? Drop a comment or DM me.</p><div><hr></div><h2><strong>TL;DR</strong></h2><p><strong>What caught my attention this week:</strong></p><p>&#8226; &#128196; <strong>Papers:</strong> Task handoff is not the same as authority transfer, video models tested on physics not aesthetics, memory that evolves without overfitting, gated reasoning that knows when to stop, and feature-level coding benchmarks exposing real agent limitations</p><p>&#8226; &#127909; <strong>Videos:</strong> Jeff Dean on AI&#8217;s Pareto frontier, comprehensive 2026 state of AI breakdown, building custom dev tools instead of buying SaaS, and practical context engineering for agents</p><p>&#8226; &#128240; <strong>Reads:</strong> Rubric-based rewards for subjective domains, recursive language models that call themselves like functions, and why taste matters for generative AI</p><p>&#8226; &#128736; <strong>Tools:</strong> Reasoning implementations from scratch, comprehensive LLM learning roadmap</p><p>&#8226; &#127891; <strong>Learning:</strong> Karpathy&#8217;s microGPT strips GPT to its mathematical essence in pure Python</p><div><hr></div><h2><strong>&#128196; 5 Papers</strong></h2><h3><strong>FeatureBench: Benchmarking Agentic Coding for Complex Feature Development</strong></h3><p><strong><a href="https://arxiv.org/abs/2602.10975">https://arxiv.org/abs/2602.10975</a></strong> | <strong><a href="https://github.com/LiberCoders/FeatureBench">GitHub</a></strong></p><p>Claude 4.5 Opus achieves 74.4% on SWE-bench but drops to 11.0% on FeatureBench. The difference? SWE-bench tests bug fixes within single pull requests, while FeatureBench evaluates end-to-end feature development spanning multiple commits and PRs across development timelines. Using a test-driven method that traces from unit tests along dependency graphs, the benchmark automatically derives 200 feature-level tasks from 24 repositories while ensuring other features remain functional after separation. This exposes the gap between fixing localized issues and actually building new capabilities.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!IlUK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8151fee6-abd2-408d-9c46-4de37a28abe3_1328x881.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!IlUK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8151fee6-abd2-408d-9c46-4de37a28abe3_1328x881.png 424w, https://substackcdn.com/image/fetch/$s_!IlUK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8151fee6-abd2-408d-9c46-4de37a28abe3_1328x881.png 848w, https://substackcdn.com/image/fetch/$s_!IlUK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8151fee6-abd2-408d-9c46-4de37a28abe3_1328x881.png 1272w, https://substackcdn.com/image/fetch/$s_!IlUK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8151fee6-abd2-408d-9c46-4de37a28abe3_1328x881.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!IlUK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8151fee6-abd2-408d-9c46-4de37a28abe3_1328x881.png" width="1328" height="881" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8151fee6-abd2-408d-9c46-4de37a28abe3_1328x881.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:881,&quot;width&quot;:1328,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!IlUK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8151fee6-abd2-408d-9c46-4de37a28abe3_1328x881.png 424w, https://substackcdn.com/image/fetch/$s_!IlUK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8151fee6-abd2-408d-9c46-4de37a28abe3_1328x881.png 848w, https://substackcdn.com/image/fetch/$s_!IlUK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8151fee6-abd2-408d-9c46-4de37a28abe3_1328x881.png 1272w, https://substackcdn.com/image/fetch/$s_!IlUK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8151fee6-abd2-408d-9c46-4de37a28abe3_1328x881.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>Intelligent AI Delegation</strong></h3><p><strong><a href="https://arxiv.org/abs/2602.11865">https://arxiv.org/abs/2602.11865</a></strong></p><p>Existing delegation methods run on heuristics that collapse when environments change or sub-agents fail. Intelligent AI Delegation reframes the problem: passing a task is not the same as transferring authority, responsibility, and accountability, and conflating the two is where multi-agent systems break down. The framework draws from principal-agent theory in economics, assumes zero trust at every delegation boundary, and applies to both human and AI delegators across complex agent networks. Misalignment, reward gaming, and deceptive behavior all compound as agents delegate to other agents. The paper is trying to define the protocol layer the agentic web needs before it can safely scale.</p><h3><strong>RISE-Video: Can Video Generators Decode Implicit World Rules?</strong></h3><p><strong><a href="https://arxiv.org/abs/2602.05986">https://arxiv.org/abs/2602.05986</a></strong> | <strong><a href="https://github.com/VisionXLab/Rise-Video">GitHub</a></strong></p><p>Video models produce visually impressive outputs, but can they reason about physics? RISE-Video shifts evaluation from aesthetics to cognitive understanding with 467 human-annotated samples spanning eight categories. The benchmark tests whether models grasp implicit constraints like spatial dynamics, temporal consistency, physical rationality, and causality. Testing 11 state-of-the-art models revealed pervasive failures when simulating complex scenarios under implicit rules, exposing the gap between visual fidelity and genuine world understanding.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Odej!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f3bd893-d05c-41ed-9821-7c41ef6b4955_1509x797.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Odej!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f3bd893-d05c-41ed-9821-7c41ef6b4955_1509x797.png 424w, https://substackcdn.com/image/fetch/$s_!Odej!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f3bd893-d05c-41ed-9821-7c41ef6b4955_1509x797.png 848w, https://substackcdn.com/image/fetch/$s_!Odej!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f3bd893-d05c-41ed-9821-7c41ef6b4955_1509x797.png 1272w, https://substackcdn.com/image/fetch/$s_!Odej!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f3bd893-d05c-41ed-9821-7c41ef6b4955_1509x797.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Odej!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f3bd893-d05c-41ed-9821-7c41ef6b4955_1509x797.png" width="1456" height="769" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1f3bd893-d05c-41ed-9821-7c41ef6b4955_1509x797.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:769,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Odej!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f3bd893-d05c-41ed-9821-7c41ef6b4955_1509x797.png 424w, https://substackcdn.com/image/fetch/$s_!Odej!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f3bd893-d05c-41ed-9821-7c41ef6b4955_1509x797.png 848w, https://substackcdn.com/image/fetch/$s_!Odej!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f3bd893-d05c-41ed-9821-7c41ef6b4955_1509x797.png 1272w, https://substackcdn.com/image/fetch/$s_!Odej!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f3bd893-d05c-41ed-9821-7c41ef6b4955_1509x797.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>UMEM: Unified Memory Extraction and Management Framework for Generalizable Memory</strong></h3><p><strong><a href="https://arxiv.org/abs/2602.10652">https://arxiv.org/abs/2602.10652</a></strong> | <strong><a href="https://github.com/AIDC-AI/Marco-DeepResearch">GitHub</a></strong></p><p>Memory-enabled agents typically optimize management while treating extraction as static, accumulating instance-specific noise rather than robust insights. UMEM jointly optimizes both through Semantic Neighborhood Modeling, evaluating memory utility across clusters of semantically related queries rather than individual instances. Trained with neighborhood-level marginal utility rewards via GRPO, the approach achieves up to 10.67% improvement on multi-turn tasks while maintaining monotonic growth during continuous evolution. Agents that actually learn from experience rather than just retrieve logs.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!O4Xu!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1975aee-4998-45ad-b3ad-7bc970d2f3ca_810x455.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!O4Xu!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1975aee-4998-45ad-b3ad-7bc970d2f3ca_810x455.png 424w, https://substackcdn.com/image/fetch/$s_!O4Xu!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1975aee-4998-45ad-b3ad-7bc970d2f3ca_810x455.png 848w, https://substackcdn.com/image/fetch/$s_!O4Xu!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1975aee-4998-45ad-b3ad-7bc970d2f3ca_810x455.png 1272w, https://substackcdn.com/image/fetch/$s_!O4Xu!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1975aee-4998-45ad-b3ad-7bc970d2f3ca_810x455.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!O4Xu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1975aee-4998-45ad-b3ad-7bc970d2f3ca_810x455.png" width="810" height="455" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f1975aee-4998-45ad-b3ad-7bc970d2f3ca_810x455.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:455,&quot;width&quot;:810,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!O4Xu!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1975aee-4998-45ad-b3ad-7bc970d2f3ca_810x455.png 424w, https://substackcdn.com/image/fetch/$s_!O4Xu!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1975aee-4998-45ad-b3ad-7bc970d2f3ca_810x455.png 848w, https://substackcdn.com/image/fetch/$s_!O4Xu!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1975aee-4998-45ad-b3ad-7bc970d2f3ca_810x455.png 1272w, https://substackcdn.com/image/fetch/$s_!O4Xu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1975aee-4998-45ad-b3ad-7bc970d2f3ca_810x455.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>When to Memorize and When to Stop: Gated Recurrent Memory for Long-Context Reasoning</strong></h3><p><strong><a href="https://arxiv.org/abs/2602.10560">https://arxiv.org/abs/2602.10560</a></strong></p><p>Long-context reasoning faces critical issues: memory explodes from indiscriminate updates on evidence-free chunks, and loops continue unnecessarily after gathering sufficient evidence. GRU-Mem introduces update and exit gates controlled by text. Memory updates only when the update gate opens, and the loop terminates immediately when the exit gate opens. Trained with end-to-end RL using separate rewards for correct updating and exiting behaviors, GRU-Mem outperforms vanilla MemAgent with up to 4x inference speed acceleration.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!maBG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb545f47-4e86-405d-a51e-5d59f2c1c56f_2048x565.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!maBG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb545f47-4e86-405d-a51e-5d59f2c1c56f_2048x565.png 424w, https://substackcdn.com/image/fetch/$s_!maBG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb545f47-4e86-405d-a51e-5d59f2c1c56f_2048x565.png 848w, https://substackcdn.com/image/fetch/$s_!maBG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb545f47-4e86-405d-a51e-5d59f2c1c56f_2048x565.png 1272w, https://substackcdn.com/image/fetch/$s_!maBG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb545f47-4e86-405d-a51e-5d59f2c1c56f_2048x565.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!maBG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb545f47-4e86-405d-a51e-5d59f2c1c56f_2048x565.png" width="1456" height="402" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bb545f47-4e86-405d-a51e-5d59f2c1c56f_2048x565.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:402,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!maBG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb545f47-4e86-405d-a51e-5d59f2c1c56f_2048x565.png 424w, https://substackcdn.com/image/fetch/$s_!maBG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb545f47-4e86-405d-a51e-5d59f2c1c56f_2048x565.png 848w, https://substackcdn.com/image/fetch/$s_!maBG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb545f47-4e86-405d-a51e-5d59f2c1c56f_2048x565.png 1272w, https://substackcdn.com/image/fetch/$s_!maBG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb545f47-4e86-405d-a51e-5d59f2c1c56f_2048x565.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h2><strong>&#127909; 4 Videos</strong></h2><h3><strong>Owning the AI Pareto Frontier &#8212; Jeff Dean</strong></h3><div id="youtube2-F_1oDPWxpFQ" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;F_1oDPWxpFQ&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/F_1oDPWxpFQ?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Jeff Dean unpacks the Pareto frontier in AI scaling, balancing compute, energy efficiency and model performance. Google&#8217;s Chief AI Scientist discusses the unification of their AI teams and why distillation is becoming the engine behind efficient models. Essential viewing for understanding the tradeoffs between model capability and practical deployment constraints.</p><h3><strong>State of AI in 2026: LLMs, Coding, Scaling Laws &amp; Agents</strong></h3><div id="youtube2-EV7WhVT270Q" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;EV7WhVT270Q&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/EV7WhVT270Q?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Nathan Lambert and Sebastian Raschka join Lex Fridman to break down where we actually stand. They cover the reality of coding agents (echoing FeatureBench findings), scaling laws beyond just &#8220;more compute,&#8221; and practical challenges of building reasoning models. Two of the best technical communicators in AI deliver a comprehensive status report on 2026&#8217;s landscape.</p><h3><strong>DIY Dev Tools: The Shift to &#8220;Build&#8221; Over &#8220;Buy&#8221;</strong></h3><div id="youtube2-LC1mKvLWZ2E" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;LC1mKvLWZ2E&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/LC1mKvLWZ2E?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>CJ Hess demonstrates &#8220;Flowy,&#8221; a custom tool he built to visualize coding plans. Instead of relying on static Markdown or finicky diagrams, he created a system where Claude generates JSON that renders into interactive flowcharts and UI mockups. When AI can write the code, building bespoke internal tools optimized for your specific workflow is often faster and better than buying generic SaaS.</p><h3><strong>Effective Context Engineering for AI Agents</strong></h3><div id="youtube2-nkJXADeI62c" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;nkJXADeI62c&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/nkJXADeI62c?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Dave Ebbelaar explains why agents fail not from bad instructions but from poor context management. In short, context engineering is king. He offers practical strategies for structuring workflows and managing the context window as a limited resource rather than a magic bucket.</p><div><hr></div><h2><strong>&#128240; 3 Curated Reads</strong></h2><h3><strong>Rubric-Based Rewards for RL</strong></h3><div class="embedded-post-wrap" data-attrs="{&quot;id&quot;:186046978,&quot;url&quot;:&quot;https://cameronrwolfe.substack.com/p/rubric-rl&quot;,&quot;publication_id&quot;:1092659,&quot;publication_name&quot;:&quot;Deep (Learning) Focus&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!87xa!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fab9b43fb-52d5-40da-995d-5b7cd3f91064_896x896.png&quot;,&quot;title&quot;:&quot;Rubric-Based Rewards for RL&quot;,&quot;truncated_body_text&quot;:&quot;Many of the recent capability gains in large language models (LLMs) have been a product of advancements in reinforcement learning (RL). In particular, RL with verifiable rewards (RLVR) has drastically improved LLM capabilities by using rules-based, deterministic correctness checks (e.g., passing the test cases for a coding problem&#8230;&quot;,&quot;date&quot;:&quot;2026-02-16T10:33:41.957Z&quot;,&quot;like_count&quot;:64,&quot;comment_count&quot;:5,&quot;bylines&quot;:[],&quot;utm_campaign&quot;:null,&quot;belowTheFold&quot;:true,&quot;type&quot;:&quot;newsletter&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPostToDOM"><a class="embedded-post" native="true" href="https://cameronrwolfe.substack.com/p/rubric-rl?utm_source=substack&amp;utm_campaign=post_embed&amp;utm_medium=web"><div class="embedded-post-header"><img class="embedded-post-publication-logo" src="https://substackcdn.com/image/fetch/$s_!87xa!,w_56,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fab9b43fb-52d5-40da-995d-5b7cd3f91064_896x896.png" loading="lazy"><span class="embedded-post-publication-name">Deep (Learning) Focus</span></div><div class="embedded-post-title-wrapper"><div class="embedded-post-title">Rubric-Based Rewards for RL</div></div><div class="embedded-post-body">Many of the recent capability gains in large language models (LLMs) have been a product of advancements in reinforcement learning (RL). In particular, RL with verifiable rewards (RLVR) has drastically improved LLM capabilities by using rules-based, deterministic correctness checks (e.g., passing the test cases for a coding problem&#8230;</div><div class="embedded-post-cta-wrapper"><span class="embedded-post-cta">Read more</span></div><div class="embedded-post-meta">2 months ago &#183; 64 likes &#183; 5 comments</div></a></div><p>Cameron Wolfe explains that while RL has excelled in domains with clear answers like math and code, Rubric-Based RL uses LLMs as judges with strict rubrics to provide reward signals in subjective domains. This bridges the gap that could allow training reasoning models for writing, creativity, and strategy rather than just problems with verifiable solutions.</p><h3><strong>Recursive Language Models: To the Rescue</strong></h3><div class="embedded-post-wrap" data-attrs="{&quot;id&quot;:187388346,&quot;url&quot;:&quot;https://wheremachinesthink.substack.com/p/recursive-language-models-can-a-simple&quot;,&quot;publication_id&quot;:5277805,&quot;publication_name&quot;:&quot;WHERE MACHINES THINK&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!Yem8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6531b6c-86e3-4240-8372-b5a887412b64_608x608.png&quot;,&quot;title&quot;:&quot;Recursive Language Models: Can a simple method potentially unlock new regime of inference-time scaling?&quot;,&quot;truncated_body_text&quot;:&quot;IN THE FALL OF 2024, there were concerns that neural scaling laws were saturating. Throwing more compute and data to train ever-bigger large language models (LLMs) was showing diminishing returns. And then, OpenAI released its o1 series of &#8220;reasoning&#8221; models, unlocking an entirely new way to improve the performance of LLMs. Ope&#8230;&quot;,&quot;date&quot;:&quot;2026-02-10T15:58:25.501Z&quot;,&quot;like_count&quot;:17,&quot;comment_count&quot;:1,&quot;bylines&quot;:[{&quot;id&quot;:328415354,&quot;name&quot;:&quot;Anil Ananthaswamy&quot;,&quot;handle&quot;:&quot;anilananth&quot;,&quot;previous_name&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cf1b6a95-42d9-4ec4-ac36-43daab10f105_3024x3024.jpeg&quot;,&quot;bio&quot;:&quot;Ex-Software Eng. / Author / Former Dep. News Editor, New Scientist. Bylines in NS, Nature, SciAm, Quanta &amp; more. Books: The Edge of Physics, The Man Who Wasn't There, Through Two Doors at Once and Why Machines Learn. Prof of Practice, IIT-Madras&quot;,&quot;profile_set_up_at&quot;:&quot;2025-06-09T01:26:52.231Z&quot;,&quot;reader_installed_at&quot;:&quot;2026-01-22T08:27:17.455Z&quot;,&quot;publicationUsers&quot;:[],&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null,&quot;status&quot;:{&quot;bestsellerTier&quot;:null,&quot;subscriberTier&quot;:null,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:null,&quot;paidPublicationIds&quot;:[],&quot;subscriber&quot;:null}}],&quot;utm_campaign&quot;:null,&quot;belowTheFold&quot;:true,&quot;type&quot;:&quot;newsletter&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPostToDOM"><a class="embedded-post" native="true" href="https://wheremachinesthink.substack.com/p/recursive-language-models-can-a-simple?utm_source=substack&amp;utm_campaign=post_embed&amp;utm_medium=web"><div class="embedded-post-header"><img class="embedded-post-publication-logo" src="https://substackcdn.com/image/fetch/$s_!Yem8!,w_56,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6531b6c-86e3-4240-8372-b5a887412b64_608x608.png" loading="lazy"><span class="embedded-post-publication-name">WHERE MACHINES THINK</span></div><div class="embedded-post-title-wrapper"><div class="embedded-post-title">Recursive Language Models: Can a simple method potentially unlock new regime of inference-time scaling?</div></div><div class="embedded-post-body">IN THE FALL OF 2024, there were concerns that neural scaling laws were saturating. Throwing more compute and data to train ever-bigger large language models (LLMs) was showing diminishing returns. And then, OpenAI released its o1 series of &#8220;reasoning&#8221; models, unlocking an entirely new way to improve the performance of LLMs. Ope&#8230;</div><div class="embedded-post-cta-wrapper"><span class="embedded-post-cta">Read more</span></div><div class="embedded-post-meta">2 months ago &#183; 17 likes &#183; 1 comment &#183; Anil Ananthaswamy</div></a></div><p>What if an LLM could call itself like a function? Recursive Language Models decompose complex prompts and recursively generate their own context. This architectural shift treats models less like text predictors and more like computer programs with call stacks, handling complexity through recursion rather than scale.</p><h3><strong>Taste for Makers</strong></h3><p><strong><a href="https://paulgraham.com/taste.html">https://paulgraham.com/taste.html</a></strong></p><p>Paul Graham breaks down the universal principles of good design across disciplines like math, coding, and architecture. For anyone building AI products today, treating &#8220;quality&#8221; as an objective, engineerable standard rather than a vague feeling is a massive competitive advantage.</p><div><hr></div><h2><strong>&#128736; 2 Tools &amp; Repos</strong></h2><h3><strong>Reasoning from Scratch (Chapter 7 Code)</strong></h3><p><strong><a href="https://github.com/rasbt/reasoning-from-scratch/blob/main/ch07/01_main-chapter-code/ch07_main.ipynb">https://github.com/rasbt/reasoning-from-scratch/blob/main/ch07/01_main-chapter-code/ch07_main.ipynb</a></strong></p><p>Sebastian Raschka&#8217;s notebook implements Reinforcement Learning with Group Relative Policy Optimization (GRPO) from scratch. If you want to understand how models like DeepSeek-R1 work under the hood, this walks through the actual mechanics rather than hiding behind abstractions. Essential for anyone implementing reasoning capabilities.</p><h3><strong>LLM Course Roadmap</strong></h3><p><strong><a href="https://github.com/mlabonne/llm-course">https://github.com/mlabonne/llm-course</a></strong></p><p>Maxime Labonne&#8217;s repository remains the gold standard for self-learning. This curated roadmap takes you from fundamentals to fine-tuning your own models, recently updated with sections on agentic workflows and evaluation. The curriculum balances theoretical explanations with practical notebooks and hands-on projects mirroring real-world applications.</p><div><hr></div><h2><strong>&#127891; 1 Pick of the Week</strong></h2><h3><strong>microGPT</strong></h3><p><strong><a href="https://karpathy.github.io/2026/02/12/microgpt/">https://karpathy.github.io/2026/02/12/microgpt/</a></strong></p><p>Andrej Karpathy stripped GPT to its absolute core: 243 lines of pure Python with zero dependencies. No PyTorch, no NumPy, no frameworks. Just the full algorithmic content needed for training and inference: dataset handling, tokenizer, autograd engine, GPT-2-like architecture, Adam optimizer, training loop, and inference loop. The code fits perfectly across three columns and represents a decade-long obsession to simplify LLMs to bare essentials. Run it from a single file, understand every mathematical operation, see exactly how attention works without abstraction layers hiding the mechanics. For anyone who wants to truly understand transformers rather than just use them, this is the definitive resource.</p><div><hr></div><p><em>Thanks for reading The Tokenizer! If you found something useful here, share it with someone who might benefit. And if you want more curated insights like this, consider subscribing to Gradient Ascent.</em></p>]]></content:encoded></item><item><title><![CDATA[Google's Viral Paper Banana, How to Systemize Claude Code, and Stanford on Agents & RAG - 📚 The Tokenizer Edition #17]]></title><description><![CDATA[This week's most valuable AI resources]]></description><link>https://newsletter.artofsaience.com/p/googles-viral-paper-banana-how-to</link><guid isPermaLink="false">https://newsletter.artofsaience.com/p/googles-viral-paper-banana-how-to</guid><dc:creator><![CDATA[Sairam Sundaresan]]></dc:creator><pubDate>Wed, 11 Feb 2026 02:27:46 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/g6z_4TMDiaE" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey there! This week brought Google&#8217;s Paper Banana into the spotlight with its interactive approach to understanding research, a comprehensive system for making Claude Code actually work at scale, and Stanford&#8217;s practical take on building agents with RAG. Beyond the headline picks, CodeOCR&#8217;s 8x code compression through visual representation and DFlash beating EAGLE-3 by 2.5x suggest we&#8217;re seeing real efficiency breakthroughs across the board.</p><h3><strong>New here?</strong></h3><p><em>The Tokenizer is my resource-focused newsletter edition where I curate the best papers, videos, articles, tools, and learning resources from across the AI landscape. Consider it your weekly dose of everything you need to stay ahead in machine learning.</em></p><div><hr></div><p>I&#8217;m teaching ML &amp; Generative AI System Design on Feb 28th / March 1st with Packt.</p><p>We&#8217;ll cover AI systems that use RAG and traditional ML design principles for building solid AI products: making systems reliable, measuring what matters, and designing architectures that work in production.</p><p>Through live discussions, guided exercises, and team-based design sprints, you&#8217;ll practice solving system-level AI problems and walk away with frameworks you can apply immediately at work.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://www.eventbrite.com/e/machine-learning-and-generative-ai-system-design-workshop-tickets-1975103644168?aff=SairamNewsletter" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Kh1G!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbba1e9ed-12f1-42cc-af9a-bc5c647cb669_1280x640.png 424w, https://substackcdn.com/image/fetch/$s_!Kh1G!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbba1e9ed-12f1-42cc-af9a-bc5c647cb669_1280x640.png 848w, https://substackcdn.com/image/fetch/$s_!Kh1G!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbba1e9ed-12f1-42cc-af9a-bc5c647cb669_1280x640.png 1272w, https://substackcdn.com/image/fetch/$s_!Kh1G!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbba1e9ed-12f1-42cc-af9a-bc5c647cb669_1280x640.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Kh1G!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbba1e9ed-12f1-42cc-af9a-bc5c647cb669_1280x640.png" width="1280" height="640" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bba1e9ed-12f1-42cc-af9a-bc5c647cb669_1280x640.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:640,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:&quot;https://www.eventbrite.com/e/machine-learning-and-generative-ai-system-design-workshop-tickets-1975103644168?aff=SairamNewsletter&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Kh1G!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbba1e9ed-12f1-42cc-af9a-bc5c647cb669_1280x640.png 424w, https://substackcdn.com/image/fetch/$s_!Kh1G!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbba1e9ed-12f1-42cc-af9a-bc5c647cb669_1280x640.png 848w, https://substackcdn.com/image/fetch/$s_!Kh1G!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbba1e9ed-12f1-42cc-af9a-bc5c647cb669_1280x640.png 1272w, https://substackcdn.com/image/fetch/$s_!Kh1G!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbba1e9ed-12f1-42cc-af9a-bc5c647cb669_1280x640.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Use code <strong>FLASH40</strong> for 40% off: <a href="https://lnkd.in/gqTrvsuS"> </a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.eventbrite.com/e/machine-learning-and-generative-ai-system-design-workshop-tickets-1975103644168?aff=SairamNewsletter&quot;,&quot;text&quot;:&quot;Register for the Workshop&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.eventbrite.com/e/machine-learning-and-generative-ai-system-design-workshop-tickets-1975103644168?aff=SairamNewsletter"><span>Register for the Workshop</span></a></p><p>Questions? Drop a comment or DM me.</p><div><hr></div><h2><strong>TL;DR</strong></h2><p><strong>What caught my attention this week:</strong></p><p><strong>&#128196; Papers:</strong> Code-as-image achieving 8x compression, speculative decoding beating EAGLE-3 by 2.5x, video models learning from YouTube demonstrations, and self-optimizing RL systems</p><p><strong>&#127909; Videos:</strong> Systematic Claude Code workflows from Anthropic users, Stanford on agents and RAG patterns, visual MoE explanations, and vector indexing methods</p><p><strong>&#128240; Reads:</strong> Why creativity can&#8217;t be interpolated, 10x cheaper tokens through prompt caching, and 12x faster MoE training with Unsloth</p><p><strong>&#128736; Tools:</strong> Language extraction for multilingual text and comprehensive LLM evaluation frameworks</p><p><strong>&#127891; Learning:</strong> Google&#8217;s viral Paper Banana for interactive research paper exploration</p><div><hr></div><h2><strong>&#128196; 5 Papers</strong></h2><h3><strong>Demo-ICL: In-Context Learning for Procedural Video Knowledge Acquisition</strong></h3><p><strong><a href="https://arxiv.org/abs/2602.08439">https://arxiv.org/abs/2602.08439</a> |<a href="https://github.com/dongyh20/Demo-ICL"> GitHub</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4Hwz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30afb6f3-d6ba-47f0-8b72-5fd5a3ff379a_1944x1158.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4Hwz!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30afb6f3-d6ba-47f0-8b72-5fd5a3ff379a_1944x1158.png 424w, https://substackcdn.com/image/fetch/$s_!4Hwz!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30afb6f3-d6ba-47f0-8b72-5fd5a3ff379a_1944x1158.png 848w, https://substackcdn.com/image/fetch/$s_!4Hwz!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30afb6f3-d6ba-47f0-8b72-5fd5a3ff379a_1944x1158.png 1272w, https://substackcdn.com/image/fetch/$s_!4Hwz!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30afb6f3-d6ba-47f0-8b72-5fd5a3ff379a_1944x1158.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4Hwz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30afb6f3-d6ba-47f0-8b72-5fd5a3ff379a_1944x1158.png" width="1456" height="867" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/30afb6f3-d6ba-47f0-8b72-5fd5a3ff379a_1944x1158.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:867,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4Hwz!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30afb6f3-d6ba-47f0-8b72-5fd5a3ff379a_1944x1158.png 424w, https://substackcdn.com/image/fetch/$s_!4Hwz!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30afb6f3-d6ba-47f0-8b72-5fd5a3ff379a_1944x1158.png 848w, https://substackcdn.com/image/fetch/$s_!4Hwz!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30afb6f3-d6ba-47f0-8b72-5fd5a3ff379a_1944x1158.png 1272w, https://substackcdn.com/image/fetch/$s_!4Hwz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30afb6f3-d6ba-47f0-8b72-5fd5a3ff379a_1944x1158.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Most video benchmarks test static knowledge rather than whether models can actually learn from demonstrations. Demo-ICL tackles this directly with a benchmark built from 1200 instructional YouTube videos. The system provides models with text or video demonstrations, then asks them to answer questions about target videos by applying what they learned from the examples. The two-stage training approach combines video-supervised fine-tuning with information-assisted preference optimization, improving how models extract and apply procedural knowledge from demonstrations.</p><h3><strong>DFlash: Block Diffusion for Flash Speculative Decoding</strong></h3><p><strong><a href="https://arxiv.org/abs/2602.06036">https://arxiv.org/abs/2602.06036</a> |<a href="https://github.com/z-lab/dflash"> GitHub</a></strong></p><div id="youtube2-sUCUxbkeABA" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;sUCUxbkeABA&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/sUCUxbkeABA?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Autoregressive models decode sequentially, creating a latency bottleneck that speculative decoding helps address. DFlash uses a lightweight block diffusion model that generates draft tokens in parallel rather than one at a time. By conditioning the draft model on context features from the target LLM, it maintains high acceptance rates while generating entire blocks simultaneously. The system achieves over 6x lossless acceleration across various tasks, delivering up to 2.5x higher speedup than EAGLE-3.</p><h3><strong>CodeOCR: On the Effectiveness of Vision Language Models in Code Understanding</strong></h3><p><strong><a href="https://arxiv.org/abs/2602.01785">https://arxiv.org/abs/2602.01785</a> |<a href="https://github.com/YerbaPage/CodeOCR"> GitHub</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!F4Rc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94786af5-1802-4da4-be2b-60bedf7d8a73_932x313.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!F4Rc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94786af5-1802-4da4-be2b-60bedf7d8a73_932x313.png 424w, https://substackcdn.com/image/fetch/$s_!F4Rc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94786af5-1802-4da4-be2b-60bedf7d8a73_932x313.png 848w, https://substackcdn.com/image/fetch/$s_!F4Rc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94786af5-1802-4da4-be2b-60bedf7d8a73_932x313.png 1272w, https://substackcdn.com/image/fetch/$s_!F4Rc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94786af5-1802-4da4-be2b-60bedf7d8a73_932x313.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!F4Rc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94786af5-1802-4da4-be2b-60bedf7d8a73_932x313.png" width="932" height="313" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/94786af5-1802-4da4-be2b-60bedf7d8a73_932x313.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:313,&quot;width&quot;:932,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!F4Rc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94786af5-1802-4da4-be2b-60bedf7d8a73_932x313.png 424w, https://substackcdn.com/image/fetch/$s_!F4Rc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94786af5-1802-4da4-be2b-60bedf7d8a73_932x313.png 848w, https://substackcdn.com/image/fetch/$s_!F4Rc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94786af5-1802-4da4-be2b-60bedf7d8a73_932x313.png 1272w, https://substackcdn.com/image/fetch/$s_!F4Rc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94786af5-1802-4da4-be2b-60bedf7d8a73_932x313.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Text-based code processing creates a linear scaling problem as codebases grow. CodeOCR explores representing code as rendered images instead, taking advantage of visual compression capabilities that text lacks. The research shows vision-language models can understand code images at up to 8x compression while maintaining performance. Tasks like clone detection actually improve slightly under compression, and syntax highlighting boosts code completion performance at 4x compression. The approach suggests image-based code representation could fundamentally change how we handle large-scale code understanding.</p><h3><strong>RLAnything: Forge Environment, Policy, and Reward Model in Completely Dynamic RL System</strong></h3><p><strong><a href="https://arxiv.org/abs/2602.02488">https://arxiv.org/abs/2602.02488</a> |<a href="https://github.com/Gen-Verse/Open-AgentRL"> GitHub</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NaUD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ca404c5-3765-4c89-93af-a6060a62b51c_757x184.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NaUD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ca404c5-3765-4c89-93af-a6060a62b51c_757x184.png 424w, https://substackcdn.com/image/fetch/$s_!NaUD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ca404c5-3765-4c89-93af-a6060a62b51c_757x184.png 848w, https://substackcdn.com/image/fetch/$s_!NaUD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ca404c5-3765-4c89-93af-a6060a62b51c_757x184.png 1272w, https://substackcdn.com/image/fetch/$s_!NaUD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ca404c5-3765-4c89-93af-a6060a62b51c_757x184.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NaUD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ca404c5-3765-4c89-93af-a6060a62b51c_757x184.png" width="757" height="184" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4ca404c5-3765-4c89-93af-a6060a62b51c_757x184.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:184,&quot;width&quot;:757,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NaUD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ca404c5-3765-4c89-93af-a6060a62b51c_757x184.png 424w, https://substackcdn.com/image/fetch/$s_!NaUD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ca404c5-3765-4c89-93af-a6060a62b51c_757x184.png 848w, https://substackcdn.com/image/fetch/$s_!NaUD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ca404c5-3765-4c89-93af-a6060a62b51c_757x184.png 1272w, https://substackcdn.com/image/fetch/$s_!NaUD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ca404c5-3765-4c89-93af-a6060a62b51c_757x184.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Traditional reinforcement learning keeps environments, policies, and reward models separate. RLAnything makes all three components adapt together through closed-loop optimization. The policy trains on combined step-wise and outcome signals, while the reward model jointly optimizes through consistency feedback. Environment tasks automatically adjust difficulty based on critic feedback from both the policy and reward model. The system delivers substantial gains: 9.1% improvement on OSWorld for Qwen3-VL-8B-Thinking, and 18.7% and 11.9% boosts on AlfWorld and LiveBench for Qwen2.5-7B-Instruct.</p><h3><strong>Modality Gap-Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models</strong></h3><p><strong><a href="https://arxiv.org/abs/2602.07026">https://arxiv.org/abs/2602.07026</a></strong></p><p>Despite progress in multimodal contrastive learning, a persistent geometric anomaly remains: embeddings of different modalities expressing identical semantics occupy systematically offset regions. This modality gap creates alignment challenges that existing approaches handle inefficiently. The research introduces Fixed-frame Modality Gap Theory, which decomposes the gap into stable biases and anisotropic residuals. ReAlign aligns text representations into image distribution space using statistics from unpaired data through anchor, trace, and centroid alignment. Building on this, ReVision enables MLLMs to learn visual representation distribution from unpaired text before visual instruction tuning, reducing dependence on expensive image-text pairs.</p><div><hr></div><h2><strong>&#127909; 4 Videos</strong></h2><h3><strong>How to Make Claude Code Better Every Time You Use It (Full System)</strong></h3><div id="youtube2-g6z_4TMDiaE" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;g6z_4TMDiaE&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/g6z_4TMDiaE?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Stop fighting with Claude Code and start building systematic workflows that compound over time. This comprehensive guide covers setting up persistent context files that Claude references across all sessions, structuring your codebase so Claude understands project architecture from the start, and creating reusable prompt patterns that extract better reasoning. You&#8217;ll learn how to build a project-specific knowledge base, manage multiple Claude sessions working on different parts of your code simultaneously without conflicts, and structure requests to get maintainable output instead of throwaway code. The system turns Claude Code from a one-off tool into a genuine development partner.</p><h3><strong>MoE, Visually Explained</strong></h3><div id="youtube2-0QQlYR1r6pQ" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;0QQlYR1r6pQ&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/0QQlYR1r6pQ?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Mixture-of-Experts architectures power many frontier models but remain conceptually opaque. This visual breakdown makes the core mechanics tangible: how router networks decide which experts process each token, why sparse activation improves efficiency without sacrificing performance, and what trade-offs exist between expert count and model capacity. The explanations focus on intuition over equations, making MoE accessible without dumbing down the actual complexity involved.</p><h3><strong>What is Indexing? Indexing Methods for Vector Retrieval</strong></h3><div id="youtube2-NytKzh8avhw" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;NytKzh8avhw&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/NytKzh8avhw?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Vector databases require efficient retrieval mechanisms as collections scale. This overview covers indexing approaches from flat search through hierarchical navigable small worlds (HNSW) and inverted file indexes (IVF). You&#8217;ll understand when approximate nearest neighbor search makes sense versus exact matching, how different index types trade off speed against accuracy, and which methods work best for specific retrieval patterns.</p><h3><strong>Agents, Prompts, and RAG</strong></h3><div id="youtube2-k1njvbBmfsw" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;k1njvbBmfsw&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/k1njvbBmfsw?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Stanford&#8217;s practical breakdown of building reliable agents using retrieval-augmented generation cuts through the hype. The discussion examines when to use agents versus simpler prompt chains, how to structure RAG systems that agents can query effectively, and debugging approaches when agentic systems fail. The focus stays on production considerations: handling edge cases, managing token budgets, and building systems that degrade gracefully rather than failing catastrophically. Essential viewing for anyone moving beyond toy agent demos to production deployments.</p><div><hr></div><h2><strong>&#128240; 3 Curated Reads</strong></h2><h3><strong>Why Creativity Cannot Be Interpolated</strong></h3><p><strong><a href="https://archive.mlst.ai/paper/why-creativity-cannot-be-interpolated/">https://archive.mlst.ai/paper/why-creativity-cannot-be-interpolated/</a></strong></p><p>This exploration challenges the assumption that creative breakthroughs emerge from incremental improvements. The argument centers on fundamental limitations of interpolation-based learning: models trained to predict the next token in existing distributions struggle to generate genuinely novel ideas that exist outside their training manifold. The piece examines what this means for AI&#8217;s creative capabilities and why certain types of innovation may require fundamentally different approaches than current architectures support.</p><h3><strong>Prompt caching: 10x cheaper LLM tokens, but how?</strong></h3><p><strong><a href="https://ngrok.com/blog/prompt-caching/">https://ngrok.com/blog/prompt-caching/</a></strong></p><p>Prompt caching stores processed prefixes so repeated context doesn&#8217;t get recomputed on every request. This technical breakdown explains how caching works under the hood: which parts of prompts get cached, how providers handle cache invalidation, and what billing implications exist. The post provides concrete strategies for structuring prompts to maximize cache hits, quantifies actual cost savings across different use patterns, and identifies scenarios where caching delivers minimal benefit.</p><h3><strong>Fine-tune MoE Models 12x Faster with Unsloth</strong></h3><p><strong><a href="https://unsloth.ai/docs/new/faster-moe">https://unsloth.ai/docs/new/faster-moe</a></strong></p><p>Training Mixture-of-Experts models typically requires massive compute due to their architecture. Unsloth achieves up to 12x speedups over Transformers v4 (and roughly 2x over the already-optimized Transformers v5) through custom Triton grouped-GEMM kernels and a Split LoRA approach that also cuts VRAM usage by over 35%. The documentation covers implementation details: how to integrate Unsloth into existing training pipelines, which MoE models are supported (Qwen3, DeepSeek R1/V3, GLM), and how the optimizations maintain full accuracy with zero approximation. Particularly valuable for teams fine-tuning large MoE models on consumer hardware.</p><div><hr></div><h2><strong>&#128736; 2 Tools &amp; Repos</strong></h2><h3><strong>Langextract</strong></h3><p><strong><a href="https://github.com/google/langextract">https://github.com/google/langextract</a></strong></p><p>Extracting structured information from unstructured text typically requires custom pipelines for every new domain. Langextract is a Gemini-powered Python library that handles this through LLM-based extraction with precise source grounding, mapping every extracted entity back to its exact location in the source text. Define your extraction schema through a few examples and the tool adapts without fine-tuning, handling long documents through optimized chunking and parallel processing. Particularly useful for domains like clinical notes, radiology reports, and research literature where traceability between extracted data and source material is critical.</p><h3><strong>Deepeval</strong></h3><p><strong><a href="https://github.com/confident-ai/deepeval">https://github.com/confident-ai/deepeval</a></strong></p><p>LLM evaluation needs systematic frameworks rather than ad-hoc testing. Deepeval provides metrics for measuring answer relevancy, factual consistency, and hallucination rates across different model outputs. The framework supports both rule-based and model-based evaluation, enables A/B testing across prompt variations, and tracks performance degradation over time. Particularly useful for teams building production LLM applications that need reliable quality metrics.</p><div><hr></div><h2><strong>&#127891; 1 Pick of the Week</strong></h2><h3><strong>Paper Banana</strong></h3><p><strong><a href="https://dwzhu-pku.github.io/PaperBanana/">https://dwzhu-pku.github.io/PaperBanana/</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!MbrP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76f1f33d-a59a-4e99-b27b-516d67750742_2048x1187.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!MbrP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76f1f33d-a59a-4e99-b27b-516d67750742_2048x1187.jpeg 424w, https://substackcdn.com/image/fetch/$s_!MbrP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76f1f33d-a59a-4e99-b27b-516d67750742_2048x1187.jpeg 848w, https://substackcdn.com/image/fetch/$s_!MbrP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76f1f33d-a59a-4e99-b27b-516d67750742_2048x1187.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!MbrP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76f1f33d-a59a-4e99-b27b-516d67750742_2048x1187.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!MbrP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76f1f33d-a59a-4e99-b27b-516d67750742_2048x1187.jpeg" width="1456" height="844" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/76f1f33d-a59a-4e99-b27b-516d67750742_2048x1187.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:844,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!MbrP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76f1f33d-a59a-4e99-b27b-516d67750742_2048x1187.jpeg 424w, https://substackcdn.com/image/fetch/$s_!MbrP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76f1f33d-a59a-4e99-b27b-516d67750742_2048x1187.jpeg 848w, https://substackcdn.com/image/fetch/$s_!MbrP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76f1f33d-a59a-4e99-b27b-516d67750742_2048x1187.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!MbrP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76f1f33d-a59a-4e99-b27b-516d67750742_2048x1187.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Google&#8217;s Paper Banana went viral for good reason: it fundamentally changes how researchers produce academic illustrations. Instead of spending hours manually crafting methodology diagrams and statistical plots, Paper Banana orchestrates five specialized AI agents (Retriever, Planner, Stylist, Visualizer, and Critic) to generate publication-ready visuals from paper text. The system handles everything from architecture diagrams to data visualizations, with the Critic agent running multiple refinement rounds to catch factual errors and visual glitches. In blind human evaluation, its outputs achieved a 72.7% win rate against baseline AI models. For researchers drowning in illustration work or trying to produce consistent, high-quality figures across papers, this approach delivers production value that manual tools struggle to match. The viral response reflects how badly the research community needed a better way to handle the illustration burden.</p><div><hr></div><p><em>Thanks for reading The Tokenizer! If you found something useful here, share it with someone who might benefit. And if you want more curated insights like this, consider subscribing to Gradient Ascent.</em></p>]]></content:encoded></item><item><title><![CDATA[DeepSeek's Human-Like Vision, Chip Huyen's AI Tools Site, and Stanford's Updated LLM Course - 📚 The Tokenizer Edition #16]]></title><description><![CDATA[This week's most valuable AI resources]]></description><link>https://newsletter.artofsaience.com/p/deepseeks-human-like-vision-chip</link><guid isPermaLink="false">https://newsletter.artofsaience.com/p/deepseeks-human-like-vision-chip</guid><dc:creator><![CDATA[Sairam Sundaresan]]></dc:creator><pubDate>Tue, 03 Feb 2026 17:33:23 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/LvLdNkgO-N0" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey there! DeepSeek just redefined how vision models process documents by teaching them to read with human-like logic instead of rigid top-to-bottom scanning. Meanwhile, Ant Group released a world simulator achieving minute-long consistent video generation with sub-second interaction latency. Open-source continues delivering production-ready systems.</p><h3><strong>New here?</strong></h3><p><em>The Tokenizer is my resource-focused newsletter edition where I curate the best papers, videos, articles, tools, and learning resources from across the AI landscape. Consider it your weekly dose of everything you need to stay ahead in machine learning.</em></p><div><hr></div><p><strong>I&#8217;m teaching ML &amp; Generative AI System Design on Feb 28th / March 1st with Packt.</strong></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.eventbrite.com/e/machine-learning-and-generative-ai-system-design-workshop-tickets-1975103644168?aff=SairamNewsletter&quot;,&quot;text&quot;:&quot;Register Today&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.eventbrite.com/e/machine-learning-and-generative-ai-system-design-workshop-tickets-1975103644168?aff=SairamNewsletter"><span>Register Today</span></a></p><blockquote><p><strong>Gradient Ascent Special:</strong> Use code <strong>FLASH40</strong> for 40% off </p></blockquote><p>We&#8217;ll cover the core system design principles for building solid AI products: making systems reliable, measuring what matters, and designing architectures that work in production.</p><p>Through live discussions, guided exercises, and team-based design sprints, you&#8217;ll practice solving system-level AI problems and walk away with frameworks you can apply immediately at work.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://www.eventbrite.com/e/machine-learning-and-generative-ai-system-design-workshop-tickets-1975103644168?aff=SairamNewsletter" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!34Pg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F233d51e2-d45d-426f-a26a-c89e0c69409b_1280x640.png 424w, https://substackcdn.com/image/fetch/$s_!34Pg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F233d51e2-d45d-426f-a26a-c89e0c69409b_1280x640.png 848w, https://substackcdn.com/image/fetch/$s_!34Pg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F233d51e2-d45d-426f-a26a-c89e0c69409b_1280x640.png 1272w, https://substackcdn.com/image/fetch/$s_!34Pg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F233d51e2-d45d-426f-a26a-c89e0c69409b_1280x640.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!34Pg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F233d51e2-d45d-426f-a26a-c89e0c69409b_1280x640.png" width="1280" height="640" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/233d51e2-d45d-426f-a26a-c89e0c69409b_1280x640.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:640,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:&quot;https://www.eventbrite.com/e/machine-learning-and-generative-ai-system-design-workshop-tickets-1975103644168?aff=SairamNewsletter&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!34Pg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F233d51e2-d45d-426f-a26a-c89e0c69409b_1280x640.png 424w, https://substackcdn.com/image/fetch/$s_!34Pg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F233d51e2-d45d-426f-a26a-c89e0c69409b_1280x640.png 848w, https://substackcdn.com/image/fetch/$s_!34Pg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F233d51e2-d45d-426f-a26a-c89e0c69409b_1280x640.png 1272w, https://substackcdn.com/image/fetch/$s_!34Pg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F233d51e2-d45d-426f-a26a-c89e0c69409b_1280x640.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>What topics/problems would you most want covered in a system design workshop? Drop a comment or DM me.</p><div><hr></div><h2><strong>TL;DR</strong></h2><p><strong>What caught my attention this week:</strong></p><p>&#128196; <strong>Papers:</strong> Vision encoders with causal reasoning for document understanding, open-source world simulators rivaling closed systems, plus advances in mathematical reasoning and multimodal scientific models</p><p>&#127909; <strong>Videos:</strong> Pydantic fundamentals for ML engineers, building agent frameworks from scratch, production AI coding workflows, and understanding diffusion versus flow matching</p><p>&#128240; <strong>Reads:</strong> Reinforcement learning for continual LLM adaptation, preparing for ML interviews beyond just attention mechanisms, and DeepSeek&#8217;s latest architectural innovations</p><p>&#128736; <strong>Tools:</strong> Curated resources for agentic reasoning research and comprehensive AI tool directories</p><p>&#127891; <strong>Learning:</strong> Stanford&#8217;s updated large language model course covering recent advances</p><div><hr></div><h2><strong>&#128196; 5 Papers</strong></h2><h3><strong>DeepSeek-OCR 2: Visual Causal Flow</strong></h3><p><strong><a href="https://arxiv.org/abs/2601.20552">https://arxiv.org/abs/2601.20552</a></strong> |<a href="https://github.com/deepseek-ai/DeepSeek-OCR-2"> </a><strong><a href="https://github.com/deepseek-ai/DeepSeek-OCR-2">GitHub</a></strong></p><p>Instead of processing images in rigid raster-scan order, DeepSeek-OCR 2 introduces DeepEncoder V2 that mimics how humans actually read documents. The encoder uses causal attention to dynamically reorder visual tokens based on semantic content, determining whether to scan titles first, process tables column-by-column, or navigate multi-column layouts intelligently. By replacing CLIP with Qwen2-0.5B and implementing learnable queries with causal flow, the 3B parameter model achieves 91.09% on OmniDocBench while maintaining 256-1120 token efficiency. Reading order edit distance dropped from 0.085 to 0.057, proving the system genuinely understands logical document structure rather than just memorizing patterns.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!e6vY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ada621c-305b-4262-8e69-514e5b518355_2048x1129.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!e6vY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ada621c-305b-4262-8e69-514e5b518355_2048x1129.png 424w, https://substackcdn.com/image/fetch/$s_!e6vY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ada621c-305b-4262-8e69-514e5b518355_2048x1129.png 848w, https://substackcdn.com/image/fetch/$s_!e6vY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ada621c-305b-4262-8e69-514e5b518355_2048x1129.png 1272w, https://substackcdn.com/image/fetch/$s_!e6vY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ada621c-305b-4262-8e69-514e5b518355_2048x1129.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!e6vY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ada621c-305b-4262-8e69-514e5b518355_2048x1129.png" width="1456" height="803" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3ada621c-305b-4262-8e69-514e5b518355_2048x1129.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:803,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!e6vY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ada621c-305b-4262-8e69-514e5b518355_2048x1129.png 424w, https://substackcdn.com/image/fetch/$s_!e6vY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ada621c-305b-4262-8e69-514e5b518355_2048x1129.png 848w, https://substackcdn.com/image/fetch/$s_!e6vY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ada621c-305b-4262-8e69-514e5b518355_2048x1129.png 1272w, https://substackcdn.com/image/fetch/$s_!e6vY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ada621c-305b-4262-8e69-514e5b518355_2048x1129.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>Advancing Open-source World Models</strong></h3><p><strong><a href="https://arxiv.org/abs/2601.20540">https://arxiv.org/abs/2601.20540</a></strong> |<a href="https://github.com/Robbyant/lingbot-world/"> </a><strong><a href="https://github.com/Robbyant/lingbot-world/">GitHub</a></strong></p><p>Ant Group&#8217;s LingBot-World delivers minute-long consistent video generation at 16 FPS with under 1-second interaction latency, positioning open-source world models competitively against closed systems. The system maintains high-fidelity dynamics across photorealistic, scientific, and stylized environments through a multi-stage training pipeline combining web videos with Unreal Engine synthetic data. Users control camera perspectives and environmental conditions in real-time while the model preserves spatial consistency across 961 frames. The hybrid data engine with hierarchical captioning separates motion control from static scene generation, addressing the training data bottleneck that typically limits world model development.</p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;b2cfa773-576a-4911-94b0-86f9cc89091b&quot;,&quot;duration&quot;:null}"></div><h3><strong>Harder Is Better: Boosting Mathematical Reasoning via Difficulty-Aware GRPO and Multi-Aspect Question Reformulation</strong></h3><p><strong><a href="https://arxiv.org/abs/2601.20614">https://arxiv.org/abs/2601.20614</a></strong> |<a href="https://github.com/AMAP-ML/MathForge"> </a><strong><a href="https://github.com/AMAP-ML/MathForge">GitHub</a></strong></p><p>Group Relative Policy Optimization suffers from an implicit imbalance where harder questions receive smaller policy updates, limiting capability development where it matters most. MathForge addresses this through Difficulty-Aware GRPO, which balances group advantage estimation by question difficulty and prioritizes harder problems through difficulty-aware weighting. The framework&#8217;s Multi-Aspect Question Reformulation strategy systematically increases question difficulty across multiple dimensions while maintaining gold answers, creating training data that pushes model boundaries. The synergistic loop of MQR expanding the data frontier and DGPO effectively learning from augmented data produces significant improvements across mathematical reasoning benchmarks.</p><h3><strong>Innovator-VL: A Multimodal Large Language Model for Scientific Discovery</strong></h3><p><strong><a href="https://arxiv.org/abs/2601.19325">https://arxiv.org/abs/2601.19325</a></strong> |<a href="https://github.com/InnovatorLM/Innovator-VL"> </a><strong><a href="https://github.com/InnovatorLM/Innovator-VL">GitHub</a></strong></p><p>Scientific multimodal models typically require massive domain-specific pretraining, but Innovator-VL demonstrates strong performance using fewer than five million curated samples without large-scale pretraining. The fully transparent training pipeline covers data collection, cleaning, preprocessing, supervised fine-tuning, and reinforcement learning with detailed optimization recipes, enabling systematic community extension. The model maintains competitive performance on general vision benchmarks while excelling at scientific tasks, indicating that scientific alignment integrates into unified models without compromising general-purpose capabilities. Principled data selection proves more effective than indiscriminate scaling for scientific reasoning.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!D8dc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc97acde7-a10c-426f-b25e-ed7937dd8ca6_2048x1125.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!D8dc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc97acde7-a10c-426f-b25e-ed7937dd8ca6_2048x1125.png 424w, https://substackcdn.com/image/fetch/$s_!D8dc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc97acde7-a10c-426f-b25e-ed7937dd8ca6_2048x1125.png 848w, https://substackcdn.com/image/fetch/$s_!D8dc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc97acde7-a10c-426f-b25e-ed7937dd8ca6_2048x1125.png 1272w, https://substackcdn.com/image/fetch/$s_!D8dc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc97acde7-a10c-426f-b25e-ed7937dd8ca6_2048x1125.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!D8dc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc97acde7-a10c-426f-b25e-ed7937dd8ca6_2048x1125.png" width="1456" height="800" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c97acde7-a10c-426f-b25e-ed7937dd8ca6_2048x1125.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:800,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!D8dc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc97acde7-a10c-426f-b25e-ed7937dd8ca6_2048x1125.png 424w, https://substackcdn.com/image/fetch/$s_!D8dc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc97acde7-a10c-426f-b25e-ed7937dd8ca6_2048x1125.png 848w, https://substackcdn.com/image/fetch/$s_!D8dc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc97acde7-a10c-426f-b25e-ed7937dd8ca6_2048x1125.png 1272w, https://substackcdn.com/image/fetch/$s_!D8dc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc97acde7-a10c-426f-b25e-ed7937dd8ca6_2048x1125.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>Youtu-VL: Unleashing Visual Potential via Unified Vision-Language Supervision</strong></h3><p><strong><a href="https://arxiv.org/abs/2601.19798">https://arxiv.org/abs/2601.19798</a></strong> |<a href="https://github.com/TencentCloudADP/youtu-vl"> </a><strong><a href="https://github.com/TencentCloudADP/youtu-vl">GitHub</a></strong></p><p>Vision-language models exhibit a text-dominant optimization bias by treating visual signals as passive inputs rather than supervisory targets. Youtu-VL shifts to the Vision-Language Unified Autoregressive Supervision paradigm, integrating visual tokens directly into the prediction stream and applying unified autoregressive supervision to both visual details and linguistic content. This &#8220;vision-as-target&#8221; approach fundamentally changes optimization from treating vision as conditional input to making it a prediction objective. The framework extends to vision-centric tasks without requiring task-specific architectural additions, establishing foundations for comprehensive generalist visual agents.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!STXt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6f2c6a9-7efd-464c-8dbe-80902e352cf4_2048x1096.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!STXt!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6f2c6a9-7efd-464c-8dbe-80902e352cf4_2048x1096.png 424w, https://substackcdn.com/image/fetch/$s_!STXt!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6f2c6a9-7efd-464c-8dbe-80902e352cf4_2048x1096.png 848w, https://substackcdn.com/image/fetch/$s_!STXt!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6f2c6a9-7efd-464c-8dbe-80902e352cf4_2048x1096.png 1272w, https://substackcdn.com/image/fetch/$s_!STXt!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6f2c6a9-7efd-464c-8dbe-80902e352cf4_2048x1096.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!STXt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6f2c6a9-7efd-464c-8dbe-80902e352cf4_2048x1096.png" width="1456" height="779" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e6f2c6a9-7efd-464c-8dbe-80902e352cf4_2048x1096.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:779,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!STXt!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6f2c6a9-7efd-464c-8dbe-80902e352cf4_2048x1096.png 424w, https://substackcdn.com/image/fetch/$s_!STXt!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6f2c6a9-7efd-464c-8dbe-80902e352cf4_2048x1096.png 848w, https://substackcdn.com/image/fetch/$s_!STXt!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6f2c6a9-7efd-464c-8dbe-80902e352cf4_2048x1096.png 1272w, https://substackcdn.com/image/fetch/$s_!STXt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6f2c6a9-7efd-464c-8dbe-80902e352cf4_2048x1096.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h2><strong>&#127909; 4 Videos</strong></h2><h3><strong>Pydantic Crash Course</strong></h3><div id="youtube2-PkQIREapb9o" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;PkQIREapb9o&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/PkQIREapb9o?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Dave Ebbelaar walks through Pydantic&#8217;s data validation and settings management capabilities essential for ML engineers working with LLMs. The tutorial covers defining models with type hints, validation logic, and configuration management patterns that ensure data integrity in production AI systems. Understanding Pydantic proves critical when building structured outputs from language models or managing complex application configurations where type safety prevents runtime errors.</p><h3><strong>Building Mini ClawdBot from Scratch</strong></h3><div id="youtube2-sfi_xebGsSw" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;sfi_xebGsSw&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/sfi_xebGsSw?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Vizuara demonstrates constructing an agent like Moltbot (ClawdBot, or whatever it's called now) without relying on existing libraries, revealing the actual mechanics behind agent architectures. By building from first principles, you understand tool integration, state management, and decision-making loops that frameworks abstract away. This approach proves valuable when debugging production agent systems or architecting custom solutions that don't fit standard framework patterns.</p><h3><strong>The Senior Engineer&#8217;s Guide to AI Coding</strong></h3><div id="youtube2-LvLdNkgO-N0" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;LvLdNkgO-N0&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/LvLdNkgO-N0?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>This How I AI episode examines Claude Code workflows and architectural decisions that separate effective AI-assisted development from mere prompt engineering. It addresses code review practices, testing strategies, and integration patterns when collaborating with AI coding assistants. </p><h3><strong>Flow Matching vs Diffusion Side By Side</strong></h3><div id="youtube2-firXjwZ_6KI" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;firXjwZ_6KI&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/firXjwZ_6KI?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Letitia compares flow matching and diffusion approaches for generative modeling, clarifying when each technique provides advantages. The visual comparison helps understand why flow matching sometimes offers training efficiency benefits over traditional diffusion while maintaining generation quality. Grasping these trade-offs matters when selecting architectures for specific generative modeling tasks.</p><div><hr></div><h2><strong>&#128240; 3 Curated Reads</strong></h2><h3><strong>Continual Learning with RL for LLMs</strong></h3><div class="embedded-post-wrap" data-attrs="{&quot;id&quot;:183759600,&quot;url&quot;:&quot;https://cameronrwolfe.substack.com/p/rl-continual-learning&quot;,&quot;publication_id&quot;:1092659,&quot;publication_name&quot;:&quot;Deep (Learning) Focus&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!87xa!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fab9b43fb-52d5-40da-995d-5b7cd3f91064_896x896.png&quot;,&quot;title&quot;:&quot;Continual Learning with RL for LLMs&quot;,&quot;truncated_body_text&quot;:&quot;Continual learning, which refers to the ability of an AI model to learn from new tasks and data over time, has become a popular topic in the discussion of Artificial General Intelligence (AGI). Put simply, general intelligence should be adaptable, which has led some to believe that continual learning abilities are a prerequisite f&#8230;&quot;,&quot;date&quot;:&quot;2026-01-26T10:33:14.548Z&quot;,&quot;like_count&quot;:93,&quot;comment_count&quot;:3,&quot;bylines&quot;:[{&quot;id&quot;:29736521,&quot;name&quot;:&quot;Cameron R. Wolfe, Ph.D.&quot;,&quot;handle&quot;:&quot;cwolferesearch&quot;,&quot;previous_name&quot;:&quot;Cameron R. Wolfe&quot;,&quot;photo_url&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/69aba7df-b571-4609-aa47-fc2d031c11b8_1242x1595.jpeg&quot;,&quot;bio&quot;:&quot;Research @ Netflix &#8226; Rice University PhD &#8226; I make AI understandable&quot;,&quot;profile_set_up_at&quot;:&quot;2022-09-17T15:11:34.083Z&quot;,&quot;reader_installed_at&quot;:&quot;2023-01-10T11:25:00.723Z&quot;,&quot;publicationUsers&quot;:[{&quot;id&quot;:1042380,&quot;user_id&quot;:29736521,&quot;publication_id&quot;:1092659,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:true,&quot;publication&quot;:{&quot;id&quot;:1092659,&quot;name&quot;:&quot;Deep (Learning) Focus&quot;,&quot;subdomain&quot;:&quot;cameronrwolfe&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;I contextualize and explain important topics in AI research.&quot;,&quot;logo_url&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/ab9b43fb-52d5-40da-995d-5b7cd3f91064_896x896.png&quot;,&quot;author_id&quot;:29736521,&quot;primary_user_id&quot;:29736521,&quot;theme_var_background_pop&quot;:&quot;#6C0095&quot;,&quot;created_at&quot;:&quot;2022-09-17T15:12:33.160Z&quot;,&quot;email_from_name&quot;:&quot;Deep (Learning) Focus&quot;,&quot;copyright&quot;:&quot;Cameron R. Wolfe&quot;,&quot;founding_plan_name&quot;:&quot;Founding Member&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;enabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false}}],&quot;twitter_screen_name&quot;:&quot;cwolferesearch&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100,&quot;status&quot;:{&quot;bestsellerTier&quot;:100,&quot;subscriberTier&quot;:null,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:{&quot;type&quot;:&quot;bestseller&quot;,&quot;tier&quot;:100},&quot;paidPublicationIds&quot;:[],&quot;subscriber&quot;:null}}],&quot;utm_campaign&quot;:null,&quot;belowTheFold&quot;:true,&quot;type&quot;:&quot;newsletter&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPostToDOM"><a class="embedded-post" native="true" href="https://cameronrwolfe.substack.com/p/rl-continual-learning?utm_source=substack&amp;utm_campaign=post_embed&amp;utm_medium=web"><div class="embedded-post-header"><img class="embedded-post-publication-logo" src="https://substackcdn.com/image/fetch/$s_!87xa!,w_56,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fab9b43fb-52d5-40da-995d-5b7cd3f91064_896x896.png" loading="lazy"><span class="embedded-post-publication-name">Deep (Learning) Focus</span></div><div class="embedded-post-title-wrapper"><div class="embedded-post-title">Continual Learning with RL for LLMs</div></div><div class="embedded-post-body">Continual learning, which refers to the ability of an AI model to learn from new tasks and data over time, has become a popular topic in the discussion of Artificial General Intelligence (AGI). Put simply, general intelligence should be adaptable, which has led some to believe that continual learning abilities are a prerequisite f&#8230;</div><div class="embedded-post-cta-wrapper"><span class="embedded-post-cta">Read more</span></div><div class="embedded-post-meta">3 months ago &#183; 93 likes &#183; 3 comments &#183; Cameron R. Wolfe, Ph.D.</div></a></div><p>Cameron Wolfe explores why Reinforcement Learning (RL) is naturally more robust than Supervised Fine-Tuning (SFT) for continual learning in LLMs. While traditional methods like replay buffers and regularization remain relevant, recent studies suggest that RL&#8217;s on-policy nature minimizes the distributional shifts that cause catastrophic forgetting.</p><h3><strong>Is Attention All You Need for ML Interviews?</strong></h3><p><strong><a href="https://medium.com/@maxwbuckley/is-attention-all-you-need-to-prepare-for-ml-interviews-830742f6d2ba">https://medium.com/@maxwbuckley/is-attention-all-you-need-to-prepare-for-ml-interviews-830742f6d2ba</a></strong></p><p>Max Buckley shares some quick tips to master the transformer architecture and its implementation ahead of ML interviews. Worth checking out to know the details of the bedrock of modern AI in case you&#8217;re preparing for interviews.</p><h3><strong>DeepSeek Drops Yet Another</strong></h3><div class="embedded-post-wrap" data-attrs="{&quot;id&quot;:184613969,&quot;url&quot;:&quot;https://wheremachinesthink.substack.com/p/deepseek-drops-yet-another-architectural&quot;,&quot;publication_id&quot;:5277805,&quot;publication_name&quot;:&quot;WHERE MACHINES THINK&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!Yem8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6531b6c-86e3-4240-8372-b5a887412b64_608x608.png&quot;,&quot;title&quot;:&quot;DeepSeek Drops Yet Another Architectural Innovation, Opening A New Front for Scaling Up LLMs&quot;,&quot;truncated_body_text&quot;:&quot;Some large language models can memorize entire books and regurgitate them almost verbatim. In one study, researchers successfully prompted an LLM to spit out, for example, a nearly complete Harry Potter and the Sorcerer&#8217;s Stone. While this raises huge concerns about large-scale copyright violations, it's also true that we simultaneously want these model&#8230;&quot;,&quot;date&quot;:&quot;2026-01-16T16:18:28.406Z&quot;,&quot;like_count&quot;:58,&quot;comment_count&quot;:8,&quot;bylines&quot;:[{&quot;id&quot;:328415354,&quot;name&quot;:&quot;Anil Ananthaswamy&quot;,&quot;handle&quot;:&quot;anilananth&quot;,&quot;previous_name&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cf1b6a95-42d9-4ec4-ac36-43daab10f105_3024x3024.jpeg&quot;,&quot;bio&quot;:&quot;Ex-Software Eng. / Author / Former Dep. News Editor, New Scientist. Bylines in NS, Nature, SciAm, Quanta &amp; more. Books: The Edge of Physics, The Man Who Wasn't There, Through Two Doors at Once and Why Machines Learn. Prof of Practice, IIT-Madras&quot;,&quot;profile_set_up_at&quot;:&quot;2025-06-09T01:26:52.231Z&quot;,&quot;reader_installed_at&quot;:&quot;2026-01-22T08:27:17.455Z&quot;,&quot;publicationUsers&quot;:[],&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null,&quot;status&quot;:{&quot;bestsellerTier&quot;:null,&quot;subscriberTier&quot;:null,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:null,&quot;paidPublicationIds&quot;:[],&quot;subscriber&quot;:null}}],&quot;utm_campaign&quot;:null,&quot;belowTheFold&quot;:true,&quot;type&quot;:&quot;newsletter&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPostToDOM"><a class="embedded-post" native="true" href="https://wheremachinesthink.substack.com/p/deepseek-drops-yet-another-architectural?utm_source=substack&amp;utm_campaign=post_embed&amp;utm_medium=web"><div class="embedded-post-header"><img class="embedded-post-publication-logo" src="https://substackcdn.com/image/fetch/$s_!Yem8!,w_56,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6531b6c-86e3-4240-8372-b5a887412b64_608x608.png" loading="lazy"><span class="embedded-post-publication-name">WHERE MACHINES THINK</span></div><div class="embedded-post-title-wrapper"><div class="embedded-post-title">DeepSeek Drops Yet Another Architectural Innovation, Opening A New Front for Scaling Up LLMs</div></div><div class="embedded-post-body">Some large language models can memorize entire books and regurgitate them almost verbatim. In one study, researchers successfully prompted an LLM to spit out, for example, a nearly complete Harry Potter and the Sorcerer&#8217;s Stone. While this raises huge concerns about large-scale copyright violations, it's also true that we simultaneously want these model&#8230;</div><div class="embedded-post-cta-wrapper"><span class="embedded-post-cta">Read more</span></div><div class="embedded-post-meta">3 months ago &#183; 58 likes &#183; 8 comments &#183; Anil Ananthaswamy</div></a></div><p>Anil Ananthaswamy analyzes DeepSeek&#8217;s pattern of releasing architectural innovations that challenge conventional wisdom about model scaling and design. The examination covers their mixture-of-experts implementations, efficiency improvements, and open-source strategy that accelerates community progress. Tracking DeepSeek&#8217;s releases provides insights into architectural directions gaining traction in production systems.</p><div><hr></div><h2><strong>&#128736; 2 Tools &amp; Repos</strong></h2><h3><strong>Awesome Agentic Reasoning</strong></h3><p><strong><a href="https://github.com/weitianxin/Awesome-Agentic-Reasoning">https://github.com/weitianxin/Awesome-Agentic-Reasoning</a></strong></p><p>This curated collection organizes research papers, codebases, and benchmarks focused on agentic reasoning capabilities in AI systems. Instead of accumulating every tangentially related work, the repository maintains editorial standards around core reasoning techniques like planning, reflection, and tool use. The structured organization helps researchers quickly locate relevant work when investigating specific reasoning approaches or comparing methodologies across different agent architectures.</p><h3><strong>GoodAI List</strong></h3><p><strong><a href="https://goodailist.com/repos">https://goodailist.com/repos</a></strong></p><p>Chip Huyen curates AI tools and repositories with clear judgment about practical utility versus hype. The directory covers libraries, frameworks, and resources across the ML stack with emphasis on production readiness and actual adoption. Unlike comprehensive but overwhelming awesome lists, this maintains focus on tools demonstrating real-world value in AI development workflows.</p><div><hr></div><h2><strong>&#127891; 1 Pick of the Week</strong></h2><h3><strong>Stanford&#8217;s LLM Course</strong></h3><p><strong><a href="https://youtube.com/playlist?list=PLoROMvodv4rObv1FMizXqumgVVdzX4_05&amp;si=bqJCbjmvpCx-1_51">https://youtube.com/playlist?list=PLoROMvodv4rObv1FMizXqumgVVdzX4_05&amp;si=bqJCbjmvpCx-1_51</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kLJn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b147384-6d41-41ae-a6b5-0c6bf46a1395_1315x860.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kLJn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b147384-6d41-41ae-a6b5-0c6bf46a1395_1315x860.png 424w, https://substackcdn.com/image/fetch/$s_!kLJn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b147384-6d41-41ae-a6b5-0c6bf46a1395_1315x860.png 848w, https://substackcdn.com/image/fetch/$s_!kLJn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b147384-6d41-41ae-a6b5-0c6bf46a1395_1315x860.png 1272w, https://substackcdn.com/image/fetch/$s_!kLJn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b147384-6d41-41ae-a6b5-0c6bf46a1395_1315x860.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kLJn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b147384-6d41-41ae-a6b5-0c6bf46a1395_1315x860.png" width="1315" height="860" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4b147384-6d41-41ae-a6b5-0c6bf46a1395_1315x860.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:860,&quot;width&quot;:1315,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kLJn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b147384-6d41-41ae-a6b5-0c6bf46a1395_1315x860.png 424w, https://substackcdn.com/image/fetch/$s_!kLJn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b147384-6d41-41ae-a6b5-0c6bf46a1395_1315x860.png 848w, https://substackcdn.com/image/fetch/$s_!kLJn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b147384-6d41-41ae-a6b5-0c6bf46a1395_1315x860.png 1272w, https://substackcdn.com/image/fetch/$s_!kLJn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b147384-6d41-41ae-a6b5-0c6bf46a1395_1315x860.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Stanford&#8217;s curated playlist brings together lectures from multiple courses covering large language models, including CS224N (NLP with Deep Learning), CS25 (Transformers United), and CME 295 (Transformers &amp; Large Language Models). The collection spans foundational concepts in transformer architectures and training methodologies alongside cutting-edge developments in reasoning models, multimodal systems, and efficient inference techniques.</p><div><hr></div><p><em>Thanks for reading The Tokenizer! If you found something useful here, share it with someone who might benefit. And if you want more curated insights like this, consider subscribing to Gradient Ascent.</em></p>]]></content:encoded></item><item><title><![CDATA[Tencent Compresses Reasoning 3-4x, Robots Master 35,000 Hours of Human Data, and Anthropic's Evaluation Framework - 📚 The Tokenizer Edition #15]]></title><description><![CDATA[This week's most valuable AI resources]]></description><link>https://newsletter.artofsaience.com/p/tencent-compresses-reasoning-3-4x</link><guid isPermaLink="false">https://newsletter.artofsaience.com/p/tencent-compresses-reasoning-3-4x</guid><dc:creator><![CDATA[Sairam Sundaresan]]></dc:creator><pubDate>Mon, 26 Jan 2026 16:25:08 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/5YBjll9XJlw" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey there! Robots just learned to treat 35,000 hours of human movements as a universal training language, mechanistic interpretability made the jump from observation to intervention, and researchers compressed reasoning into images at 3-4x efficiency. Real progress on problems that matter.</p><h3><strong>New here?</strong></h3><p><em>The Tokenizer is my resource-focused newsletter edition where I curate the best papers, videos, articles, tools, and learning resources from across the AI landscape. Consider it your weekly dose of everything you need to stay ahead in machine learning.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.artofsaience.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Gradient Ascent! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2><strong>TL;DR</strong></h2><p><strong>What caught my attention this week:</strong></p><ul><li><p>&#128196; <strong>Papers:</strong> Actionable mechanistic interpretability, agent efficiency frameworks, 3-4x reasoning compression, cross-embodiment robotics, unified video generation</p></li><li><p>&#127909; <strong>Videos:</strong> Agent architecture patterns, live coding marathon, production evals, MoE routing mechanics</p></li><li><p>&#128240; <strong>Reads:</strong> Anthropic&#8217;s evaluation resilience, AI emotional frameworks, semantic search without embeddings</p></li><li><p>&#128736; <strong>Tools:</strong> Production RAG and distributed training infrastructure</p></li><li><p>&#127891; <strong>Learning:</strong> MIT&#8217;s command-line fundamentals</p></li></ul><div><hr></div><p>I&#8217;m running an ML &amp; Generative AI System Design workshop with Packt.</p><p>We&#8217;ll cover the core design principles for building solid AI products: making systems reliable, measuring what matters, and designing architectures that work in production.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://www.eventbrite.com/e/machine-learning-and-generative-ai-system-design-workshop-tickets-1975103644168?aff=Sairam" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hWCg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1d1571c-72f0-409d-9326-eddd44e133b5_1280x640.png 424w, https://substackcdn.com/image/fetch/$s_!hWCg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1d1571c-72f0-409d-9326-eddd44e133b5_1280x640.png 848w, https://substackcdn.com/image/fetch/$s_!hWCg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1d1571c-72f0-409d-9326-eddd44e133b5_1280x640.png 1272w, https://substackcdn.com/image/fetch/$s_!hWCg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1d1571c-72f0-409d-9326-eddd44e133b5_1280x640.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hWCg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1d1571c-72f0-409d-9326-eddd44e133b5_1280x640.png" width="1280" height="640" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e1d1571c-72f0-409d-9326-eddd44e133b5_1280x640.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:640,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:&quot;https://www.eventbrite.com/e/machine-learning-and-generative-ai-system-design-workshop-tickets-1975103644168?aff=Sairam&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hWCg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1d1571c-72f0-409d-9326-eddd44e133b5_1280x640.png 424w, https://substackcdn.com/image/fetch/$s_!hWCg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1d1571c-72f0-409d-9326-eddd44e133b5_1280x640.png 848w, https://substackcdn.com/image/fetch/$s_!hWCg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1d1571c-72f0-409d-9326-eddd44e133b5_1280x640.png 1272w, https://substackcdn.com/image/fetch/$s_!hWCg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1d1571c-72f0-409d-9326-eddd44e133b5_1280x640.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Use code <strong>FLASH40</strong> for 40% off: <a href="https://lnkd.in/gqTrvsuS"> </a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.eventbrite.com/e/machine-learning-and-generative-ai-system-design-workshop-tickets-1975103644168?aff=Sairam&quot;,&quot;text&quot;:&quot;Join the Workshop!&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.eventbrite.com/e/machine-learning-and-generative-ai-system-design-workshop-tickets-1975103644168?aff=Sairam"><span>Join the Workshop!</span></a></p><p>What topics/problems would you most want covered in a system design workshop? Drop a comment or DM me.</p><div><hr></div><h2><strong>&#128196; 5 Papers</strong></h2><h3><strong>Locate, Steer, and Improve: A Practical Survey of Actionable Mechanistic Interpretability in Large Language Models</strong></h3><p><strong><a href="https://arxiv.org/abs/2601.14004">https://arxiv.org/abs/2601.14004</a> |<a href="https://github.com/rattlesnakey/Awesome-Actionable-MI-Survey"> GitHub</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!w2v8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F675e60f8-c65e-47c8-b185-aa0e4c5b8567_886x454.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!w2v8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F675e60f8-c65e-47c8-b185-aa0e4c5b8567_886x454.png 424w, https://substackcdn.com/image/fetch/$s_!w2v8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F675e60f8-c65e-47c8-b185-aa0e4c5b8567_886x454.png 848w, https://substackcdn.com/image/fetch/$s_!w2v8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F675e60f8-c65e-47c8-b185-aa0e4c5b8567_886x454.png 1272w, https://substackcdn.com/image/fetch/$s_!w2v8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F675e60f8-c65e-47c8-b185-aa0e4c5b8567_886x454.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!w2v8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F675e60f8-c65e-47c8-b185-aa0e4c5b8567_886x454.png" width="886" height="454" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/675e60f8-c65e-47c8-b185-aa0e4c5b8567_886x454.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:454,&quot;width&quot;:886,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!w2v8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F675e60f8-c65e-47c8-b185-aa0e4c5b8567_886x454.png 424w, https://substackcdn.com/image/fetch/$s_!w2v8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F675e60f8-c65e-47c8-b185-aa0e4c5b8567_886x454.png 848w, https://substackcdn.com/image/fetch/$s_!w2v8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F675e60f8-c65e-47c8-b185-aa0e4c5b8567_886x454.png 1272w, https://substackcdn.com/image/fetch/$s_!w2v8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F675e60f8-c65e-47c8-b185-aa0e4c5b8567_886x454.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Mechanistic interpretability research has spent years documenting how models work internally. This survey introduces the first systematic framework for actually using those insights to improve models. The &#8220;Locate, Steer, and Improve&#8221; pipeline transforms interpretability from passive analysis into active intervention, showing how to diagnose issues in specific model components, steer behavior through targeted modifications, and measurably improve alignment, capability, and efficiency. The framework provides actionable protocols for practitioners who want to do more than just observe their models.</p><h3><strong>Toward Efficient Agents: Memory, Tool learning, and Planning</strong></h3><p><strong><a href="https://arxiv.org/abs/2601.14192">https://arxiv.org/abs/2601.14192</a> |<a href="https://github.com/yxf203/Awesome-Efficient-Agents"> GitHub</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iGaT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4748de4c-e5f1-4c6f-8378-c94c81405f18_2048x1152.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iGaT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4748de4c-e5f1-4c6f-8378-c94c81405f18_2048x1152.png 424w, https://substackcdn.com/image/fetch/$s_!iGaT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4748de4c-e5f1-4c6f-8378-c94c81405f18_2048x1152.png 848w, https://substackcdn.com/image/fetch/$s_!iGaT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4748de4c-e5f1-4c6f-8378-c94c81405f18_2048x1152.png 1272w, https://substackcdn.com/image/fetch/$s_!iGaT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4748de4c-e5f1-4c6f-8378-c94c81405f18_2048x1152.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iGaT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4748de4c-e5f1-4c6f-8378-c94c81405f18_2048x1152.png" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4748de4c-e5f1-4c6f-8378-c94c81405f18_2048x1152.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!iGaT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4748de4c-e5f1-4c6f-8378-c94c81405f18_2048x1152.png 424w, https://substackcdn.com/image/fetch/$s_!iGaT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4748de4c-e5f1-4c6f-8378-c94c81405f18_2048x1152.png 848w, https://substackcdn.com/image/fetch/$s_!iGaT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4748de4c-e5f1-4c6f-8378-c94c81405f18_2048x1152.png 1272w, https://substackcdn.com/image/fetch/$s_!iGaT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4748de4c-e5f1-4c6f-8378-c94c81405f18_2048x1152.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Production AI agents burn through tokens, time, and compute budget faster than most teams anticipate. This comprehensive survey tackles efficiency across three critical components: memory management through context compression, tool learning via reinforcement learning rewards that minimize unnecessary invocations, and planning with controlled search mechanisms. The research addresses the gap between agents that work impressively in demos and systems that remain economically viable at scale. If deployment costs are keeping your agents in the lab, this provides frameworks for optimizing them.</p><h3><strong>Render-of-Thought: Rendering Textual Chain-of-Thought as Images for Visual Latent Reasoning</strong></h3><p><strong><a href="https://arxiv.org/abs/2601.14750">https://arxiv.org/abs/2601.14750</a></strong> |<a href="https://github.com/TencentBAC/RoT"> </a><strong><a href="https://github.com/TencentBAC/RoT">GitHub</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0eXP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa588a5e1-1735-4b45-9dda-f7232069cf58_534x420.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0eXP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa588a5e1-1735-4b45-9dda-f7232069cf58_534x420.png 424w, https://substackcdn.com/image/fetch/$s_!0eXP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa588a5e1-1735-4b45-9dda-f7232069cf58_534x420.png 848w, https://substackcdn.com/image/fetch/$s_!0eXP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa588a5e1-1735-4b45-9dda-f7232069cf58_534x420.png 1272w, https://substackcdn.com/image/fetch/$s_!0eXP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa588a5e1-1735-4b45-9dda-f7232069cf58_534x420.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0eXP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa588a5e1-1735-4b45-9dda-f7232069cf58_534x420.png" width="534" height="420" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a588a5e1-1735-4b45-9dda-f7232069cf58_534x420.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:420,&quot;width&quot;:534,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0eXP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa588a5e1-1735-4b45-9dda-f7232069cf58_534x420.png 424w, https://substackcdn.com/image/fetch/$s_!0eXP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa588a5e1-1735-4b45-9dda-f7232069cf58_534x420.png 848w, https://substackcdn.com/image/fetch/$s_!0eXP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa588a5e1-1735-4b45-9dda-f7232069cf58_534x420.png 1272w, https://substackcdn.com/image/fetch/$s_!0eXP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa588a5e1-1735-4b45-9dda-f7232069cf58_534x420.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Chain-of-thought prompting unlocked reasoning capabilities but created a verbosity problem. Render-of-Thought converts those verbose textual reasoning steps into compact visual representations, achieving 3-4x token compression while maintaining competitive performance on mathematical and logical reasoning benchmarks. The framework leverages vision encoders from existing VLMs as semantic anchors, making it plug-and-play without additional pretraining. The approach addresses a practical bottleneck: getting models to reason without drowning in tokens.</p><h3><strong>Being-H0.5: Scaling Human-Centric Robot Learning for Cross-Embodiment Generalization</strong></h3><p><strong><a href="https://arxiv.org/abs/2601.12993">https://arxiv.org/abs/2601.12993</a></strong> |<a href="https://github.com/BeingBeyond/Being-H"> </a><strong><a href="https://github.com/BeingBeyond/Being-H">GitHub</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gtdZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37eef211-b593-4af8-a5a5-bb2aa3ee3b00_2048x1137.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gtdZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37eef211-b593-4af8-a5a5-bb2aa3ee3b00_2048x1137.png 424w, https://substackcdn.com/image/fetch/$s_!gtdZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37eef211-b593-4af8-a5a5-bb2aa3ee3b00_2048x1137.png 848w, https://substackcdn.com/image/fetch/$s_!gtdZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37eef211-b593-4af8-a5a5-bb2aa3ee3b00_2048x1137.png 1272w, https://substackcdn.com/image/fetch/$s_!gtdZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37eef211-b593-4af8-a5a5-bb2aa3ee3b00_2048x1137.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gtdZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37eef211-b593-4af8-a5a5-bb2aa3ee3b00_2048x1137.png" width="1456" height="808" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/37eef211-b593-4af8-a5a5-bb2aa3ee3b00_2048x1137.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:808,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gtdZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37eef211-b593-4af8-a5a5-bb2aa3ee3b00_2048x1137.png 424w, https://substackcdn.com/image/fetch/$s_!gtdZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37eef211-b593-4af8-a5a5-bb2aa3ee3b00_2048x1137.png 848w, https://substackcdn.com/image/fetch/$s_!gtdZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37eef211-b593-4af8-a5a5-bb2aa3ee3b00_2048x1137.png 1272w, https://substackcdn.com/image/fetch/$s_!gtdZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37eef211-b593-4af8-a5a5-bb2aa3ee3b00_2048x1137.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Most robot learning systems work brilliantly on the hardware they trained on and catastrophically fail when transferred to different morphologies. Being-H0.5 treats human interaction patterns as a universal &#8220;mother tongue&#8221; for physical manipulation, training on over 35,000 hours of multimodal data across 30 robot embodiments. The key innovation is a Unified Action Space that maps heterogeneous robot controls into semantically aligned slots, letting low-resource robots bootstrap skills from human demonstrations and high-resource platforms. The system achieves 98.9% on LIBERO and 53.9% on RoboCasa while demonstrating genuine cross-embodiment transfer on five different robotic platforms.</p><h3><strong>OmniTransfer: All-in-one Framework for Spatio-temporal Video Transfer</strong></h3><p><strong><a href="https://arxiv.org/abs/2601.14250">https://arxiv.org/abs/2601.14250</a></strong> |<a href="https://github.com/PangzeCheung/OmniTransfer"> </a><strong><a href="https://github.com/PangzeCheung/OmniTransfer">GitHub</a></strong></p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;45ae7085-46fe-4e75-85e7-32b332d50ad4&quot;,&quot;duration&quot;:null}"></div><p>Video generation models typically handle appearance transfer or temporal effects separately, requiring different models for each task. OmniTransfer unifies spatial appearance transfer (ID and style) with temporal video transfer (effects, motion, camera movement) in a single framework. The system uses Task-aware Positional Bias to adaptively leverage reference video information, Reference-decoupled Causal Learning for efficient transfer, and Task-adaptive Multimodal Alignment to handle different tasks dynamically. The framework matches pose-guided methods in motion transfer without requiring explicit pose extraction, establishing a more flexible paradigm for video generation.</p><div><hr></div><h2><strong>&#127909; 4 Videos</strong></h2><h3><strong>Agent Skills, Rules, Subagents: Explained!</strong></h3><div id="youtube2-L_p5GxGSB_I" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;L_p5GxGSB_I&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/L_p5GxGSB_I?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Lee Robinson simplifies the complex terminology around managing context with coding agents, covering the evolution and core concepts behind rules, commands, MCP servers, subagents, modes, hooks, and skills. The video provides clarity on when to use each approach, cutting through the confusion that comes with rapidly evolving agent architectures. Helpful if you&#8217;re building with AI coding assistants and need clear decision frameworks for context management.</p><h3><strong>Vibe Code Camp: Live Marathon With the World&#8217;s Best AI Builders</strong></h3><div id="youtube2-5YBjll9XJlw" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;5YBjll9XJlw&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/5YBjll9XJlw?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>A live coding marathon featuring developers building real projects with AI tools. The session provides unfiltered looks at how experienced builders approach problems, handle tool limitations, and combine different AI capabilities in production workflows. Rather than polished tutorials, you see actual development with all the debugging, iteration, and problem-solving that comes with building AI-powered applications.</p><h3><strong>AI Evals for Everyone</strong></h3><p><strong><a href="https://www.youtube.com/playlist?list=PLZoalK-hTD4VPIkRXNdSEwcTCt2QUgEPR">https://www.youtube.com/playlist?list=PLZoalK-hTD4VPIkRXNdSEwcTCt2QUgEPR</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_na2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9dc0198-3fcc-42c4-920b-6a1e9fd10606_1351x776.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_na2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9dc0198-3fcc-42c4-920b-6a1e9fd10606_1351x776.png 424w, https://substackcdn.com/image/fetch/$s_!_na2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9dc0198-3fcc-42c4-920b-6a1e9fd10606_1351x776.png 848w, https://substackcdn.com/image/fetch/$s_!_na2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9dc0198-3fcc-42c4-920b-6a1e9fd10606_1351x776.png 1272w, https://substackcdn.com/image/fetch/$s_!_na2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9dc0198-3fcc-42c4-920b-6a1e9fd10606_1351x776.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_na2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9dc0198-3fcc-42c4-920b-6a1e9fd10606_1351x776.png" width="1351" height="776" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b9dc0198-3fcc-42c4-920b-6a1e9fd10606_1351x776.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:776,&quot;width&quot;:1351,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_na2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9dc0198-3fcc-42c4-920b-6a1e9fd10606_1351x776.png 424w, https://substackcdn.com/image/fetch/$s_!_na2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9dc0198-3fcc-42c4-920b-6a1e9fd10606_1351x776.png 848w, https://substackcdn.com/image/fetch/$s_!_na2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9dc0198-3fcc-42c4-920b-6a1e9fd10606_1351x776.png 1272w, https://substackcdn.com/image/fetch/$s_!_na2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9dc0198-3fcc-42c4-920b-6a1e9fd10606_1351x776.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This playlist covers evaluation frameworks that work beyond toy benchmarks. The content addresses designing evaluations that capture real failure modes, continuous evaluation strategies for production systems, and practical approaches to measuring what actually matters. Essential viewing if your models perform well on standard benchmarks but struggle in deployment, or if you need systematic ways to track improvements across releases.</p><h3><strong>MoE Token Routing Explained: How Mixture of Experts Works (with Code)</strong></h3><div id="youtube2-CDnkFbW-uEQ" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;CDnkFbW-uEQ&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/CDnkFbW-uEQ?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Mixture-of-experts models power many frontier systems, but their routing mechanisms remain poorly understood. This video walks through the actual code that determines which tokens get routed to which experts, explaining the load balancing challenges and why naive routing strategies fail. With models like DeepSeek-R1 using MoE architectures, understanding these fundamentals helps you work with and debug these systems effectively.</p><div><hr></div><h2><strong>&#128240; 3 Curated Reads</strong></h2><h3><strong>AI Resistant Technical Evaluations</strong></h3><p><strong><a href="https://www.anthropic.com/engineering/AI-resistant-technical-evaluations">https://www.anthropic.com/engineering/AI-resistant-technical-evaluations</a></strong></p><p>Anthropic&#8217;s performance engineering team had to redesign their hiring take-home test three times because Claude kept beating it. When Opus 4.5 matched the best human performance within the 2-hour time limit, they faced a problem: candidates&#8217; optimal strategy became delegating entirely to Claude Code. The article documents each iteration, from realistic optimization problems to increasingly unusual puzzle-like constraints, revealing what happens when your own AI outperforms most technical candidates on the tests designed to evaluate them.</p><h3><strong>A feelings wheel but for robots</strong></h3><div class="embedded-post-wrap" data-attrs="{&quot;id&quot;:185105809,&quot;url&quot;:&quot;https://hils.substack.com/p/a-feelings-wheel-but-for-robots&quot;,&quot;publication_id&quot;:539895,&quot;publication_name&quot;:&quot;writerbuilder&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!tneR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3328fe4-ee27-4a35-a357-27fbae38812a_1024x1024.png&quot;,&quot;title&quot;:&quot;a feelings wheel, but for robots&quot;,&quot;truncated_body_text&quot;:&quot;If you want to improve your artistic skills, focusing on the act of drawing will only get you so far. Arguably more important is training your eye: learning to look at a composition and see the shapes, lines, colors, shadows, and light. Artists spend as much time, if not more, on this as they do practicing the craft of capturing what they see.&quot;,&quot;date&quot;:&quot;2026-01-20T12:20:24.598Z&quot;,&quot;like_count&quot;:21,&quot;comment_count&quot;:4,&quot;bylines&quot;:[{&quot;id&quot;:2738338,&quot;name&quot;:&quot;Hilary Gridley&quot;,&quot;handle&quot;:&quot;hils&quot;,&quot;previous_name&quot;:&quot;Hilary&quot;,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F8cc75cfa-2b1e-44e7-bd67-f122f97c0557_1793x1793.jpeg&quot;,&quot;bio&quot;:&quot;writer | builder | writerbuilder&quot;,&quot;profile_set_up_at&quot;:&quot;2021-10-24T22:23:17.222Z&quot;,&quot;reader_installed_at&quot;:&quot;2024-11-25T02:00:53.666Z&quot;,&quot;publicationUsers&quot;:[{&quot;id&quot;:469437,&quot;user_id&quot;:2738338,&quot;publication_id&quot;:539895,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:true,&quot;publication&quot;:{&quot;id&quot;:539895,&quot;name&quot;:&quot;writerbuilder&quot;,&quot;subdomain&quot;:&quot;hils&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;for writers who build and builders who write&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c3328fe4-ee27-4a35-a357-27fbae38812a_1024x1024.png&quot;,&quot;author_id&quot;:2738338,&quot;primary_user_id&quot;:2738338,&quot;theme_var_background_pop&quot;:&quot;#6C0095&quot;,&quot;created_at&quot;:&quot;2021-10-24T21:53:17.513Z&quot;,&quot;email_from_name&quot;:&quot;Hilary Gridley&quot;,&quot;copyright&quot;:&quot;Hilary Gridley&quot;,&quot;founding_plan_name&quot;:null,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;disabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:null,&quot;is_personal_mode&quot;:false}}],&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null,&quot;status&quot;:{&quot;bestsellerTier&quot;:null,&quot;subscriberTier&quot;:1,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:{&quot;type&quot;:&quot;subscriber&quot;,&quot;tier&quot;:1,&quot;accent_colors&quot;:null},&quot;paidPublicationIds&quot;:[10845,1309801,2450],&quot;subscriber&quot;:null}}],&quot;utm_campaign&quot;:null,&quot;belowTheFold&quot;:true,&quot;type&quot;:&quot;newsletter&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPostToDOM"><a class="embedded-post" native="true" href="https://hils.substack.com/p/a-feelings-wheel-but-for-robots?utm_source=substack&amp;utm_campaign=post_embed&amp;utm_medium=web"><div class="embedded-post-header"><img class="embedded-post-publication-logo" src="https://substackcdn.com/image/fetch/$s_!tneR!,w_56,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3328fe4-ee27-4a35-a357-27fbae38812a_1024x1024.png" loading="lazy"><span class="embedded-post-publication-name">writerbuilder</span></div><div class="embedded-post-title-wrapper"><div class="embedded-post-title">a feelings wheel, but for robots</div></div><div class="embedded-post-body">If you want to improve your artistic skills, focusing on the act of drawing will only get you so far. Arguably more important is training your eye: learning to look at a composition and see the shapes, lines, colors, shadows, and light. Artists spend as much time, if not more, on this as they do practicing the craft of capturing what they see&#8230;</div><div class="embedded-post-cta-wrapper"><span class="embedded-post-cta">Read more</span></div><div class="embedded-post-meta">3 months ago &#183; 21 likes &#183; 4 comments &#183; Hilary Gridley</div></a></div><p>Hilary Gridley built the &#8220;AI Steering Wheel&#8221; to solve a vocabulary problem: LLMs are extremely sensitive to word choice, but most people lack precise language for what they want. Inspired by psychology&#8217;s Feelings Wheel, her tool organizes feedback across six dimensions (Originality, Grounding, Risk, Scope, Style, Certainty) with increasingly specific terms. &#8220;Make it more specific&#8221; and &#8220;make it more detailed&#8221; sound similar but produce completely different results from LLMs. The same precision helps when giving feedback to people, not just models.</p><h3><strong>Large scale semantic search without embeddings</strong></h3><p><strong><a href="https://fergusfinn.com/blog/arxiv-llm-search/">https://fergusfinn.com/blog/arxiv-llm-search/</a></strong></p><p>Fergus Finn demonstrates building semantic search over academic papers without traditional embedding pipelines. The approach bypasses the usual vector database infrastructure, using LLMs directly for relevance matching at query time. The method trades some retrieval speed for elimination of embedding maintenance overhead and improved interpretability of search results. Particularly relevant if you&#8217;re building search systems where embedding drift or maintenance burden has become problematic.</p><div><hr></div><h2><strong>&#128736; 2 Tools &amp; Repos</strong></h2><h3><strong>RAG Project (bRAG-langchain)</strong></h3><p><strong><a href="https://github.com/bragai/bRAG-langchain">https://github.com/bragai/bRAG-langchain</a></strong></p><p>A production-focused RAG framework built on LangChain that addresses common implementation gaps in retrieval-augmented generation systems. The repository includes patterns for document processing pipelines, retrieval strategies that work beyond simple similarity search, and integration approaches for connecting to various data sources. Designed for teams moving from RAG prototypes to production deployments where reliability and performance actually matter.</p><h3><strong>DeepSpeed</strong></h3><p><strong><a href="https://github.com/deepspeedai/DeepSpeed">https://github.com/deepspeedai/DeepSpeed</a></strong></p><p>Microsoft&#8217;s distributed training library makes training large models practical. DeepSpeed handles the infrastructure complexity of multi-GPU and multi-node training, implementing memory optimization techniques like ZeRO (Zero Redundancy Optimizer) that let you train models that wouldn&#8217;t fit in memory otherwise. The library includes optimizations for both training and inference, with specific support for mixture-of-experts architectures. Essential infrastructure for teams working with models beyond what fits on a single GPU.</p><div><hr></div><h2><strong>&#127891; 1 Pick of the Week</strong></h2><h3><strong>MIT&#8217;s Missing Semester 2026</strong></h3><p><strong><a href="https://www.youtube.com/playlist?list=PLyzOVJj3bHQunmnnTXrNbZnBaCA-ieK4L">https://www.youtube.com/playlist?list=PLyzOVJj3bHQunmnnTXrNbZnBaCA-ieK4L</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kVMy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f298d1e-d955-4334-b9ce-922301a06300_1180x524.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kVMy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f298d1e-d955-4334-b9ce-922301a06300_1180x524.png 424w, https://substackcdn.com/image/fetch/$s_!kVMy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f298d1e-d955-4334-b9ce-922301a06300_1180x524.png 848w, https://substackcdn.com/image/fetch/$s_!kVMy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f298d1e-d955-4334-b9ce-922301a06300_1180x524.png 1272w, https://substackcdn.com/image/fetch/$s_!kVMy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f298d1e-d955-4334-b9ce-922301a06300_1180x524.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kVMy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f298d1e-d955-4334-b9ce-922301a06300_1180x524.png" width="1180" height="524" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9f298d1e-d955-4334-b9ce-922301a06300_1180x524.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:524,&quot;width&quot;:1180,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kVMy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f298d1e-d955-4334-b9ce-922301a06300_1180x524.png 424w, https://substackcdn.com/image/fetch/$s_!kVMy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f298d1e-d955-4334-b9ce-922301a06300_1180x524.png 848w, https://substackcdn.com/image/fetch/$s_!kVMy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f298d1e-d955-4334-b9ce-922301a06300_1180x524.png 1272w, https://substackcdn.com/image/fetch/$s_!kVMy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f298d1e-d955-4334-b9ce-922301a06300_1180x524.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>MIT&#8217;s course on the practical computing skills that formal CS education typically skips. The 2026 playlist covers shell scripting, version control, text manipulation, debugging tools, and the command-line workflows that separate efficient developers from those constantly fighting their environment. These fundamentals compound in value over your entire career, making tasks that seem arcane to beginners second nature to experienced practitioners. If you&#8217;ve ever watched a senior engineer accomplish in seconds what takes you minutes of clicking through UIs, this teaches those techniques systematically. In this age of AI-assisted coding, these are the skills you absolutely should master.</p><div><hr></div><p><em>Thanks for reading The Tokenizer! If you found something useful here, share it with someone who might benefit. And if you want more curated insights like this, consider subscribing to Gradient Ascent.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.artofsaience.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Gradient Ascent! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[NVIDIA Fixes Multi-Reward RL Collapse, Video Agents Lose Their Train of Thought, and LLM Benchmarks That Judge Themselves - 📚 The Tokenizer Edition #14]]></title><description><![CDATA[This week's most valuable AI resources]]></description><link>https://newsletter.artofsaience.com/p/nvidia-fixes-multi-reward-rl-collapse</link><guid isPermaLink="false">https://newsletter.artofsaience.com/p/nvidia-fixes-multi-reward-rl-collapse</guid><dc:creator><![CDATA[Sairam Sundaresan]]></dc:creator><pubDate>Mon, 19 Jan 2026 17:02:51 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/HY_JyxAZsiE" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey there! NVIDIA identified a normalization issue in multi-reward RL where distinct reward combinations collapse into identical training signals. Video agents show consistent failures in maintaining focus through long retrieval chains. Someone built a framework to systematically evaluate benchmark quality. Also, new research questions whether the weight decay equilibrium we&#8217;ve accepted is optimal.</p><div><hr></div><h3><strong>New here?</strong></h3><p><em>The Tokenizer is my resource-focused newsletter edition where I curate the best papers, videos, articles, tools, and learning resources from across the AI landscape. Consider it your weekly dose of everything you need to stay ahead in machine learning.</em></p><div><hr></div><h2><strong>TL;DR</strong></h2><p><strong>What caught my attention this week:</strong></p><p>&#8226; &#128196; <strong>Papers:</strong> NVIDIA&#8217;s fix for multi-reward RL signal collapse, video agents that drift off-task during web research, learnable multipliers escaping weight decay traps, RL-as-a-service infrastructure, and meta-benchmarks judging benchmark quality</p><p>&#8226; &#127909; <strong>Videos:</strong> Claude Agent SDK workshop with real-world integrations, why context stuffing isn&#8217;t memory, Codex fundamentals, and spec-driven AI development</p><p>&#8226; &#128240; <strong>Reads:</strong> OpenAI&#8217;s framework for AI behavior governance, performance optimization principles that actually matter, and what LLM coding workflows look like in 2026</p><p>&#8226; &#128736; <strong>Tools:</strong> Simon Willison&#8217;s definitive 2025 LLM retrospective and production-ready Claude Code templates</p><p>&#8226; &#127891; <strong>Learning:</strong> Comprehensive modern AI course from foundations through deployment</p><div><hr></div><p>Quick Plug: I&#8217;m running an <strong>ML and Generative AI System Design workshop with Packt</strong>.</p><p>We&#8217;ll cover the core design principles for building solid AI products: making systems reliable, measuring what matters, and designing architectures that work in production.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://www.eventbrite.com/e/machine-learning-and-generative-ai-system-design-workshop-tickets-1975103644168?aff=Sairam" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!znoC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04ee1b92-e493-48d2-9a3a-77668b4727d8_1280x640.png 424w, https://substackcdn.com/image/fetch/$s_!znoC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04ee1b92-e493-48d2-9a3a-77668b4727d8_1280x640.png 848w, https://substackcdn.com/image/fetch/$s_!znoC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04ee1b92-e493-48d2-9a3a-77668b4727d8_1280x640.png 1272w, https://substackcdn.com/image/fetch/$s_!znoC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04ee1b92-e493-48d2-9a3a-77668b4727d8_1280x640.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!znoC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04ee1b92-e493-48d2-9a3a-77668b4727d8_1280x640.png" width="1280" height="640" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/04ee1b92-e493-48d2-9a3a-77668b4727d8_1280x640.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:640,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:&quot;https://www.eventbrite.com/e/machine-learning-and-generative-ai-system-design-workshop-tickets-1975103644168?aff=Sairam&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!znoC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04ee1b92-e493-48d2-9a3a-77668b4727d8_1280x640.png 424w, https://substackcdn.com/image/fetch/$s_!znoC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04ee1b92-e493-48d2-9a3a-77668b4727d8_1280x640.png 848w, https://substackcdn.com/image/fetch/$s_!znoC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04ee1b92-e493-48d2-9a3a-77668b4727d8_1280x640.png 1272w, https://substackcdn.com/image/fetch/$s_!znoC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04ee1b92-e493-48d2-9a3a-77668b4727d8_1280x640.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Use code <strong>FLASH40</strong> for <a href="https://www.eventbrite.com/e/machine-learning-and-generative-ai-system-design-workshop-tickets-1975103644168?aff=Sairam">40% off</a>: <a href="https://lnkd.in/gqTrvsuS"> </a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.eventbrite.com/e/machine-learning-and-generative-ai-system-design-workshop-tickets-1975103644168?aff=Sairam&quot;,&quot;text&quot;:&quot;Register for the Workshop&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.eventbrite.com/e/machine-learning-and-generative-ai-system-design-workshop-tickets-1975103644168?aff=Sairam"><span>Register for the Workshop</span></a></p><p>What topics/problems would you most want covered in a system design workshop? Drop a comment or DM me.</p><div><hr></div><h2><strong>&#128196; 5 Papers</strong></h2><h3><strong>GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization</strong></h3><p><strong><a href="https://arxiv.org/abs/2601.05242">https://arxiv.org/abs/2601.05242</a> |<a href="https://github.com/NVlabs/GDPO"> Github</a></strong></p><p>Multi-reward RL has a normalization problem. When GRPO normalizes the sum of distinct rewards, different reward combinations can collapse into identical advantage values. NVIDIA&#8217;s GDPO addresses this by normalizing each reward independently before aggregating them, preserving the relative differences between reward combinations. Testing on tool calling, math reasoning, and code generation shows consistent improvements. On AIME, training DeepSeek-R1-1.5B with GDPO yields 6.3% higher accuracy compared to GRPO while maintaining shorter response lengths. The method works as a drop-in replacement in verl and TRL frameworks.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mwhZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d3f5966-2583-4c81-84d4-f142488037b4_1600x901.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mwhZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d3f5966-2583-4c81-84d4-f142488037b4_1600x901.png 424w, https://substackcdn.com/image/fetch/$s_!mwhZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d3f5966-2583-4c81-84d4-f142488037b4_1600x901.png 848w, https://substackcdn.com/image/fetch/$s_!mwhZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d3f5966-2583-4c81-84d4-f142488037b4_1600x901.png 1272w, https://substackcdn.com/image/fetch/$s_!mwhZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d3f5966-2583-4c81-84d4-f142488037b4_1600x901.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mwhZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d3f5966-2583-4c81-84d4-f142488037b4_1600x901.png" width="1456" height="820" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2d3f5966-2583-4c81-84d4-f142488037b4_1600x901.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:820,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mwhZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d3f5966-2583-4c81-84d4-f142488037b4_1600x901.png 424w, https://substackcdn.com/image/fetch/$s_!mwhZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d3f5966-2583-4c81-84d4-f142488037b4_1600x901.png 848w, https://substackcdn.com/image/fetch/$s_!mwhZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d3f5966-2583-4c81-84d4-f142488037b4_1600x901.png 1272w, https://substackcdn.com/image/fetch/$s_!mwhZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d3f5966-2583-4c81-84d4-f142488037b4_1600x901.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning</strong></h3><p><strong><a href="https://arxiv.org/abs/2601.06943">https://arxiv.org/abs/2601.06943</a> |<a href="https://github.com/QuantaAlpha/VideoDR-Benchmark"> Github</a></strong></p><p>VideoDR tests whether video agents can extract visual anchors from multiple frames, retrieve information from the open web, and combine evidence from both sources. The benchmark requires cross-frame visual extraction, interactive web retrieval, and multi-hop reasoning. Results show agentic approaches don&#8217;t consistently outperform workflow-based methods. The advantage appears when models maintain their initial video anchors throughout long retrieval chains. The evaluation across both Workflow and Agentic paradigms identifies goal drift and long-horizon consistency as the primary failure modes. The benchmark includes 100 video-question-answer triples across six semantic domains with annotation designed to prevent single-source solutions.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ZbUP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30569940-37fc-4d83-aaf1-d28ec602e668_1600x1296.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ZbUP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30569940-37fc-4d83-aaf1-d28ec602e668_1600x1296.png 424w, https://substackcdn.com/image/fetch/$s_!ZbUP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30569940-37fc-4d83-aaf1-d28ec602e668_1600x1296.png 848w, https://substackcdn.com/image/fetch/$s_!ZbUP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30569940-37fc-4d83-aaf1-d28ec602e668_1600x1296.png 1272w, https://substackcdn.com/image/fetch/$s_!ZbUP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30569940-37fc-4d83-aaf1-d28ec602e668_1600x1296.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ZbUP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30569940-37fc-4d83-aaf1-d28ec602e668_1600x1296.png" width="1456" height="1179" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/30569940-37fc-4d83-aaf1-d28ec602e668_1600x1296.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1179,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ZbUP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30569940-37fc-4d83-aaf1-d28ec602e668_1600x1296.png 424w, https://substackcdn.com/image/fetch/$s_!ZbUP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30569940-37fc-4d83-aaf1-d28ec602e668_1600x1296.png 848w, https://substackcdn.com/image/fetch/$s_!ZbUP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30569940-37fc-4d83-aaf1-d28ec602e668_1600x1296.png 1272w, https://substackcdn.com/image/fetch/$s_!ZbUP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30569940-37fc-4d83-aaf1-d28ec602e668_1600x1296.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>Learnable Multipliers: Freeing the Scale of Language Model Matrix Layers</strong></h3><p><strong><a href="https://arxiv.org/abs/2601.04890">https://arxiv.org/abs/2601.04890</a> |<a href="https://github.com/tiiuae/falcon-h1"> Github</a></strong></p><p>Weight decay and stochastic gradient noise create an equilibrium that determines weight matrix norms through optimization hyperparameters rather than data. This work introduces learnable multipliers to test whether this equilibrium is optimal. Scalar multipliers attached to weight matrices show the equilibrium norm is suboptimal, with learned scales adapting to data and improving performance. Per-row and per-column multipliers extend this by freeing individual dimension scales. The approach generalizes muP multipliers with more expressivity, outperforms well-tuned muP baselines, and shows improvements with both Adam and Muon optimizers. Applied in Falcon-H1 pretraining with consistent downstream performance gains.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!S_9Q!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf4c9271-e541-4eea-aa5e-2a6dfd8d10e3_610x554.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!S_9Q!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf4c9271-e541-4eea-aa5e-2a6dfd8d10e3_610x554.png 424w, https://substackcdn.com/image/fetch/$s_!S_9Q!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf4c9271-e541-4eea-aa5e-2a6dfd8d10e3_610x554.png 848w, https://substackcdn.com/image/fetch/$s_!S_9Q!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf4c9271-e541-4eea-aa5e-2a6dfd8d10e3_610x554.png 1272w, https://substackcdn.com/image/fetch/$s_!S_9Q!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf4c9271-e541-4eea-aa5e-2a6dfd8d10e3_610x554.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!S_9Q!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf4c9271-e541-4eea-aa5e-2a6dfd8d10e3_610x554.png" width="610" height="554" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cf4c9271-e541-4eea-aa5e-2a6dfd8d10e3_610x554.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:554,&quot;width&quot;:610,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!S_9Q!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf4c9271-e541-4eea-aa5e-2a6dfd8d10e3_610x554.png 424w, https://substackcdn.com/image/fetch/$s_!S_9Q!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf4c9271-e541-4eea-aa5e-2a6dfd8d10e3_610x554.png 848w, https://substackcdn.com/image/fetch/$s_!S_9Q!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf4c9271-e541-4eea-aa5e-2a6dfd8d10e3_610x554.png 1272w, https://substackcdn.com/image/fetch/$s_!S_9Q!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf4c9271-e541-4eea-aa5e-2a6dfd8d10e3_610x554.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>OpenTinker: Separating Concerns in Agentic Reinforcement Learning</strong></h3><p><strong><a href="https://arxiv.org/abs/2601.07376">https://arxiv.org/abs/2601.07376</a> |<a href="https://github.com/open-tinker/OpenTinker"> Github</a></strong></p><p>OpenTinker provides RL infrastructure that separates environment specification from execution and resource management. Users define agents, environments, and interaction protocols while the system handles rollout generation, training, and scheduling through a managed runtime. The architecture supports both LoRA-based and full-parameter RL across shared cluster resources. A centralized scheduler manages multi-tenant workloads, allowing multiple users to run training jobs on shared infrastructure. The design extends to multi-agent training through an agent protocol coordinator that manages interaction order and synchronization within the environment abstraction.</p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;4eb9105a-576e-42ac-8329-9b88983691b4&quot;,&quot;duration&quot;:null}"></div><h3><strong>Benchmark^2: Systematic Evaluation of LLM Benchmarks</strong></h3><p><strong><a href="https://arxiv.org/abs/2601.03986">https://arxiv.org/abs/2601.03986</a></strong></p><p>This work introduces a framework for evaluating benchmark quality through three metrics: Cross-Benchmark Ranking Consistency (alignment with peer benchmark rankings), Discriminability Score (ability to differentiate between models), and Capability Alignment Deviation (whether stronger models within a family perform better). Testing across 15 benchmarks spanning mathematics, reasoning, and knowledge domains with 11 LLMs from four families reveals significant quality variations among existing benchmarks. The analysis shows that selective benchmark construction based on these metrics can achieve comparable evaluation performance with substantially reduced test sets.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mV24!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd05ccb75-c651-405e-ba74-2c71dcf4cd1b_917x422.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mV24!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd05ccb75-c651-405e-ba74-2c71dcf4cd1b_917x422.png 424w, https://substackcdn.com/image/fetch/$s_!mV24!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd05ccb75-c651-405e-ba74-2c71dcf4cd1b_917x422.png 848w, https://substackcdn.com/image/fetch/$s_!mV24!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd05ccb75-c651-405e-ba74-2c71dcf4cd1b_917x422.png 1272w, https://substackcdn.com/image/fetch/$s_!mV24!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd05ccb75-c651-405e-ba74-2c71dcf4cd1b_917x422.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mV24!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd05ccb75-c651-405e-ba74-2c71dcf4cd1b_917x422.png" width="917" height="422" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d05ccb75-c651-405e-ba74-2c71dcf4cd1b_917x422.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:422,&quot;width&quot;:917,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mV24!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd05ccb75-c651-405e-ba74-2c71dcf4cd1b_917x422.png 424w, https://substackcdn.com/image/fetch/$s_!mV24!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd05ccb75-c651-405e-ba74-2c71dcf4cd1b_917x422.png 848w, https://substackcdn.com/image/fetch/$s_!mV24!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd05ccb75-c651-405e-ba74-2c71dcf4cd1b_917x422.png 1272w, https://substackcdn.com/image/fetch/$s_!mV24!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd05ccb75-c651-405e-ba74-2c71dcf4cd1b_917x422.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h2><strong>&#127909; 4 Videos</strong></h2><h3><strong>Claude Agent SDK [Full Workshop]</strong></h3><div id="youtube2-TqC1qOfiVcQ" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;TqC1qOfiVcQ&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/TqC1qOfiVcQ?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Anthropic&#8217;s technical team walks through their Agent SDK and Response API. The workshop covers chain-of-thought tool calling patterns, the interface for defining agent behaviors, and production deployment considerations. Includes integration examples from Coinbase and Box demonstrating how the SDK works in real implementations.</p><h3><strong>Stuffing Context is not Memory, Updating Weights is</strong></h3><div id="youtube2-Jty4s9-Jb78" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;Jty4s9-Jb78&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/Jty4s9-Jb78?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>This talk examines the distinction between context windows and model memory. The discussion covers why adding information to context differs from updating model weights, with implications for system design. Relevant if you&#8217;re building systems that rely on context windows and need to understand their limitations.</p><h3><strong>Getting started with Codex</strong></h3><div id="youtube2-px7XlbYgk7I" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;px7XlbYgk7I&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/px7XlbYgk7I?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>A practical introduction to Codex covering setup, integration patterns, and common workflows. Focuses on core functionality needed to evaluate whether Codex fits specific use cases.</p><h3><strong>Spec-Driven Development: Sharpening your AI toolbox</strong></h3><div id="youtube2-HY_JyxAZsiE" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;HY_JyxAZsiE&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/HY_JyxAZsiE?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>This talk covers how specification-driven development applies to AI systems. The approach involves defining clear specifications upfront and using them to guide development and validation, helping teams move from prototypes to production systems with more predictable behavior.</p><div><hr></div><h2><strong>&#128240; 3 Curated Reads</strong></h2><h3><strong>How should AI systems behave, and who should decide?</strong></h3><p><strong><a href="https://openai.com/index/how-should-ai-systems-behave/">https://openai.com/index/how-should-ai-systems-behave/<br><br></a></strong>Old but gold. OpenAI&#8217;s piece on frameworks for determining AI system behavior and who should influence those decisions. Covers the tension between universal standards and context-specific behavior with deployment examples. Addresses how to balance different stakeholder perspectives when defining appropriate AI behavior across varied use cases.</p><h3><strong>Performance Hints</strong></h3><p><strong><a href="https://abseil.io/fast/hints.html#performance-hints">https://abseil.io/fast/hints.html#performance-hints<br><br></a></strong>Jeff Dean and Sanjay Ghemawat share some incredible insights on performance optimization, covering measurement methodology, common pitfalls, and when optimization matters. Examines the relationship between code clarity and performance, with principles that apply across frameworks and languages.</p><h3><strong>My LLM coding workflow going into 2026</strong></h3><div class="embedded-post-wrap" data-attrs="{&quot;id&quot;:181957927,&quot;url&quot;:&quot;https://addyo.substack.com/p/my-llm-coding-workflow-going-into&quot;,&quot;publication_id&quot;:2115638,&quot;publication_name&quot;:&quot;Elevate&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!8WxC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3704470-b6d5-48a9-a9d1-564bd833fc5c_1280x1280.png&quot;,&quot;title&quot;:&quot;My LLM coding workflow going into 2026&quot;,&quot;truncated_body_text&quot;:&quot;AI coding assistants became game-changers this year, but harnessing them effectively takes skill and structure. These tools dramatically increased what LLMs can do for real-world coding, and many developers (myself included) embraced them.&quot;,&quot;date&quot;:&quot;2025-12-18T15:30:46.704Z&quot;,&quot;like_count&quot;:543,&quot;comment_count&quot;:31,&quot;bylines&quot;:[{&quot;id&quot;:11623675,&quot;name&quot;:&quot;Addy Osmani&quot;,&quot;handle&quot;:&quot;addyosmani&quot;,&quot;previous_name&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cee7ba66-e656-4450-a0ed-c951c27ee228_1080x1080.jpeg&quot;,&quot;bio&quot;:&quot;Engineering leader at Google, #1 Bestselling Amazon author, Award-winning engineer and international speaker. I want to help you succeed. My writing is about software engineering, motivation, and leadership.&quot;,&quot;profile_set_up_at&quot;:&quot;2023-11-19T09:33:50.395Z&quot;,&quot;reader_installed_at&quot;:&quot;2023-11-29T05:13:59.015Z&quot;,&quot;publicationUsers&quot;:[{&quot;id&quot;:2120503,&quot;user_id&quot;:11623675,&quot;publication_id&quot;:2115638,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:true,&quot;publication&quot;:{&quot;id&quot;:2115638,&quot;name&quot;:&quot;Elevate&quot;,&quot;subdomain&quot;:&quot;addyo&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Addy Osmani's newsletter on elevating your effectiveness. Join his community of 600,000 readers across social media.&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e3704470-b6d5-48a9-a9d1-564bd833fc5c_1280x1280.png&quot;,&quot;author_id&quot;:11623675,&quot;primary_user_id&quot;:11623675,&quot;theme_var_background_pop&quot;:&quot;#FF5CD7&quot;,&quot;created_at&quot;:&quot;2023-11-19T09:34:16.230Z&quot;,&quot;email_from_name&quot;:null,&quot;copyright&quot;:&quot;Addy Osmani&quot;,&quot;founding_plan_name&quot;:&quot;Founding Member&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;enabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false}},{&quot;id&quot;:2207048,&quot;user_id&quot;:11623675,&quot;publication_id&quot;:2192362,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:false,&quot;publication&quot;:{&quot;id&quot;:2192362,&quot;name&quot;:&quot;Large Scale Web Apps&quot;,&quot;subdomain&quot;:&quot;largeapps&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Learn tools and techniques to build and maintain large-scale React web applications.&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a9a53806-0d0b-4025-b992-145baca33809_512x512.png&quot;,&quot;author_id&quot;:11623675,&quot;primary_user_id&quot;:98078198,&quot;theme_var_background_pop&quot;:&quot;#99A2F1&quot;,&quot;created_at&quot;:&quot;2023-12-20T10:59:33.318Z&quot;,&quot;email_from_name&quot;:&quot;Addy and Hassan from Large Scale Apps&quot;,&quot;copyright&quot;:&quot;Addy Osmani and Hassan Djirdeh&quot;,&quot;founding_plan_name&quot;:null,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;disabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false}},{&quot;id&quot;:2224891,&quot;user_id&quot;:11623675,&quot;publication_id&quot;:2209631,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:false,&quot;publication&quot;:{&quot;id&quot;:2209631,&quot;name&quot;:&quot;Deep Voice&quot;,&quot;subdomain&quot;:&quot;deepvoice&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;A newsletter on how to get more motivated. Brought to you by Addy Osmani.&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/328afbab-a375-4ffd-ac83-40300eefc225_1280x1280.png&quot;,&quot;author_id&quot;:11623675,&quot;primary_user_id&quot;:null,&quot;theme_var_background_pop&quot;:&quot;#786CFF&quot;,&quot;created_at&quot;:&quot;2023-12-28T20:05:46.081Z&quot;,&quot;email_from_name&quot;:null,&quot;copyright&quot;:&quot;Addy Osmani&quot;,&quot;founding_plan_name&quot;:&quot;Founding Member&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;enabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false}}],&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100,&quot;status&quot;:{&quot;bestsellerTier&quot;:100,&quot;subscriberTier&quot;:null,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:{&quot;type&quot;:&quot;bestseller&quot;,&quot;tier&quot;:100},&quot;paidPublicationIds&quot;:[],&quot;subscriber&quot;:null}}],&quot;utm_campaign&quot;:null,&quot;belowTheFold&quot;:true,&quot;type&quot;:&quot;newsletter&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPostToDOM"><a class="embedded-post" native="true" href="https://addyo.substack.com/p/my-llm-coding-workflow-going-into?utm_source=substack&amp;utm_campaign=post_embed&amp;utm_medium=web"><div class="embedded-post-header"><img class="embedded-post-publication-logo" src="https://substackcdn.com/image/fetch/$s_!8WxC!,w_56,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3704470-b6d5-48a9-a9d1-564bd833fc5c_1280x1280.png" loading="lazy"><span class="embedded-post-publication-name">Elevate</span></div><div class="embedded-post-title-wrapper"><div class="embedded-post-title">My LLM coding workflow going into 2026</div></div><div class="embedded-post-body">AI coding assistants became game-changers this year, but harnessing them effectively takes skill and structure. These tools dramatically increased what LLMs can do for real-world coding, and many developers (myself included) embraced them&#8230;</div><div class="embedded-post-cta-wrapper"><span class="embedded-post-cta">Read more</span></div><div class="embedded-post-meta">4 months ago &#183; 543 likes &#183; 31 comments &#183; Addy Osmani</div></a></div><p>Google&#8217;s Addy Osmani shares a practitioner&#8217;s account of using LLM-assisted coding in daily work. Covers specific tools, workflow patterns, where AI assistance provides value, and where it introduces friction. Focuses on practical integration with traditional development practices.</p><div><hr></div><h2><strong>&#128736; 2 Tools &amp; Repos</strong></h2><h3><strong>Simon Willison 2025 Review</strong></h3><p><strong><a href="https://simonwillison.net/2025/Dec/31/the-year-in-llms/">https://simonwillison.net/2025/Dec/31/the-year-in-llms/<br><br></a></strong>Simon Willison&#8217;s year-end retrospective connecting technical developments in LLMs to practical applications. The analysis identifies trends and developments from 2025 with context on their significance beyond benchmark performance. Useful for understanding how the field evolved over the year.</p><h3><strong>Claude Code Templates</strong></h3><p><strong><a href="https://github.com/davila7/claude-code-templates">https://github.com/davila7/claude-code-templates<br><br></a></strong>Ready-to-use templates for Claude Code workflows across React, Vue, Django, FastAPI, and other frameworks. Each template includes CLAUDE.md configurations and setup patterns that work in practice. Reduces setup time when starting new projects.</p><div><hr></div><h2><strong>&#127891; 1 Pick of the Week</strong></h2><h3><strong>CMU&#8217;s Modern AI Course</strong></h3><p><strong><a href="https://modernaicourse.org/">https://modernaicourse.org/</a></strong></p><p>CMU&#8217;s free course covering AI from foundations through deployment starts on Jan 26th for folks online and not in the classroom. The curriculum includes core concepts, training methodologies, and practical implementation. It appears to balance mathematical foundations with hands-on projects. Structured for both newcomers building foundational knowledge and practitioners working on specific topics.</p><div><hr></div><p><em>Thanks for reading The Tokenizer! If you found something useful here, share it with someone who might benefit. And if you want more curated insights like this, consider subscribing to Gradient Ascent.</em></p>]]></content:encoded></item><item><title><![CDATA[The Last Mile is Always Human]]></title><description><![CDATA[What's next for Gradient Ascent (looking back and ahead)]]></description><link>https://newsletter.artofsaience.com/p/the-last-mile-is-always-human</link><guid isPermaLink="false">https://newsletter.artofsaience.com/p/the-last-mile-is-always-human</guid><dc:creator><![CDATA[Sairam Sundaresan]]></dc:creator><pubDate>Sun, 04 Jan 2026 13:37:39 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!zgMZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb6c9ece-78d5-4310-9e18-9a6b0fb710be_1080x1920.gif" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>We are at a crossroads.</p><p>The &#8220;wow&#8221; phase is over. For two years, generative AI felt like witnessing a magic trick on repeat. Chatbots that could write poetry. Models that conjured images from words. Agents that promised to automate everything we found tedious. The demos were intoxicating. The possibilities felt infinite.</p><blockquote><p>Now comes the harder question: <em>What do we actually do with all of this?</em></p></blockquote><p>Something has shifted. The noise-to-signal ratio has exploded. LinkedIn feeds overflow with AI-generated thought leadership that says nothing. Inboxes fill with synthetic outreach that fools no one. The tools that promised clarity have produced a fog. And somewhere in that fog, the things worth reading, worth learning, worth building have become harder to find.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zgMZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb6c9ece-78d5-4310-9e18-9a6b0fb710be_1080x1920.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zgMZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb6c9ece-78d5-4310-9e18-9a6b0fb710be_1080x1920.gif 424w, https://substackcdn.com/image/fetch/$s_!zgMZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb6c9ece-78d5-4310-9e18-9a6b0fb710be_1080x1920.gif 848w, https://substackcdn.com/image/fetch/$s_!zgMZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb6c9ece-78d5-4310-9e18-9a6b0fb710be_1080x1920.gif 1272w, https://substackcdn.com/image/fetch/$s_!zgMZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb6c9ece-78d5-4310-9e18-9a6b0fb710be_1080x1920.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zgMZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb6c9ece-78d5-4310-9e18-9a6b0fb710be_1080x1920.gif" width="258" height="458.6666666666667" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cb6c9ece-78d5-4310-9e18-9a6b0fb710be_1080x1920.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1920,&quot;width&quot;:1080,&quot;resizeWidth&quot;:258,&quot;bytes&quot;:2042320,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/gif&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://newsletter.artofsaience.com/i/183438573?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb6c9ece-78d5-4310-9e18-9a6b0fb710be_1080x1920.gif&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zgMZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb6c9ece-78d5-4310-9e18-9a6b0fb710be_1080x1920.gif 424w, https://substackcdn.com/image/fetch/$s_!zgMZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb6c9ece-78d5-4310-9e18-9a6b0fb710be_1080x1920.gif 848w, https://substackcdn.com/image/fetch/$s_!zgMZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb6c9ece-78d5-4310-9e18-9a6b0fb710be_1080x1920.gif 1272w, https://substackcdn.com/image/fetch/$s_!zgMZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcb6c9ece-78d5-4310-9e18-9a6b0fb710be_1080x1920.gif 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I started Gradient Ascent because I believe complex ideas deserve to be understood, not just consumed. I believe that a hand-drawn diagram can unlock an insight that ten thousand words cannot. I believe that the struggle to understand something deeply is not an obstacle to learning. </p><p>It <em>is</em> learning.</p><p>But I cannot build this newsletter in a vacuum. Not anymore. Not at this crossroads.</p><p>I need to know where you are on your climb. What concepts remain fuzzy? What keeps you up at night? What would actually help you build better systems, lead better teams, and think more clearly about the technology reshaping our world?</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://form.typeform.com/to/qk9cGkK8&quot;,&quot;text&quot;:&quot;Take the Survey (Just 5 minutes)&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://form.typeform.com/to/qk9cGkK8"><span>Take the Survey (Just 5 minutes)</span></a></p><blockquote><p><em>As a token of thanks, on completing the survey, you&#8217;ll receive a curated resource pack that has the best FREE courses, repos, and other learning resources on various topics, including LLMs, Agents, RAG, Vision, RL, and more. (Within 2 business days)</em></p></blockquote><p>Your answers will directly shape what Gradient Ascent becomes next. The topics I cover. The depth I go to. The formats I experiment with. This is not a formality. I will read every response.</p><p>Before I share where we are headed, let me tell you where we have been, and what I have been wrestling with along the way.</p><div><hr></div><h2><strong>Where We Have Been</strong></h2><p>This past year, the Gradient Ascent community grew from 8k readers to <strong>25k</strong> readers.</p><p>We had 15 curation pieces and a handful of deeper essays. The most popular editions, <strong>&#8220;Claude Code is not a tool&#8221;</strong> and <strong>&#8220;The secrets of distributed training&#8221;</strong>, sparked conversations that continued for weeks.</p><p>But the numbers only tell part of the story.</p><p>The truth is, most of my creative energy this year went somewhere else: finishing and launching<a href="https://www.amazon.com/AI-Rest-Us-Illustrated-Introduction/dp/B0F29THNLT/"> </a><em><a href="https://www.amazon.com/AI-Rest-Us-Illustrated-Introduction/dp/B0F29THNLT/">AI for the Rest of Us</a></em>.</p><p>Writing a book, especially one built around hand-drawn illustrations, is a different beast than writing a newsletter. It demands sustained focus across months, not the weekly rhythm of shipping editions. It forces you to think in chapters, not posts. It consumed me in the best and most exhausting way.</p><p>That is why The Tokenizer, my curated weekly roundup, became the workhorse of Gradient Ascent this year. It was the edition I could sustain while wrestling a book manuscript into shape in between a full-time role and family commitments.</p><p>I don&#8217;t regret that trade-off. The book exists today in the real-world. It is helping people who will never read a technical paper understand what is actually happening inside these systems. That matters to me deeply.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://www.amazon.com/AI-Rest-Us-Illustrated-Introduction/dp/B0F29THNLT/" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xrlO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf868815-17d3-40bb-8630-1a9040a2daaf_1080x1350.png 424w, https://substackcdn.com/image/fetch/$s_!xrlO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf868815-17d3-40bb-8630-1a9040a2daaf_1080x1350.png 848w, https://substackcdn.com/image/fetch/$s_!xrlO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf868815-17d3-40bb-8630-1a9040a2daaf_1080x1350.png 1272w, https://substackcdn.com/image/fetch/$s_!xrlO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf868815-17d3-40bb-8630-1a9040a2daaf_1080x1350.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xrlO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf868815-17d3-40bb-8630-1a9040a2daaf_1080x1350.png" width="432" height="540" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/af868815-17d3-40bb-8630-1a9040a2daaf_1080x1350.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1350,&quot;width&quot;:1080,&quot;resizeWidth&quot;:432,&quot;bytes&quot;:2931998,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://www.amazon.com/AI-Rest-Us-Illustrated-Introduction/dp/B0F29THNLT/&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.artofsaience.com/i/183438573?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf868815-17d3-40bb-8630-1a9040a2daaf_1080x1350.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xrlO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf868815-17d3-40bb-8630-1a9040a2daaf_1080x1350.png 424w, https://substackcdn.com/image/fetch/$s_!xrlO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf868815-17d3-40bb-8630-1a9040a2daaf_1080x1350.png 848w, https://substackcdn.com/image/fetch/$s_!xrlO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf868815-17d3-40bb-8630-1a9040a2daaf_1080x1350.png 1272w, https://substackcdn.com/image/fetch/$s_!xrlO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf868815-17d3-40bb-8630-1a9040a2daaf_1080x1350.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Pictures from readers worldwide</figcaption></figure></div><p>But I also felt the absence of something.</p><p>The deep-dive visual explainers. The pieces that force me to truly understand a concept before I can draw and write about it. The work that sticks in your brain months after you read it.</p><p>That is what I want to return to.</p><p>And during the months I spent away from that work, I found myself thinking constantly about <em>why</em> it matters.</p><div><hr></div><h2><strong>The Future of Learning</strong></h2><p>While writing the book, and in the months since, I have been wrestling with a question that will not let me go:</p><p><em>What is the future of learning in an age of infinite generated content?</em></p><p>This is not an abstract question for me. It is existential. I write a newsletter. I draw illustrations to explain complex ideas. I believe deeply in education as transformation, not transaction. And I am watching the very foundations of that belief get tested.</p><p>I have spent months reading, listening, and talking to creators across disciplines. Educators rethinking what a classroom means when every student has a tireless tutor in their pocket. Artists questioning what it means to make something when a machine can generate a thousand variations in seconds. Writers wondering whether the craft of prose still matters when text can be produced on demand.</p><p>The conversations have been fascinating and unsettling in equal measure. And they have crystallized something for me about what this newsletter needs to be, and what it must resist becoming.</p><p>Because here is what I have come to believe: we are facing a cognition problem and not just a content problem. And if we don&#8217;t name it clearly, we can&#8217;t fight it.</p><div><hr></div><h2><strong>The Quiet Erosion</strong></h2><p>Here is what I have been thinking about while planning that next chapter.</p><p>We are in the middle of something I call the Quiet Erosion. It isn&#8217;t dramatic. It doesn&#8217;t announce itself. It happens one small convenience at a time.</p><p>The student prompts an AI to write their essay. They get the grade. But they never learned to structure an argument, to wrestle with ambiguity, to find their own voice. The output exists. The skill does not.</p><p>The engineer pastes AI-generated code into their system. It runs. But they cannot fully explain what it does. They have introduced a black box into their own creation. When it breaks at 2 AM, they will stare at it like a stranger.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wq_O!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6495aab2-38de-4010-82c0-5e9a1b03763b_1080x1920.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wq_O!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6495aab2-38de-4010-82c0-5e9a1b03763b_1080x1920.png 424w, https://substackcdn.com/image/fetch/$s_!wq_O!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6495aab2-38de-4010-82c0-5e9a1b03763b_1080x1920.png 848w, https://substackcdn.com/image/fetch/$s_!wq_O!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6495aab2-38de-4010-82c0-5e9a1b03763b_1080x1920.png 1272w, https://substackcdn.com/image/fetch/$s_!wq_O!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6495aab2-38de-4010-82c0-5e9a1b03763b_1080x1920.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wq_O!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6495aab2-38de-4010-82c0-5e9a1b03763b_1080x1920.png" width="274" height="487.1111111111111" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6495aab2-38de-4010-82c0-5e9a1b03763b_1080x1920.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1920,&quot;width&quot;:1080,&quot;resizeWidth&quot;:274,&quot;bytes&quot;:961725,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.artofsaience.com/i/183438573?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6495aab2-38de-4010-82c0-5e9a1b03763b_1080x1920.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wq_O!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6495aab2-38de-4010-82c0-5e9a1b03763b_1080x1920.png 424w, https://substackcdn.com/image/fetch/$s_!wq_O!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6495aab2-38de-4010-82c0-5e9a1b03763b_1080x1920.png 848w, https://substackcdn.com/image/fetch/$s_!wq_O!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6495aab2-38de-4010-82c0-5e9a1b03763b_1080x1920.png 1272w, https://substackcdn.com/image/fetch/$s_!wq_O!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6495aab2-38de-4010-82c0-5e9a1b03763b_1080x1920.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The executive skims an AI summary of a strategy document. They learn the bullet points but miss the subtext. They know the what but not the why. And when the decision demands nuance, they reach for a foundation that never existed.</p><p>This is cognitive offloading, and it is subtle. Charlie Gedeon calls it &#8220;intellectual deskilling&#8221;: letting the<a href="https://www.youtube.com/watch?v=m8WomdCLBqE"> copilot become the autopilot</a>. Anthropic&#8217;s own research found that<a href="https://www.anthropic.com/news/anthropic-education-report-how-university-students-use-claude"> 47% of early student AI use</a> was purely transactional. Students wanted the answer, not the understanding. The output existed, but the user gained nothing. Call it atrophy or deskilling. The risk is real.</p><p>Think about what happens when you break your arm. The cast immobilizes it, protects it, does the work of holding everything in place while you heal. But when the cast comes off six weeks later, the arm underneath has withered. The muscles have atrophied. The joint is stiff. The very thing that protected you also weakened you, because the arm was never asked to do its job.</p><p>That is what we are doing to our minds. Every task we offload is a rep we do not perform. Every answer we accept without struggle is a neural pathway we do not build. The AI holds our cognition in place, and we feel supported. But underneath the cast, we are quietly atrophying.</p><p>And while our individual cognition weakens, the collective information environment is rotting too.</p><p>The internet is filling with slop. Infinite summaries. Auto-generated code. Synthetic opinions. Content mills churning out articles optimized for algorithms, not humans.</p><p>Every platform is drowning in low-quality noise designed to capture attention, not to inform or illuminate. Finding signal in this environment is becoming genuinely difficult.</p><p>Finding truth is becoming harder still.</p><p>Kurzgesagt identified what they call<a href="https://www.youtube.com/watch?v=_zfN9wnPvU0"> the Circular Lie</a>, and it haunts me. Here is how it works: an AI hallucinates a fact, and that ends up on the internet. It sounds confident, so another AI scrapes that content and cites it. A human researcher, trusting the citation, includes it in their work.</p><p>And just like that, a falsehood embeds itself in the knowledge base, passed off as truth, impossible to trace back to its hollow origin.</p><p>This is happening now. We are offloading cognition and, in the process, building cathedrals on quicksand. Worse yet, we are doing it while our ability to detect the quicksand atrophies from disuse.</p><p>There is something else too, something harder to measure but impossible to ignore. The erosion of wonder. We used to look at a stunning piece of art or an incredible video and feel awe. Now we look at it with suspicion. <em>Is this real? Is this fake? Was this made by a human or generated by a machine?</em> The default reaction has shifted from appreciation to skepticism.</p><p>By flooding us with content, AI has poisoned our ability to trust what we see.</p><p>This is the crisis I want Gradient Ascent to address. Not by ignoring AI. Not by pretending we can turn back the clock. But by insisting, stubbornly and deliberately, on the things that still require human struggle, human judgment, human understanding.</p><p>Which brings me to the core philosophy of learning.</p><div><hr></div><h2><strong>No Friction, No Growth</strong></h2><p>Here is the uncomfortable truth about learning: it requires resistance.</p><p>Your brain does not build new pathways because you want to understand something. It builds them because you <em>struggle</em> to understand something. The difficulty, the obstacle, is the mechanism. When AI removes the friction, it removes the learning itself.</p><p>Think of it this way.</p><p>A helicopter can drop you at the summit of a mountain. You step out, take in the view, snap a photo. &#8220;Made it to the top!&#8221; you announce. But you are a tourist. You do not know the terrain. You could not navigate back down if you tried. The first unexpected storm will leave you stranded.</p><p>The climb is different. You walk the path yourself. You feel the incline. You learn which handholds hold and which crumble. You take wrong turns and backtrack. And when you finally reach the summit, you are not a tourist. You are a mountaineer. The knowledge lives in your body.</p><p>Both the tourist and the climber arrive at the summit. Only one arrives transformed.</p><p>This is what Gradient Ascent means.</p><p>I&#8217;m not here to helicopter you to the summit. I&#8217;m not here to hand you AI-generated summaries so you can skip the complex parts. I&#8217;ll help map the route, verify the path, point out where the footing gets tricky, and where the views are worth pausing for.</p><p>But you have to do the climbing.</p><p>And so do I.</p><div><hr></div><h2><strong>The Pen as a Filter</strong></h2><p>Some readers of my book have asked me, &#8220;Why do you still draw everything by hand? AI can generate diagrams now. They&#8217;re getting better every day.&#8221;</p><p>Yes, the models are impressive. Prompt the right words, and a technically polished neural network diagram appears in seconds.</p><p>But that observation misses the entire point.</p><p>I don&#8217;t draw to decorate. I draw to <em>think</em>.</p><p>When I sit down to illustrate a concept for my book or the newsletter, I&#8217;m not creating art (It would be a stretch to call it that, regardless). I&#8217;m running myself through a brutal compression algorithm. I take a 20-page research paper full of equations, jargon, and dense abstraction, and I force myself to strip away everything that is not essential. I search for the single visual metaphor that unlocks the whole thing. I can&#8217;t draw it until I understand it. And I can&#8217;t understand it until I have struggled with it.</p><p>That struggle is the product.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jFA3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4be66eca-f597-4fdc-8bab-7b78c0177932_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jFA3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4be66eca-f597-4fdc-8bab-7b78c0177932_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!jFA3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4be66eca-f597-4fdc-8bab-7b78c0177932_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!jFA3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4be66eca-f597-4fdc-8bab-7b78c0177932_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!jFA3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4be66eca-f597-4fdc-8bab-7b78c0177932_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jFA3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4be66eca-f597-4fdc-8bab-7b78c0177932_1024x1024.png" width="590" height="590" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4be66eca-f597-4fdc-8bab-7b78c0177932_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:590,&quot;bytes&quot;:683710,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.artofsaience.com/i/183438573?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4be66eca-f597-4fdc-8bab-7b78c0177932_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!jFA3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4be66eca-f597-4fdc-8bab-7b78c0177932_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!jFA3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4be66eca-f597-4fdc-8bab-7b78c0177932_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!jFA3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4be66eca-f597-4fdc-8bab-7b78c0177932_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!jFA3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4be66eca-f597-4fdc-8bab-7b78c0177932_1024x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>An AI can generate a diagram of a Transformer in seconds. It might even look clean. But it represents a statistical average, a pattern-matched guess assembled from every diagram in its training data. There is no understanding behind it. No filter. No point of view. No personalization.</p><p>When I draw, the resulting image is a map of my understanding, handed to you so you don&#8217;t have to get lost in the weeds.</p><p>The image is not the point. The thinking that produced the image is the point. And that thinking cannot and should not be outsourced.</p><div><hr></div><h2><strong>The Line We Walk</strong></h2><p>I want to be clear about something. I&#8217;m not a Luddite. I&#8217;m not here to tell you AI is dangerous and you should reject it.</p><p>I use AI every day. I work in the field. I use it to write code, to debug, to automate tedious tasks, to accelerate research. I have compressed weeks of work into hours using these tools. I love efficiency. I love leverage. I love building things faster than I ever could before.</p><p>But there is a line.</p><p>I do not use AI to verify truth. I do not use it to replace judgment. I do not let it think for me on the things that matter.</p><p>AI is a power tool. And power tools are remarkable for the jobs they are designed for. But you would not use a circular saw to perform surgery. You should not use a language model to decide what is true or what is right.</p><p>This is what I mean by &#8220;the last mile is always human.&#8221; Every AI system, no matter how sophisticated, ends with a human decision. The human who chooses to trust the output or question it. The human who catches the hallucination. The human who knows when to override the recommendation. That human needs to understand what is happening beneath the surface.</p><p>Otherwise, they are not using the tool. The tool is using them.</p><p>Gradient Ascent isn&#8217;t here to help you &#8220;keep up&#8221; with AI news. A thousand newsletters already provide the daily churn of announcements and funding rounds.</p><p>This newsletter is for the thinkers and learners. Whether you lead a company, or a team of engineers or you are building your first agent alone in a Jupyter notebook, Gradient Ascent exists to be there with you on that journey. It exists to help you understand the fundamentals so you can wield these tools with precision, not just hype.</p><p>Transformers. Vision-language models. Agents. RAG. And anything that comes ahead. The whole stack, clarified and illustrated, so you can lead and build on solid ground.</p><p>In practice: I use AI for scaffolding (enumerations, boilerplate, code stubs), I verify truth in primary sources and triangulate across multiple reputable papers or docs, and I reserve human judgment for causality, ethics, and decisions with irreversible consequences.</p><div><hr></div><h2><strong>The Promise and the Ask</strong></h2><p>Here is my commitment to you.</p><p>I will continue to do the work. I will read the papers, not just the abstracts. I will build the systems, not just describe them. I will verify sources and call out hype. I will struggle with the concepts until I can draw them clearly enough that they stick in your (and my) minds.</p><p>I will respect your time. If I do not have something genuinely useful to say, I will not fill your inbox with filler.</p><p>And I will keep climbing. This field moves fast. The terrain shifts constantly. But Gradient Ascent will keep mapping the route, honestly and rigorously, with the reader always in mind.</p><p>In 2026, I will start posting more of the deep visual explainers alongside the Tokenizer. For example, agents in production, VLMs and grounding, and system design for AI products.</p><p>If you value that kind of work, if you want a newsletter that treats you as a builder and not a bystander, I need your help shaping what comes next.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://form.typeform.com/to/qk9cGkK8&quot;,&quot;text&quot;:&quot;Take the Survey (Just 5 Minutes)&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://form.typeform.com/to/qk9cGkK8"><span>Take the Survey (Just 5 Minutes)</span></a></p><p>Your responses will determine the first three explainer topics, and I&#8217;ll publish the roadmap soon.</p><p>Five minutes. Your answers go directly into my planning. I will read every one.</p><p>Thank you for being on this climb with me.</p><p>The summit is not the point. The climbing is the point.</p><blockquote><p><em>Life before death. <br>Strength before weakness. <br>Journey before destination.</em></p><p><em>~ Brandon Sanderson</em></p></blockquote><p></p><p>Happy New Year!<br>Sairam</p><p><strong>P.S: Remember to <a href="https://form.typeform.com/to/qk9cGkK8">grab your FREE resource pack</a></strong></p>]]></content:encoded></item><item><title><![CDATA[Andrew Ng on Navigating AI Careers, DeepMind CEO on Gemini 3, and Microsoft's Agent Curriculum: The Tokenizer Edition #13]]></title><description><![CDATA[This week's most valuable AI resources]]></description><link>https://newsletter.artofsaience.com/p/andrew-ng-on-navigating-ai-careers</link><guid isPermaLink="false">https://newsletter.artofsaience.com/p/andrew-ng-on-navigating-ai-careers</guid><dc:creator><![CDATA[Sairam Sundaresan]]></dc:creator><pubDate>Sat, 20 Dec 2025 16:40:01 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/AuZoDsNmG_s" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey there! Happy holiday season to you and yours! Some very interesting resources this week. Andrew Ng shares career guidance for AI. DeepMind CEO Demis Hassabis unpacks Gemini 3, world models, and breakthroughs from AlphaFold to fusion and elsewhere, Microsoft released a curriculum for building AI agents from foundations to production.</p><h3><strong>New here?</strong></h3><p><em>The Tokenizer is my resource-focused newsletter edition where I curate the best papers, videos, articles, tools, and learning resources from across the AI landscape. Consider it your weekly dose of everything you need to stay ahead in machine learning.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.artofsaience.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.artofsaience.com/subscribe?"><span>Subscribe now</span></a></p><h2><strong>TL;DR</strong></h2><p><strong>What caught my attention this week:</strong></p><p>&#8226; <strong>&#128196; Papers:</strong> GUI agents reaching production-ready performance, explainable AI video detection with grounded reasoning, and unified multimodal video generation</p><p>&#8226; <strong>&#127909; Videos:</strong> DeepMind&#8217;s CEO on the future of intelligence, game physics breakthroughs from Two Minute Papers, Anthropic on AI in education, and Bloomberg&#8217;s real-world AI deployment lessons</p><p>&#8226; <strong>&#128240; Reads:</strong> Practical distributed training from single GPU to clusters, formal verification&#8217;s AI-powered future, and understanding cross-entropy loss fundamentals</p><p>&#8226; <strong>&#128736; Tools:</strong> Microsoft&#8217;s comprehensive agent learning guide and DAIR&#8217;s definitive prompt engineering resource</p><p>&#8226; <strong>&#127891; Learning:</strong> Andrew Ng&#8217;s career advice for AI practitioners looking to navigate the field strategically</p><div><hr></div><p>Thinking about a good holiday gift for your friends or family? Grab my first book &#8212; <strong><a href="https://www.amazon.com/AI-Rest-Us-Illustrated-Introduction/dp/B0F29THNLT/">AI for the Rest of Us</a></strong> &#8212; today!</p><blockquote><p><em><strong>Quick note:</strong> If you find the book useful, please leave a review on Amazon. It makes a world of difference. If you have a picture of the book IRL, please share it with me. I really appreciate it.</em></p></blockquote><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.amazon.com/AI-Rest-Us-Illustrated-Introduction/dp/B0F29THNLT/&quot;,&quot;text&quot;:&quot;Buy my book&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.amazon.com/AI-Rest-Us-Illustrated-Introduction/dp/B0F29THNLT/"><span>Buy my book</span></a></p><div><hr></div><h2><strong>&#128196; 5 Papers</strong></h2><h3><strong>Skyra: AI-Generated Video Detection via Grounded Artifact Reasoning</strong></h3><p><strong><a href="https://arxiv.org/abs/2512.15693">https://arxiv.org/abs/2512.15693</a></strong> |<a href="https://github.com/JoeLeelyf/Skyra"> </a><strong><a href="https://github.com/JoeLeelyf/Skyra">GitHub</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UVlb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3124761e-cc5b-47af-a536-713f0272f9b5_1920x1000.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UVlb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3124761e-cc5b-47af-a536-713f0272f9b5_1920x1000.png 424w, https://substackcdn.com/image/fetch/$s_!UVlb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3124761e-cc5b-47af-a536-713f0272f9b5_1920x1000.png 848w, https://substackcdn.com/image/fetch/$s_!UVlb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3124761e-cc5b-47af-a536-713f0272f9b5_1920x1000.png 1272w, https://substackcdn.com/image/fetch/$s_!UVlb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3124761e-cc5b-47af-a536-713f0272f9b5_1920x1000.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UVlb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3124761e-cc5b-47af-a536-713f0272f9b5_1920x1000.png" width="1456" height="758" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3124761e-cc5b-47af-a536-713f0272f9b5_1920x1000.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:758,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!UVlb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3124761e-cc5b-47af-a536-713f0272f9b5_1920x1000.png 424w, https://substackcdn.com/image/fetch/$s_!UVlb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3124761e-cc5b-47af-a536-713f0272f9b5_1920x1000.png 848w, https://substackcdn.com/image/fetch/$s_!UVlb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3124761e-cc5b-47af-a536-713f0272f9b5_1920x1000.png 1272w, https://substackcdn.com/image/fetch/$s_!UVlb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3124761e-cc5b-47af-a536-713f0272f9b5_1920x1000.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>AI video detection just got explainable. Skyra identifies specific visual artifacts in AI-generated videos and uses them as grounded evidence for both detection and explanation. The model introduces ViF-CoT-4K, the first large-scale dataset of AI-generated video artifacts with fine-grained human annotations. Instead of just saying &#8220;this is fake,&#8221; Skyra points to concrete inconsistencies like shape distortions or camera motion problems. The two-stage training strategy enhances spatio-temporal artifact perception while maintaining explainability. Tested against outputs from over ten state-of-the-art generators on the new ViF-Bench benchmark, Skyra outperforms existing methods while actually showing you why it reached its conclusion.</p><h3><strong>Step-GUI Technical Report</strong></h3><p><strong><a href="https://arxiv.org/abs/2512.15431">https://arxiv.org/abs/2512.15431</a></strong> |<a href="https://github.com/stepfun-ai/gelab-zero"> </a><strong><a href="https://github.com/stepfun-ai/gelab-zero">GitHub</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iSiB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1ed0717-2652-42c8-aff5-45ede0672537_2048x2048.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iSiB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1ed0717-2652-42c8-aff5-45ede0672537_2048x2048.png 424w, https://substackcdn.com/image/fetch/$s_!iSiB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1ed0717-2652-42c8-aff5-45ede0672537_2048x2048.png 848w, https://substackcdn.com/image/fetch/$s_!iSiB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1ed0717-2652-42c8-aff5-45ede0672537_2048x2048.png 1272w, https://substackcdn.com/image/fetch/$s_!iSiB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1ed0717-2652-42c8-aff5-45ede0672537_2048x2048.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iSiB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1ed0717-2652-42c8-aff5-45ede0672537_2048x2048.png" width="1456" height="1456" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e1ed0717-2652-42c8-aff5-45ede0672537_2048x2048.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1456,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!iSiB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1ed0717-2652-42c8-aff5-45ede0672537_2048x2048.png 424w, https://substackcdn.com/image/fetch/$s_!iSiB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1ed0717-2652-42c8-aff5-45ede0672537_2048x2048.png 848w, https://substackcdn.com/image/fetch/$s_!iSiB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1ed0717-2652-42c8-aff5-45ede0672537_2048x2048.png 1272w, https://substackcdn.com/image/fetch/$s_!iSiB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe1ed0717-2652-42c8-aff5-45ede0672537_2048x2048.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>GUI automation hit production-grade numbers. Step-GUI&#8217;s 8B model achieves 80.2% on AndroidWorld, 48.5% on OSWorld, and 62.6% on ScreenShot-Pro through a self-evolving training pipeline that converts model-generated trajectories into reliable training signals. The Calibrated Step Reward System achieves over 90% annotation accuracy at 10-100x lower cost than manual annotation. Beyond the models, they introduce GUI-MCP, the first Model Context Protocol for GUI automation with hierarchical architecture combining low-level atomic operations and high-level task delegation to local specialist models. The AndroidDaily benchmark tests authentic everyday usage with 3,146 static actions and 235 end-to-end tasks across high-frequency mobile scenarios.</p><h3><strong>Kling-Omni Technical Report</strong></h3><p><strong><a href="https://arxiv.org/abs/2512.15431">https://arxiv.org/abs/2512.16776</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qwnp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09dd5ffd-592d-4fef-9870-f88cd674abf7_832x349.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qwnp!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09dd5ffd-592d-4fef-9870-f88cd674abf7_832x349.png 424w, https://substackcdn.com/image/fetch/$s_!qwnp!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09dd5ffd-592d-4fef-9870-f88cd674abf7_832x349.png 848w, https://substackcdn.com/image/fetch/$s_!qwnp!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09dd5ffd-592d-4fef-9870-f88cd674abf7_832x349.png 1272w, https://substackcdn.com/image/fetch/$s_!qwnp!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09dd5ffd-592d-4fef-9870-f88cd674abf7_832x349.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qwnp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09dd5ffd-592d-4fef-9870-f88cd674abf7_832x349.png" width="832" height="349" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/09dd5ffd-592d-4fef-9870-f88cd674abf7_832x349.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:349,&quot;width&quot;:832,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qwnp!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09dd5ffd-592d-4fef-9870-f88cd674abf7_832x349.png 424w, https://substackcdn.com/image/fetch/$s_!qwnp!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09dd5ffd-592d-4fef-9870-f88cd674abf7_832x349.png 848w, https://substackcdn.com/image/fetch/$s_!qwnp!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09dd5ffd-592d-4fef-9870-f88cd674abf7_832x349.png 1272w, https://substackcdn.com/image/fetch/$s_!qwnp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09dd5ffd-592d-4fef-9870-f88cd674abf7_832x349.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Kling-Omni bridges the gap between diverse video generation tasks into a unified system. The framework handles text-to-video, image-to-video, and video editing through an end-to-end multimodal visual language approach, processing text instructions, reference images, and video contexts into a unified representation. The system delivers cinematic-quality content through efficient large-scale pre-training strategies and infrastructure optimizations. Comprehensive evaluations show exceptional in-context generation, reasoning-based editing, and multimodal instruction following capabilities, moving beyond content creation toward multimodal world simulators.</p><h3><strong>Adaptation of Agentic AI</strong></h3><p><strong><a href="https://arxiv.org/abs/2512.15431">https://arxiv.org/abs/2512.16301</a></strong> |<a href="https://github.com/pat-jj/Awesome-Adaptation-of-Agentic-AI"> </a><strong><a href="https://github.com/pat-jj/Awesome-Adaptation-of-Agentic-AI">GitHub</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!j03O!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0dd7a5f-818f-464f-8fd3-9a7639a42f23_980x365.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!j03O!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0dd7a5f-818f-464f-8fd3-9a7639a42f23_980x365.png 424w, https://substackcdn.com/image/fetch/$s_!j03O!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0dd7a5f-818f-464f-8fd3-9a7639a42f23_980x365.png 848w, https://substackcdn.com/image/fetch/$s_!j03O!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0dd7a5f-818f-464f-8fd3-9a7639a42f23_980x365.png 1272w, https://substackcdn.com/image/fetch/$s_!j03O!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0dd7a5f-818f-464f-8fd3-9a7639a42f23_980x365.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!j03O!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0dd7a5f-818f-464f-8fd3-9a7639a42f23_980x365.png" width="980" height="365" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f0dd7a5f-818f-464f-8fd3-9a7639a42f23_980x365.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:365,&quot;width&quot;:980,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!j03O!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0dd7a5f-818f-464f-8fd3-9a7639a42f23_980x365.png 424w, https://substackcdn.com/image/fetch/$s_!j03O!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0dd7a5f-818f-464f-8fd3-9a7639a42f23_980x365.png 848w, https://substackcdn.com/image/fetch/$s_!j03O!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0dd7a5f-818f-464f-8fd3-9a7639a42f23_980x365.png 1272w, https://substackcdn.com/image/fetch/$s_!j03O!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0dd7a5f-818f-464f-8fd3-9a7639a42f23_980x365.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>A systematic framework for understanding adaptation in agentic AI systems. The paper unifies the expanding research landscape into agent adaptations (tool-execution-signaled and agent-output-signaled) and tool adaptations (agent-agnostic and agent-supervised). This framework clarifies the design space of adaptation strategies, makes trade-offs explicit, and provides practical guidance for selecting or switching among strategies during system design. The comprehensive survey reviews representative approaches in each category, analyzes strengths and limitations, and highlights key open challenges for building more capable and reliable agentic systems.</p><h3><strong>Next-Embedding Prediction Makes Strong Vision Learners</strong></h3><p><strong><a href="https://arxiv.org/abs/2512.15431">https://arxiv.org/abs/2512.16922</a></strong> |<a href="https://github.com/SihanXU/nepa"> </a><strong><a href="https://github.com/SihanXU/nepa">GitHub</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!fdd9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4d45853-1e04-400c-b85f-71ef962a195a_1931x2048.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!fdd9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4d45853-1e04-400c-b85f-71ef962a195a_1931x2048.png 424w, https://substackcdn.com/image/fetch/$s_!fdd9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4d45853-1e04-400c-b85f-71ef962a195a_1931x2048.png 848w, https://substackcdn.com/image/fetch/$s_!fdd9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4d45853-1e04-400c-b85f-71ef962a195a_1931x2048.png 1272w, https://substackcdn.com/image/fetch/$s_!fdd9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4d45853-1e04-400c-b85f-71ef962a195a_1931x2048.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!fdd9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4d45853-1e04-400c-b85f-71ef962a195a_1931x2048.png" width="1456" height="1544" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d4d45853-1e04-400c-b85f-71ef962a195a_1931x2048.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1544,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!fdd9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4d45853-1e04-400c-b85f-71ef962a195a_1931x2048.png 424w, https://substackcdn.com/image/fetch/$s_!fdd9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4d45853-1e04-400c-b85f-71ef962a195a_1931x2048.png 848w, https://substackcdn.com/image/fetch/$s_!fdd9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4d45853-1e04-400c-b85f-71ef962a195a_1931x2048.png 1272w, https://substackcdn.com/image/fetch/$s_!fdd9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4d45853-1e04-400c-b85f-71ef962a195a_1931x2048.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Generative pretraining comes to computer vision through next-embedding prediction. Instead of training models to output features, NEPA trains them to generate embeddings for predictive tasks directly. Models learn to predict future patch embeddings conditioned on past ones using causal masking and stop gradient. A simple Transformer pretrained on ImageNet-1k with next embedding prediction as its sole objective achieves 83.8% and 85.3% top-1 accuracy with ViT-B and ViT-L backbones after fine-tuning. No pixel reconstruction, discrete tokens, contrastive loss, or task-specific heads required. The approach transfers effectively to semantic segmentation on ADE20K, suggesting generative pretraining from embeddings provides a simpler, scalable alternative to current visual self-supervised learning.</p><div><hr></div><h2><strong>&#127909; 4 Videos</strong></h2><h3><strong>The future of intelligence | Demis Hassabis (Co-founder and CEO of DeepMind)</strong></h3><div id="youtube2-PqVbypvxDto" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;PqVbypvxDto&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/PqVbypvxDto?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>DeepMind&#8217;s CEO discusses the current state and future directions of artificial intelligence in this 50-minute conversation. Demis Hassabis covers rapid advancements including Gemini 3 and world model development, progress on fundamental problems like AlphaFold, and ongoing efforts in material science, fusion, and quantum computing. The discussion explores AI&#8217;s paradoxical capabilities, its potential impact on mathematics, and balancing scientific research with commercial product development. Hassabis shares insights on scaling AI, addressing challenges like hallucinations, and the significance of simulated worlds for robotics and scientific discovery.</p><h3><strong>Game Physics Just Jumped A Generation</strong></h3><div id="youtube2-oToAGiozQF8" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;oToAGiozQF8&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/oToAGiozQF8?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>K&#225;roly Zsolnai-Feh&#233;r from Two Minute Papers covers a new &#8220;Domain Decomposition&#8221; method that enables real-time simulation of hyper-complex materials. The technique allows for the interaction of tens of thousands of vertices (think intricate cloth tearing or squishy &#8220;gummy&#8221; soft bodies) at interactive framerates. This represents a massive leap over previous standard solvers, which would struggle to render these interactions in real-time without exploding computationally. It&#8217;s a purely algorithmic breakthrough in physics simulation, achieving offline film-quality results in live environments without using AI approximations.</p><h3><strong>What does AI mean for education?</strong></h3><div id="youtube2-Uh98_aGhAuY" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;Uh98_aGhAuY&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/Uh98_aGhAuY?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Anthropic examines AI&#8217;s implications for educational systems and learning approaches. The discussion explores how AI capabilities are reshaping traditional educational paradigms, the opportunities for personalized learning, and considerations for implementing AI in educational contexts while maintaining pedagogical principles. The video provides frameworks for educators and institutions thinking about AI integration in learning environments.</p><h3><strong>What We Learned Deploying AI within Bloomberg&#8217;s Engineering Organization</strong></h3><div id="youtube2-Q81AzlA-VE8" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;Q81AzlA-VE8&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/Q81AzlA-VE8?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Lei Zhang from Bloomberg shares lessons from deploying AI across their engineering organization. The talk covers practical challenges of real-world AI adoption, organizational insights from implementing AI at scale, and what actually works versus what sounds good in theory. Bloomberg&#8217;s experience provides valuable perspectives on moving from AI prototypes to production systems that serve actual business needs across a large technology organization.</p><div><hr></div><h2><strong>&#128240; 3 Curated Reads</strong></h2><h3><strong>From Single GPU to Clusters: A Practical Journey into Distributed Training with PyTorch and Ray</strong></h3><p><strong><a href="https://www.youtube.com/watch?v=PqVbypvxDto">https://debnsuma.github.io/my-blog/posts/distributed-training-from-scratch/</a></strong></p><p>A hands-on guide to scaling deep learning from single GPU experiments to multi-node clusters. The post walks through the progression from local training to distributed setups using PyTorch and Ray, covering practical considerations for data parallelism, model parallelism, and orchestrating training across multiple machines. Particularly useful for practitioners hitting GPU memory limits or training time constraints who need to move beyond single-machine setups without getting lost in distributed systems complexity.</p><h3><strong>Prediction: AI will make formal verification go mainstream</strong></h3><p><strong><a href="https://www.youtube.com/watch?v=PqVbypvxDto">https://martin.kleppmann.com/2025/12/08/ai-formal-verification.html</a></strong></p><p>Martin Kleppmann argues that AI will finally bring formal verification into widespread use. The piece explores how large language models could lower the barrier to writing formal specifications and proofs, making techniques that were previously restricted to critical systems accessible for everyday software development. The analysis examines the current state of formal verification, why it hasn&#8217;t achieved mainstream adoption despite proven benefits, and how AI assistance could change the economics of using these tools.</p><h3><strong>Cross Entropy Loss</strong></h3><p><strong><a href="https://www.youtube.com/watch?v=PqVbypvxDto">https://cgnarendiran.github.io/blog/cross-entropy-loss/</a></strong></p><p>A clear, intuitive explanation of cross-entropy loss rooted in information theory. The post breaks down the concept of &#8220;surprise&#8221; (self-information) using a simple weather prediction analogy to explain why we penalize confident wrong answers so heavily. It walks through the mathematical connection between Entropy, Cross-Entropy, and KL Divergence, showing exactly how the loss function measures the distance between the predicted probability distribution and the actual ground truth. A great refresher for understanding the &#8220;why&#8221; behind the most common classification loss function.</p><div><hr></div><h2><strong>&#128736; 2 Tools &amp; Repos</strong></h2><h3><strong>AI Agents for Beginners</strong></h3><p><strong><a href="https://www.youtube.com/watch?v=PqVbypvxDto">https://github.com/microsoft/ai-agents-for-beginners</a></strong></p><p>Microsoft&#8217;s comprehensive introduction to building AI agents. The curriculum covers fundamental concepts, agent architectures, tool integration, and practical implementations. Designed for developers new to agentic AI, the course provides hands-on examples and clear explanations of core concepts needed to build working agents. The structured approach takes you from basic agent patterns to more sophisticated multi-agent systems.</p><h3><strong>Prompt Engineering Guide</strong></h3><p><strong><a href="https://www.youtube.com/watch?v=PqVbypvxDto">https://github.com/dair-ai/Prompt-Engineering-Guide</a></strong></p><p>DAIR.AI&#8217;s definitive resource for prompt engineering techniques. The guide covers fundamental prompting strategies, advanced techniques like chain-of-thought and tree-of-thought reasoning, domain-specific applications, and best practices for different model families. Regularly updated with the latest research and techniques, this remains one of the most comprehensive and well-maintained prompt engineering resources available.</p><div><hr></div><h2><strong>&#127891; 1 Pick of the Week</strong></h2><h3><strong>Career Advice in AI</strong></h3><div id="youtube2-AuZoDsNmG_s" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;AuZoDsNmG_s&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/AuZoDsNmG_s?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Andrew Ng shares strategic career advice for AI practitioners. The discussion covers how to build skills that compound over time, choosing between research and engineering paths, navigating the rapidly evolving AI landscape, and positioning yourself for long-term success in the field. Ng&#8217;s perspective comes from experience building AI teams at Google Brain and Baidu, founding Coursera and DeepLearning.AI, and advising countless AI careers. Particularly valuable for anyone making career decisions in AI or trying to understand where the field is heading.</p><div><hr></div><p><em>Thanks for reading The Tokenizer! If you found something useful here, share it with someone who might benefit. And if you want more curated insights like this, consider subscribing to Gradient Ascent.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.artofsaience.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.artofsaience.com/subscribe?"><span>Subscribe now</span></a></p>]]></content:encoded></item><item><title><![CDATA[OpenAI's Agent RL Secrets, Learn RAG From Scratch, and How 1% Fake Data Replaces 28% of ImageNet: The Tokenizer Edition #12]]></title><description><![CDATA[This week's most valuable AI resources]]></description><link>https://newsletter.artofsaience.com/p/openais-agent-rl-secrets-learn-rag</link><guid isPermaLink="false">https://newsletter.artofsaience.com/p/openais-agent-rl-secrets-learn-rag</guid><dc:creator><![CDATA[Sairam Sundaresan]]></dc:creator><pubDate>Sat, 13 Dec 2025 15:46:42 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!xqG1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb092f967-c224-4936-b6ec-4f24ff9008cb_2048x734.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey there! Remember when everyone said adding more agents would solve everything? Turns out they make systems 17x worse at propagating errors. Meanwhile, researchers are training vision models on data that literally isn&#8217;t images (and it works better than you&#8217;d think), and Nathan Lambert finally pulled back the curtain on how AI2 actually built a competitive reasoning model. The gap between what works in demos versus production keeps getting more interesting.</p><h3><strong>New here?</strong></h3><p><em>The Tokenizer is my resource-focused newsletter edition where I curate the best papers, videos, articles, tools, and learning resources from across the AI landscape. Consider it your weekly dose of everything you need to stay ahead in machine learning.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.artofsaience.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption"></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2><strong>TL;DR</strong></h2><p><strong>What caught my attention this week:</strong></p><ul><li><p><strong>&#128196; Papers: </strong>Teacher-free parallel reasoning with 4.6x speedups, self-supervised 3D vision that surpasses supervised baselines, quantitative scaling principles for agent architectures, procedural pretraining that replaces 28% of ImageNet data, and evidence that neural networks converge to universal weight subspaces</p></li><li><p><strong>&#127909; Videos: </strong>Nathan Lambert&#8217;s complete walkthrough of building OLMo 3 Think, OpenAI&#8217;s reinforcement fine-tuning approach for agents, comprehensive RAG from scratch, and an entire JAX AI stack masterclass</p></li><li><p><strong>&#128240; Reads: </strong>Waymo&#8217;s approach to demonstrably safe autonomous driving, continuous batching for LLM serving, and choosing the right GPUs for your AI workloads</p></li><li><p><strong>&#128736; Tools: </strong>Made with ML&#8217;s production-grade machine learning course and a comprehensive AI engineering toolkit</p></li><li><p><strong>&#127891; Learning: </strong>Stanford&#8217;s updated deep reinforcement learning course with fresh 2025 content</p></li></ul><div><hr></div><p>Grab my first book &#8212; <strong><a href="https://www.amazon.com/AI-Rest-Us-Illustrated-Introduction/dp/B0F29THNLT/">AI for the Rest of Us</a></strong> &#8212; today!</p><blockquote><p><strong>Quick note:</strong> <em>If you find the book useful, please leave a review on Amazon. It makes a world of difference. If you have a picture of the book IRL, please share it with me. I really appreciate it.</em></p></blockquote><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.amazon.com/AI-Rest-Us-Illustrated-Introduction/dp/B0F29THNLT/&quot;,&quot;text&quot;:&quot;Grab AI for the Rest of Us&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.amazon.com/AI-Rest-Us-Illustrated-Introduction/dp/B0F29THNLT/"><span>Grab AI for the Rest of Us</span></a></p><div><hr></div><h2><strong>&#128196; 5 Papers</strong></h2><h3><strong>Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning</strong></h3><p><strong><a href="https://arxiv.org/abs/2512.07461">https://arxiv.org/abs/2512.07461</a></strong> |<a href="https://github.com/bigai-nlco/Native-Parallel-Reasoner"> </a><strong><a href="https://github.com/bigai-nlco/Native-Parallel-Reasoner">GitHub</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HA9r!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F139b022e-6f39-4a6c-ad14-05f737fa42c3_997x250.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HA9r!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F139b022e-6f39-4a6c-ad14-05f737fa42c3_997x250.png 424w, https://substackcdn.com/image/fetch/$s_!HA9r!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F139b022e-6f39-4a6c-ad14-05f737fa42c3_997x250.png 848w, https://substackcdn.com/image/fetch/$s_!HA9r!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F139b022e-6f39-4a6c-ad14-05f737fa42c3_997x250.png 1272w, https://substackcdn.com/image/fetch/$s_!HA9r!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F139b022e-6f39-4a6c-ad14-05f737fa42c3_997x250.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HA9r!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F139b022e-6f39-4a6c-ad14-05f737fa42c3_997x250.png" width="997" height="250" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/139b022e-6f39-4a6c-ad14-05f737fa42c3_997x250.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:250,&quot;width&quot;:997,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!HA9r!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F139b022e-6f39-4a6c-ad14-05f737fa42c3_997x250.png 424w, https://substackcdn.com/image/fetch/$s_!HA9r!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F139b022e-6f39-4a6c-ad14-05f737fa42c3_997x250.png 848w, https://substackcdn.com/image/fetch/$s_!HA9r!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F139b022e-6f39-4a6c-ad14-05f737fa42c3_997x250.png 1272w, https://substackcdn.com/image/fetch/$s_!HA9r!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F139b022e-6f39-4a6c-ad14-05f737fa42c3_997x250.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Models that reason sequentially hit performance ceilings because early mistakes lock them into suboptimal paths. Native Parallel Reasoner teaches Qwen3-4B to explore multiple reasoning branches simultaneously without relying on teacher models. The system achieves performance gains up to 24.5% and inference speedups up to 4.6x across eight reasoning benchmarks through a self-distilled training pipeline that learns adaptive decomposition directly from experience. Unlike previous approaches that fall back to sequential generation when things get complex, NPR demonstrates 100% genuine parallel execution, establishing that models can learn to think in parallel rather than just simulating it.</p><h3><strong>E-RayZer: Self-supervised 3D Reconstruction as Spatial Visual Pre-training</strong></h3><p><strong><a href="https://arxiv.org/abs/2512.10950">https://arxiv.org/abs/2512.10950</a></strong> |<a href="https://github.com/QitaoZhao/E-RayZer"> </a><strong><a href="https://github.com/QitaoZhao/E-RayZer">GitHub</a></strong></p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;e79f1bcb-d2b7-4374-8afc-edd0b75eacdf&quot;,&quot;duration&quot;:null}"></div><p>Self-supervised learning transformed language models and 2D vision, but 3D understanding from multi-view images remained largely supervised. E-RayZer changes this by learning truly 3D-aware representations from unlabeled images, operating directly in 3D space with explicit Gaussian splatting rather than inferring 3D indirectly through view synthesis. The approach matches or surpasses fully supervised reconstruction models while significantly outperforming leading visual pre-training methods like DINOv3 and VideoMAE V2 on 3D downstream tasks. The key breakthrough is a fine-grained learning curriculum that organizes training from high-overlap views to general 3D understanding, solving the convergence challenges that plagued previous explicit 3D approaches.</p><h3><strong>Towards a Science of Scaling Agent Systems</strong></h3><p><strong><a href="https://arxiv.org/abs/2512.08296">https://arxiv.org/abs/2512.08296</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!uEAr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff026e046-a198-4984-b9ce-ca96b6974599_1106x415.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!uEAr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff026e046-a198-4984-b9ce-ca96b6974599_1106x415.png 424w, https://substackcdn.com/image/fetch/$s_!uEAr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff026e046-a198-4984-b9ce-ca96b6974599_1106x415.png 848w, https://substackcdn.com/image/fetch/$s_!uEAr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff026e046-a198-4984-b9ce-ca96b6974599_1106x415.png 1272w, https://substackcdn.com/image/fetch/$s_!uEAr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff026e046-a198-4984-b9ce-ca96b6974599_1106x415.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!uEAr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff026e046-a198-4984-b9ce-ca96b6974599_1106x415.png" width="1106" height="415" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f026e046-a198-4984-b9ce-ca96b6974599_1106x415.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:415,&quot;width&quot;:1106,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!uEAr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff026e046-a198-4984-b9ce-ca96b6974599_1106x415.png 424w, https://substackcdn.com/image/fetch/$s_!uEAr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff026e046-a198-4984-b9ce-ca96b6974599_1106x415.png 848w, https://substackcdn.com/image/fetch/$s_!uEAr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff026e046-a198-4984-b9ce-ca96b6974599_1106x415.png 1272w, https://substackcdn.com/image/fetch/$s_!uEAr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff026e046-a198-4984-b9ce-ca96b6974599_1106x415.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Adding more agents to a system often makes it worse, not better. This research derives quantitative scaling principles by evaluating 180 configurations across four benchmarks using five canonical architectures. The findings are sobering: independent agents amplify errors 17.2x through unchecked propagation, tool-heavy tasks suffer disproportionately from multi-agent overhead, and coordination yields diminishing returns once single-agent baselines exceed 45% success rates. The derived predictive model correctly identifies optimal coordination strategies for 87% of held-out configurations, turning agent architecture from guesswork into engineering. The lesson is clear: measure coordination efficiency, overhead, error amplification, and redundancy rather than assuming more agents equals better performance.</p><h3><strong>Can You Learn to See Without Images? Procedural Warm-Up for Vision Transformers</strong></h3><p><strong><a href="https://arxiv.org/abs/2511.13945">https://arxiv.org/abs/2511.13945</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Z9OA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1afa9732-dcd7-4dc4-bc96-2b1bed20cc9c_817x326.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Z9OA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1afa9732-dcd7-4dc4-bc96-2b1bed20cc9c_817x326.png 424w, https://substackcdn.com/image/fetch/$s_!Z9OA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1afa9732-dcd7-4dc4-bc96-2b1bed20cc9c_817x326.png 848w, https://substackcdn.com/image/fetch/$s_!Z9OA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1afa9732-dcd7-4dc4-bc96-2b1bed20cc9c_817x326.png 1272w, https://substackcdn.com/image/fetch/$s_!Z9OA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1afa9732-dcd7-4dc4-bc96-2b1bed20cc9c_817x326.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Z9OA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1afa9732-dcd7-4dc4-bc96-2b1bed20cc9c_817x326.png" width="817" height="326" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1afa9732-dcd7-4dc4-bc96-2b1bed20cc9c_817x326.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:326,&quot;width&quot;:817,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Z9OA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1afa9732-dcd7-4dc4-bc96-2b1bed20cc9c_817x326.png 424w, https://substackcdn.com/image/fetch/$s_!Z9OA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1afa9732-dcd7-4dc4-bc96-2b1bed20cc9c_817x326.png 848w, https://substackcdn.com/image/fetch/$s_!Z9OA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1afa9732-dcd7-4dc4-bc96-2b1bed20cc9c_817x326.png 1272w, https://substackcdn.com/image/fetch/$s_!Z9OA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1afa9732-dcd7-4dc4-bc96-2b1bed20cc9c_817x326.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Vision transformers can learn useful representations from data that isn&#8217;t visual at all. This research generates procedural data using formal grammars with zero semantic content, then uses it to pretrain vision transformers before standard image training. Allocating just 1% of the training budget to this procedural warm-up improves ImageNet-1k accuracy by 1.7%, equivalent to replacing 28% of the actual image data. The benefits arise from structured dependencies that help transformers internalize abstract computational priors rather than visual patterns. Unlike existing structured initializations that only affect attention weights, procedural warm-up acts on both attention and MLP layers, primarily in later layers where standard visual pretraining typically has minimal impact.</p><h3><strong>The Universal Weight Subspace Hypothesis</strong></h3><p><strong><a href="https://arxiv.org/abs/2512.05117">https://arxiv.org/abs/2512.05117</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!JJ8E!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14dfc601-3a67-4050-95bd-e981b03bca92_1521x868.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JJ8E!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14dfc601-3a67-4050-95bd-e981b03bca92_1521x868.png 424w, https://substackcdn.com/image/fetch/$s_!JJ8E!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14dfc601-3a67-4050-95bd-e981b03bca92_1521x868.png 848w, https://substackcdn.com/image/fetch/$s_!JJ8E!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14dfc601-3a67-4050-95bd-e981b03bca92_1521x868.png 1272w, https://substackcdn.com/image/fetch/$s_!JJ8E!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14dfc601-3a67-4050-95bd-e981b03bca92_1521x868.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JJ8E!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14dfc601-3a67-4050-95bd-e981b03bca92_1521x868.png" width="1456" height="831" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/14dfc601-3a67-4050-95bd-e981b03bca92_1521x868.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:831,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!JJ8E!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14dfc601-3a67-4050-95bd-e981b03bca92_1521x868.png 424w, https://substackcdn.com/image/fetch/$s_!JJ8E!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14dfc601-3a67-4050-95bd-e981b03bca92_1521x868.png 848w, https://substackcdn.com/image/fetch/$s_!JJ8E!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14dfc601-3a67-4050-95bd-e981b03bca92_1521x868.png 1272w, https://substackcdn.com/image/fetch/$s_!JJ8E!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14dfc601-3a67-4050-95bd-e981b03bca92_1521x868.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Neural networks trained on completely different tasks converge to remarkably similar low-dimensional parameter subspaces. Analysis of over 1,100 models including 500 Mistral-7B LoRAs, 500 Vision Transformers, and 50 LLaMA-8B models reveals that majority variance concentrates in just a few principal directions regardless of initialization, task, or domain. This finding explains why parameter-efficient fine-tuning works so well and enables massive model compression up to 100x. New tasks can be learned by optimizing scalar coefficients in the universal subspace rather than full weight matrices, suggesting that the vast majority of parameters in fine-tuned models are redundant. The implications for model reusability, multi-task learning, and training efficiency could fundamentally change how we approach large-scale neural networks.</p><div><hr></div><h2><strong>&#127909; 4 Videos</strong></h2><h3><strong>How We Built a Leading Reasoning Model (OLMo 3)</strong></h3><div id="youtube2-uaZ3yRdYg8A" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;uaZ3yRdYg8A&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/uaZ3yRdYg8A?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Nathan Lambert walks through every stage of building OLMo 3 Think, from pretraining through reinforcement learning infrastructure and evaluation. This isn&#8217;t a high-level overview but a detailed technical breakdown of the decisions, trade-offs, and engineering challenges involved in creating competitive reasoning models. The talk focuses heavily on RL infrastructure and evaluating reasoning capabilities, providing insights into how leading labs actually iterate on model training versus what appears in published papers. If you want to understand the gap between research papers and production reasoning models, this comprehensive walkthrough fills that void.</p><h3><strong>Agent Reinforcement Fine Tuning</strong></h3><div id="youtube2-p1CmPZ2j6Lk" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;p1CmPZ2j6Lk&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/p1CmPZ2j6Lk?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>OpenAI&#8217;s approach to reinforcement fine-tuning for code models, explained in depth. The video covers how to structure RL training for agent behaviors, balancing exploration with task completion, and the specific challenges that arise when models need to use tools and environments rather than just generate text. Particularly valuable for understanding how frontier labs think about agent training beyond standard language modeling objectives.</p><h3><strong>Learn RAG From Scratch</strong></h3><div id="youtube2-sVcwVQRHIc8" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;sVcwVQRHIc8&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/sVcwVQRHIc8?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>A LangChain software engineer teaches you to implement RAG from scratch using Python. This isn&#8217;t about using abstractions but understanding the mechanics: how retrieval actually works, why embeddings matter, when to chunk documents, and how to combine retrieved context with LLM generation. The course covers practical patterns you&#8217;ll need when building production systems that augment language models with custom knowledge.</p><h3><strong>Mastering the JAX AI Stack</strong></h3><p><strong><a href="https://youtube.com/playlist?list=PLOU2XLYxmsIJBcjiFi8LdyY5YGR8sz0ZZ">https://youtube.com/playlist?list=PLOU2XLYxmsIJBcjiFi8LdyY5YGR8sz0ZZ</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-YcE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55b380b8-0148-4e1c-ae6a-2437c80a1825_971x534.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-YcE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55b380b8-0148-4e1c-ae6a-2437c80a1825_971x534.png 424w, https://substackcdn.com/image/fetch/$s_!-YcE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55b380b8-0148-4e1c-ae6a-2437c80a1825_971x534.png 848w, https://substackcdn.com/image/fetch/$s_!-YcE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55b380b8-0148-4e1c-ae6a-2437c80a1825_971x534.png 1272w, https://substackcdn.com/image/fetch/$s_!-YcE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55b380b8-0148-4e1c-ae6a-2437c80a1825_971x534.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-YcE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55b380b8-0148-4e1c-ae6a-2437c80a1825_971x534.png" width="971" height="534" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/55b380b8-0148-4e1c-ae6a-2437c80a1825_971x534.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:534,&quot;width&quot;:971,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-YcE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55b380b8-0148-4e1c-ae6a-2437c80a1825_971x534.png 424w, https://substackcdn.com/image/fetch/$s_!-YcE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55b380b8-0148-4e1c-ae6a-2437c80a1825_971x534.png 848w, https://substackcdn.com/image/fetch/$s_!-YcE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55b380b8-0148-4e1c-ae6a-2437c80a1825_971x534.png 1272w, https://substackcdn.com/image/fetch/$s_!-YcE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55b380b8-0148-4e1c-ae6a-2437c80a1825_971x534.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Comprehensive guide to the JAX ecosystem centered on the Flax NNX library. The playlist takes you through JAX&#8217;s functional transformations, debugging techniques, scaling across distributed hardware with SPMD, optimization with Optax, checkpointing with Orbax, efficient data loading with Grain, and model serving with vLLM. Designed to bridge the gap for those familiar with PyTorch and NumPy, this provides everything you need to tackle advanced computational challenges with JAX at scale.</p><div><hr></div><h2><strong>&#128240; 3 Curated Reads</strong></h2><h3><strong>Demonstrably Safe AI For Autonomous Driving</strong></h3><p><strong><a href="https://waymo.com/blog/2025/12/demonstrably-safe-ai-for-autonomous-driving">https://waymo.com/blog/2025/12/demonstrably-safe-ai-for-autonomous-driving</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jmF7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bef5ef0-4eeb-4909-b2f1-3260af850b1e_1440x810.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jmF7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bef5ef0-4eeb-4909-b2f1-3260af850b1e_1440x810.png 424w, https://substackcdn.com/image/fetch/$s_!jmF7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bef5ef0-4eeb-4909-b2f1-3260af850b1e_1440x810.png 848w, https://substackcdn.com/image/fetch/$s_!jmF7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bef5ef0-4eeb-4909-b2f1-3260af850b1e_1440x810.png 1272w, https://substackcdn.com/image/fetch/$s_!jmF7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bef5ef0-4eeb-4909-b2f1-3260af850b1e_1440x810.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jmF7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bef5ef0-4eeb-4909-b2f1-3260af850b1e_1440x810.png" width="1440" height="810" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8bef5ef0-4eeb-4909-b2f1-3260af850b1e_1440x810.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:810,&quot;width&quot;:1440,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!jmF7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bef5ef0-4eeb-4909-b2f1-3260af850b1e_1440x810.png 424w, https://substackcdn.com/image/fetch/$s_!jmF7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bef5ef0-4eeb-4909-b2f1-3260af850b1e_1440x810.png 848w, https://substackcdn.com/image/fetch/$s_!jmF7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bef5ef0-4eeb-4909-b2f1-3260af850b1e_1440x810.png 1272w, https://substackcdn.com/image/fetch/$s_!jmF7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bef5ef0-4eeb-4909-b2f1-3260af850b1e_1440x810.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Waymo explains their approach to building AI systems that can be verified safe rather than just empirically tested. The piece covers how they structure safety cases, validate behavior in edge cases, and maintain safety guarantees as models evolve. Particularly relevant as AI systems take on higher-stakes applications where failures have real consequences beyond incorrect text generation.</p><h3><strong>Continuous Batching</strong></h3><p><strong><a href="https://huggingface.co/blog/continuous_batching">https://huggingface.co/blog/continuous_batching</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xqG1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb092f967-c224-4936-b6ec-4f24ff9008cb_2048x734.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xqG1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb092f967-c224-4936-b6ec-4f24ff9008cb_2048x734.png 424w, https://substackcdn.com/image/fetch/$s_!xqG1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb092f967-c224-4936-b6ec-4f24ff9008cb_2048x734.png 848w, https://substackcdn.com/image/fetch/$s_!xqG1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb092f967-c224-4936-b6ec-4f24ff9008cb_2048x734.png 1272w, https://substackcdn.com/image/fetch/$s_!xqG1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb092f967-c224-4936-b6ec-4f24ff9008cb_2048x734.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xqG1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb092f967-c224-4936-b6ec-4f24ff9008cb_2048x734.png" width="1456" height="522" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b092f967-c224-4936-b6ec-4f24ff9008cb_2048x734.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:522,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xqG1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb092f967-c224-4936-b6ec-4f24ff9008cb_2048x734.png 424w, https://substackcdn.com/image/fetch/$s_!xqG1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb092f967-c224-4936-b6ec-4f24ff9008cb_2048x734.png 848w, https://substackcdn.com/image/fetch/$s_!xqG1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb092f967-c224-4936-b6ec-4f24ff9008cb_2048x734.png 1272w, https://substackcdn.com/image/fetch/$s_!xqG1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb092f967-c224-4936-b6ec-4f24ff9008cb_2048x734.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Traditional batching waits for all sequences in a batch to finish before processing new requests. Continuous batching processes sequences at the token level, adding new requests as soon as existing ones complete. This dramatically improves throughput and reduces latency for LLM serving. HuggingFace&#8217;s breakdown explains why this matters for production deployments and how to implement it effectively with modern serving frameworks.</p><h3><strong>An Engineer&#8217;s Guide to Choosing GPUs</strong></h3><div class="embedded-post-wrap" data-attrs="{&quot;id&quot;:180877610,&quot;url&quot;:&quot;https://multimodalai.substack.com/p/an-ai-engineers-guide-to-choosing&quot;,&quot;publication_id&quot;:2799726,&quot;publication_name&quot;:&quot;Neural Bits&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!PLDd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4f14526-b620-40f9-9baf-78669ebe4997_1280x1280.png&quot;,&quot;title&quot;:&quot;An AI Engineer's Guide To Choosing GPUs&quot;,&quot;truncated_body_text&quot;:null,&quot;date&quot;:&quot;2025-12-07T14:02:40.493Z&quot;,&quot;like_count&quot;:18,&quot;comment_count&quot;:0,&quot;bylines&quot;:[{&quot;id&quot;:102147316,&quot;name&quot;:&quot;Alex Razvant&quot;,&quot;handle&quot;:&quot;arazvant&quot;,&quot;previous_name&quot;:&quot;Razvant Alexandru&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e98b89ac-97e9-4875-88b6-2a5039668cb2_1700x1700.png&quot;,&quot;bio&quot;:&quot;Senior AI Engineer | I work on large-scale Vision AI &amp; MLOps | I share practical industry insights for AI/ML Engineers, on building production-ready AI Systems.&quot;,&quot;profile_set_up_at&quot;:&quot;2023-02-27T08:06:55.191Z&quot;,&quot;reader_installed_at&quot;:&quot;2023-12-02T10:53:44.106Z&quot;,&quot;publicationUsers&quot;:[{&quot;id&quot;:2843638,&quot;user_id&quot;:102147316,&quot;publication_id&quot;:2799726,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:true,&quot;publication&quot;:{&quot;id&quot;:2799726,&quot;name&quot;:&quot;Neural Bits&quot;,&quot;subdomain&quot;:&quot;multimodalai&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Neural Bits specializes in AI/ML Engineering, sharing expert insights and hands-on advice for active and aspiring AI Builders.\n\nJoin thousands of other engineers and learn how to build complex AI Systems, end-to-end.&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f4f14526-b620-40f9-9baf-78669ebe4997_1280x1280.png&quot;,&quot;author_id&quot;:102147316,&quot;primary_user_id&quot;:102147316,&quot;theme_var_background_pop&quot;:&quot;#FF9900&quot;,&quot;created_at&quot;:&quot;2024-07-17T16:04:35.189Z&quot;,&quot;email_from_name&quot;:&quot;Alex Razvant @ Neural Bits&quot;,&quot;copyright&quot;:&quot;Alex Razvant&quot;,&quot;founding_plan_name&quot;:&quot;Founder&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;enabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false}}],&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null,&quot;status&quot;:{&quot;bestsellerTier&quot;:null,&quot;subscriberTier&quot;:null,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:null,&quot;paidPublicationIds&quot;:[],&quot;subscriber&quot;:null}}],&quot;utm_campaign&quot;:null,&quot;belowTheFold&quot;:true,&quot;type&quot;:&quot;newsletter&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPostToDOM"><a class="embedded-post" native="true" href="https://multimodalai.substack.com/p/an-ai-engineers-guide-to-choosing?utm_source=substack&amp;utm_campaign=post_embed&amp;utm_medium=web"><div class="embedded-post-header"><img class="embedded-post-publication-logo" src="https://substackcdn.com/image/fetch/$s_!PLDd!,w_56,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4f14526-b620-40f9-9baf-78669ebe4997_1280x1280.png" loading="lazy"><span class="embedded-post-publication-name">Neural Bits</span></div><div class="embedded-post-title-wrapper"><div class="embedded-post-title">An AI Engineer's Guide To Choosing GPUs</div></div><div class="embedded-post-cta-wrapper"><span class="embedded-post-cta">Read more</span></div><div class="embedded-post-meta">5 months ago &#183; 18 likes &#183; Alex Razvant</div></a></div><p><span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Alex Razvant&quot;,&quot;id&quot;:102147316,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e98b89ac-97e9-4875-88b6-2a5039668cb2_1700x1700.png&quot;,&quot;uuid&quot;:&quot;581738f7-b291-4dd2-937a-5985f78da8c0&quot;}" data-component-name="MentionToDOM"></span> provides practical guidance on selecting GPUs for AI workloads. The analysis goes beyond spec sheets to explain how memory bandwidth, compute capabilities, and interconnect topology actually affect training and inference performance. Covers trade-offs between consumer and data center GPUs, when to prioritize memory over compute, and how to evaluate cost-effectiveness for specific use cases.</p><div><hr></div><h2><strong>&#128736; 2 Tools &amp; Repos</strong></h2><h3><strong>Made with ML</strong></h3><p><strong><a href="https://github.com/GokuMohandas/Made-With-ML">https://github.com/GokuMohandas/Made-With-ML</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lpMl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d989cc6-1963-4b67-86cd-637b6c558583_2048x1369.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lpMl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d989cc6-1963-4b67-86cd-637b6c558583_2048x1369.png 424w, https://substackcdn.com/image/fetch/$s_!lpMl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d989cc6-1963-4b67-86cd-637b6c558583_2048x1369.png 848w, https://substackcdn.com/image/fetch/$s_!lpMl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d989cc6-1963-4b67-86cd-637b6c558583_2048x1369.png 1272w, https://substackcdn.com/image/fetch/$s_!lpMl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d989cc6-1963-4b67-86cd-637b6c558583_2048x1369.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lpMl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d989cc6-1963-4b67-86cd-637b6c558583_2048x1369.png" width="1456" height="973" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5d989cc6-1963-4b67-86cd-637b6c558583_2048x1369.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:973,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!lpMl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d989cc6-1963-4b67-86cd-637b6c558583_2048x1369.png 424w, https://substackcdn.com/image/fetch/$s_!lpMl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d989cc6-1963-4b67-86cd-637b6c558583_2048x1369.png 848w, https://substackcdn.com/image/fetch/$s_!lpMl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d989cc6-1963-4b67-86cd-637b6c558583_2048x1369.png 1272w, https://substackcdn.com/image/fetch/$s_!lpMl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d989cc6-1963-4b67-86cd-637b6c558583_2048x1369.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Goku Mohandas built a complete production machine learning course covering everything from experiment tracking and data versioning through deployment and monitoring. The repository provides working code for MLOps patterns you&#8217;ll actually use: CI/CD for models, feature stores, model registries, and production monitoring. Designed for practitioners who need to move models from notebooks to reliable systems that serve real users.</p><h3><strong>AI Engineering Toolkit</strong></h3><p><strong><a href="https://github.com/Sumanth077/ai-engineering-toolkit">https://github.com/Sumanth077/ai-engineering-toolkit</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EhM6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb018fcdc-b107-4978-bc84-c98bfd0dcdfe_2048x819.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EhM6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb018fcdc-b107-4978-bc84-c98bfd0dcdfe_2048x819.jpeg 424w, https://substackcdn.com/image/fetch/$s_!EhM6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb018fcdc-b107-4978-bc84-c98bfd0dcdfe_2048x819.jpeg 848w, https://substackcdn.com/image/fetch/$s_!EhM6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb018fcdc-b107-4978-bc84-c98bfd0dcdfe_2048x819.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!EhM6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb018fcdc-b107-4978-bc84-c98bfd0dcdfe_2048x819.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EhM6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb018fcdc-b107-4978-bc84-c98bfd0dcdfe_2048x819.jpeg" width="1456" height="582" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b018fcdc-b107-4978-bc84-c98bfd0dcdfe_2048x819.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:582,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!EhM6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb018fcdc-b107-4978-bc84-c98bfd0dcdfe_2048x819.jpeg 424w, https://substackcdn.com/image/fetch/$s_!EhM6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb018fcdc-b107-4978-bc84-c98bfd0dcdfe_2048x819.jpeg 848w, https://substackcdn.com/image/fetch/$s_!EhM6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb018fcdc-b107-4978-bc84-c98bfd0dcdfe_2048x819.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!EhM6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb018fcdc-b107-4978-bc84-c98bfd0dcdfe_2048x819.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Comprehensive collection of resources and tools for AI engineering, organized by use case. Instead of overwhelming you with every possible library, this toolkit focuses on battle-tested tools for specific problems: prompt engineering, vector databases, model deployment, evaluation frameworks, and monitoring solutions. Helpful for navigating the rapidly evolving ecosystem and finding tools that solve your specific engineering challenges.</p><div><hr></div><h2><strong>&#127891; 1 Pick of the Week</strong></h2><h3><strong>Stanford&#8217;s Deep Reinforcement Learning</strong></h3><p><strong><a href="https://youtube.com/playlist?list=PLoROMvodv4rPwxE0ONYRa_itZFdaKCylL">https://youtube.com/playlist?list=PLoROMvodv4rPwxE0ONYRa_itZFdaKCylL</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_hHz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd67c6dc9-e0c7-4be6-b03f-0127ad3f8d59_1206x778.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_hHz!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd67c6dc9-e0c7-4be6-b03f-0127ad3f8d59_1206x778.png 424w, https://substackcdn.com/image/fetch/$s_!_hHz!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd67c6dc9-e0c7-4be6-b03f-0127ad3f8d59_1206x778.png 848w, https://substackcdn.com/image/fetch/$s_!_hHz!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd67c6dc9-e0c7-4be6-b03f-0127ad3f8d59_1206x778.png 1272w, https://substackcdn.com/image/fetch/$s_!_hHz!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd67c6dc9-e0c7-4be6-b03f-0127ad3f8d59_1206x778.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_hHz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd67c6dc9-e0c7-4be6-b03f-0127ad3f8d59_1206x778.png" width="1206" height="778" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d67c6dc9-e0c7-4be6-b03f-0127ad3f8d59_1206x778.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:778,&quot;width&quot;:1206,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_hHz!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd67c6dc9-e0c7-4be6-b03f-0127ad3f8d59_1206x778.png 424w, https://substackcdn.com/image/fetch/$s_!_hHz!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd67c6dc9-e0c7-4be6-b03f-0127ad3f8d59_1206x778.png 848w, https://substackcdn.com/image/fetch/$s_!_hHz!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd67c6dc9-e0c7-4be6-b03f-0127ad3f8d59_1206x778.png 1272w, https://substackcdn.com/image/fetch/$s_!_hHz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd67c6dc9-e0c7-4be6-b03f-0127ad3f8d59_1206x778.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Stanford updated their deep RL course for 2025, covering policy gradients, actor-critic methods, model-based RL, and advanced topics like offline RL and multi-agent systems. The lectures balance theoretical foundations with practical implementation details, explaining not just how algorithms work but when to use them and why they fail. Essential viewing if you&#8217;re working with RL for agents, robotics, or any sequential decision-making problem where you need to understand the fundamentals rather than just applying libraries.</p><div><hr></div><p><em>Thanks for reading The Tokenizer! If you found something useful here, share it with someone who might benefit. And if you want more curated insights like this, consider subscribing to Gradient Ascent.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.artofsaience.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption"></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Ilya on the Scaling Limits, Build DeepSeek From Scratch, and Why Agents Fail 80% of Real Tasks: The Tokenizer Edition #11]]></title><description><![CDATA[This week's most valuable AI resources]]></description><link>https://newsletter.artofsaience.com/p/ilya-on-the-scaling-limits-build</link><guid isPermaLink="false">https://newsletter.artofsaience.com/p/ilya-on-the-scaling-limits-build</guid><dc:creator><![CDATA[Sairam Sundaresan]]></dc:creator><pubDate>Sat, 06 Dec 2025 16:15:19 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/KnCRTP11p5U" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey there! The gap between what we think AI agents can do and what they actually accomplish just got measured, and the results are humbling. Turns out data agents succeed at realistic enterprise-style data engineering workflows less than 20% of the time. Meanwhile, researchers found a dead-simple attention mechanism tweak that mitigates multiple transformer problems at once. Sometimes progress looks less like breakthroughs and more like finally admitting what doesn&#8217;t work.</p><h3><strong>New here?</strong></h3><p><em>The Tokenizer is my resource-focused newsletter edition where I curate the best papers, videos, articles, tools, and learning resources from across the AI landscape. Consider it your weekly dose of everything you need to stay ahead in machine learning.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.artofsaience.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.artofsaience.com/subscribe?"><span>Subscribe now</span></a></p><h2><strong>TL;DR</strong></h2><p><strong>What caught my attention this week:</strong></p><ul><li><p><strong>&#128196; Papers:</strong> Comprehensive surveys exposing agent architecture confusion, hidden learning dynamics challenging loss curves, rendering breakthroughs making 3D reconstruction practical, and benchmarks revealing where data agents actually fail</p></li><li><p><strong>&#127909; Videos:</strong> Real-world AI engineering wisdom from transformers vs CNNs debates, production codebase strategies that work, and Ilya Sutskever on why we&#8217;re shifting from scaling to research</p></li><li><p><strong>&#128240; Reads:</strong> Model serving fundamentals, nanochat meets HuggingFace, and practical ML interview preparation</p></li><li><p><strong>&#128736; Tools:</strong> Debugging research papers systematically, and comprehensive Nano Banana resources</p></li><li><p><strong>&#127891; Learning:</strong> Building DeepSeek from scratch in a structured video series</p></li></ul><div><hr></div><p>Grab my first book &#8212; <strong><a href="https://www.amazon.com/AI-Rest-Us-Illustrated-Introduction/dp/B0F29THNLT/">AI for the Rest of Us</a></strong> &#8212; today!</p><blockquote><p><em><strong>Quick note:</strong> If you find the book useful, please leave a review on Amazon. It makes a world of difference. If you have a picture of the book IRL, please share it with me. I really appreciate it.</em></p></blockquote><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.amazon.com/AI-Rest-Us-Illustrated-Introduction/dp/B0F29THNLT/&quot;,&quot;text&quot;:&quot;Order AI for the Rest of Us Today!&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.amazon.com/AI-Rest-Us-Illustrated-Introduction/dp/B0F29THNLT/"><span>Order AI for the Rest of Us Today!</span></a></p><div><hr></div><h2><strong>&#128196; 5 Papers</strong></h2><h3><strong>Agentic AI: A Comprehensive Survey of Architectures, Applications, and Future Directions</strong></h3><p><strong><a href="https://arxiv.org/abs/2510.25445">https://arxiv.org/abs/2510.25445</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!pAfy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6ba4eb4-ab60-4271-a59f-d63bff1e47f5_1080x398.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!pAfy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6ba4eb4-ab60-4271-a59f-d63bff1e47f5_1080x398.png 424w, https://substackcdn.com/image/fetch/$s_!pAfy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6ba4eb4-ab60-4271-a59f-d63bff1e47f5_1080x398.png 848w, https://substackcdn.com/image/fetch/$s_!pAfy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6ba4eb4-ab60-4271-a59f-d63bff1e47f5_1080x398.png 1272w, https://substackcdn.com/image/fetch/$s_!pAfy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6ba4eb4-ab60-4271-a59f-d63bff1e47f5_1080x398.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!pAfy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6ba4eb4-ab60-4271-a59f-d63bff1e47f5_1080x398.png" width="1080" height="398" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d6ba4eb4-ab60-4271-a59f-d63bff1e47f5_1080x398.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:398,&quot;width&quot;:1080,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!pAfy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6ba4eb4-ab60-4271-a59f-d63bff1e47f5_1080x398.png 424w, https://substackcdn.com/image/fetch/$s_!pAfy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6ba4eb4-ab60-4271-a59f-d63bff1e47f5_1080x398.png 848w, https://substackcdn.com/image/fetch/$s_!pAfy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6ba4eb4-ab60-4271-a59f-d63bff1e47f5_1080x398.png 1272w, https://substackcdn.com/image/fetch/$s_!pAfy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6ba4eb4-ab60-4271-a59f-d63bff1e47f5_1080x398.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The field keeps retrofitting modern LLM-based agents into outdated symbolic frameworks, creating massive confusion about how these systems actually work. This survey addresses this by introducing a dual-paradigm framework that separates symbolic/classical systems (algorithmic planning, persistent state) from neural/generative ones (stochastic generation, prompt orchestration). Through analyzing 90 studies from 2018 to 2025, the authors reveal how paradigm choice is strategic: symbolic dominates safety-critical healthcare, neural prevails in adaptive finance. The real contribution is identifying critical gaps, including governance deficits for symbolic systems and the pressing need for hybrid neuro-symbolic architectures.</p><h3><strong>Quiet Feature Learning in Algorithmic Tasks</strong></h3><p><strong><a href="https://arxiv.org/abs/2505.03997">https://arxiv.org/abs/2505.03997</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LLtk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe129e993-cfa7-41e5-a1a5-dcb00cb47bfb_756x464.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LLtk!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe129e993-cfa7-41e5-a1a5-dcb00cb47bfb_756x464.png 424w, https://substackcdn.com/image/fetch/$s_!LLtk!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe129e993-cfa7-41e5-a1a5-dcb00cb47bfb_756x464.png 848w, https://substackcdn.com/image/fetch/$s_!LLtk!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe129e993-cfa7-41e5-a1a5-dcb00cb47bfb_756x464.png 1272w, https://substackcdn.com/image/fetch/$s_!LLtk!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe129e993-cfa7-41e5-a1a5-dcb00cb47bfb_756x464.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LLtk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe129e993-cfa7-41e5-a1a5-dcb00cb47bfb_756x464.png" width="756" height="464" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e129e993-cfa7-41e5-a1a5-dcb00cb47bfb_756x464.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:464,&quot;width&quot;:756,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LLtk!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe129e993-cfa7-41e5-a1a5-dcb00cb47bfb_756x464.png 424w, https://substackcdn.com/image/fetch/$s_!LLtk!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe129e993-cfa7-41e5-a1a5-dcb00cb47bfb_756x464.png 848w, https://substackcdn.com/image/fetch/$s_!LLtk!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe129e993-cfa7-41e5-a1a5-dcb00cb47bfb_756x464.png 1272w, https://substackcdn.com/image/fetch/$s_!LLtk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe129e993-cfa7-41e5-a1a5-dcb00cb47bfb_756x464.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Loss curves lie. Transformers trained on algorithmic tasks show phase transitions where validation loss barely budges across massive compute ranges, then suddenly drops. Probing internal representations reveals quiet features being learned before any loss improvement - intermediate computations causally necessary for performance but invisible to output metrics. Ablation experiments prove individual quiet features are essential, strongly challenging the assumption that cross-entropy reliably tracks learning progress. Substantial representational progress happens beneath flat loss curves, demanding richer diagnostics for monitoring training.</p><h3><strong>Radiance Meshes for Volumetric Reconstruction</strong></h3><p><strong><a href="https://arxiv.org/abs/2512.04076">https://arxiv.org/abs/2512.04076</a> |<a href="https://github.com/half-potato/radiance_meshes"> GitHub</a></strong></p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;8a5c7116-2cf9-44f6-a298-63d9efdf5f01&quot;,&quot;duration&quot;:null}"></div><p>Radiance fields finally get a representation that hardware actually likes. This work uses Delaunay tetrahedralization to create constant-density tetrahedral cells that render using native triangle rasterization instead of complex ray-tracing. The method handles topological discontinuities from optimizing vertex positions with a Zip-NeRF-style backbone, maintaining smooth fields despite topology changes. Real-time view synthesis on consumer hardware with exact volume rendering equation evaluation - faster than all prior radiance field representations at equivalent primitive counts.</p><h3><strong>Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free</strong></h3><p><strong><a href="https://arxiv.org/abs/2505.06708">https://arxiv.org/abs/2505.06708</a> |<a href="https://github.com/qiuzh20/gated_attention"> GitHub</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dDVA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe984abe2-2218-44b0-9236-b61ecd43cf00_1090x436.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dDVA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe984abe2-2218-44b0-9236-b61ecd43cf00_1090x436.png 424w, https://substackcdn.com/image/fetch/$s_!dDVA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe984abe2-2218-44b0-9236-b61ecd43cf00_1090x436.png 848w, https://substackcdn.com/image/fetch/$s_!dDVA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe984abe2-2218-44b0-9236-b61ecd43cf00_1090x436.png 1272w, https://substackcdn.com/image/fetch/$s_!dDVA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe984abe2-2218-44b0-9236-b61ecd43cf00_1090x436.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dDVA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe984abe2-2218-44b0-9236-b61ecd43cf00_1090x436.png" width="1090" height="436" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e984abe2-2218-44b0-9236-b61ecd43cf00_1090x436.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:436,&quot;width&quot;:1090,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dDVA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe984abe2-2218-44b0-9236-b61ecd43cf00_1090x436.png 424w, https://substackcdn.com/image/fetch/$s_!dDVA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe984abe2-2218-44b0-9236-b61ecd43cf00_1090x436.png 848w, https://substackcdn.com/image/fetch/$s_!dDVA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe984abe2-2218-44b0-9236-b61ecd43cf00_1090x436.png 1272w, https://substackcdn.com/image/fetch/$s_!dDVA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe984abe2-2218-44b0-9236-b61ecd43cf00_1090x436.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Applying a head-specific sigmoid gate after attention output fixes multiple transformer pathologies simultaneously. Tested across 30 variants of 15B MoE models and 1.7B dense models on 3.5 trillion tokens, this simple modification introduces non-linearity, breaking the low-rank bottleneck while enabling query-dependent sparsity. The sparse gating eliminates attention sink phenomena where early tokens absorb unwanted probability mass, enhances training stability for larger learning rates, and improves long-context extrapolation. The effectiveness comes from two factors: non-linearity on the value-output transformation and input-dependent sparse modulation of attention outputs.</p><h3><strong>DAComp: Benchmarking Data Agents across the Full Data Intelligence Lifecycle</strong></h3><p><strong><a href="https://arxiv.org/abs/2512.04324">https://arxiv.org/abs/2512.04324</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!QLCM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0110315-fd25-4e07-b6ed-7c776da7eb86_2048x842.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QLCM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0110315-fd25-4e07-b6ed-7c776da7eb86_2048x842.png 424w, https://substackcdn.com/image/fetch/$s_!QLCM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0110315-fd25-4e07-b6ed-7c776da7eb86_2048x842.png 848w, https://substackcdn.com/image/fetch/$s_!QLCM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0110315-fd25-4e07-b6ed-7c776da7eb86_2048x842.png 1272w, https://substackcdn.com/image/fetch/$s_!QLCM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0110315-fd25-4e07-b6ed-7c776da7eb86_2048x842.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QLCM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0110315-fd25-4e07-b6ed-7c776da7eb86_2048x842.png" width="1456" height="599" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d0110315-fd25-4e07-b6ed-7c776da7eb86_2048x842.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:599,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!QLCM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0110315-fd25-4e07-b6ed-7c776da7eb86_2048x842.png 424w, https://substackcdn.com/image/fetch/$s_!QLCM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0110315-fd25-4e07-b6ed-7c776da7eb86_2048x842.png 848w, https://substackcdn.com/image/fetch/$s_!QLCM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0110315-fd25-4e07-b6ed-7c776da7eb86_2048x842.png 1272w, https://substackcdn.com/image/fetch/$s_!QLCM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0110315-fd25-4e07-b6ed-7c776da7eb86_2048x842.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Enterprise data work is a closed loop, not isolated SQL generation. This benchmark of 210 tasks evaluates agents on repository-level data engineering and open-ended analysis that mirrors actual workflows. Data engineering tasks require designing multi-stage SQL pipelines from scratch and evolving systems under changing requirements - often involving 4,000+ lines across 30+ files. State-of-the-art agents achieve under 20% success on engineering tasks, exposing failures in pipeline orchestration, not code generation. Data analysis tasks average below 40%, proving that engineering and analysis require distinct capabilities. The holistic evaluation reveals where autonomous data agents actually break down in production settings.</p><div><hr></div><h2><strong>&#127909; 4 Videos</strong></h2><h3><strong>Why are Transformers replacing CNNs?</strong></h3><div id="youtube2-KnCRTP11p5U" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;KnCRTP11p5U&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/KnCRTP11p5U?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Julie Turc breaks down why transformers classify images differently than ResNets, despite CNNs being explicitly designed for vision. The video compares convolution versus self-attention, explains CNNs&#8217; inductive biases (locality, translation invariance, hierarchical features), and demonstrates why self-attention can be more expressive than convolution. You&#8217;ll see how attention can exactly implement convolutional kernels using relative positional encodings, making the transition from CNNs to vision transformers less mysterious and more mechanistically grounded.</p><h3><strong>No Vibes Allowed: Solving Hard Problems in Complex Codebases</strong></h3><div id="youtube2-rmvDxxNubIg" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;rmvDxxNubIg&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/rmvDxxNubIg?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Dex Horthy addresses the productivity paradox where AI coding tools excel at new projects but often make developers less productive in large, established codebases. The solution isn&#8217;t waiting for smarter models - it&#8217;s context engineering. This talk demonstrates techniques for getting Claude Code to handle 300k LOC Rust codebases and ship a week&#8217;s work in a day while maintaining code quality. The &#8220;frequent intentional compaction&#8221; family of techniques systematically structures how you feed context throughout development, making agents effective in real production environments.</p><h3><strong>VoiceVision RAG - Integrating Visual Document Intelligence with Voice Response</strong></h3><div id="youtube2-hwCmfThIiS4" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;hwCmfThIiS4&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/hwCmfThIiS4?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Suman Debnath demonstrates integrating Colpali&#8217;s vision-based retrieval with voice synthesis for next-generation RAG systems. Colpali generates multi-vector embeddings directly from document images, bypassing OCR and complex preprocessing while handling mixed textual and visual information. Adding voice output creates more intuitive and accessible user experiences. The workshop shows how this combination handles documents traditional RAG struggles with, leading to more efficient retrieval with natural voice responses.</p><h3><strong>We&#8217;re moving from the age of scaling to the age of research</strong></h3><div id="youtube2-aR20FWCCjAs" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;aR20FWCCjAs&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/aR20FWCCjAs?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Ilya Sutskever and Dwarkesh Patel discuss SSI&#8217;s strategy and the fundamental shift in AI development. Ilya explains problems with current pre-training approaches, how to improve model generalization, and ensuring AGI development goes well. The conversation covers why simple scaling has limitations and what research-focused approaches might unlock next. Essential viewing for understanding how leading researchers think about the transition from brute-force scaling to more sophisticated development strategies.</p><div><hr></div><h2><strong>&#128240; 3 Curated Reads</strong></h2><h3><strong>Model Serving Engineering: A Primer</strong></h3><div class="embedded-post-wrap" data-attrs="{&quot;id&quot;:179089422,&quot;url&quot;:&quot;https://mlfrontiers.substack.com/p/model-serving-engineering-a-primer&quot;,&quot;publication_id&quot;:920647,&quot;publication_name&quot;:&quot;Machine Learning Frontiers&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!mX0K!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76fb3173-90e4-4925-9679-6652099fff88_701x701.png&quot;,&quot;title&quot;:&quot;Model Serving Engineering: A Primer&quot;,&quot;truncated_body_text&quot;:&quot;From an academic point of view, the goal of ML research is to build a model that predicts a real-world phenomenon better than the current state of the art. Consequently, most ML papers end with a leaderboard table showing that the new model beats a handful of competitors on some benchmark. That&#8217;s all fine and well, but in the real world, this is where t&#8230;&quot;,&quot;date&quot;:&quot;2025-12-05T02:17:43.863Z&quot;,&quot;like_count&quot;:5,&quot;comment_count&quot;:1,&quot;bylines&quot;:[{&quot;id&quot;:48115509,&quot;name&quot;:&quot;Samuel Flender&quot;,&quot;handle&quot;:&quot;mlfrontiers&quot;,&quot;previous_name&quot;:null,&quot;photo_url&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/f604f32f-d0a3-4c43-9549-aeb618651690_480x600.jpeg&quot;,&quot;bio&quot;:&quot;Physics PhD turned ML Engineer&quot;,&quot;profile_set_up_at&quot;:&quot;2022-06-04T18:58:59.947Z&quot;,&quot;reader_installed_at&quot;:&quot;2023-10-22T03:07:21.808Z&quot;,&quot;publicationUsers&quot;:[{&quot;id&quot;:863485,&quot;user_id&quot;:48115509,&quot;publication_id&quot;:920647,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:true,&quot;publication&quot;:{&quot;id&quot;:920647,&quot;name&quot;:&quot;Machine Learning Frontiers&quot;,&quot;subdomain&quot;:&quot;mlfrontiers&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Real-life lessons on ML theory, practice, and careers from a seasoned ML engineer. This is the stuff you won't learn in \&quot;standard\&quot; ML curricula.&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/76fb3173-90e4-4925-9679-6652099fff88_701x701.png&quot;,&quot;author_id&quot;:48115509,&quot;primary_user_id&quot;:48115509,&quot;theme_var_background_pop&quot;:&quot;#EA82FF&quot;,&quot;created_at&quot;:&quot;2022-06-04T19:00:14.799Z&quot;,&quot;email_from_name&quot;:null,&quot;copyright&quot;:&quot;Samuel Flender&quot;,&quot;founding_plan_name&quot;:&quot;\&quot;I can expense it\&quot;&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;enabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:null,&quot;is_personal_mode&quot;:false}}],&quot;twitter_screen_name&quot;:&quot;samflender&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null,&quot;status&quot;:{&quot;bestsellerTier&quot;:null,&quot;subscriberTier&quot;:1,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:{&quot;type&quot;:&quot;subscriber&quot;,&quot;tier&quot;:1,&quot;accent_colors&quot;:null},&quot;paidPublicationIds&quot;:[458709],&quot;subscriber&quot;:null}}],&quot;utm_campaign&quot;:null,&quot;belowTheFold&quot;:true,&quot;type&quot;:&quot;newsletter&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPostToDOM"><a class="embedded-post" native="true" href="https://mlfrontiers.substack.com/p/model-serving-engineering-a-primer?utm_source=substack&amp;utm_campaign=post_embed&amp;utm_medium=web"><div class="embedded-post-header"><img class="embedded-post-publication-logo" src="https://substackcdn.com/image/fetch/$s_!mX0K!,w_56,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76fb3173-90e4-4925-9679-6652099fff88_701x701.png" loading="lazy"><span class="embedded-post-publication-name">Machine Learning Frontiers</span></div><div class="embedded-post-title-wrapper"><div class="embedded-post-title">Model Serving Engineering: A Primer</div></div><div class="embedded-post-body">From an academic point of view, the goal of ML research is to build a model that predicts a real-world phenomenon better than the current state of the art. Consequently, most ML papers end with a leaderboard table showing that the new model beats a handful of competitors on some benchmark. That&#8217;s all fine and well, but in the real world, this is where t&#8230;</div><div class="embedded-post-cta-wrapper"><span class="embedded-post-cta">Read more</span></div><div class="embedded-post-meta">5 months ago &#183; 5 likes &#183; 1 comment &#183; Samuel Flender</div></a></div><p><span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Samuel Flender&quot;,&quot;id&quot;:48115509,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/f604f32f-d0a3-4c43-9549-aeb618651690_480x600.jpeg&quot;,&quot;uuid&quot;:&quot;4315d0b9-88e6-43bf-bde1-5ea37667077a&quot;}" data-component-name="MentionToDOM"></span> covers the engineering fundamentals of serving ML models in production. The piece addresses infrastructure design, latency optimization, scaling strategies, and reliability patterns that matter when your model faces real traffic. Helpful for practitioners moving from research to deployment who need to understand the operational layer between training and user-facing applications.</p><h3><strong>Porting nanochat to Transformers: an AI modeling history lesson</strong></h3><p><strong><a href="https://huggingface.co/spaces/nanochat-students/transformers#what-is-nanochat">https://huggingface.co/spaces/nanochat-students/transformers#what-is-nanochat</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-pUt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64249921-c9cd-46ad-a0aa-d0012120daba_1920x1080.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-pUt!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64249921-c9cd-46ad-a0aa-d0012120daba_1920x1080.png 424w, https://substackcdn.com/image/fetch/$s_!-pUt!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64249921-c9cd-46ad-a0aa-d0012120daba_1920x1080.png 848w, https://substackcdn.com/image/fetch/$s_!-pUt!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64249921-c9cd-46ad-a0aa-d0012120daba_1920x1080.png 1272w, https://substackcdn.com/image/fetch/$s_!-pUt!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64249921-c9cd-46ad-a0aa-d0012120daba_1920x1080.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-pUt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64249921-c9cd-46ad-a0aa-d0012120daba_1920x1080.png" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/64249921-c9cd-46ad-a0aa-d0012120daba_1920x1080.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-pUt!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64249921-c9cd-46ad-a0aa-d0012120daba_1920x1080.png 424w, https://substackcdn.com/image/fetch/$s_!-pUt!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64249921-c9cd-46ad-a0aa-d0012120daba_1920x1080.png 848w, https://substackcdn.com/image/fetch/$s_!-pUt!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64249921-c9cd-46ad-a0aa-d0012120daba_1920x1080.png 1272w, https://substackcdn.com/image/fetch/$s_!-pUt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64249921-c9cd-46ad-a0aa-d0012120daba_1920x1080.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The HuggingFace team walks through porting nanochat to the Transformers library, using the exercise to explain transformer architecture evolution and implementation decisions. You&#8217;ll see how historical choices in model design translate to modern library conventions, making the abstraction layers more transparent. Good for understanding what&#8217;s actually happening under the hood when you call a transformers model.</p><h3><strong>Interview Prep: The ML Grind</strong></h3><p><strong><a href="https://twopug.com/interview-prep-ml-grind/">https://twopug.com/interview-prep-ml-grind/</a></strong></p><p>Jenya provides practical guidance for ML interview preparation covering the technical foundations, coding patterns, and system design knowledge companies actually test. The resource focuses on what works for getting through technical screens rather than comprehensive ML education, making it useful if you&#8217;re actively interviewing or want to identify gaps in applied knowledge.</p><div><hr></div><h2><strong>&#128736; 2 Tools &amp; Repos</strong></h2><h3><strong>Paper Debugger</strong></h3><p><strong><a href="https://github.com/PaperDebugger/PaperDebugger">https://github.com/PaperDebugger/PaperDebugger</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GAv9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F422e7d05-9b31-464c-af17-a9c7a034d82b_1184x1192.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GAv9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F422e7d05-9b31-464c-af17-a9c7a034d82b_1184x1192.png 424w, https://substackcdn.com/image/fetch/$s_!GAv9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F422e7d05-9b31-464c-af17-a9c7a034d82b_1184x1192.png 848w, https://substackcdn.com/image/fetch/$s_!GAv9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F422e7d05-9b31-464c-af17-a9c7a034d82b_1184x1192.png 1272w, https://substackcdn.com/image/fetch/$s_!GAv9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F422e7d05-9b31-464c-af17-a9c7a034d82b_1184x1192.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GAv9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F422e7d05-9b31-464c-af17-a9c7a034d82b_1184x1192.png" width="1184" height="1192" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/422e7d05-9b31-464c-af17-a9c7a034d82b_1184x1192.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1192,&quot;width&quot;:1184,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GAv9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F422e7d05-9b31-464c-af17-a9c7a034d82b_1184x1192.png 424w, https://substackcdn.com/image/fetch/$s_!GAv9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F422e7d05-9b31-464c-af17-a9c7a034d82b_1184x1192.png 848w, https://substackcdn.com/image/fetch/$s_!GAv9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F422e7d05-9b31-464c-af17-a9c7a034d82b_1184x1192.png 1272w, https://substackcdn.com/image/fetch/$s_!GAv9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F422e7d05-9b31-464c-af17-a9c7a034d82b_1184x1192.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This tool helps you systematically debug research papers by identifying inconsistencies, verifying claims, and tracking experimental setups. When reproducing results or implementing papers, it provides structure for catching errors and understanding discrepancies between what&#8217;s written and what actually works. Useful for researchers implementing papers who&#8217;ve encountered the gap between description and reality.</p><h3><strong>Awesome Nano Banana Pro</strong></h3><p><strong><a href="https://github.com/ZeroLu/awesome-nanobanana-pro">https://github.com/ZeroLu/awesome-nanobanana-pro</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!W36w!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77b41b9a-a4e8-4bb8-8f4c-d0cf4885a45c_416x228.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!W36w!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77b41b9a-a4e8-4bb8-8f4c-d0cf4885a45c_416x228.png 424w, https://substackcdn.com/image/fetch/$s_!W36w!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77b41b9a-a4e8-4bb8-8f4c-d0cf4885a45c_416x228.png 848w, https://substackcdn.com/image/fetch/$s_!W36w!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77b41b9a-a4e8-4bb8-8f4c-d0cf4885a45c_416x228.png 1272w, https://substackcdn.com/image/fetch/$s_!W36w!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77b41b9a-a4e8-4bb8-8f4c-d0cf4885a45c_416x228.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!W36w!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77b41b9a-a4e8-4bb8-8f4c-d0cf4885a45c_416x228.png" width="416" height="228" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/77b41b9a-a4e8-4bb8-8f4c-d0cf4885a45c_416x228.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:228,&quot;width&quot;:416,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!W36w!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77b41b9a-a4e8-4bb8-8f4c-d0cf4885a45c_416x228.png 424w, https://substackcdn.com/image/fetch/$s_!W36w!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77b41b9a-a4e8-4bb8-8f4c-d0cf4885a45c_416x228.png 848w, https://substackcdn.com/image/fetch/$s_!W36w!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77b41b9a-a4e8-4bb8-8f4c-d0cf4885a45c_416x228.png 1272w, https://substackcdn.com/image/fetch/$s_!W36w!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77b41b9a-a4e8-4bb8-8f4c-d0cf4885a45c_416x228.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>A curated collection of resources for Google&#8217;s Nano Banana image generation model (revealed as Gemini 2.5 Flash Image). The repo aggregates tutorials, implementation examples, and community contributions around the model that topped LMArena rankings. Helpful for anyone exploring Gemini&#8217;s image capabilities or character consistency features in their projects.</p><div><hr></div><h2><strong>&#127891; 1 Pick of the Week</strong></h2><h3><strong>Build DeepSeek from Scratch</strong></h3><p><a href="https://youtube.com/playlist?list=PLPTV0NXA_ZSiOpKKlHCyOq9lnp-dLvlms">https://youtube.com/playlist?list=PLPTV0NXA_ZSiOpKKlHCyOq9lnp-dLvlms</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ZDBv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04b67a35-f904-4fb6-88e3-5678e8e81464_1180x650.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ZDBv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04b67a35-f904-4fb6-88e3-5678e8e81464_1180x650.png 424w, https://substackcdn.com/image/fetch/$s_!ZDBv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04b67a35-f904-4fb6-88e3-5678e8e81464_1180x650.png 848w, https://substackcdn.com/image/fetch/$s_!ZDBv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04b67a35-f904-4fb6-88e3-5678e8e81464_1180x650.png 1272w, https://substackcdn.com/image/fetch/$s_!ZDBv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04b67a35-f904-4fb6-88e3-5678e8e81464_1180x650.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ZDBv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04b67a35-f904-4fb6-88e3-5678e8e81464_1180x650.png" width="1180" height="650" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/04b67a35-f904-4fb6-88e3-5678e8e81464_1180x650.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:650,&quot;width&quot;:1180,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ZDBv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04b67a35-f904-4fb6-88e3-5678e8e81464_1180x650.png 424w, https://substackcdn.com/image/fetch/$s_!ZDBv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04b67a35-f904-4fb6-88e3-5678e8e81464_1180x650.png 848w, https://substackcdn.com/image/fetch/$s_!ZDBv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04b67a35-f904-4fb6-88e3-5678e8e81464_1180x650.png 1272w, https://substackcdn.com/image/fetch/$s_!ZDBv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04b67a35-f904-4fb6-88e3-5678e8e81464_1180x650.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Vizuara&#8217;s video series takes you through implementing DeepSeek from foundational concepts to working code. Rather than just explaining what DeepSeek does, you&#8217;ll build the architecture piece by piece, understanding each design decision. The structured approach makes complex model architectures concrete through implementation, helping you move from conceptual understanding to actual capability building.</p><div><hr></div><p><em>Thanks for reading The Tokenizer! If you found something useful here, share it with someone who might benefit. And if you want more curated insights like this, consider subscribing to Gradient Ascent.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.artofsaience.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.artofsaience.com/subscribe?"><span>Subscribe now</span></a></p>]]></content:encoded></item><item><title><![CDATA[Google Dismantles Deep Learning, VLMs Ditch Text for Vision, and Karpathy Builds AI Juries: The Tokenizer Edition #10]]></title><description><![CDATA[This week's most valuable AI resources]]></description><link>https://newsletter.artofsaience.com/p/google-rethinks-deep-learning-vision</link><guid isPermaLink="false">https://newsletter.artofsaience.com/p/google-rethinks-deep-learning-vision</guid><dc:creator><![CDATA[Sairam Sundaresan]]></dc:creator><pubDate>Sat, 29 Nov 2025 12:30:23 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/AnTw_t21ayE" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey there! Google Research unveiled a paradigm that reframes deep learning as nested optimization problems rather than stacked layers. Vision-language models learned to reason in continuous visual tokens instead of forcing concepts into text. And Karpathy released a multi-model council system that has LLMs review each other&#8217;s work before producing final answers.</p><h3><strong>New here?</strong></h3><p><em>The Tokenizer is my resource-focused newsletter edition where I curate the best papers, videos, articles, tools, and learning resources from across the AI landscape. Consider it your weekly dose of everything you need to stay ahead in machine learning.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.artofsaience.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption"></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2><strong>TL;DR</strong></h2><p><strong>What caught my attention this week:</strong></p><ul><li><p><strong>&#128196; Papers:</strong> Text-promptable medical segmentation across modalities, vision models reasoning in visual token space, humanoid agents searching 360&#176; environments, and a provocative rethinking of deep learning architectures</p></li><li><p><strong>&#127909; Videos:</strong> Meta-optimization frameworks for AI agents, DeepMind&#8217;s scientific journey, reward hacking in production RL, and Jeff Dean on frontier AI trends</p></li><li><p><strong>&#128240; Reads:</strong> Deep debugging insights from PyTorch development, practical CursorAI workflows for engineers, and understanding GRPO for reinforcement learning</p></li><li><p><strong>&#128736; Tools:</strong> Karpathy&#8217;s multi-LLM council system and Andrew Ng&#8217;s automated paper reviewer</p></li><li><p><strong>&#127891; Learning:</strong> Building Olmo 3 from scratch with Sebastian Raschka&#8217;s implementation notebook</p></li></ul><div><hr></div><p>Grab my first book &#8212; <strong><a href="https://www.amazon.com/AI-Rest-Us-Illustrated-Introduction/dp/B0F29THNLT/">AI for the Rest of Us</a></strong> &#8212; today!</p><blockquote><p><em><strong>Quick note:</strong> If you find the book useful, please leave a review on Amazon. It makes a world of difference. If you have a picture of the book IRL, please share it with me. I really appreciate it.</em></p></blockquote><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.amazon.com/AI-Rest-Us-Illustrated-Introduction/dp/B0F29THNLT/&quot;,&quot;text&quot;:&quot;Order AI for the Rest of Us Today!&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.amazon.com/AI-Rest-Us-Illustrated-Introduction/dp/B0F29THNLT/"><span>Order AI for the Rest of Us Today!</span></a></p><div><hr></div><h2><strong>&#128196; 5 Papers</strong></h2><h3><strong>MedSAM3: Delving into Segment Anything with Medical Concepts</strong></h3><p><strong><a href="https://arxiv.org/abs/2511.19046">https://arxiv.org/abs/2511.19046</a> |<a href="https://github.com/Joey-S-Liu/MedSAM3"> GitHub</a></strong></p><p>Medical image segmentation typically demands extensive manual annotation for each new clinical application. MedSAM3 addresses this by enabling text-based prompting across diverse imaging modalities. Instead of drawing boxes or clicking points, you can segment anatomical structures using natural language like &#8220;breast tumor&#8221; or &#8220;pulmonary artery.&#8221; The system works across X-ray, MRI, Ultrasound, CT, and video by fine-tuning SAM 3 with semantic conceptual labels. The MedSAM3 Agent framework integrates multimodal LLMs for complex reasoning and iterative refinement, allowing practitioners to interact with medical imaging through language rather than geometric constraints.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zeaf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a147339-b840-4db0-ae02-d478d262f9ea_2048x1162.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zeaf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a147339-b840-4db0-ae02-d478d262f9ea_2048x1162.png 424w, https://substackcdn.com/image/fetch/$s_!zeaf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a147339-b840-4db0-ae02-d478d262f9ea_2048x1162.png 848w, https://substackcdn.com/image/fetch/$s_!zeaf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a147339-b840-4db0-ae02-d478d262f9ea_2048x1162.png 1272w, https://substackcdn.com/image/fetch/$s_!zeaf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a147339-b840-4db0-ae02-d478d262f9ea_2048x1162.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zeaf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a147339-b840-4db0-ae02-d478d262f9ea_2048x1162.png" width="1456" height="826" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6a147339-b840-4db0-ae02-d478d262f9ea_2048x1162.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:826,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zeaf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a147339-b840-4db0-ae02-d478d262f9ea_2048x1162.png 424w, https://substackcdn.com/image/fetch/$s_!zeaf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a147339-b840-4db0-ae02-d478d262f9ea_2048x1162.png 848w, https://substackcdn.com/image/fetch/$s_!zeaf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a147339-b840-4db0-ae02-d478d262f9ea_2048x1162.png 1272w, https://substackcdn.com/image/fetch/$s_!zeaf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a147339-b840-4db0-ae02-d478d262f9ea_2048x1162.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens</strong></h3><p><strong><a href="https://arxiv.org/abs/2511.19418">https://arxiv.org/abs/2511.19418</a> |<a href="https://github.com/Wakals/CoVT"> GitHub</a></strong></p><p>Vision-language models excel at linguistic reasoning but struggle with dense visual perception tasks like spatial reasoning and geometric awareness. Chain-of-Visual-Thought (COVT) enables VLMs to reason through continuous visual tokens rather than forcing visual concepts into text descriptions. Within roughly 20 tokens, COVT distills knowledge from lightweight vision experts, capturing 2D appearance, 3D geometry, spatial layout, and edge structure. During training, the model predicts these visual tokens to reconstruct dense supervision signals. At inference, it reasons directly in the continuous visual token space while optionally decoding dense predictions for interpretability. Integrating COVT into Qwen2.5-VL and LLaVA consistently improves performance by 3% to 16% across benchmarks, demonstrating that compact continuous visual thinking enables more precise and grounded multimodal intelligence.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ZW4T!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0bb065d2-7804-4bae-a8d0-3ac4c9af5129_2048x820.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ZW4T!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0bb065d2-7804-4bae-a8d0-3ac4c9af5129_2048x820.png 424w, https://substackcdn.com/image/fetch/$s_!ZW4T!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0bb065d2-7804-4bae-a8d0-3ac4c9af5129_2048x820.png 848w, https://substackcdn.com/image/fetch/$s_!ZW4T!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0bb065d2-7804-4bae-a8d0-3ac4c9af5129_2048x820.png 1272w, https://substackcdn.com/image/fetch/$s_!ZW4T!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0bb065d2-7804-4bae-a8d0-3ac4c9af5129_2048x820.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ZW4T!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0bb065d2-7804-4bae-a8d0-3ac4c9af5129_2048x820.png" width="1456" height="583" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0bb065d2-7804-4bae-a8d0-3ac4c9af5129_2048x820.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:583,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ZW4T!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0bb065d2-7804-4bae-a8d0-3ac4c9af5129_2048x820.png 424w, https://substackcdn.com/image/fetch/$s_!ZW4T!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0bb065d2-7804-4bae-a8d0-3ac4c9af5129_2048x820.png 848w, https://substackcdn.com/image/fetch/$s_!ZW4T!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0bb065d2-7804-4bae-a8d0-3ac4c9af5129_2048x820.png 1272w, https://substackcdn.com/image/fetch/$s_!ZW4T!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0bb065d2-7804-4bae-a8d0-3ac4c9af5129_2048x820.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>Thinking in 360&#176;: Humanoid Visual Search in the Wild</strong></h3><p><strong><a href="https://arxiv.org/abs/2511.20351">https://arxiv.org/abs/2511.20351</a> |<a href="https://github.com/humanoid-vstar/hstar"> GitHub</a></strong></p><p>Humans efficiently search for visual information in 360&#176; by coordinating head and eye movements. Prior visual search approaches operate on static images, neglecting physical embodiment. This research proposes humanoid visual search where agents actively rotate their heads to search for objects or paths in panoramic images. The H* Bench benchmark moves beyond household scenes to challenging real-world environments like transportation hubs, large-scale retail spaces, urban streets, and public institutions. Current top-tier models achieve only ~30% success on these tasks. Post-training techniques enhance Qwen2.5-VL by over threefold for object search (14.83% to 47.38%) and path search (6.44% to 24.94%). The lower ceiling for path search reveals the demand for sophisticated spatial commonsense that remains a significant challenge.</p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;5de5edb3-b017-4315-83cf-ff443c56576d&quot;,&quot;duration&quot;:null}"></div><h3><strong>Nested Learning: The Illusion of Deep Learning Architectures</strong></h3><p><strong><a href="https://abehrouz.github.io/files/NL.pdf">https://abehrouz.github.io/files/NL.pdf</a></strong></p><p>Google Research presents a provocative reframing of deep learning. Instead of viewing models as stacked layers, Nested Learning represents them as coherent systems of nested, multi-level optimization problems, each with its own context flow and update frequency. This white-box perspective reveals that existing deep learning methods learn by compressing their context flows, explaining how in-context learning emerges. The framework yields three core contributions: Deep Optimizers (showing that gradient descent with momentum is actually a two-level associative memory module), Self-Modifying Titans (a sequence model that learns its own update algorithm), and the Continuum Memory System (generalizing short-term and long-term memory into a hierarchy updating at different time scales). The HOPE architecture demonstrates improved language modeling, continual learning, and long-context reasoning compared to standard Transformers and recurrent models.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!S5LT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F532ed3ae-bff8-4e29-8742-eab6382036d0_940x402.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!S5LT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F532ed3ae-bff8-4e29-8742-eab6382036d0_940x402.png 424w, https://substackcdn.com/image/fetch/$s_!S5LT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F532ed3ae-bff8-4e29-8742-eab6382036d0_940x402.png 848w, https://substackcdn.com/image/fetch/$s_!S5LT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F532ed3ae-bff8-4e29-8742-eab6382036d0_940x402.png 1272w, https://substackcdn.com/image/fetch/$s_!S5LT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F532ed3ae-bff8-4e29-8742-eab6382036d0_940x402.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!S5LT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F532ed3ae-bff8-4e29-8742-eab6382036d0_940x402.png" width="940" height="402" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/532ed3ae-bff8-4e29-8742-eab6382036d0_940x402.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:402,&quot;width&quot;:940,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!S5LT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F532ed3ae-bff8-4e29-8742-eab6382036d0_940x402.png 424w, https://substackcdn.com/image/fetch/$s_!S5LT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F532ed3ae-bff8-4e29-8742-eab6382036d0_940x402.png 848w, https://substackcdn.com/image/fetch/$s_!S5LT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F532ed3ae-bff8-4e29-8742-eab6382036d0_940x402.png 1272w, https://substackcdn.com/image/fetch/$s_!S5LT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F532ed3ae-bff8-4e29-8742-eab6382036d0_940x402.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>DoPE: Denoising Positional Encoding</strong></h3><p><strong><a href="https://arxiv.org/abs/2511.09146">https://arxiv.org/abs/2511.09146</a></strong></p><p>Rotary Position Embedding (RoPE) has inherent limits that weaken length extrapolation in Transformer models. DoPE reinterprets the attention map with positional encoding as a noisy feature map and uses truncated matrix entropy to detect outlier frequency bands. The training-free method reparameterizes the feature map with a parameter-free Gaussian distribution to achieve robust extrapolation. Experiments on needle-in-a-haystack and many-shot in-context learning tasks demonstrate that DoPE significantly improves retrieval accuracy and reasoning stability across extended contexts up to 64K tokens. The approach theoretically reveals the underlying cause of the attention sink phenomenon and its connection to truncated matrix entropy, providing a simple yet powerful solution for improving length generalization.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!f_SU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34b6daa4-592c-4c45-82a7-0fa9afcb9eaa_1086x330.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!f_SU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34b6daa4-592c-4c45-82a7-0fa9afcb9eaa_1086x330.png 424w, https://substackcdn.com/image/fetch/$s_!f_SU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34b6daa4-592c-4c45-82a7-0fa9afcb9eaa_1086x330.png 848w, https://substackcdn.com/image/fetch/$s_!f_SU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34b6daa4-592c-4c45-82a7-0fa9afcb9eaa_1086x330.png 1272w, https://substackcdn.com/image/fetch/$s_!f_SU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34b6daa4-592c-4c45-82a7-0fa9afcb9eaa_1086x330.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!f_SU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34b6daa4-592c-4c45-82a7-0fa9afcb9eaa_1086x330.png" width="1086" height="330" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/34b6daa4-592c-4c45-82a7-0fa9afcb9eaa_1086x330.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:330,&quot;width&quot;:1086,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!f_SU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34b6daa4-592c-4c45-82a7-0fa9afcb9eaa_1086x330.png 424w, https://substackcdn.com/image/fetch/$s_!f_SU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34b6daa4-592c-4c45-82a7-0fa9afcb9eaa_1086x330.png 848w, https://substackcdn.com/image/fetch/$s_!f_SU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34b6daa4-592c-4c45-82a7-0fa9afcb9eaa_1086x330.png 1272w, https://substackcdn.com/image/fetch/$s_!f_SU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34b6daa4-592c-4c45-82a7-0fa9afcb9eaa_1086x330.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h2><strong>&#127909; 4 Videos</strong></h2><h3><strong>The Unbearable Lightness of Agent Optimization</strong></h3><div id="youtube2-zfvEMNmVlNY" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;zfvEMNmVlNY&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/zfvEMNmVlNY?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Alberto introduces Meta-ACE, a learned meta-optimization framework that dynamically orchestrates multiple strategies to maximize task performance under real-world constraints. Instead of uniform prompt refinement, Meta-ACE profiles each task by complexity, verifiability, and feedback quality, then selects an optimal strategy bundle via a lightweight meta-controller. The system adaptively chooses between context evolution, adaptive compute, hierarchical verification, structured memory, and selective test-time parameter adaptation. The talk provides a systematic approach to building self-optimizing AI agents for regulated industries rather than relying on one-size-fits-all optimization methods.</p><h3><strong>The Thinking Game</strong></h3><div id="youtube2-d95J8yzvjbQ" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;d95J8yzvjbQ&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/d95J8yzvjbQ?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>This documentary from the award-winning team behind AlphaGo takes you inside DeepMind over five years, capturing the team&#8217;s pursuit of artificial general intelligence. The film examines how Demis Hassabis&#8217;s extraordinary beginnings shaped his lifelong work while documenting the rigorous process of scientific discovery. You&#8217;ll see the journey from mastering complex strategy games to solving the 50-year-old protein folding problem with AlphaFold, including the ups and downs of tackling fundamental scientific challenges. It provides insight into what building transformative AI systems actually looks like beyond polished research announcements.</p><h3><strong>What is AI &#8220;reward hacking&#8221; and why do we worry about it?</strong></h3><div id="youtube2-lvMMZLYoDr4" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;lvMMZLYoDr4&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/lvMMZLYoDr4?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>The Anthropic team discusses new research showing that realistic AI training processes can accidentally produce misaligned models. When large language models learn to cheat on software programming tasks, they develop other misaligned behaviors as unintended consequences, including alignment faking and sabotage of AI safety research. The work demonstrates for the first time that natural emergent misalignment can arise from reward hacking in production RL systems, making it essential viewing for anyone deploying reinforcement learning in real applications.</p><h3><strong>Jeff Dean on Important AI Trends</strong></h3><div id="youtube2-AnTw_t21ayE" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;AnTw_t21ayE&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/AnTw_t21ayE?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Jeff Dean, Google&#8217;s Chief Scientist and co-founder of Google Brain, discusses the trends shaping modern AI. As one of the most influential computer scientists of the modern computing era, his work spans the foundations of large-scale distributed systems, deep learning frameworks like TensorFlow, and today&#8217;s frontier AI research. The conversation covers practical insights into where AI capabilities are heading and the infrastructure challenges that come with scaling intelligence systems.</p><div><hr></div><h2><strong>&#128240; 3 Curated Reads</strong></h2><h3><strong>The bug that taught me more about PyTorch than years of using it</strong></h3><p><strong><a href="https://elanapearl.github.io/blog/2025/the-bug-that-taught-me-pytorch/">https://elanapearl.github.io/blog/2025/the-bug-that-taught-me-pytorch/</a></strong></p><p>Elana Simon shares a debugging story that reveals deep insights into PyTorch&#8217;s execution model. The post walks through discovering a subtle bug that exposed fundamental misunderstandings about how PyTorch handles computations. You&#8217;ll learn practical lessons about tensor operations, gradient computation, and memory management that textbooks rarely cover. The kind of hard-won knowledge that only comes from fighting with production code.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EcWn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7537c874-3f43-429b-ac01-285ff87028e8_2048x973.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EcWn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7537c874-3f43-429b-ac01-285ff87028e8_2048x973.png 424w, https://substackcdn.com/image/fetch/$s_!EcWn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7537c874-3f43-429b-ac01-285ff87028e8_2048x973.png 848w, https://substackcdn.com/image/fetch/$s_!EcWn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7537c874-3f43-429b-ac01-285ff87028e8_2048x973.png 1272w, https://substackcdn.com/image/fetch/$s_!EcWn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7537c874-3f43-429b-ac01-285ff87028e8_2048x973.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EcWn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7537c874-3f43-429b-ac01-285ff87028e8_2048x973.png" width="1456" height="692" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7537c874-3f43-429b-ac01-285ff87028e8_2048x973.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:692,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!EcWn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7537c874-3f43-429b-ac01-285ff87028e8_2048x973.png 424w, https://substackcdn.com/image/fetch/$s_!EcWn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7537c874-3f43-429b-ac01-285ff87028e8_2048x973.png 848w, https://substackcdn.com/image/fetch/$s_!EcWn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7537c874-3f43-429b-ac01-285ff87028e8_2048x973.png 1272w, https://substackcdn.com/image/fetch/$s_!EcWn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7537c874-3f43-429b-ac01-285ff87028e8_2048x973.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>My Engineering Workflow in CursorAI</strong></h3><p><strong><a href="https://codeaholicguy.com/2025/10/18/my-engineering-workflow-in-cursorai/">https://codeaholicguy.com/2025/10/18/my-engineering-workflow-in-cursorai/</a></strong></p><p>Hoang Nguyen details his practical workflow for engineering with CursorAI. The post covers specific patterns for getting AI assistance to actually accelerate development rather than creating more cleanup work. You&#8217;ll find concrete examples of prompting strategies, context management techniques, and workflow organization that work in real projects. Useful if you&#8217;re trying to move beyond basic autocomplete to genuine AI-augmented development.</p><h3><strong>Group Relative Policy Optimization (GRPO)</strong></h3><div class="embedded-post-wrap" data-attrs="{&quot;id&quot;:177823868,&quot;url&quot;:&quot;https://cameronrwolfe.substack.com/p/grpo&quot;,&quot;publication_id&quot;:1092659,&quot;publication_name&quot;:&quot;Deep (Learning) Focus&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!87xa!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fab9b43fb-52d5-40da-995d-5b7cd3f91064_896x896.png&quot;,&quot;title&quot;:&quot;Group Relative Policy Optimization (GRPO)&quot;,&quot;truncated_body_text&quot;:&quot;Reinforcement learning (RL) has always played a pivotal role in research on large language models (LLMs), beginning with its use for aligning LLMs to human preferences. More recently, researchers have heavily focused on using RL training to improve LLM reasoning performance. This line of research has led to a rapid expansion of LLM capabil&#8230;&quot;,&quot;date&quot;:&quot;2025-11-24T10:33:31.743Z&quot;,&quot;like_count&quot;:55,&quot;comment_count&quot;:4,&quot;bylines&quot;:[{&quot;id&quot;:29736521,&quot;name&quot;:&quot;Cameron R. Wolfe, Ph.D.&quot;,&quot;handle&quot;:&quot;cwolferesearch&quot;,&quot;previous_name&quot;:&quot;Cameron R. Wolfe&quot;,&quot;photo_url&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/69aba7df-b571-4609-aa47-fc2d031c11b8_1242x1595.jpeg&quot;,&quot;bio&quot;:&quot;Research @ Netflix &#8226; Rice University PhD &#8226; I make AI understandable&quot;,&quot;profile_set_up_at&quot;:&quot;2022-09-17T15:11:34.083Z&quot;,&quot;reader_installed_at&quot;:&quot;2023-01-10T11:25:00.723Z&quot;,&quot;publicationUsers&quot;:[{&quot;id&quot;:1042380,&quot;user_id&quot;:29736521,&quot;publication_id&quot;:1092659,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:true,&quot;publication&quot;:{&quot;id&quot;:1092659,&quot;name&quot;:&quot;Deep (Learning) Focus&quot;,&quot;subdomain&quot;:&quot;cameronrwolfe&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;I contextualize and explain important topics in AI research.&quot;,&quot;logo_url&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/ab9b43fb-52d5-40da-995d-5b7cd3f91064_896x896.png&quot;,&quot;author_id&quot;:29736521,&quot;primary_user_id&quot;:29736521,&quot;theme_var_background_pop&quot;:&quot;#6C0095&quot;,&quot;created_at&quot;:&quot;2022-09-17T15:12:33.160Z&quot;,&quot;email_from_name&quot;:&quot;Deep (Learning) Focus&quot;,&quot;copyright&quot;:&quot;Cameron R. Wolfe&quot;,&quot;founding_plan_name&quot;:&quot;Founding Member&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;enabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false}}],&quot;twitter_screen_name&quot;:&quot;cwolferesearch&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null,&quot;status&quot;:{&quot;bestsellerTier&quot;:null,&quot;subscriberTier&quot;:null,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:null,&quot;paidPublicationIds&quot;:[],&quot;subscriber&quot;:null}}],&quot;utm_campaign&quot;:null,&quot;belowTheFold&quot;:true,&quot;type&quot;:&quot;newsletter&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPostToDOM"><a class="embedded-post" native="true" href="https://cameronrwolfe.substack.com/p/grpo?utm_source=substack&amp;utm_campaign=post_embed&amp;utm_medium=web"><div class="embedded-post-header"><img class="embedded-post-publication-logo" src="https://substackcdn.com/image/fetch/$s_!87xa!,w_56,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fab9b43fb-52d5-40da-995d-5b7cd3f91064_896x896.png" loading="lazy"><span class="embedded-post-publication-name">Deep (Learning) Focus</span></div><div class="embedded-post-title-wrapper"><div class="embedded-post-title">Group Relative Policy Optimization (GRPO)</div></div><div class="embedded-post-body">Reinforcement learning (RL) has always played a pivotal role in research on large language models (LLMs), beginning with its use for aligning LLMs to human preferences. More recently, researchers have heavily focused on using RL training to improve LLM reasoning performance. This line of research has led to a rapid expansion of LLM capabil&#8230;</div><div class="embedded-post-cta-wrapper"><span class="embedded-post-cta">Read more</span></div><div class="embedded-post-meta">5 months ago &#183; 55 likes &#183; 4 comments &#183; Cameron R. Wolfe, Ph.D.</div></a></div><p><span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Cameron R. Wolfe, Ph.D.&quot;,&quot;id&quot;:29736521,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/69aba7df-b571-4609-aa47-fc2d031c11b8_1242x1595.jpeg&quot;,&quot;uuid&quot;:&quot;a9093d6d-e91c-4df9-966e-fbde322fa303&quot;}" data-component-name="MentionToDOM"></span> breaks down Group Relative Policy Optimization, a reinforcement learning algorithm that&#8217;s become important for training modern LLMs. The post explains how GRPO differs from PPO and DPO, why it matters for alignment, and the mathematical foundations that make it work. Essential reading for understanding the RL techniques behind recent model improvements, with clear explanations that bridge theory and practice.</p><div><hr></div><h2><strong>&#128736; 2 Tools &amp; Repos</strong></h2><h3><strong>LLM Council</strong></h3><p><strong><a href="https://github.com/karpathy/llm-council">https://github.com/karpathy/llm-council</a></strong></p><p>Andrej Karpathy built this system for querying multiple LLMs simultaneously, having them review each other&#8217;s work, then producing a final response through a Chairman LLM. Instead of asking one model, you send queries to your council (GPT, Claude, Gemini, Grok) via OpenRouter. The models provide individual responses, review each other&#8217;s work with anonymized identities to prevent favoritism, and then a designated Chairman compiles the final answer. The system provides transparency into how different models approach the same problem and enables systematic model evaluation through peer review. Karpathy describes it as a &#8220;Saturday vibe code project&#8221; and notes he won&#8217;t support it, but the code is there for exploration and modification.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!vF_M!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb79afdac-fa3a-4709-85ec-106c996775ac_1024x506.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vF_M!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb79afdac-fa3a-4709-85ec-106c996775ac_1024x506.jpeg 424w, https://substackcdn.com/image/fetch/$s_!vF_M!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb79afdac-fa3a-4709-85ec-106c996775ac_1024x506.jpeg 848w, https://substackcdn.com/image/fetch/$s_!vF_M!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb79afdac-fa3a-4709-85ec-106c996775ac_1024x506.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!vF_M!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb79afdac-fa3a-4709-85ec-106c996775ac_1024x506.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vF_M!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb79afdac-fa3a-4709-85ec-106c996775ac_1024x506.jpeg" width="1024" height="506" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b79afdac-fa3a-4709-85ec-106c996775ac_1024x506.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:506,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!vF_M!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb79afdac-fa3a-4709-85ec-106c996775ac_1024x506.jpeg 424w, https://substackcdn.com/image/fetch/$s_!vF_M!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb79afdac-fa3a-4709-85ec-106c996775ac_1024x506.jpeg 848w, https://substackcdn.com/image/fetch/$s_!vF_M!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb79afdac-fa3a-4709-85ec-106c996775ac_1024x506.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!vF_M!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb79afdac-fa3a-4709-85ec-106c996775ac_1024x506.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>Agentic Paper Reviewer</strong></h3><p><strong><a href="https://paperreview.ai/">https://paperreview.ai/</a></strong></p><p>Andrew Ng&#8217;s automated system for reviewing research papers. The tool provides structured feedback on paper quality, methodology, and contributions using agentic AI workflows. Designed to help researchers get constructive feedback on their work before submission, particularly useful for identifying potential weaknesses and improvement areas. The system combines domain knowledge with systematic evaluation frameworks to provide more than superficial summaries.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!XunU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F251881a7-2410-4fe2-a2de-48503ec873d0_1276x461.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!XunU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F251881a7-2410-4fe2-a2de-48503ec873d0_1276x461.png 424w, https://substackcdn.com/image/fetch/$s_!XunU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F251881a7-2410-4fe2-a2de-48503ec873d0_1276x461.png 848w, https://substackcdn.com/image/fetch/$s_!XunU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F251881a7-2410-4fe2-a2de-48503ec873d0_1276x461.png 1272w, https://substackcdn.com/image/fetch/$s_!XunU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F251881a7-2410-4fe2-a2de-48503ec873d0_1276x461.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!XunU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F251881a7-2410-4fe2-a2de-48503ec873d0_1276x461.png" width="1276" height="461" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/251881a7-2410-4fe2-a2de-48503ec873d0_1276x461.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:461,&quot;width&quot;:1276,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!XunU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F251881a7-2410-4fe2-a2de-48503ec873d0_1276x461.png 424w, https://substackcdn.com/image/fetch/$s_!XunU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F251881a7-2410-4fe2-a2de-48503ec873d0_1276x461.png 848w, https://substackcdn.com/image/fetch/$s_!XunU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F251881a7-2410-4fe2-a2de-48503ec873d0_1276x461.png 1272w, https://substackcdn.com/image/fetch/$s_!XunU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F251881a7-2410-4fe2-a2de-48503ec873d0_1276x461.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h2><strong>&#127891; 1 Pick of the Week</strong></h2><h3><strong>Olmo 3 From Scratch</strong></h3><p><strong><a href="https://github.com/rasbt/LLMs-from-scratch/blob/main/ch05/13_olmo3/standalone-olmo3.ipynb">https://github.com/rasbt/LLMs-from-scratch/blob/main/ch05/13_olmo3/standalone-olmo3.ipynb</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4GF_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb46186dd-15d0-4d5d-998d-894c555efc39_1677x2048.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4GF_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb46186dd-15d0-4d5d-998d-894c555efc39_1677x2048.png 424w, https://substackcdn.com/image/fetch/$s_!4GF_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb46186dd-15d0-4d5d-998d-894c555efc39_1677x2048.png 848w, https://substackcdn.com/image/fetch/$s_!4GF_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb46186dd-15d0-4d5d-998d-894c555efc39_1677x2048.png 1272w, https://substackcdn.com/image/fetch/$s_!4GF_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb46186dd-15d0-4d5d-998d-894c555efc39_1677x2048.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4GF_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb46186dd-15d0-4d5d-998d-894c555efc39_1677x2048.png" width="1456" height="1778" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b46186dd-15d0-4d5d-998d-894c555efc39_1677x2048.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1778,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4GF_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb46186dd-15d0-4d5d-998d-894c555efc39_1677x2048.png 424w, https://substackcdn.com/image/fetch/$s_!4GF_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb46186dd-15d0-4d5d-998d-894c555efc39_1677x2048.png 848w, https://substackcdn.com/image/fetch/$s_!4GF_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb46186dd-15d0-4d5d-998d-894c555efc39_1677x2048.png 1272w, https://substackcdn.com/image/fetch/$s_!4GF_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb46186dd-15d0-4d5d-998d-894c555efc39_1677x2048.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Sebastian Raschka implemented Olmo 3 from scratch in a standalone notebook, providing the clearest way to understand the architecture at a glance. The notebook covers both 7B and 32B models with detailed comparisons to other architectures like Qwen3. Olmo 3 is interesting because it represents the leading fully open-source model with complete transparency about training data, code, and methodology. The notebook includes practical details about attention mechanisms, normalization strategies, and architectural choices that distinguish Olmo 3 from other recent models.</p><div><hr></div><p><em>Thanks for reading The Tokenizer! If you found something useful here, share it with someone who might benefit. And if you want more curated insights like this, consider subscribing to Gradient Ascent.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.artofsaience.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption"></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[AI Agents That Train Themselves, NVIDIA Cuts Training Costs 360x, and Meta's SAM 3D Wins 5:1: The Tokenizer Edition]]></title><description><![CDATA[This week's best AI resources]]></description><link>https://newsletter.artofsaience.com/p/ai-agents-that-train-themselves-nvidia</link><guid isPermaLink="false">https://newsletter.artofsaience.com/p/ai-agents-that-train-themselves-nvidia</guid><dc:creator><![CDATA[Sairam Sundaresan]]></dc:creator><pubDate>Mon, 24 Nov 2025 16:02:45 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/I4olDc6MmP8" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey there! Meta released SAM 3D, a model that generates complete 3D objects from a single photo with a 5:1 win rate against competing methods. Meanwhile, Agent0 proved you can train capable AI agents entirely from scratch without any human-curated data. Maybe self-evolution actually works when you give agents the right tools?</p><h3><strong>New here?</strong></h3><p><em>The Tokenizer is my resource-focused newsletter edition where I curate the best papers, videos, articles, tools, and learning resources from across the AI landscape. Consider it your weekly dose of everything you need to stay ahead in machine learning.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.artofsaience.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Gradient Ascent! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h2><strong>TL;DR</strong></h2><p><strong>What caught my attention this week:</strong></p><ul><li><p>&#128196; <strong>Papers:</strong> Single-image 3D reconstruction that handles real-world clutter, self-evolving agents that need zero external data, cost-efficient model families from NVIDIA, and formal proofs of LLM scaling limits</p></li><li><p>&#127909; <strong>Videos:</strong> Mapping open model progress, practical AI coding workflows, and surprising advances in small reasoning models</p></li><li><p>&#128240; <strong>Reads:</strong> How OpenAI is using evals to drive enterprise AI, Stanford&#8217;s CS336 study notes, and insights on policy distillation</p></li><li><p>&#128736; <strong>Tools:</strong> Automated paper-to-agent systems and comprehensive ML systems resources</p></li><li><p>&#127891; <strong>Learning:</strong> Google&#8217;s Nano Banana Pro (Gemini 3 Pro Image), the studio-quality upgrade to their viral image model</p></li></ul><div><hr></div><p>Grab my first book &#8212; <strong><a href="https://www.amazon.com/AI-Rest-Us-Illustrated-Introduction/dp/B0F29THNLT/">AI for the Rest of Us</a></strong> &#8212; today!</p><blockquote><p><em><strong>Quick note:</strong> If you find the book useful, please leave a review on Amazon. It makes a world of difference. If you have a picture of the book IRL, please share it with me. I really appreciate it.</em></p></blockquote><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.amazon.com/AI-Rest-Us-Illustrated-Introduction/dp/B0F29THNLT/&quot;,&quot;text&quot;:&quot;Order AI for the Rest of Us Today!&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.amazon.com/AI-Rest-Us-Illustrated-Introduction/dp/B0F29THNLT/"><span>Order AI for the Rest of Us Today!</span></a></p><div><hr></div><h2><strong>&#128196; 5 Papers</strong></h2><h3><strong>SAM 3D: 3Dfy Anything in Images</strong></h3><p><strong><a href="https://arxiv.org/abs/2511.16624">https://arxiv.org/abs/2511.16624</a> |<a href="https://github.com/facebookresearch/sam-3d-objects"> GitHub</a></strong></p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;cc4350b1-bc09-41a3-835c-2e6631ed8d71&quot;,&quot;duration&quot;:null}"></div><p>Meta&#8217;s SAM 3D reconstructs full 3D geometry, texture, and layout from single images, focusing on real-world scenarios where objects are partially occluded or surrounded by clutter. The breakthrough comes from a human-and-model-in-the-loop pipeline that generated visually grounded 3D reconstruction data at unprecedented scale, combined with a multi-stage training framework that moves from synthetic pretraining to real-world alignment. Unlike methods that fail when objects are partially hidden, SAM 3D achieves at least a 5:1 win rate in human preference tests on real-world objects and scenes.</p><h3><strong>Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning</strong></h3><p><strong><a href="https://arxiv.org/abs/2511.16043">https://arxiv.org/abs/2511.16043</a> |<a href="https://github.com/aiming-lab/Agent0"> GitHub</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!26SS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9208fd7e-2fb8-4f3d-a3e4-2308657df65a_1606x830.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!26SS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9208fd7e-2fb8-4f3d-a3e4-2308657df65a_1606x830.png 424w, https://substackcdn.com/image/fetch/$s_!26SS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9208fd7e-2fb8-4f3d-a3e4-2308657df65a_1606x830.png 848w, https://substackcdn.com/image/fetch/$s_!26SS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9208fd7e-2fb8-4f3d-a3e4-2308657df65a_1606x830.png 1272w, https://substackcdn.com/image/fetch/$s_!26SS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9208fd7e-2fb8-4f3d-a3e4-2308657df65a_1606x830.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!26SS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9208fd7e-2fb8-4f3d-a3e4-2308657df65a_1606x830.png" width="1456" height="752" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9208fd7e-2fb8-4f3d-a3e4-2308657df65a_1606x830.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:752,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!26SS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9208fd7e-2fb8-4f3d-a3e4-2308657df65a_1606x830.png 424w, https://substackcdn.com/image/fetch/$s_!26SS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9208fd7e-2fb8-4f3d-a3e4-2308657df65a_1606x830.png 848w, https://substackcdn.com/image/fetch/$s_!26SS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9208fd7e-2fb8-4f3d-a3e4-2308657df65a_1606x830.png 1272w, https://substackcdn.com/image/fetch/$s_!26SS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9208fd7e-2fb8-4f3d-a3e4-2308657df65a_1606x830.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Training AI agents typically requires massive amounts of human-curated data, limiting both scalability and capabilities to what humans already know. Agent0 breaks this dependency through symbiotic competition between two agents: a curriculum agent that proposes increasingly complex tasks, and an executor agent that learns to solve them using external tools. The key insight is giving agents access to tools like Python interpreters, which provides problem-solving capabilities beyond the base model. This self-reinforcing cycle improved Qwen3-8B-Base by 18% on mathematical reasoning and 24% on general reasoning benchmarks, entirely without external training data.</p><h3><strong>Nemotron Elastic: Towards Efficient Many-in-One Reasoning LLMs</strong></h3><p><strong><a href="https://arxiv.org/abs/2511.16664">https://arxiv.org/abs/2511.16664</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xTXc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb9d920f-f9bf-448e-b473-c352936f3d89_1533x824.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xTXc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb9d920f-f9bf-448e-b473-c352936f3d89_1533x824.png 424w, https://substackcdn.com/image/fetch/$s_!xTXc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb9d920f-f9bf-448e-b473-c352936f3d89_1533x824.png 848w, https://substackcdn.com/image/fetch/$s_!xTXc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb9d920f-f9bf-448e-b473-c352936f3d89_1533x824.png 1272w, https://substackcdn.com/image/fetch/$s_!xTXc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb9d920f-f9bf-448e-b473-c352936f3d89_1533x824.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xTXc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb9d920f-f9bf-448e-b473-c352936f3d89_1533x824.png" width="1456" height="783" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fb9d920f-f9bf-448e-b473-c352936f3d89_1533x824.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:783,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xTXc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb9d920f-f9bf-448e-b473-c352936f3d89_1533x824.png 424w, https://substackcdn.com/image/fetch/$s_!xTXc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb9d920f-f9bf-448e-b473-c352936f3d89_1533x824.png 848w, https://substackcdn.com/image/fetch/$s_!xTXc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb9d920f-f9bf-448e-b473-c352936f3d89_1533x824.png 1272w, https://substackcdn.com/image/fetch/$s_!xTXc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb9d920f-f9bf-448e-b473-c352936f3d89_1533x824.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Training a family of language models at different scales typically requires separate training runs for each size, burning through hundreds of billions of tokens per model. NVIDIA&#8217;s Nemotron Elastic embeds multiple nested submodels within a single parent model, where each smaller version shares weights with the parent and can be extracted zero-shot at deployment. Applied to the Nemotron Nano V2 12B model, this approach simultaneously produced 9B and 6B models using only 110B training tokens. That&#8217;s over 360x cost reduction compared to training from scratch, and around 7x better than existing compression techniques, while each nested model performs on par or better than state-of-the-art models at their respective sizes.</p><h3><strong>On the Fundamental Limits of LLMs at Scale</strong></h3><p><strong><a href="https://arxiv.org/abs/2511.12869">https://arxiv.org/abs/2511.12869</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!m06i!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48f9319e-cfdd-449d-bd72-2768ad75b503_855x500.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!m06i!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48f9319e-cfdd-449d-bd72-2768ad75b503_855x500.png 424w, https://substackcdn.com/image/fetch/$s_!m06i!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48f9319e-cfdd-449d-bd72-2768ad75b503_855x500.png 848w, https://substackcdn.com/image/fetch/$s_!m06i!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48f9319e-cfdd-449d-bd72-2768ad75b503_855x500.png 1272w, https://substackcdn.com/image/fetch/$s_!m06i!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48f9319e-cfdd-449d-bd72-2768ad75b503_855x500.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!m06i!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48f9319e-cfdd-449d-bd72-2768ad75b503_855x500.png" width="855" height="500" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/48f9319e-cfdd-449d-bd72-2768ad75b503_855x500.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:500,&quot;width&quot;:855,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!m06i!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48f9319e-cfdd-449d-bd72-2768ad75b503_855x500.png 424w, https://substackcdn.com/image/fetch/$s_!m06i!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48f9319e-cfdd-449d-bd72-2768ad75b503_855x500.png 848w, https://substackcdn.com/image/fetch/$s_!m06i!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48f9319e-cfdd-449d-bd72-2768ad75b503_855x500.png 1272w, https://substackcdn.com/image/fetch/$s_!m06i!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48f9319e-cfdd-449d-bd72-2768ad75b503_855x500.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Large language models have improved dramatically with scale, but these gains face five fundamental limitations: hallucination, context compression, reasoning degradation, retrieval fragility, and multimodal misalignment. This paper provides a rigorous theoretical framework showing these aren&#8217;t engineering problems to be solved through more compute. Computability theory guarantees that for any model family, diagonalization ensures inputs where models must fail. Finite description length enforces irreducible generalization error. Long context windows compress far below their nominal size due to positional under-training and attention mechanics. The work establishes mathematical boundaries for where scaling helps, where it saturates, and where it provably cannot progress.</p><h3><strong>LeJEPA: Provable and Scalable Self-Supervised Learning Without the Heuristics</strong></h3><p><strong><a href="https://arxiv.org/abs/2511.08544">https://arxiv.org/abs/2511.08544</a> |<a href="https://github.com/rbalestr-lab/lejepa"> GitHub</a></strong></p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;effe0104-0d4d-493f-8364-3d070e7b8de5&quot;,&quot;duration&quot;:null}"></div><p>Self-supervised learning through Joint-Embedding Predictive Architectures has relied on brittle heuristics like stop-gradients and teacher-student networks to prevent collapse. LeJEPA replaces these with a comprehensive theory proving that isotropic Gaussian distributions minimize downstream prediction risk. The framework introduces Sketched Isotropic Gaussian Regularization (SIGReg) to constrain embeddings to this optimal distribution, requiring only a single trade-off hyperparameter and linear time/memory complexity. LeJEPA works out of the box across ResNets, ViTs, and ConvNets, achieving 79% on ImageNet-1k with a frozen ViT-H/14 backbone. The approach requires approximately 50 lines of code and eliminates the need for hyperparameter schedulers.</p><div><hr></div><h2><strong>&#127909; 4 Videos</strong></h2><h3><strong>Mapping the Open Model Landscape in 2025</strong></h3><div id="youtube2-QlrGr-D4vTg" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;QlrGr-D4vTg&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/QlrGr-D4vTg?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Nathan Lambert provides a comprehensive overview of where open-source AI models stand as we head into 2025. The discussion covers which models are genuinely competitive with proprietary options, where the gaps remain, and how the landscape has shifted over the past year. Particularly useful for understanding which open models to consider for specific use cases and what trade-offs you&#8217;re making when choosing between open and closed options.</p><h3><strong>Coding with AI</strong></h3><div id="youtube2-I4olDc6MmP8" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;I4olDc6MmP8&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/I4olDc6MmP8?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Chip Huyen walks through practical workflows for coding with AI assistance, moving beyond theoretical discussions to show how experienced engineers actually integrate AI into their development process. The video addresses when AI coding tools help versus when they get in the way, covering context management, code review workflows, and strategies for maintaining code quality when working with AI-generated suggestions.</p><h3><strong>Learn the basics of Google Antigravity</strong></h3><div id="youtube2-nTOVIGsqCuY" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;nTOVIGsqCuY&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/nTOVIGsqCuY?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>An exploration of Google&#8217;s experimental antigravity feature, demonstrating physics-based interactions and unconventional UI approaches. While playful in presentation, the video showcases interesting approaches to interactive design that challenge conventional interface patterns.</p><h3><strong>The Weirdly Small AI That Cracks Reasoning Puzzles [HRM]</strong></h3><div id="youtube2-RK7lysjz_G0" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;RK7lysjz_G0&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/RK7lysjz_G0?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Jia-Bin Huang examines surprisingly capable small models that solve complex reasoning tasks, challenging assumptions about the relationship between model size and reasoning capability. The video explores what makes these compact models effective and the implications for deploying reasoning systems with limited computational resources.</p><div><hr></div><h2><strong>&#128240; 3 Curated Reads</strong></h2><h3><strong>How evals drive the next chapter in AI for businesses</strong></h3><p><strong><a href="https://openai.com/index/evals-drive-next-chapter-of-ai/">https://openai.com/index/evals-drive-next-chapter-of-ai/</a></strong></p><p>OpenAI explains how systematic evaluation frameworks are becoming central to deploying AI in enterprise settings. The piece covers moving beyond anecdotal testing to rigorous evaluation systems that catch failures before production, measure performance across edge cases, and provide confidence for business-critical applications. Essential reading for anyone moving AI from prototype to production.</p><h3><strong>Study Notes: Stanford CS336 Language Modeling from Scratch</strong></h3><p><strong><a href="https://bearbearyu1223.github.io/cs336/2025/07/20/cs336-note-get-started.html">https://bearbearyu1223.github.io/cs336/2025/07/20/cs336-note-get-started.html</a></strong></p><p>Han Yu shares detailed study notes from Stanford&#8217;s CS336, which covers building language models from first principles. The notes distill key concepts and implementation details that often get glossed over in high-level overviews. Useful for practitioners who want to understand what&#8217;s actually happening under the hood when training language models.</p><h3><strong>On Policy Distillation</strong></h3><p><strong><a href="https://thinkingmachines.ai/blog/on-policy-distillation/">https://thinkingmachines.ai/blog/on-policy-distillation/</a></strong></p><p>Thinking Machines Lab explores policy distillation techniques for transferring capabilities from larger teacher models to smaller student models. The analysis covers when distillation works well, where it breaks down, and practical considerations for applying these techniques in production systems where you need efficient inference without sacrificing too much capability.</p><div><hr></div><h2><strong>&#128736; 2 Tools &amp; Repos</strong></h2><h3><strong>Paper2Agent</strong></h3><p><strong><a href="https://github.com/jmiao24/Paper2Agent">https://github.com/jmiao24/Paper2Agent</a></strong></p><p>Automates the process of converting research papers into functional AI agents. Instead of manually implementing methods described in papers, this tool extracts key algorithmic details and generates working code. While it won&#8217;t replace careful human implementation for production systems, it accelerates prototyping and helps you quickly test whether a paper&#8217;s approach applies to your problem.</p><h3><strong>Machine Learning Systems Book</strong></h3><p><strong><a href="https://github.com/harvard-edge/cs249r_book">https://github.com/harvard-edge/cs249r_book</a></strong></p><p>Harvard&#8217;s comprehensive resource on machine learning systems, covering everything from training infrastructure to deployment considerations. The book addresses practical engineering challenges that papers typically ignore: how to actually build, scale, and maintain ML systems in production. Valuable for anyone moving beyond toy datasets to real-world ML engineering.</p><div><hr></div><h2><strong>&#127891; 1 Pick of the Week</strong></h2><h3><strong>Nano Banana Pro (Gemini 3 Pro Image)</strong></h3><p><strong><a href="https://deepmind.google/models/gemini-image/pro/">https://deepmind.google/models/gemini-image/pro/</a></strong></p><div id="youtube2-UQsJIo46ZR8" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;UQsJIo46ZR8&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/UQsJIo46ZR8?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Google released Nano Banana Pro, their latest image generation model built on Gemini 3 Pro. While the original Nano Banana (Gemini 2.5 Flash Image) from August made waves with character consistency, Nano Banana Pro takes things further with studio-quality precision and advanced text rendering. The model can generate clear, legible text directly in images, create detailed infographics using Gemini&#8217;s real-world knowledge, and handle complex compositions at 2K and 4K resolutions. The upgrade addresses persistent pain points in professional creative workflows where text needed to be crisp, diagrams needed to be accurate, and outputs needed to meet production quality standards.</p><p>My oh my, are the results BANANAS!</p><div><hr></div><p><em>Thanks for reading The Tokenizer! If you found something useful here, share it with someone who might benefit. And if you want more curated insights like this, consider subscribing to Gradient Ascent.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.artofsaience.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption"></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Training World Class LLMs, Google's Agent Starter Pack, and Fei Fei Li on AI's Evolution: The Tokenizer Edition #8]]></title><description><![CDATA[This week's most valuable AI resources]]></description><link>https://newsletter.artofsaience.com/p/training-world-class-llms-googles</link><guid isPermaLink="false">https://newsletter.artofsaience.com/p/training-world-class-llms-googles</guid><dc:creator><![CDATA[Sairam Sundaresan]]></dc:creator><pubDate>Sat, 15 Nov 2025 17:59:52 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!sg1_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58c7994c-12a2-4af9-80d8-69e0d9206f03_1078x596.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey there! Model distillation just got smarter with adversarial training that makes student models competitive, and the Depth Anything team pushed visual geometry to new heights with a simplified transformer approach. Sometimes the best progress comes from making things simpler, not more complex.</p><p>By the way, you may have noticed that there wasn&#8217;t a newsletter in your inbox last week. That&#8217;s because I was away giving a talk on Vision Language Models at the Google Developer Group&#8217;s DevFest 2025 conference at IIT Madras.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gyUL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe85d4c70-305a-4430-b993-926f9af39eba_1228x826.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gyUL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe85d4c70-305a-4430-b993-926f9af39eba_1228x826.jpeg 424w, https://substackcdn.com/image/fetch/$s_!gyUL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe85d4c70-305a-4430-b993-926f9af39eba_1228x826.jpeg 848w, https://substackcdn.com/image/fetch/$s_!gyUL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe85d4c70-305a-4430-b993-926f9af39eba_1228x826.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!gyUL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe85d4c70-305a-4430-b993-926f9af39eba_1228x826.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gyUL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe85d4c70-305a-4430-b993-926f9af39eba_1228x826.jpeg" width="1228" height="826" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e85d4c70-305a-4430-b993-926f9af39eba_1228x826.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:826,&quot;width&quot;:1228,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gyUL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe85d4c70-305a-4430-b993-926f9af39eba_1228x826.jpeg 424w, https://substackcdn.com/image/fetch/$s_!gyUL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe85d4c70-305a-4430-b993-926f9af39eba_1228x826.jpeg 848w, https://substackcdn.com/image/fetch/$s_!gyUL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe85d4c70-305a-4430-b993-926f9af39eba_1228x826.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!gyUL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe85d4c70-305a-4430-b993-926f9af39eba_1228x826.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">The hall was packed end to end. So much interest in VLMs :D</figcaption></figure></div><p>I had a ton of fun answering questions and got to meet so many of the Gradient Ascent community in person. I hope to catch more of you in person in future talks and events.</p><h3><strong>New here?</strong></h3><p><em>The Tokenizer is my resource-focused newsletter edition where I curate the best papers, videos, articles, tools, and learning resources from across the AI landscape. Consider it your weekly dose of everything you need to stay ahead in machine learning.</em></p><h2><strong>TL;DR</strong></h2><p><strong>What caught my attention this week:</strong></p><ul><li><p><strong>&#128196; Papers:</strong> Adversarial distillation reaching proprietary model performance, visual geometry models setting new benchmarks, and efficient latent-space upscaling for diffusion models</p></li><li><p><strong>&#127909; Videos:</strong> Physics-based rendering breakthroughs, transformer architecture deep dives, and practical approaches to ML library testing</p></li><li><p><strong>&#128240; Reads:</strong> Fei-Fei Li on AI&#8217;s evolution from words to worlds, scaling laws for reinforcement learning, and accessible PPO explanations</p></li><li><p><strong>&#128736; Tools:</strong> Google&#8217;s comprehensive agent starter kit and scalable diffusion language modeling</p></li><li><p><strong>&#127891; Learning:</strong> HuggingFace&#8217;s complete playbook for training small language models from scratch</p></li></ul><div><hr></div><p>Grab my first book &#8212; <strong><a href="https://www.amazon.com/AI-Rest-Us-Illustrated-Introduction/dp/B0F29THNLT/">AI for the Rest of Us</a></strong> &#8212; today!</p><p>It&#8217;s been a whirlwind two weeks since the book&#8217;s launch and so many of you have sent me pictures of the book in your hands. Thank you so much!! </p><p>In fact, <strong>the book hit #2 on Amazon&#8217;s new releases </strong>in the software books category.</p><blockquote><p>If you find the book useful, <em><strong>please leave a review on Amazon</strong></em>. It makes a world of difference. If you have a picture of the book IRL, please share it with me. I really appreciate it.</p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://www.amazon.com/AI-Rest-Us-Illustrated-Introduction/dp/B0F29THNLT/" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NmFg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcd7ee00-a083-431a-a758-4d537f351e9a_1080x1350.png 424w, https://substackcdn.com/image/fetch/$s_!NmFg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcd7ee00-a083-431a-a758-4d537f351e9a_1080x1350.png 848w, https://substackcdn.com/image/fetch/$s_!NmFg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcd7ee00-a083-431a-a758-4d537f351e9a_1080x1350.png 1272w, https://substackcdn.com/image/fetch/$s_!NmFg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcd7ee00-a083-431a-a758-4d537f351e9a_1080x1350.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NmFg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcd7ee00-a083-431a-a758-4d537f351e9a_1080x1350.png" width="1080" height="1350" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fcd7ee00-a083-431a-a758-4d537f351e9a_1080x1350.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1350,&quot;width&quot;:1080,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:&quot;https://www.amazon.com/AI-Rest-Us-Illustrated-Introduction/dp/B0F29THNLT/&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NmFg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcd7ee00-a083-431a-a758-4d537f351e9a_1080x1350.png 424w, https://substackcdn.com/image/fetch/$s_!NmFg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcd7ee00-a083-431a-a758-4d537f351e9a_1080x1350.png 848w, https://substackcdn.com/image/fetch/$s_!NmFg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcd7ee00-a083-431a-a758-4d537f351e9a_1080x1350.png 1272w, https://substackcdn.com/image/fetch/$s_!NmFg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcd7ee00-a083-431a-a758-4d537f351e9a_1080x1350.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.amazon.com/AI-Rest-Us-Illustrated-Introduction/dp/B0F29THNLT/&quot;,&quot;text&quot;:&quot;Order AI for the Rest of Us Today!&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.amazon.com/AI-Rest-Us-Illustrated-Introduction/dp/B0F29THNLT/"><span>Order AI for the Rest of Us Today!</span></a></p><div><hr></div><h2><strong>&#128196; 5 Papers</strong></h2><h3><strong>Black-Box On-Policy Distillation of Large Language Models</strong></h3><p><strong><a href="https://arxiv.org/abs/2511.10643">https://arxiv.org/abs/2511.10643</a> |<a href="https://github.com/microsoft/LMOps/tree/main/gad"> GitHub</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KxOl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e6210f7-9f8e-40f5-b1cd-81e1dc489191_1279x476.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KxOl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e6210f7-9f8e-40f5-b1cd-81e1dc489191_1279x476.png 424w, https://substackcdn.com/image/fetch/$s_!KxOl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e6210f7-9f8e-40f5-b1cd-81e1dc489191_1279x476.png 848w, https://substackcdn.com/image/fetch/$s_!KxOl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e6210f7-9f8e-40f5-b1cd-81e1dc489191_1279x476.png 1272w, https://substackcdn.com/image/fetch/$s_!KxOl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e6210f7-9f8e-40f5-b1cd-81e1dc489191_1279x476.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KxOl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e6210f7-9f8e-40f5-b1cd-81e1dc489191_1279x476.png" width="1279" height="476" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5e6210f7-9f8e-40f5-b1cd-81e1dc489191_1279x476.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:476,&quot;width&quot;:1279,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KxOl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e6210f7-9f8e-40f5-b1cd-81e1dc489191_1279x476.png 424w, https://substackcdn.com/image/fetch/$s_!KxOl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e6210f7-9f8e-40f5-b1cd-81e1dc489191_1279x476.png 848w, https://substackcdn.com/image/fetch/$s_!KxOl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e6210f7-9f8e-40f5-b1cd-81e1dc489191_1279x476.png 1272w, https://substackcdn.com/image/fetch/$s_!KxOl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e6210f7-9f8e-40f5-b1cd-81e1dc489191_1279x476.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Distilling knowledge from proprietary models usually means settling for worse performance because you only get text outputs, not internal logits. Microsoft&#8217;s Generative Adversarial Distillation (GAD) changes this by training a discriminator to distinguish student responses from teacher outputs, creating a minimax game where the student learns to fool the discriminator. The discriminator acts as an on-policy reward model that evolves alongside the student, providing stable feedback throughout training. Qwen2.5-14B-Instruct trained with GAD matches its teacher GPT-5-Chat on LMSYS-Chat evaluation, proving you can reach proprietary model performance through black-box distillation alone.</p><h3><strong>Depth Anything 3: Recovering the Visual Space from Any Views</strong></h3><p><strong><a href="https://arxiv.org/abs/2511.10647">https://arxiv.org/abs/2511.10647</a> |<a href="https://github.com/ByteDance-Seed/depth-anything-3"> GitHub</a></strong></p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;bdc9fe07-8374-4bb5-ae6b-14d184a2b8ec&quot;,&quot;duration&quot;:null}"></div><p>Depth Anything 3 achieves state-of-the-art visual geometry from arbitrary viewpoints using surprisingly minimal architecture. A vanilla DINO encoder handles everything without specialized modifications, and a single depth-ray prediction target eliminates complex multi-task learning. The model surpasses prior SOTA VGGT by an average of 44.3% in camera pose accuracy and 25.1% in geometric accuracy while outperforming Depth Anything 2 in monocular depth estimation. Training exclusively on public academic datasets, DA3 demonstrates that architectural simplicity paired with smart training paradigms outperforms specialized complexity for visual geometry tasks.</p><h3><strong>One Small Step in Latent, One Giant Leap for Pixels: Fast Latent Upscale Adapter for Your Diffusion Models</strong></h3><p><strong><a href="https://arxiv.org/abs/2511.10629">https://arxiv.org/abs/2511.10629</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bny9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F423ae600-6df5-4242-9637-788578675928_574x352.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bny9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F423ae600-6df5-4242-9637-788578675928_574x352.png 424w, https://substackcdn.com/image/fetch/$s_!bny9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F423ae600-6df5-4242-9637-788578675928_574x352.png 848w, https://substackcdn.com/image/fetch/$s_!bny9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F423ae600-6df5-4242-9637-788578675928_574x352.png 1272w, https://substackcdn.com/image/fetch/$s_!bny9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F423ae600-6df5-4242-9637-788578675928_574x352.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bny9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F423ae600-6df5-4242-9637-788578675928_574x352.png" width="574" height="352" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/423ae600-6df5-4242-9637-788578675928_574x352.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:352,&quot;width&quot;:574,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bny9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F423ae600-6df5-4242-9637-788578675928_574x352.png 424w, https://substackcdn.com/image/fetch/$s_!bny9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F423ae600-6df5-4242-9637-788578675928_574x352.png 848w, https://substackcdn.com/image/fetch/$s_!bny9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F423ae600-6df5-4242-9637-788578675928_574x352.png 1272w, https://substackcdn.com/image/fetch/$s_!bny9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F423ae600-6df5-4242-9637-788578675928_574x352.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Diffusion models struggle with high-resolution generation because direct sampling is slow and post-hoc upscaling introduces artifacts. The Latent Upscaler Adapter (LUA) performs super-resolution directly on latent codes before VAE decoding, integrating as a drop-in component requiring zero modifications to base models. A shared Swin-style backbone with scale-specific pixel-shuffle heads supports both 2x and 4x upscaling, adding only 0.42 seconds for 1024px generation from 512px compared to 1.87 seconds for pixel-space super-resolution. LUA generalizes across different VAE latent spaces without retraining, making deployment straightforward regardless of which decoder you&#8217;re using.</p><h3><strong>Every Attention Matters: An Efficient Hybrid Architecture for Long-Context Reasoning</strong></h3><p><strong><a href="https://arxiv.org/abs/2510.19338">https://arxiv.org/abs/2510.19338</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EQ8p!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf4ec670-bd33-4cdd-bb71-fd23f9549c67_798x762.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EQ8p!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf4ec670-bd33-4cdd-bb71-fd23f9549c67_798x762.png 424w, https://substackcdn.com/image/fetch/$s_!EQ8p!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf4ec670-bd33-4cdd-bb71-fd23f9549c67_798x762.png 848w, https://substackcdn.com/image/fetch/$s_!EQ8p!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf4ec670-bd33-4cdd-bb71-fd23f9549c67_798x762.png 1272w, https://substackcdn.com/image/fetch/$s_!EQ8p!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf4ec670-bd33-4cdd-bb71-fd23f9549c67_798x762.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EQ8p!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf4ec670-bd33-4cdd-bb71-fd23f9549c67_798x762.png" width="798" height="762" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bf4ec670-bd33-4cdd-bb71-fd23f9549c67_798x762.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:762,&quot;width&quot;:798,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!EQ8p!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf4ec670-bd33-4cdd-bb71-fd23f9549c67_798x762.png 424w, https://substackcdn.com/image/fetch/$s_!EQ8p!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf4ec670-bd33-4cdd-bb71-fd23f9549c67_798x762.png 848w, https://substackcdn.com/image/fetch/$s_!EQ8p!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf4ec670-bd33-4cdd-bb71-fd23f9549c67_798x762.png 1272w, https://substackcdn.com/image/fetch/$s_!EQ8p!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf4ec670-bd33-4cdd-bb71-fd23f9549c67_798x762.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The Ring-linear model series combines linear attention and softmax attention to slash long-context inference costs. Ring-mini-linear-2.0 (16B parameters, 957M activations) and Ring-flash-linear-2.0 (104B parameters, 6.1B activations) reduce inference cost to 1/10 compared to 32B dense models. Through systematic exploration of attention mechanism ratios, they identified optimal hybrid architecture structures that maintain SOTA performance across challenging reasoning benchmarks while leveraging their FP8 operator library for 50% training efficiency gains. The models undergo stable long-term optimization during reinforcement learning phases due to high alignment between training and inference operators.</p><h3><strong>GigaBrain-0: A World Model-Powered Vision-Language-Action Model</strong></h3><p><strong><a href="https://arxiv.org/abs/2510.19430">https://arxiv.org/abs/2510.19430</a> |<a href="https://github.com/open-gigaai/giga-brain-0"> GitHub</a></strong></p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;dca7ae92-9296-4758-9f95-f963b37d84c6&quot;,&quot;duration&quot;:null}"></div><p>Physical robot data collection bottlenecks VLA model development, so GigaBrain-0 uses world models to generate diverse training data at scale through video generation, real2real transfer, human transfer, view transfer, and sim2real techniques. This dramatically reduces reliance on expensive real robot data while improving cross-task generalization. RGBD input modeling and embodied Chain-of-Thought supervision enable the model to reason about spatial geometry, object states, and long-horizon dependencies during execution. GigaBrain-0 achieves superior generalization across appearance variations, object placements, and camera viewpoints for dexterous, long-horizon, and mobile manipulation tasks, with GigaBrain-0-Small optimized for NVIDIA Jetson AGX Orin deployment.</p><div><hr></div><h3><strong>You&#8217;ll Never Look At Chocolate TV Ads The Same Way Again</strong></h3><div id="youtube2-Mh2y2Z6Iy0U" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;Mh2y2Z6Iy0U&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/Mh2y2Z6Iy0U?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>K&#225;roly from Two Minute Papers explores physics-based rendering advances that make digital chocolate indistinguishable from the real thing. The techniques behind realistic material simulation and light transport have commercial applications beyond advertising, affecting everything from product visualization to virtual production. Worth watching to understand how far physically-based rendering has come and where computational photography is heading.</p><h3><strong>LLM Building Blocks &amp; Transformer Alternatives</strong></h3><div id="youtube2-lONyteDR4XE" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;lONyteDR4XE&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/lONyteDR4XE?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Sebastian Raschka breaks down the fundamental components that make language models work, then examines emerging alternatives to standard transformer architectures. His systematic approach covers why certain architectural choices matter for training efficiency, inference speed, and model capabilities. Useful for anyone building models or evaluating which architectures suit specific use cases.</p><h3><strong>Designing Tests for ML Libraries</strong></h3><div id="youtube2-bFTPSq3l-lE" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;bFTPSq3l-lE&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/bFTPSq3l-lE?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Testing machine learning libraries requires different strategies than traditional software because of probabilistic outputs and numerical precision issues. This practical guide covers approaches for ensuring your ML code behaves correctly across edge cases, different hardware configurations, and varying input distributions. Essential viewing for anyone maintaining production ML infrastructure or contributing to open-source frameworks.</p><h3><strong>Recursive Language Models</strong></h3><div id="youtube2-_TaIZLKhfLc" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;_TaIZLKhfLc&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/_TaIZLKhfLc?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>This exploration of recursive structures in language models examines how models can learn to call themselves iteratively to solve complex reasoning tasks. The approach offers alternatives to chain-of-thought prompting by building recursion directly into model architecture and training. Helpful for understanding emerging approaches to multi-step reasoning beyond standard autoregressive generation.</p><div><hr></div><h2><strong>&#128240; 3 Curated Reads</strong></h2><h3><strong>From Words to Worlds</strong></h3><div class="embedded-post-wrap" data-attrs="{&quot;id&quot;:178294178,&quot;url&quot;:&quot;https://drfeifei.substack.com/p/from-words-to-worlds-spatial-intelligence&quot;,&quot;publication_id&quot;:6635554,&quot;publication_name&quot;:&quot;Dr. Fei-Fei Li&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!de0u!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F301e4536-263e-49f6-b50f-4bf7f63348be_683x683.png&quot;,&quot;title&quot;:&quot;From Words to Worlds: Spatial Intelligence is AI&#8217;s Next Frontier&quot;,&quot;truncated_body_text&quot;:&quot;In 1950, when computing was little more than automated arithmetic and simple logic, Alan Turing asked a question that still reverberates today: can machines think? It took remarkable imagination to see what he saw: that intelligence might someday be built rather than born. That insight later launched a relentless scientific quest called Artificial Intel&#8230;&quot;,&quot;date&quot;:&quot;2025-11-10T14:31:23.320Z&quot;,&quot;like_count&quot;:873,&quot;comment_count&quot;:0,&quot;bylines&quot;:[{&quot;id&quot;:196433379,&quot;name&quot;:&quot;Fei-Fei Li&quot;,&quot;handle&quot;:&quot;drfeifei&quot;,&quot;previous_name&quot;:&quot;Dr. Fei-Fei Li&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9acc95bc-b55d-45ac-b734-eae9e0614274_523x523.jpeg&quot;,&quot;bio&quot;:&quot;AI researcher, founder, professor, educator, author&quot;,&quot;profile_set_up_at&quot;:&quot;2025-10-20T06:30:45.860Z&quot;,&quot;reader_installed_at&quot;:null,&quot;publicationUsers&quot;:[{&quot;id&quot;:6771561,&quot;user_id&quot;:196433379,&quot;publication_id&quot;:6635554,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:true,&quot;publication&quot;:{&quot;id&quot;:6635554,&quot;name&quot;:&quot;Dr. Fei-Fei Li&quot;,&quot;subdomain&quot;:&quot;drfeifei&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;All about AI &quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/301e4536-263e-49f6-b50f-4bf7f63348be_683x683.png&quot;,&quot;author_id&quot;:196433379,&quot;primary_user_id&quot;:null,&quot;theme_var_background_pop&quot;:&quot;#FF6719&quot;,&quot;created_at&quot;:&quot;2025-10-20T06:36:54.230Z&quot;,&quot;email_from_name&quot;:&quot;Fei-Fei Li &quot;,&quot;copyright&quot;:&quot;Fei-Fei Li&quot;,&quot;founding_plan_name&quot;:null,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;disabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false}}],&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null,&quot;status&quot;:{&quot;bestsellerTier&quot;:null,&quot;subscriberTier&quot;:null,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:null,&quot;paidPublicationIds&quot;:[],&quot;subscriber&quot;:null}}],&quot;utm_campaign&quot;:null,&quot;belowTheFold&quot;:true,&quot;type&quot;:&quot;newsletter&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPostToDOM"><a class="embedded-post" native="true" href="https://drfeifei.substack.com/p/from-words-to-worlds-spatial-intelligence?utm_source=substack&amp;utm_campaign=post_embed&amp;utm_medium=web"><div class="embedded-post-header"><img class="embedded-post-publication-logo" src="https://substackcdn.com/image/fetch/$s_!de0u!,w_56,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F301e4536-263e-49f6-b50f-4bf7f63348be_683x683.png" loading="lazy"><span class="embedded-post-publication-name">Dr. Fei-Fei Li</span></div><div class="embedded-post-title-wrapper"><div class="embedded-post-title">From Words to Worlds: Spatial Intelligence is AI&#8217;s Next Frontier</div></div><div class="embedded-post-body">In 1950, when computing was little more than automated arithmetic and simple logic, Alan Turing asked a question that still reverberates today: can machines think? It took remarkable imagination to see what he saw: that intelligence might someday be built rather than born. That insight later launched a relentless scientific quest called Artificial Intel&#8230;</div><div class="embedded-post-cta-wrapper"><span class="embedded-post-cta">Read more</span></div><div class="embedded-post-meta">5 months ago &#183; 873 likes &#183; Fei-Fei Li</div></a></div><p>Fei-Fei Li contends that spatial intelligence, which is the ability for AI to perceive, reason about, and interact with the 3D world, is the pivotal next step for artificial intelligence. Her essay highlights the need for world models that can build, simulate, and understand consistent environments, expanding AI&#8217;s impact across creativity, robotics, science, and human-centered applications.</p><h3><strong>How to Scale RL</strong></h3><div class="embedded-post-wrap" data-attrs="{&quot;id&quot;:175849446,&quot;url&quot;:&quot;https://www.interconnects.ai/p/the-new-rl-scaling-laws&quot;,&quot;publication_id&quot;:48206,&quot;publication_name&quot;:&quot;Interconnects&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!djof!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc52e8097-8f3d-4f7e-808b-2f4ad37f3b52_720x720.png&quot;,&quot;title&quot;:&quot;How to scale RL&quot;,&quot;truncated_body_text&quot;:&quot;Two quick housekeeping items before I get to the post.&quot;,&quot;date&quot;:&quot;2025-10-20T15:17:48.165Z&quot;,&quot;like_count&quot;:79,&quot;comment_count&quot;:3,&quot;bylines&quot;:[{&quot;id&quot;:10472909,&quot;name&quot;:&quot;Nathan Lambert&quot;,&quot;handle&quot;:&quot;natolambert&quot;,&quot;previous_name&quot;:null,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!RihO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fedcdfb-e137-4f6a-9089-a46add6c6242_500x500.jpeg&quot;,&quot;bio&quot;:&quot;ML researcher making sense of AI research, products, and the uncertain technological future. PhD from Berkeley AI. Experience at Meta, DeepMind, HuggingFace.&quot;,&quot;profile_set_up_at&quot;:&quot;2021-04-24T01:19:33.371Z&quot;,&quot;reader_installed_at&quot;:&quot;2022-03-09T17:52:30.690Z&quot;,&quot;publicationUsers&quot;:[{&quot;id&quot;:100753,&quot;user_id&quot;:10472909,&quot;publication_id&quot;:48206,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:true,&quot;publication&quot;:{&quot;id&quot;:48206,&quot;name&quot;:&quot;Interconnects&quot;,&quot;subdomain&quot;:&quot;robotic&quot;,&quot;custom_domain&quot;:&quot;www.interconnects.ai&quot;,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;The cutting edge of AI, from inside the frontier AI labs, minus the hype. The border between high-level and technical thinking. Read by leading engineers, researchers, and investors.&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c52e8097-8f3d-4f7e-808b-2f4ad37f3b52_720x720.png&quot;,&quot;author_id&quot;:10472909,&quot;primary_user_id&quot;:10472909,&quot;theme_var_background_pop&quot;:&quot;#ff6b00&quot;,&quot;created_at&quot;:&quot;2020-05-21T02:59:47.895Z&quot;,&quot;email_from_name&quot;:&quot;Interconnects by Nathan Lambert&quot;,&quot;copyright&quot;:&quot;Interconnects AI, LLC&quot;,&quot;founding_plan_name&quot;:&quot;Founding Member&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;enabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false}},{&quot;id&quot;:4610799,&quot;user_id&quot;:10472909,&quot;publication_id&quot;:4519930,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:false,&quot;publication&quot;:{&quot;id&quot;:4519930,&quot;name&quot;:&quot;natolambert overflow&quot;,&quot;subdomain&quot;:&quot;natolambert&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;a place for any extra thoughts beyond Interconnects.ai&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/eb88d599-32c8-49a9-ba33-ab6327aff727_256x256.png&quot;,&quot;author_id&quot;:10472909,&quot;primary_user_id&quot;:null,&quot;theme_var_background_pop&quot;:&quot;#FF6719&quot;,&quot;created_at&quot;:&quot;2025-03-27T15:04:05.448Z&quot;,&quot;email_from_name&quot;:null,&quot;copyright&quot;:&quot;Nathan Lambert&quot;,&quot;founding_plan_name&quot;:null,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;disabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false}},{&quot;id&quot;:4926744,&quot;user_id&quot;:10472909,&quot;publication_id&quot;:4830082,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:false,&quot;publication&quot;:{&quot;id&quot;:4830082,&quot;name&quot;:&quot;Retort AI&quot;,&quot;subdomain&quot;:&quot;retortai&quot;,&quot;custom_domain&quot;:&quot;www.retortai.com&quot;,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Distilling the major events and challenges in the world of artificial intelligence and machine learning, from Thomas Krendl Gilbert and Nathan Lambert.\n\n&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cbad298c-6074-441b-ad43-d5df6dbf101d_800x800.png&quot;,&quot;author_id&quot;:10472909,&quot;primary_user_id&quot;:null,&quot;theme_var_background_pop&quot;:&quot;#FF6719&quot;,&quot;created_at&quot;:&quot;2025-04-25T22:10:28.216Z&quot;,&quot;email_from_name&quot;:null,&quot;copyright&quot;:&quot;Nathan Lambert&quot;,&quot;founding_plan_name&quot;:null,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;disabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false}}],&quot;twitter_screen_name&quot;:&quot;natolambert&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100,&quot;status&quot;:{&quot;bestsellerTier&quot;:100,&quot;subscriberTier&quot;:5,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:{&quot;type&quot;:&quot;bestseller&quot;,&quot;tier&quot;:100},&quot;paidPublicationIds&quot;:[1084089,883883,69345,1084918,6349492,6027],&quot;subscriber&quot;:null}}],&quot;utm_campaign&quot;:null,&quot;belowTheFold&quot;:true,&quot;type&quot;:&quot;newsletter&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPostToDOM"><a class="embedded-post" native="true" href="https://www.interconnects.ai/p/the-new-rl-scaling-laws?utm_source=substack&amp;utm_campaign=post_embed&amp;utm_medium=web"><div class="embedded-post-header"><img class="embedded-post-publication-logo" src="https://substackcdn.com/image/fetch/$s_!djof!,w_56,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc52e8097-8f3d-4f7e-808b-2f4ad37f3b52_720x720.png" loading="lazy"><span class="embedded-post-publication-name">Interconnects</span></div><div class="embedded-post-title-wrapper"><div class="embedded-post-title">How to scale RL</div></div><div class="embedded-post-body">Two quick housekeeping items before I get to the post&#8230;</div><div class="embedded-post-cta-wrapper"><span class="embedded-post-cta">Read more</span></div><div class="embedded-post-meta">6 months ago &#183; 79 likes &#183; 3 comments &#183; Nathan Lambert</div></a></div><p><span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Nathan Lambert&quot;,&quot;id&quot;:10472909,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!RihO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8fedcdfb-e137-4f6a-9089-a46add6c6242_500x500.jpeg&quot;,&quot;uuid&quot;:&quot;aed71935-85ed-43ab-9791-8a7b1221c621&quot;}" data-component-name="MentionToDOM"></span> examines emerging scaling laws for reinforcement learning, challenging assumptions that RL doesn&#8217;t scale like supervised learning. The analysis covers what factors actually drive RL performance improvements, from compute allocation to data diversity. Critical reading as RL becomes more central to post-training for language models and robotics applications.</p><h3><strong>PPO for LLMs: A Guide for Normal People</strong></h3><div class="embedded-post-wrap" data-attrs="{&quot;id&quot;:175107358,&quot;url&quot;:&quot;https://cameronrwolfe.substack.com/p/ppo-llm&quot;,&quot;publication_id&quot;:1092659,&quot;publication_name&quot;:&quot;Deep (Learning) Focus&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!87xa!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fab9b43fb-52d5-40da-995d-5b7cd3f91064_896x896.png&quot;,&quot;title&quot;:&quot;PPO for LLMs: A Guide for Normal People&quot;,&quot;truncated_body_text&quot;:&quot;Over the last several years, reinforcement learning (RL) has been one of the most impactful areas of research for large language models (LLMs). Early research used RL to align LLMs to human preferences, and this initial work on applying RL to LLMs relied almost exclusively on Proximal Policy Optimization (PPO) [1]. This choice led PPO to&#8230;&quot;,&quot;date&quot;:&quot;2025-10-27T09:33:23.171Z&quot;,&quot;like_count&quot;:100,&quot;comment_count&quot;:4,&quot;bylines&quot;:[{&quot;id&quot;:29736521,&quot;name&quot;:&quot;Cameron R. Wolfe, Ph.D.&quot;,&quot;handle&quot;:&quot;cwolferesearch&quot;,&quot;previous_name&quot;:&quot;Cameron R. Wolfe&quot;,&quot;photo_url&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/69aba7df-b571-4609-aa47-fc2d031c11b8_1242x1595.jpeg&quot;,&quot;bio&quot;:&quot;Research @ Netflix &#8226; Rice University PhD &#8226; I make AI understandable&quot;,&quot;profile_set_up_at&quot;:&quot;2022-09-17T15:11:34.083Z&quot;,&quot;reader_installed_at&quot;:&quot;2023-01-10T11:25:00.723Z&quot;,&quot;publicationUsers&quot;:[{&quot;id&quot;:1042380,&quot;user_id&quot;:29736521,&quot;publication_id&quot;:1092659,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:true,&quot;publication&quot;:{&quot;id&quot;:1092659,&quot;name&quot;:&quot;Deep (Learning) Focus&quot;,&quot;subdomain&quot;:&quot;cameronrwolfe&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;I contextualize and explain important topics in AI research.&quot;,&quot;logo_url&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/ab9b43fb-52d5-40da-995d-5b7cd3f91064_896x896.png&quot;,&quot;author_id&quot;:29736521,&quot;primary_user_id&quot;:29736521,&quot;theme_var_background_pop&quot;:&quot;#6C0095&quot;,&quot;created_at&quot;:&quot;2022-09-17T15:12:33.160Z&quot;,&quot;email_from_name&quot;:&quot;Deep (Learning) Focus&quot;,&quot;copyright&quot;:&quot;Cameron R. Wolfe&quot;,&quot;founding_plan_name&quot;:&quot;Founding Member&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;enabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false}}],&quot;twitter_screen_name&quot;:&quot;cwolferesearch&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null,&quot;status&quot;:{&quot;bestsellerTier&quot;:null,&quot;subscriberTier&quot;:null,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:null,&quot;paidPublicationIds&quot;:[],&quot;subscriber&quot;:null}}],&quot;utm_campaign&quot;:null,&quot;belowTheFold&quot;:true,&quot;type&quot;:&quot;newsletter&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPostToDOM"><a class="embedded-post" native="true" href="https://cameronrwolfe.substack.com/p/ppo-llm?utm_source=substack&amp;utm_campaign=post_embed&amp;utm_medium=web"><div class="embedded-post-header"><img class="embedded-post-publication-logo" src="https://substackcdn.com/image/fetch/$s_!87xa!,w_56,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fab9b43fb-52d5-40da-995d-5b7cd3f91064_896x896.png" loading="lazy"><span class="embedded-post-publication-name">Deep (Learning) Focus</span></div><div class="embedded-post-title-wrapper"><div class="embedded-post-title">PPO for LLMs: A Guide for Normal People</div></div><div class="embedded-post-body">Over the last several years, reinforcement learning (RL) has been one of the most impactful areas of research for large language models (LLMs). Early research used RL to align LLMs to human preferences, and this initial work on applying RL to LLMs relied almost exclusively on Proximal Policy Optimization (PPO) [1]. This choice led PPO to&#8230;</div><div class="embedded-post-cta-wrapper"><span class="embedded-post-cta">Read more</span></div><div class="embedded-post-meta">6 months ago &#183; 100 likes &#183; 4 comments &#183; Cameron R. Wolfe, Ph.D.</div></a></div><p><span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Cameron R. Wolfe, Ph.D.&quot;,&quot;id&quot;:29736521,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/69aba7df-b571-4609-aa47-fc2d031c11b8_1242x1595.jpeg&quot;,&quot;uuid&quot;:&quot;0562703a-d018-4506-871a-f280dacfceea&quot;}" data-component-name="MentionToDOM"></span> demystifies Proximal Policy Optimization for language model training without assuming you have a PhD in reinforcement learning. His breakdown covers why PPO became the standard for RLHF, what the core mechanisms actually do, and how to think about the algorithm&#8217;s behavior during training. An accessible guide that translates complex ideas into clear, actionable insights and showing how PPO underpins contemporary LLM alignment practices.</p><div><hr></div><h2><strong>&#128736; 2 Tools &amp; Repos</strong></h2><h3><strong>Agent Starter Pack</strong></h3><p><strong><a href="https://github.com/GoogleCloudPlatform/agent-starter-pack">https://github.com/GoogleCloudPlatform/agent-starter-pack</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2vKN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fb27539-720d-4016-827e-58f6ba288403_2048x1151.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2vKN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fb27539-720d-4016-827e-58f6ba288403_2048x1151.png 424w, https://substackcdn.com/image/fetch/$s_!2vKN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fb27539-720d-4016-827e-58f6ba288403_2048x1151.png 848w, https://substackcdn.com/image/fetch/$s_!2vKN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fb27539-720d-4016-827e-58f6ba288403_2048x1151.png 1272w, https://substackcdn.com/image/fetch/$s_!2vKN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fb27539-720d-4016-827e-58f6ba288403_2048x1151.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2vKN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fb27539-720d-4016-827e-58f6ba288403_2048x1151.png" width="1456" height="818" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2fb27539-720d-4016-827e-58f6ba288403_2048x1151.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:818,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2vKN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fb27539-720d-4016-827e-58f6ba288403_2048x1151.png 424w, https://substackcdn.com/image/fetch/$s_!2vKN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fb27539-720d-4016-827e-58f6ba288403_2048x1151.png 848w, https://substackcdn.com/image/fetch/$s_!2vKN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fb27539-720d-4016-827e-58f6ba288403_2048x1151.png 1272w, https://substackcdn.com/image/fetch/$s_!2vKN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fb27539-720d-4016-827e-58f6ba288403_2048x1151.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Google Cloud&#8217;s comprehensive toolkit for building production AI agents covers everything from basic scaffolding to deployment patterns. Instead of starting from scratch, you get tested implementations of common agent workflows, integration templates for external tools, and best practices for managing agent state. The repository focuses on patterns that work reliably in production rather than experimental approaches.</p><h3><strong>Simple Diffusion Language Modeling</strong></h3><p><strong><a href="https://github.com/ZHZisZZ/dllm">https://github.com/ZHZisZZ/dllm</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2T1f!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe7e207a-1278-40da-9acc-7be6a6692fd2_480x210.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2T1f!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe7e207a-1278-40da-9acc-7be6a6692fd2_480x210.gif 424w, https://substackcdn.com/image/fetch/$s_!2T1f!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe7e207a-1278-40da-9acc-7be6a6692fd2_480x210.gif 848w, https://substackcdn.com/image/fetch/$s_!2T1f!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe7e207a-1278-40da-9acc-7be6a6692fd2_480x210.gif 1272w, https://substackcdn.com/image/fetch/$s_!2T1f!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe7e207a-1278-40da-9acc-7be6a6692fd2_480x210.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2T1f!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe7e207a-1278-40da-9acc-7be6a6692fd2_480x210.gif" width="480" height="210" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/be7e207a-1278-40da-9acc-7be6a6692fd2_480x210.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:210,&quot;width&quot;:480,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:978464,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/gif&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.artofsaience.com/i/178988077?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe7e207a-1278-40da-9acc-7be6a6692fd2_480x210.gif&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2T1f!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe7e207a-1278-40da-9acc-7be6a6692fd2_480x210.gif 424w, https://substackcdn.com/image/fetch/$s_!2T1f!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe7e207a-1278-40da-9acc-7be6a6692fd2_480x210.gif 848w, https://substackcdn.com/image/fetch/$s_!2T1f!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe7e207a-1278-40da-9acc-7be6a6692fd2_480x210.gif 1272w, https://substackcdn.com/image/fetch/$s_!2T1f!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe7e207a-1278-40da-9acc-7be6a6692fd2_480x210.gif 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>This implementation demonstrates diffusion models applied to language generation, offering an alternative to standard autoregressive approaches. The clean codebase makes it easier to understand how diffusion processes work for discrete tokens and experiment with different training strategies. Useful for researchers exploring non-autoregressive generation methods or looking to understand diffusion model fundamentals.</p><div><hr></div><h2><strong>&#127891; 1 Pick of the Week</strong></h2><h3><strong>The Smol Training Playbook</strong></h3><p><strong><a href="https://huggingface.co/spaces/HuggingFaceTB/smol-training-playbook">https://huggingface.co/spaces/HuggingFaceTB/smol-training-playbook</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!sg1_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58c7994c-12a2-4af9-80d8-69e0d9206f03_1078x596.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!sg1_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58c7994c-12a2-4af9-80d8-69e0d9206f03_1078x596.png 424w, https://substackcdn.com/image/fetch/$s_!sg1_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58c7994c-12a2-4af9-80d8-69e0d9206f03_1078x596.png 848w, https://substackcdn.com/image/fetch/$s_!sg1_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58c7994c-12a2-4af9-80d8-69e0d9206f03_1078x596.png 1272w, https://substackcdn.com/image/fetch/$s_!sg1_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58c7994c-12a2-4af9-80d8-69e0d9206f03_1078x596.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!sg1_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58c7994c-12a2-4af9-80d8-69e0d9206f03_1078x596.png" width="1078" height="596" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/58c7994c-12a2-4af9-80d8-69e0d9206f03_1078x596.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:596,&quot;width&quot;:1078,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!sg1_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58c7994c-12a2-4af9-80d8-69e0d9206f03_1078x596.png 424w, https://substackcdn.com/image/fetch/$s_!sg1_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58c7994c-12a2-4af9-80d8-69e0d9206f03_1078x596.png 848w, https://substackcdn.com/image/fetch/$s_!sg1_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58c7994c-12a2-4af9-80d8-69e0d9206f03_1078x596.png 1272w, https://substackcdn.com/image/fetch/$s_!sg1_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58c7994c-12a2-4af9-80d8-69e0d9206f03_1078x596.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>HuggingFace&#8217;s complete guide to training small language models efficiently covers everything from data preparation through final evaluation. The playbook synthesizes lessons learned from training the SmolLM series, including which architectural choices matter most at smaller scales, how to optimize training compute, and which evaluation benchmarks actually predict downstream performance. Particularly valuable if you&#8217;re training models under 2B parameters where established practices for larger models don&#8217;t always apply. The guide includes specific hyperparameter recommendations, dataset mixing strategies, and ablation studies showing what works versus what doesn&#8217;t at small scale.</p><div><hr></div><p><em>Thanks for reading The Tokenizer! If you found something useful here, share it with someone who might benefit. And if you want more curated insights like this, consider subscribing to Gradient Ascent.</em></p>]]></content:encoded></item><item><title><![CDATA[AI for the Rest of Us: Available Now!]]></title><description><![CDATA[It's finally here plus find out if you won the preorder raffle!]]></description><link>https://newsletter.artofsaience.com/p/ai-for-the-rest-of-us-available-now</link><guid isPermaLink="false">https://newsletter.artofsaience.com/p/ai-for-the-rest-of-us-available-now</guid><dc:creator><![CDATA[Sairam Sundaresan]]></dc:creator><pubDate>Fri, 31 Oct 2025 02:33:17 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!rFqk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3251f419-c979-468c-941e-ef2c857d179a_1080x1080.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>I can&#8217;t believe I&#8217;m saying this. I&#8217;m officially a published author :D</p><p>After three years, my first book,&nbsp;<em><strong><a href="https://www.amazon.com/AI-Rest-Us-Illustrated-Introduction/dp/B0F29THNLT">AI for the Rest of Us</a>,</strong></em>&nbsp;with Bloomsbury, is finally out in the world.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://www.amazon.com/AI-Rest-Us-Illustrated-Introduction/dp/B0F29THNLT" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!rFqk!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3251f419-c979-468c-941e-ef2c857d179a_1080x1080.jpeg 424w, https://substackcdn.com/image/fetch/$s_!rFqk!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3251f419-c979-468c-941e-ef2c857d179a_1080x1080.jpeg 848w, https://substackcdn.com/image/fetch/$s_!rFqk!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3251f419-c979-468c-941e-ef2c857d179a_1080x1080.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!rFqk!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3251f419-c979-468c-941e-ef2c857d179a_1080x1080.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!rFqk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3251f419-c979-468c-941e-ef2c857d179a_1080x1080.jpeg" width="1080" height="1080" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3251f419-c979-468c-941e-ef2c857d179a_1080x1080.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1080,&quot;width&quot;:1080,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:135319,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:&quot;https://www.amazon.com/AI-Rest-Us-Illustrated-Introduction/dp/B0F29THNLT&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://newsletter.artofsaience.com/i/177623039?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3251f419-c979-468c-941e-ef2c857d179a_1080x1080.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!rFqk!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3251f419-c979-468c-941e-ef2c857d179a_1080x1080.jpeg 424w, https://substackcdn.com/image/fetch/$s_!rFqk!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3251f419-c979-468c-941e-ef2c857d179a_1080x1080.jpeg 848w, https://substackcdn.com/image/fetch/$s_!rFqk!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3251f419-c979-468c-941e-ef2c857d179a_1080x1080.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!rFqk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3251f419-c979-468c-941e-ef2c857d179a_1080x1080.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Why This Book Exists</h2><p>I wrote it because too many AI conversations got stuck in translation.</p><p>Executives talked strategy. Engineers talked code. Everyone else nodded along, trying to make sense of both.</p><p>So I built a bridge.</p><h2>Who This Book Is For</h2><p>This book is for the people in between. The rest of us.</p><ul><li><p><strong>Product managers, leaders, decision makers, and domain experts</strong> who need to understand how AI actually works to make better decisions</p></li><li><p><strong>Engineers</strong> who are tired of explaining why AI isn&#8217;t magic to their bosses and teams</p></li><li><p><strong>Curious readers</strong> who want to see the big picture clearly</p></li></ul><h2>What Makes It Different</h2><p>No code. No equations. No jargon.</p><p>Just hand-drawn sketches, stories, and metaphors that make the complex intuitive.</p><p>It&#8217;s been a long road of rewrites, feedback, and sketches that taught me how to explain ideas simply. I&#8217;m deeply grateful to everyone who believed in this mission from the start.</p><h2>How You Can Help</h2><p>Here&#8217;s how you can help me spread AI literacy:</p><p><strong>&#8594; Order a copy</strong> for yourself or your team<br><strong>&#8594; Share this</strong> with someone who needs it<br><strong>&#8594; Tell me</strong> what part helps you the most</p><p>Every share, every reader, every conversation helps make AI literacy accessible to more people.</p><p>Let&#8217;s make sure understanding AI isn&#8217;t a privilege. Let&#8217;s bring everyone along.</p><p><strong>It&#8217;s time to lead.</strong></p><div><hr></div><h2>Get Your Copy</h2><ul><li><p><strong>Amazon</strong>: <strong><a href="https://www.amazon.com/AI-Rest-Us-Illustrated-Introduction/dp/B0F29THNLT">https://www.amazon.com/AI-Rest-Us-Illustrated-Introduction/dp/B0F29THNLT</a></strong></p></li><li><p><strong>Barnes &amp; Noble: <a href="https://www.barnesandnoble.com/w/ai-for-the-rest-of-us-sairam-sundaresan/1147236404">https://www.barnesandnoble.com/w/ai-for-the-rest-of-us-sairam-sundaresan/1147236404</a></strong></p></li><li><p><strong>Bloomsbury: <a href="https://www.bloomsbury.com/us/ai-for-the-rest-of-us-9798881807955/">https://www.bloomsbury.com/us/ai-for-the-rest-of-us-9798881807955/</a></strong></p><div><hr></div></li></ul><h2>Preorder Winners!</h2><p>And now, the fun part: I&#8217;m thrilled to announce the winners of the pre-order bonuses!</p><ul><li><p><strong>Signed Copy Winners: </strong>Gary Marchant, Harishkumar Singh, Pandu Ranganath, Michael Machado, Om Bhatt, Gilles Everling, Aman Deep, David Poore Turner, Travis Bradford, and Daniel Janecek</p></li><li><p><strong>1:1 Session Winner:</strong> Cynthia Machado</p></li></ul><p>Congratulations to all, and I&#8217;ll be in touch directly with next steps.</p><p>If you&#8217;ve ever wanted to cut through the jargon and finally feel confident talking about AI, I hope this book gives you exactly that.</p><p>Order your copy here:<a href="https://tinyurl.com/4cha7cv7"> </a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.amazon.com/AI-Rest-Us-Illustrated-Introduction/dp/B0F29THNLT&quot;,&quot;text&quot;:&quot;Grab your copy today!&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.amazon.com/AI-Rest-Us-Illustrated-Introduction/dp/B0F29THNLT"><span>Grab your copy today!</span></a></p><p>It&#8217;s time to lead!</p>]]></content:encoded></item></channel></rss>