<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Tech with Darin]]></title><description><![CDATA[Weekly AI and cloud signal from a 30-year practitioner. No hype. No filler. What happened, what it means, and what to do about it.]]></description><link>https://www.techwithdarin.com</link><image><url>https://substackcdn.com/image/fetch/$s_!4B2V!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff69ac3d5-f261-4e0f-9c92-ca59f685d498_1024x1024.png</url><title>Tech with Darin</title><link>https://www.techwithdarin.com</link></image><generator>Substack</generator><lastBuildDate>Mon, 06 Apr 2026 19:57:14 GMT</lastBuildDate><atom:link href="https://www.techwithdarin.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Darin Deters]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[darindeters@gmail.com]]></webMaster><itunes:owner><itunes:email><![CDATA[darindeters@gmail.com]]></itunes:email><itunes:name><![CDATA[Darin Deters]]></itunes:name></itunes:owner><itunes:author><![CDATA[Darin Deters]]></itunes:author><googleplay:owner><![CDATA[darindeters@gmail.com]]></googleplay:owner><googleplay:email><![CDATA[darindeters@gmail.com]]></googleplay:email><googleplay:author><![CDATA[Darin Deters]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[The AI Stack War Is Now Internal]]></title><description><![CDATA[Anthropic bought a biotech startup. Microsoft launched three models. OpenAI lost two executives. One pattern explains all of it.]]></description><link>https://www.techwithdarin.com/p/the-ai-stack-war-is-now-internal</link><guid isPermaLink="false">https://www.techwithdarin.com/p/the-ai-stack-war-is-now-internal</guid><dc:creator><![CDATA[Darin Deters]]></dc:creator><pubDate>Sat, 04 Apr 2026 14:24:19 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/4fcb430e-7967-444e-af7d-50f4a50c1fc4_1476x1052.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2>The Bottom Line (No Jargon Edition)</h2><ul><li><p><strong>AI labs are buying the tools they used to rent.</strong> Anthropic paid $400 million for a biotech AI startup. Microsoft shipped three in-house models to chip away at its OpenAI dependency. The era of "we'll just use someone else's AI" is ending for the big players.</p></li><li><p><strong>AI vendors are now a supply chain risk.</strong> Mercor, a recruiting platform backed by Meta, got hit through LiteLLM, a third-party AI integration layer. This is not an isolated incident. Every AI tool you plug into your stack is a new attack surface.</p></li><li><p><strong>OpenAI is under real operational stress.</strong> The CEO of applications went on medical leave. The CMO is stepping back to fight cancer. The COO shifted to a special projects role. Three major leadership moves in one week at a company serving nearly one billion users.</p></li><li><p><strong>Microsoft is spending $10 billion in Japan.</strong> That's infrastructure spending through 2029, with SoftBank and Sakura Internet as partners. The goal: train one million engineers and developers by 2030 while expanding compute capacity in-region.</p></li><li><p><strong>Google's Gemma 4 is now Apache 2.0 licensed.</strong> That's a meaningful shift. It means enterprise teams can use it without the licensing headaches that came with previous Gemma releases. Open weights, agentic support, and lower-power device compatibility in one drop.</p></li><li><p><strong>The moat is no longer the model.</strong> It's the vertical stack around the model. Who owns the data pipeline, the tooling, the workflow integration, the compliance layer. The labs figured this out. Now enterprise teams need to figure out what it means for vendor decisions.</p></li></ul><div><hr></div><h2>The Take That Started the Week</h2><p>The Mercor breach didn't make most front pages. A hiring platform gets hit through a dependency in its AI stack. Happens all the time. Move on.</p><p>Except it doesn't happen all the time. Not like this. Mercor was using LiteLLM, a popular open-source library that lets you route calls across multiple AI model providers from a single interface. It's a reasonable engineering choice. Lots of teams use it. And that's exactly the problem.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.techwithdarin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Tech with Darin is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>When you add an AI layer to your product, you're not adding one vendor. You're adding the vendor's dependencies, the dependencies' dependencies, and every integration point along the way. In traditional software supply chain terms, this is a known pattern. We watched it blow up with Log4Shell in 2021. We watched it again with the XZ Utils backdoor in 2024. The AI tooling ecosystem is running through the same learning curve, just faster, with more surface area, and with less institutional memory because most of the teams building on top of these tools are doing it for the first time.</p><p>The practical takeaway is not "don't use AI vendors." That's not a real answer in 2026. The takeaway is that AI vendor risk now belongs on your threat model the same way third-party software dependencies do. If you haven't mapped which AI tools your products or internal systems call out to, and what permissions those tools hold, that audit is overdue.</p><div><hr></div><h2>Cloud Roundup</h2><h3>AWS</h3><p>A quieter week for AWS on the product announcement front. No flagship drops. The bigger story for AWS practitioners is contextual: Microsoft's $10B Japan commitment and its MAI model trio represent a direct challenge in the enterprise cloud space where AWS has historically owned the conversation. AWS's regional infrastructure strategy, particularly in Asia-Pacific, is going to face harder questions as Microsoft builds out compute with local partners like SoftBank and Sakura Internet. Worth watching what AWS counters with in Q2.</p><h3>Azure</h3><p>Microsoft had the biggest cloud week of the year so far. The three MAI models (MAI-Transcribe-1, MAI-Voice-1, and MAI-Image-2) are now broadly available for commercial use on Microsoft Foundry and the new MAI Playground. The numbers are concrete: MAI-Transcribe-1 runs 2.5x faster than Azure Fast, supports 25 languages, and starts at $0.36/hour. MAI-Voice-1 handles audio generation and custom voice creation at $22 per million characters. MAI-Image-2 does images and video at $5 per million input tokens.</p><p>This is not just a product launch. Microsoft is systematically reducing its dependency on OpenAI models. The relationship is still active, but Redmond is clearly building the capability to operate independently. That's a strategic hedge that makes the $10B Japan announcement even more coherent. If you're building on Azure, your model routing options just got wider.</p><h3>GCP</h3><p>Google shipped Gemma 4 this week. The Apache 2.0 license switch is the headline that matters most to enterprise teams. Previous Gemma releases had custom terms that created legal ambiguity in commercial deployments. Apache 2.0 removes that friction. The model itself is built on the same foundation as Gemini 3, with improved reasoning, native function calling, structured output support, and agentic workflow management baked in. It runs on low-power devices, which positions it well for edge deployments and on-prem use cases where data sovereignty matters.</p><div><hr></div><h2>AI Model Roundup</h2><h3>OpenAI</h3><p>No new model drops this week, which is notable given the leadership turbulence. Fidji Simo, CEO of applications, announced medical leave for a worsening neuroimmune condition. CMO Kate Rouch is stepping back for cancer recovery. COO Brad Lightcap is shifting to a special projects role reporting directly to Sam Altman. Greg Brockman will oversee product in Simo's absence. Chief Strategy Officer Jason Kwon, CFO Sarah Friar, and CRO Denise Dresser are splitting business and operations oversight. Former Meta CMO Gary Briggs is stepping in as interim CMO. That is a lot of change to manage at a company approaching one billion users and an active IPO process.</p><h3>Anthropic</h3><p>The Coefficient Bio acquisition landed this week. $400 million in stock for a sub-10-person stealth startup. Coefficient Bio's platform lets AI draft drug R&amp;D plans, manage clinical regulatory strategies, and identify drug candidates. Anthropic is folding that into biopharma R&amp;D workflows using its foundation models as the backbone. This is vertical integration in the truest sense: owning the domain application layer, not just the model underneath it. The same week, Anthropic cut off Claude Pro and Max subscribers from using their subscriptions to power third-party AI agents, citing compute and engineering resource management. You now need the API or pay-as-you-go billing to run third-party agent workflows on Claude.</p><h3>Google AI</h3><p>Gemma 4 is the AI model story from Google this week. Ten-trillion-parameter count at the top of the family (Claude Mythos 5 at the same scale is Anthropic's comparable), Apache 2.0 licensing, agentic support, and a design philosophy built around both cloud and local deployment. Google has been playing a long game in open-weights models, and Gemma 4 feels like the first version that's genuinely ready for serious enterprise workloads without legal headaches.</p><div><hr></div><h2>The Pattern I'm Watching</h2><p>Thirty years in tech, and I've watched this cycle play out more than once. In the early 2000s, enterprise software companies started acquiring the consulting firms and implementation partners that lived on top of their platforms. SAP bought its way into services. Oracle absorbed its own ecosystem. The reasoning was always the same: the money isn't in the license, it's in the workflow. Own the workflow, and the license renews itself.</p><p>What's happening now across OpenAI, Anthropic, Microsoft, and Google is structurally identical, just compressed and running at AI speed. Anthropic buying Coefficient Bio is not primarily a talent acquisition or a technology bet. It's a workflow acquisition. Drug discovery workflows, clinical regulatory workflows, candidate identification workflows. Once those are native to Claude, the switching cost for a pharma company is no longer "which model do I use?" It's "do I want to rebuild three years of operational integration?" That's a very different question, with a very different answer.</p><p>The thing that's different this time is the speed of the lock-in. In the SAP era, implementation cycles ran 18 to 36 months. That was your window to reconsider vendor choices. In an AI-native workflow, a team can go from evaluation to deeply embedded in 90 days. The vertical integration moat builds faster than enterprise procurement can respond. I've seen this catch companies flat-footed before. I'm watching it happen again.</p><p>The question worth sitting with this weekend: which of the AI tools your team runs today would be genuinely painful to replace? Not inconvenient. Genuinely painful. That list is probably longer than you think.</p><div><hr></div><div><hr></div><h2>Sign-Off</h2><p>The AI stack war moved inside the labs this week. Acquisitions, model launches, leadership shifts, and a supply chain breach all pointing in the same direction: the competition isn't just between labs anymore. It's between ecosystems, workflows, and whoever gets to own the layer your team can't easily replace.</p><p>Hit reply and tell me. I read every response. Darin</p><div><hr></div><p><em>Weekly AI and cloud breakdowns from someone who's been in the game since the early days of the internet. No ads. No filler. The signal.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.techwithdarin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Tech with Darin is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Anthropic had the worst week and still won]]></title><description><![CDATA[Tech with Darin - Weekly Round up 03/28/26]]></description><link>https://www.techwithdarin.com/p/anthropic-had-the-worst-week-and</link><guid isPermaLink="false">https://www.techwithdarin.com/p/anthropic-had-the-worst-week-and</guid><dc:creator><![CDATA[Darin Deters]]></dc:creator><pubDate>Sat, 28 Mar 2026 12:46:22 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/3f285058-8f9d-494c-9a29-423ad32bffc1_1420x840.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2>The "Explain It Like I'm Paying the Bill" Version</h2><ul><li><p>Anthropic confirmed its most powerful model yet, codenamed Mythos, after internal documents leaked online this week. The company turned a security headache into a product announcement.</p></li><li><p>A federal judge blocked the Pentagon from labeling Anthropic a national security risk, ruling the government's action was First Amendment retaliation. That ruling is a bigger enterprise sales asset than any product launch.</p></li><li><p>OpenAI shut down Sora, its video generation app, just months after launch. The compute that powered it will be redirected to coding and reasoning. That tells you exactly where OpenAI thinks the money is.</p></li><li><p>Microsoft is on track for its worst stock quarter since 2008, with investors questioning whether its AI infrastructure bets will pay off. Capital expenditures nearly doubled year-over-year to $29.9 billion in Q2 FY2026.</p></li><li><p>Global cloud infrastructure spending hit $110.9 billion in Q4 2025, up 29% year-over-year. The buildout is not slowing down, but the pressure to show returns is intensifying.</p></li><li><p>Wikipedia voted to ban AI-generated content from its encyclopedia, with two narrow exceptions: AI-assisted translations and minor copy edits. The open web's most trusted source drew a clear line.</p></li><li><p>Meta told employees this week that 65% of engineers must be using AI coding tools by the end of H1 2026. That is a mandate, not a suggestion.</p></li></ul><div><hr></div><h2>The Take That Started the Week</h2><p>Anthropic had a month that would have broken most companies. Fourteen product launches. Five service outages. Internal documents leaked publicly. A presidential administration put the company on a supply chain risk list. By any normal measure, this should have been a brand implosion.</p><p>It wasn't. By the end of the week, a federal judge had ruled that the Pentagon's action was unconstitutional retaliation for protected speech. The leaked documents, far from revealing damaging secrets, confirmed what enterprise buyers were hoping to hear: Anthropic is building something genuinely powerful and has a roadmap ambitious enough to be worth protecting. The company confirmed the existence of Mythos, its most capable model yet, after the leak forced its hand.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.techwithdarin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Tech with Darin is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>There is a lesson buried in this for anyone watching the AI vendor market. Anthropic's value to enterprise buyers is not just model performance. It is the signal that the company will fight to stay independent. A government blacklist, when successfully challenged in court, becomes proof of vendor backbone. Every Fortune 500 legal and procurement team watching that ruling saw the same thing: Anthropic does not fold under political pressure. In a market where vendor lock-in is a real risk and regulatory environments are shifting fast, that posture has dollar value.</p><p>The chaos did real damage. Five outages in a single month is not a footnote. it is a reliability problem that engineers running production workloads have to explain to their managers. Anthropic has to fix that. But the narrative of the week was resilience, not collapse. That matters more in the enterprise than it should.</p><div><hr></div><h2>Cloud Roundup</h2><h3>AWS</h3><p>No headline AWS announcements this week, but the $110.9 billion Q4 2025 global cloud infrastructure number puts the hyperscaler buildout in stark relief. AWS holds roughly 30% of that market. Enterprise AI demand shifting from experimentation to production deployment is the engine behind the numbers, and AWS is expanding infrastructure capacity accordingly. For practitioners: the capacity is there, but so is the cost pressure from finance teams who want to see ROI timelines.</p><h3>Azure</h3><p>Microsoft's stock tells a story the earnings call won't fully capture. The company is on track for its worst quarter since 2008, and the culprit is not weak fundamentals. Azure grew 39% in Q2 FY2026. The problem is investor math: capital expenditures nearly doubled year-over-year to $29.9 billion. At that spend level, the market wants a returns timeline, and Nadella does not have a clean one yet. The Perplexity cloud deal and the OpenAI partnership are bets, not revenues. The next 90 days of earnings guidance will be closely watched.</p><h3>GCP</h3><p>Google Cloud is the quiet beneficiary of the Microsoft uncertainty and the Anthropic noise. GCP does not have the OpenAI partnership baggage or the Anthropic political drama. For enterprise procurement teams looking for a neutral lane, GCP is the option that does not come with a news cycle attached. That is a positioning advantage worth watching.</p><div><hr></div><h2>AI Model Roundup</h2><h3>OpenAI</h3><p>OpenAI shut down Sora this week. The iOS app, the API, and the Sora.com experience are all going offline, though the company has not published a final shutdown date. The stated reason is compute reallocation. Sora consumed significant GPU capacity that can generate more revenue in coding, reasoning, and text generation. OpenAI raised $110 billion in fresh funding just weeks ago at a $730 billion valuation. Shutting down a consumer product that topped the App Store charts is a clear signal about where the company thinks durable revenue lives. Video generation research will continue internally for robotics training and simulation, which is a sensible narrowing.</p><h3>Anthropic</h3><p>The Claude Mythos confirmation is the week's biggest model news, even if the delivery was unplanned. Beyond the leak, the federal court ruling is the story that enterprise buyers will remember. Anthropic can now say, with a federal court order behind it, that it defended its right to operate independently against a sitting administration. That is a data point that belongs in every enterprise procurement brief about AI vendor risk.</p><h3>Google AI</h3><p>No major standalone Google AI model announcements this week. Google's position sits beneath the surface of the bigger stories. its infrastructure fuels a significant portion of the AI workloads generating the cloud spending numbers, and Google DeepMind continues its research publishing cadence. The absence of a headline event this week is not a problem for Google. It is a week where the competitors made the news.</p><div><hr></div><h2>The Pattern I'm Watching</h2><p>I have been watching vendor consolidation cycles for 30 years, and they almost always follow the same sequence. First, a wave of new entrants floods the market with capabilities. Then a stress event. a real one, not a press release. separates the companies that can operate under pressure from the ones that only perform under ideal conditions. Then enterprise buyers sort themselves into camps based on who passed the stress test.</p><p>We are in that second phase right now. Anthropic's March was a stress test. Five outages, a government blacklist, a data leak. The company came through it with a court ruling affirming its independence and a product roadmap that looks stronger for having been forced into daylight. OpenAI's Sora shutdown is a different kind of stress test. not a crisis, but a discipline test. Can a company valued at $730 billion make the call to kill a popular consumer product because it is not the highest-value use of its compute? Apparently yes. That kind of resource discipline is what separates labs that scale from labs that sprawl.</p><p>The third phase. enterprise sorting. is underway. Meta's 65% AI coding tools mandate is not a technology story. It is an organizational commitment story. Meta is betting its engineering velocity on AI-assisted development, and the mandate forces adoption rather than waiting for organic enthusiasm. Wikipedia banning AI-generated content is the counterweight: the institutions that care most about accuracy and trust are drawing lines. These two moves will coexist for years. The question worth sitting with: in the organizations you work in or advise, which camp are they moving toward. mandate or moratorium?</p><div><hr></div><p><em>Weekly AI and cloud breakdowns from someone who's been in the game since the early days of the internet. No ads. No filler. The signal.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.techwithdarin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Tech with Darin is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Tech with Darin - Weekly Rollup March 21, 2026]]></title><description><![CDATA[Jensen Huang said $1 trillion. Here's the constraint nobody's talking about.]]></description><link>https://www.techwithdarin.com/p/tech-with-darin-weekly-rollup-march</link><guid isPermaLink="false">https://www.techwithdarin.com/p/tech-with-darin-weekly-rollup-march</guid><dc:creator><![CDATA[Darin Deters]]></dc:creator><pubDate>Sat, 21 Mar 2026 13:31:29 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/b960cfe8-64f5-4e9d-979c-7c766fa09e3a_2752x1536.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2>The Bottom Line (No Jargon Edition)</h2><ul><li><p>Google quietly reversed its 2018 "no military contracts" position and is now openly telling employees it is "leaning more" into Pentagon work. That shift is reshaping who gets hired and what they build.</p></li><li><p>A popular open-source security scanner called Trivy was backdoored this week. The attackers injected credential-stealing malware into the tool teams use to find vulnerabilities. That is a significant escalation. The weapon is now the defense tool itself.</p></li><li><p>Nvidia CEO Jensen Huang projected $1 trillion in AI chip orders through 2027 at the company's annual developer conference. One manufacturer reported Nvidia is already supplying 20% fewer chips than the market needs.</p></li><li><p>Andy Jassy told AWS employees he now expects AI to push AWS annual revenue to $600 billion by 2036. That is double what he projected a year ago.</p></li><li><p>OpenAI plans to nearly double its workforce to 8,000 employees by year-end and is preparing for an IPO. The company also told staff ChatGPT needs to become a genuine productivity tool, not a demo.</p></li></ul><div><hr></div><h2>The Take That Started the Week</h2><p>Jensen Huang did not come to GTC 2026 to be modest. He stood on stage and projected $1 trillion in Blackwell and Vera Rubin chip orders through 2027. For context, he projected $500 billion through 2026 a year ago. He doubled the number in twelve months. That is not a forecast. That is a statement of market position.</p><p>The agentic AI thesis is the engine behind it. When AI shifts from a chatbot you open to a system that spawns agents, executes tasks, and calls APIs without you watching, the compute demand does not grow linearly. It compounds. Every AI agent running inference at scale needs hardware. Huang knows exactly what that arithmetic looks like.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.techwithdarin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Tech with Darin is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>Here is the part that matters for practitioners. MSI reported this week that Nvidia is delivering approximately 20% fewer GPUs than the market currently demands. A $1 trillion projection with a 20% supply gap is not a growth story. It is a rationing story. Teams that secure hardware capacity now are not optimizing costs. They are securing the ability to build at all. The labs that have locked in chip supply agreements are operating in a different competitive environment than the ones queuing for spot capacity.</p><p>The Nvidia-Anthropic hiring debate that ran through the week was the visible surface of this constraint. The real pressure is not talent. It is compute. When hardware is scarce, every decision downstream of that scarcity gets distorted. Hiring shortcuts, evaluation compromises, build-versus-buy trade-offs. Watch this dynamic. It shapes the next 18 months more than any model release.</p><div><hr></div><h2>Cloud Roundup</h2><h3>AWS</h3><p>Andy Jassy's internal all-hands comment landed in Reuters this week. He told employees AI could push AWS annual revenue to $600 billion by 2036, doubling his prior estimate of $300 billion. That revision says more about how Jassy reads the agentic AI transition than any earnings call. When inference demand compounds and every enterprise workload starts attaching AI agents, the cloud bill goes up. AWS is the infrastructure that bill gets charged to.</p><p>AWS also announced its 2026 Pioneers cohort: 12 European AI startups working across healthcare diagnostics, climate modeling, and conflict prediction. Alongside that, AWS committed $1 billion in cloud credits for startups developing generative AI solutions. The startup credit play is a long-game customer acquisition strategy. Seed the ecosystem now, collect the revenue when those companies scale.</p><h3>Azure</h3><p>Microsoft's Copilot AI leadership reshuffled this week, freeing Mustafa Suleyman to focus on building new models. The structural read: Microsoft is separating the "ship Copilot features into Office" work from the "build the next generation of models" work. Those two tracks have very different timelines and success metrics. Watch whether that separation produces sharper output or slower coordination.</p><p>OpenAI's partnership with AWS to supply AI models to the U.S. military and government also surfaced this week. Microsoft's exclusive relationship with OpenAI on commercial Azure workloads coexists, somewhat uncomfortably, with OpenAI doing its own government deals. The boundaries of that partnership are getting tested.</p><h3>GCP</h3><p>Google published its "Personal Intelligence" rollout this week, bringing personalized Gemini responses into Chrome and AI Mode for free users. The feature pulls context from a user's Google ecosystem data to generate more relevant answers. Google is also testing a Gemini Mac app to put it in the same desktop shortcut slot as ChatGPT and Claude.</p><p>The bigger story is the Pentagon move. The New York Times reported this week that Google is quietly rebuilding its Defense Department relationship after walking away in 2018 following employee protests over Project Maven. Google is now telling staff that working with democratically elected governments is part of its obligations. That is a complete philosophical reversal, and it is happening fast.</p><div><hr></div><h2>AI Model Roundup</h2><h3>OpenAI</h3><p>OpenAI expanded GPT-5.4 access with faster Mini and Nano model variants this week. The model ladder strategy is now clear: large frontier models for complex tasks, small fast models for high-volume inference. That architecture matches how agents actually get deployed. The flagship model reasons. The mini model executes at scale.</p><p>The workforce news is the structural signal. Plans to nearly double headcount to 8,000 by year-end, combined with IPO preparation, means OpenAI is no longer running like a research lab. It is running like a company with quarterly pressure and investor commitments. The internal directive to make ChatGPT a "productivity tool" reflects that. Research culture and revenue culture pull in different directions.</p><h3>Anthropic</h3><p>Anthropic stayed quieter on releases this week, but it sat in the center of the labor market debate. The Nvidia-Anthropic hiring story surfaced questions about evaluation standards under capacity pressure. When compute scarcity forces build timelines to compress, the teams doing the building get squeezed. Anthropic's position in that dynamic is interesting: it is one of the best-resourced frontier labs and still feels the pressure of the hardware constraint.</p><h3>Google AI</h3><p>Gemini's Personal Intelligence rollout is Google's answer to the ambient AI question. ChatGPT has the brand. Claude has the trust signal with technical users. Google has the data ecosystem. Personal Intelligence is the move that plays to Google's actual advantage: knowing more about you than any other platform on earth. The Mac app push is table stakes. The data play is the moat.</p><p>Google also rolled out Gemini integration into Workspace this week, enabling the model to generate first drafts in Docs, build spreadsheets, and design presentations from simple prompts. The office suite integration race is now fully engaged. Microsoft Copilot has been shipping this for 18 months. Google is closing the gap.</p><div><hr></div><h2>The Pattern I'm Watching</h2><p>Here is what this week looked like when you zoom out: three unrelated stories all pointed at the same underlying shift. Google reverses course on defense contracts. A security scanner becomes an attack vector. Nvidia projects a trillion-dollar chip market while supply runs 20% short. On the surface, those stories live in different domains: policy, security, and hardware. The through-line is that infrastructure decisions are now strategic in ways they were not two years ago.</p><p>I watched this pattern play out before. In the mid-1990s, network infrastructure went from a back-office cost center to a competitive weapon. The companies that treated bandwidth, routing architecture, and physical co-location as strategic priorities pulled ahead. The ones that treated those as commodity procurement problems lost the decade. The current moment rhymes. The teams treating GPU allocation as a strategic question, treating their security tooling supply chain as a threat surface, and watching how their cloud providers are positioning on government contracts will have different options than the teams that are not.</p><p>The Trivy compromise is the one I keep coming back to. The thing that got backdoored was the tool designed to find backdoors. That is not a security failure with a patch. That is a structural challenge with no clean fix. Your security posture is only as reliable as the integrity of the tools you use to measure it. Thirty years in, that feels like the most underpriced risk in the current environment.</p><p>What is the infrastructure decision your team is treating as a commodity problem that you should be treating as a strategic one?</p><div><hr></div><h2>Sign-Off</h2><p>Infrastructure used to be a decision you made and then mostly forgot about. That era ended this week, if it had not already. The questions your team is answering about where your compute lives, who built your security toolchain, and which government contracts your cloud providers are chasing are not IT decisions anymore. They are business decisions with 5-year consequences.</p><p>Hit reply and tell me. I read every response. Darin</p><div><hr></div><p><em>Weekly AI and cloud breakdowns from someone who's been in the game since the early days of the internet. No ads. No filler. The signal.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.techwithdarin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Tech with Darin is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[When the Model Is No Longer the Product]]></title><description><![CDATA[OpenAI's Astral acquisition is one move on a larger board. The 30-year pattern behind developer platform consolidation, and what it means for your stack.]]></description><link>https://www.techwithdarin.com/p/when-the-model-is-no-longer-the-product</link><guid isPermaLink="false">https://www.techwithdarin.com/p/when-the-model-is-no-longer-the-product</guid><dc:creator><![CDATA[Darin Deters]]></dc:creator><pubDate>Fri, 20 Mar 2026 21:33:31 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/0aa36b9d-2703-47e2-ab75-3197b073d9cb_1080x1350.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>OpenAI bought Astral this week. Astral makes ruff and uv, two Python developer tools that the community adopted not because anyone marketed them, but because they are genuinely excellent. Ruff replaced an entire category of Python linters. uv replaced pip in enough real environments that "just use uv" became the default answer in engineering Slack threads. Astral's founder Charlie Marsh confirmed the deal would bring those tools into OpenAI's Codex platform, with a commitment to keep the open-source tools alive post-acquisition.</p><p>The headline will get filed under "AI company buys developer tool startup." That framing misses the point entirely.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.techwithdarin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Tech with Darin is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h2>1. The Headline Everyone Is Talking About</h2><p>OpenAI is not buying a Python linter. OpenAI is buying the trust that lives inside every Python developer's <code>.venv</code> folder.</p><p>That trust took years to build. Charlie Marsh and the Astral team earned it by shipping tools that were faster, more opinionated, and less annoying than the incumbents. The developer community rewarded that by making ruff and uv near-defaults in serious Python projects. OpenAI wrote a check for that installed base, that credibility, and the engineers behind it.</p><p>But the Astral deal is one piece of a pattern that became hard to ignore this week.</p><p>The same 72-hour window brought us the WSJ and Verge reporting that OpenAI is planning a desktop superapp. ChatGPT, Codex, and the Atlas browser unified into a single product under Fidji Simo, OpenAI's Chief of Applications. Greg Brockman is involved. The internal memo framing, per Simo's post on X: "when new bets start to work, like we're seeing now with Codex, it's very important to double down." Reuters reported separately that OpenAI is building a GitHub alternative. And the Windsurf acquisition. the AI-native IDE that was in talks with OpenAI for months. reportedly stalled because Microsoft wanted access to Windsurf's IP to protect GitHub Copilot.</p><p>Read those stories individually and they're interesting. Read them together and you see the architecture of a developer platform.</p><div><hr></div><h2>2. What Happened Last Time</h2><p>In 1995, Microsoft was already the dominant PC operating system vendor. But Bill Gates made a decision that locked in the next decade of developer loyalty: Microsoft would own the entire developer experience, not just the OS.</p><p>Visual Basic had been around since 1991. Visual C++ was shipping. In 1995, Microsoft shipped Visual Studio 97, bundling everything into a single IDE. They bought Fox Software for FoxPro. They acquired Vermeer Technologies for FrontPage. They partnered with and then slowly absorbed the tooling that developers used every day. The strategy was never about the individual tool. It was about making the cost of switching away from Windows feel prohibitive because every tool you depended on, every debugger, every deployment utility, every source control client, was stitched into the Microsoft fabric.</p><p>By the time the DOJ antitrust case was in full swing, the developer lock-in was already complete. Developers didn't stay on Windows because they loved Windows. They stayed because their entire workflow was Windows. The OS was the platform. The platform was the tools. The tools were the moat.</p><p>Here is the number that puts that era in context: by 1997, Visual Studio had an estimated 7 million licensed users. Microsoft's developer tools division was generating over $1 billion in annual revenue. That was before the internet reshaped everything.</p><p>The second parallel worth naming is IBM's acquisition of the Rational Software portfolio in 2003. IBM bought Rational for $2.1 billion, not because it needed another software product line, but because Rational owned the workflows. Rose, RequisitePro, ClearCase. if your team ran on Rational tools, you ran on IBM tools, and IBM's consulting and services arm was right there to help you do more of that. The tools created the relationship. The relationship created the revenue.</p><p>Both stories have the same structure. The platform company identifies where developers spend their time. It acquires the trust that already lives there. It connects that trust to a broader surface area. Then it waits.</p><div><hr></div><h2>3. What Is Different This Time</h2><p>Three things are different, and they matter.</p><p><strong>The speed is different.</strong> Microsoft's developer platform consolidation took roughly a decade to reach critical mass. OpenAI's version is happening in months. Astral's tools have been widely adopted for less than three years. Claude Code went from zero to being the benchmark that Codex chases in about eighteen months. The cycle that used to take a decade now takes a product cycle or two.</p><p><strong>The moat target is different.</strong> In the 1990s, Microsoft was locking in the workflow around writing code. OpenAI is locking in the workflow around writing code with AI. That is a larger surface area. It includes the model, the IDE, the linter, the dependency manager, the code reviewer, the browser, and soon the code repository. Every layer that a developer touches in a day is now a layer that OpenAI is trying to own or influence. If they succeed, switching away from OpenAI won't mean "install a different IDE." It will mean "rebuild your entire cognitive workflow."</p><p><strong>The incumbent threat is more direct.</strong> When Microsoft built its developer platform, the competition was fragmented. Borland, Watcom, Metrowerks. nobody had a coherent counter-strategy. The competition OpenAI faces in 2026 is much sharper. Anthropic launched a code review tool this month and committed $100 million to a Claude Partner Network. Claude Code's ARR is part of Anthropic's $2.5 billion+ total business. Anthropic is not fragmented. It is making the same bet on developer trust from a different angle. The $15-25 per pull request code review pricing is a stake in the ground that says: Claude belongs in your deployment pipeline, not just your editor.</p><p>Meanwhile Microsoft, which owns GitHub and has powered GitHub Copilot with OpenAI models since 2021, is watching its own partner turn into its most direct competitor. The Windsurf situation crystallized that tension. OpenAI wanted the IDE. Microsoft wanted the IP. The stalemate is a window into how much the relationship has frayed.</p><p><strong>The regulatory and open-source dimension is also different.</strong> Microsoft could acquire developer tools in the 1990s without much scrutiny and without community blowback. Charlie Marsh's promise to keep ruff and uv open-source after the Astral deal closes is a direct response to that difference. The developer community in 2026 has cultural antibodies to platform capture. They remember what happened to tools that got acquired and then quietly deprioritized. OpenAI is making a calculated bet that it can absorb Astral's trust without triggering the immune response. That bet is not guaranteed to pay off.</p><div><hr></div><h2>4. The Practitioner Playbook</h2><p>This is the section I'd want if I were an engineering leader right now. Not the analysis. The decisions.</p><p><strong>If you run a cloud platform or infrastructure product:</strong></p><p>Your moat just thinned. OpenAI building a GitHub alternative and a superapp that wraps the browser is a direct bid for the compute-adjacent developer relationship that AWS, Azure, and GCP have spent a decade cultivating. The developer who used to start from the cloud console now starts from Codex. If OpenAI succeeds in making Codex the place where code is written, reviewed, deployed, and iterated on, the cloud provider becomes a commodity below that layer. This is not a threat that materializes overnight. But the architectural direction is clear. Start thinking now about what you own that OpenAI cannot easily replicate. Execution, compliance, data locality, enterprise relationships. Those are the levers. Positioning on raw compute won't hold forever.</p><p><strong>If you build developer tooling or a security product that sits in the dev workflow:</strong></p><p>You are being targeted. Not personally, but structurally. The integrated platform play always squeezes independent tool vendors. In the Microsoft era, Borland built great C++ tools. Microsoft made Visual C++ good enough and bundled it free. Borland didn't lose because Microsoft's product was better. It lost because developers didn't need to pay separately for something that shipped with the platform. Watch how OpenAI treats third-party integrations with Codex over the next 18 months. If they start replicating capabilities that partners built, that's the tell. Start building your differentiation around workflow depth and enterprise integration, not around the surface-level feature set that an AI platform can ship in a sprint.</p><p><strong>If you are a technology leader evaluating toolchain strategy:</strong></p><p>Do not consolidate onto any single AI platform right now. The competition between OpenAI and Anthropic is real and it is running hot. That competition is good for you today. It means aggressive pricing, generous rate limits, and both platforms making concessions to win enterprise deals. Claude Code users in early 2026 were reporting over $1,000 of effective usage against $200/month plans. That is OpenAI and Anthropic buying market share. Lock in those rates where you can, but architect your workflow so you can move. The team that builds deep platform dependency today will be the team renegotiating from a weak position in 24 months.</p><p><strong>If you are thinking about the Microsoft relationship specifically:</strong></p><p>The OpenAI-Microsoft partnership is under structural strain. Microsoft now derives roughly 45% of its remaining performance obligation from OpenAI. That is a dependency, not a partnership. Microsoft shipping Copilot Cowork on Anthropic's technology is a hedge. OpenAI building a GitHub alternative is a hedge in the other direction. Both companies are covering their exits while keeping the partnership alive because neither can afford to blow it up yet. If you are a Microsoft enterprise customer, start asking your rep how Microsoft's AI strategy works if the OpenAI relationship changes materially. The answer to that question will tell you a lot about where Microsoft's actual differentiation sits.</p><div><hr></div><h2>The Pattern Underneath All of This</h2><p>Here is the 30-year read.</p><p>Every major technology cycle ends the same way. The enabling layer, whether that's the operating system, the cloud, or the AI model, gets commoditized or consolidated, and the winner is the company that owns the workflow on top of it. Microsoft owned the workflow in the PC era. AWS owned the workflow in the early cloud era. Salesforce owned the workflow in the CRM era.</p><p>OpenAI is trying to own the workflow in the AI-native development era. Acquiring Astral is one move on that board. The superapp is another. The GitHub alternative is a third. Taken together, they form a coherent theory of where the value accretes.</p><p>The model is not the product. The model is the engine. The product is the complete surface area where a developer does their work. OpenAI figured that out faster than most people expected.</p><p>The companies that lose in this cycle will be the ones that kept betting on the model being the moat. The companies that win will be the ones that got to the workflow first.</p><p>Astral was the workflow. Now it's OpenAI's.</p><div><hr></div><p><em>Hit reply and tell me what you're seeing in your own stack. I read every response.</em></p><p><em>. Darin</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.techwithdarin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Tech with Darin is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[The Week AI Got a Geopolitical Problem]]></title><description><![CDATA[Pentagon bans, private capital responds, Nvidia bets $26B on models, and datacenters become military targets. The signal from week 11 of 2026.]]></description><link>https://www.techwithdarin.com/p/the-week-ai-got-a-geopolitical-problem</link><guid isPermaLink="false">https://www.techwithdarin.com/p/the-week-ai-got-a-geopolitical-problem</guid><dc:creator><![CDATA[Darin Deters]]></dc:creator><pubDate>Sat, 14 Mar 2026 18:46:55 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/2bcf687c-0a06-43de-b901-9e0e287ffdaf_1080x1350.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2>The Bottom Line (No Jargon Edition)</h2><ul><li><p><strong>The Pentagon banned Anthropic, then 185,000 people downloaded Claude in 24 hours.</strong> A government agency said a company&#8217;s AI product was off-limits for national security reasons. Consumer reaction: record downloads. Government procurement and consumer behavior are now moving in opposite directions. That gap has real consequences for enterprise AI strategy.</p></li></ul><ul><li><p><strong>Private capital moved in immediately.</strong> Blackstone, the firm that manages over a trillion dollars in assets. entered talks with Anthropic to deploy Claude across its portfolio the same week the Pentagon ban landed. When governments shut the door on a tech company, the money doesn&#8217;t wait. It finds the next door.</p></li></ul><ul><li><p><strong>Nvidia made the biggest strategic bet of the year.</strong> The company that sells chips to every AI lab in the world is now spending $26 billion to build AI models itself. competing directly with the same labs it powers. It also pre-announced NemoClaw, an open-source agent platform. If you thought Nvidia was just a hardware company, that framing is now out of date.</p></li></ul><ul><li><p><strong>Iran struck AI datacenters in the Gulf.</strong> Hyperscale cloud infrastructure became a military target this week in a way that is no longer theoretical. The cloud resilience architecture most teams have built assumes power outages, hardware failure, and software bugs. Geopolitical conflict is a different threat model. Most DR plans don&#8217;t account for it.</p></li></ul><ul><li><p><strong>LinkedIn rebuilt its feed algorithm using large language models.</strong> The platform that distributes most practitioner-written content quietly upgraded the system that decides what gets read. If your content strategy relied on keyword frequency or fast engagement signals, the game just changed. Writing from real experience is not just the honest approach anymore. it&#8217;s now the algorithmic one.</p></li></ul><ul><li><p><strong>Atlassian cut 1,600 people, mostly R&amp;D engineers, to fund AI tooling. The stock barely moved.</strong> The market has already decided how it values engineering headcount relative to AI tooling spend. This is not an Atlassian story. It&#8217;s a pattern running across Salesforce, Microsoft, Google, and every major enterprise software company. It matters if you work in software R&amp;D.</p></li></ul><ul><li><p><strong>Google picked up the Pentagon&#8217;s AI contract after Anthropic walked away.</strong> Three million government employees now have access to Gemini. Anthropic refused the contract over safety guardrails. Google accepted it. The divergence in how these two companies handle government AI is no longer subtle.</p></li></ul><p><strong>The connecting thread:</strong> Government AI policy is moving on its own timeline, and it is increasingly out of sync with both consumer adoption and private capital deployment. The companies navigating that misalignment best are the ones that will define enterprise AI in 2026 and beyond.</p><div><hr></div><h2>The Take That Started the Week</h2><p>I have been watching the relationship between governments and technology companies for three decades. There is a pattern, and it has played out the same way every time.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.techwithdarin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Tech with Darin is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>Phase one: The government has a problem. The tech company has a solution. The government gets excited. Phase two: The government puts conditions on the relationship that the tech company finds unacceptable. The tech company pushes back. Phase three: One side blinks, or they both walk away. Phase four: The technology becomes so mission-critical that the dispute ends on whatever terms the market dictates.</p><p>Anthropic&#8217;s Pentagon situation is in phase two right now. The Pentagon wanted Anthropic to modify Claude&#8217;s safety guardrails to support classified military work. Anthropic said no. The Pentagon said fine, we&#8217;ll use Gemini. One hundred eighty-five thousand people downloaded Claude in the 24 hours that followed.</p><p>That last number is the one that doesn&#8217;t fit the pattern. In every previous cycle. telecommunications, encryption, cloud access. consumer adoption followed the enterprise lead. Governments made decisions and the market followed. This time, consumers did the opposite of what the government signaled. They ran toward the product the government said to avoid.</p><p>I think this is because AI is the first technology that people use personally before they encounter it professionally. Most practitioners first used Claude or ChatGPT on their own phone, on their own time, for their own curiosity. The trust relationship was already built before any enterprise procurement decision happened. When the Pentagon said stop using it, people who already trusted it didn&#8217;t stop.</p><p>What that means for practitioners is this: the AI vendor you&#8217;re evaluating for enterprise deployment is almost certainly one your team is already using personally. The evaluation process is not objective. It never was. But now the personal adoption path is running so far ahead of the enterprise procurement path that the gap has become a governance problem. Which model is actually running inside the tools you buy? Which vendor&#8217;s API is your vendor calling? The supply chain question for AI is genuinely harder than most teams have answered.</p><p>Anthropic&#8217;s move refusing the Pentagon contract, picking up the Blackstone deal is the most interesting strategic positioning I have seen from an AI company this year. They are explicitly betting that private capital markets, not government procurement, are the right distribution channel for a safety-forward AI company. Whether that bet pays off depends on how much enterprise buyers actually care about safety positioning versus capability. Based on the last two years, capability wins every head-to-head. But Anthropic is betting 2026 is different. I am not sure I would take that bet, but I understand why they made it.</p><div><hr></div><h2>Cloud Roundup</h2><h3>AWS</h3><p>The AWS story this week is largely the shadow of the Amazon-OpenAI $50 billion investment announced last month. still reverberating through enterprise procurement conversations. AWS is positioning Bedrock as the infrastructure layer for running models regardless of which lab wins the model race. The Graviton strategy applied to AI: don&#8217;t win on the model, win on the runtime. That is a defensible moat if the model market commoditizes. The question is whether any single model stays dominant long enough to matter before the next one arrives.</p><p>Secondary: AWS Trainium continues to pick up adoption from customers who want to avoid GPU dependency. No major announcements this week, but the competitive pressure on Nvidia from Amazon&#8217;s custom silicon is a slow-moving story that practitioners in large AWS shops should be watching.</p><h3>Azure</h3><p>Microsoft&#8217;s positioning this week is quietly strong. They have Copilot embedded in Microsoft 365 across the enterprise. They have an Anthropic licensing deal that gives them access to Claude inside Azure. They have the OpenAI relationship. Two model relationships, one distribution channel, one invoice. For enterprise IT leaders who need to justify a single platform decision, Microsoft&#8217;s AI story is the easiest one to tell to a procurement committee.</p><p>The M365 E7 tier at $99 per user per month is the vehicle for that story. If your organization is already paying for M365, the incremental AI cost is increasingly hard to resist. even if the per-seat economics feel steep. Watch for Microsoft to push harder on that conversion in Q2.</p><h3>GCP</h3><p>Google had a significant week. The Wiz acquisition closed. the $32 billion deal that gives Google cross-cloud security visibility across AWS, Azure, and GCP workloads. The strategic intent is clear: own the security layer that enterprise customers need regardless of which cloud provider they&#8217;re on. AWS has GuardDuty. Microsoft has Defender. Google now has Wiz. Security and cloud infrastructure are merging.</p><p>The Pentagon Gemini deployment is the other major development. Three million government users is not a rounding error. It establishes Google as the enterprise and government AI vendor of record in a way that was not clear before Anthropic walked away. Whether Google can hold that position against a potential Microsoft challenge is the question for Q2.</p><div><hr></div><h2>AI Model Roundup</h2><h3>OpenAI</h3><p>The classified military deal and the Caitlin Kalinowski resignation are the dominant OpenAI story this week. The hardware lead walking out is a talent signal, not just a policy signal. When a senior technical executive leaves over a values question, the internal debate was real and it was not settled cleanly. ChatGPT uninstalls up 295% is a consumer signal. Those two numbers together. talent exit plus consumer rejection. are worth tracking over the next 90 days.</p><p>The Promptfoo acquisition also closed. OpenAI now owns the tool that 125,000 developers use to red-team AI systems, including OpenAI&#8217;s own models. The conflict of interest critique is valid. The strategic logic is also valid. Both things are true.</p><h3>Anthropic</h3><p>Two things happened simultaneously this week that would have seemed contradictory 12 months ago: Anthropic lost a major government contract and began talks with Blackstone for what could be a much larger private capital deployment.</p><p>Claude Code is at $2.5 billion in annualized revenue. The zero-commission Claude Marketplace is live with six enterprise partners. The Pentagon ban created more consumer downloads than any marketing campaign Anthropic has run. By most measures, the company is in a stronger market position today than it was before the Pentagon dispute started. That is an unusual thing to say about a company that just walked away from a government contract, but the numbers support it.</p><h3>Google AI</h3><p>Gemini is now deployed across Google Workspace for enterprise users. Three million Pentagon employees. The head-to-head with Microsoft Copilot is no longer theoretical. it is active in millions of enterprise and government seats simultaneously. Google&#8217;s AI distribution story is better than its AI model story, which is exactly the right position to be in if you believe model commoditization is inevitable.</p><div><hr></div><h2>The Pattern I&#8217;m Watching</h2><p>There is a word for what is happening across Anthropic, OpenAI, Microsoft, Google, and now Nvidia this week: vertical integration. Every major player is trying to own more of the stack simultaneously.</p><p>Nvidia goes from chips to models to the agent platform. OpenAI goes from models to security testing tools to classified military deployments. Anthropic goes from models to a marketplace to private capital joint ventures. Google goes from models to workspace distribution to government deployments to cloud security. Microsoft goes from cloud to workspace to model licensing to government contracting.</p><p>I have seen this pattern before. In the late 1990s, every major enterprise software company tried to own the database, the middleware, the application layer, and the consulting services simultaneously. Most of them failed. The ones that survived did so by dominating one layer so thoroughly that the others became defensible territory. not by winning everywhere at once.</p><p>The AI version of this is playing out faster than any previous cycle. The question I keep asking is: which company actually has a monopoly on one layer? Nvidia has the closest thing. GPU training dominance. and they are now voluntarily exiting that monopoly position to compete on models and platforms. That is either the most confident strategic move in tech history or a sign that they know their training moat is shakier than it looks.</p><p>Thirty years in, I have learned that companies that try to own the full stack at the same time usually end up owning none of it well. The counter-examples are memorable precisely because they are rare. Is any of these companies Apple? I genuinely do not know yet. But that is the question that will define the next five years.</p><div><hr></div><p><em>Weekly AI and cloud breakdowns from someone who&#8217;s been in the game since the early days of the internet. No ads. No filler. The signal.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.techwithdarin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Tech with Darin is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[The 30-Year Pattern: Why I Held a Certification Exam Guide Against My Own System]]></title><description><![CDATA[Every certification I&#8217;ve earned has taught me something.]]></description><link>https://www.techwithdarin.com/p/the-30-year-pattern-why-i-held-a</link><guid isPermaLink="false">https://www.techwithdarin.com/p/the-30-year-pattern-why-i-held-a</guid><dc:creator><![CDATA[Darin Deters]]></dc:creator><pubDate>Fri, 13 Mar 2026 13:12:11 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!7jGC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc876f3ff-f147-49d8-9c85-a7e80a2f2b21_928x928.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Every certification I&#8217;ve earned has taught me something. But not always the thing on the exam.</p>
      <p>
          <a href="https://www.techwithdarin.com/p/the-30-year-pattern-why-i-held-a">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[The Week AI Stopped Competing and Started Converging]]></title><description><![CDATA[AWS invested $50B to host OpenAI. GPT-5.4 rated Claude higher than itself. Claude found 22 Firefox bugs. The infrastructure layer won this week.]]></description><link>https://www.techwithdarin.com/p/the-week-ai-stopped-competing-and</link><guid isPermaLink="false">https://www.techwithdarin.com/p/the-week-ai-stopped-competing-and</guid><dc:creator><![CDATA[Darin Deters]]></dc:creator><pubDate>Sat, 07 Mar 2026 14:02:39 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/cea54567-bc2d-48b6-9908-629d177cb9a6_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h1>AI + Cloud &#8212; Week of March 3, 2026</h1><div><hr></div><h2>The Bottom Line (No Jargon Edition)</h2><ul><li><p><strong>AWS is spending $50 billion to run OpenAI&#8217;s software on its computers.</strong> That&#8217;s one of the biggest tech infrastructure deals ever. Think of it like a massive factory being built &#8212; not to make the product, but to house the machines that make the product. AWS wants to be the building where AI lives.</p></li><li><p><strong>OpenAI released a smarter version of its AI assistant (GPT-5.4).</strong> When tested against Anthropic&#8217;s Claude on a real business project, GPT-5.4 honestly admitted Claude did a better job at the first draft. That kind of self-awareness in AI is new and surprisingly useful &#8212; it means the tools are getting better at knowing their own limits.</p></li><li><p><strong>An AI assistant found 22 security holes in Firefox&#8217;s code in two weeks.</strong> The encouraging part: it could barely exploit any of them. Modern security protections stopped it almost every time. Translation: AI is about to make software much safer by finding problems faster, while existing defenses still work.</p></li><li><p><strong>Three companies launched AI &#8220;coworker&#8221; products in the same week.</strong> Anthropic, AWS, and Google all moved their AI from answering questions to doing actual work &#8212; scheduling tasks, writing code, managing files. The shift from &#8220;chatbot&#8221; to &#8220;autonomous assistant&#8221; happened faster than anyone expected.</p></li><li><p><strong>Google released a cheaper, faster AI model (Gemini 3.1 Flash-Lite).</strong> At a fraction of the cost of premium models, this gives smaller teams and startups access to capable AI without enterprise budgets. The price of intelligence keeps falling.</p></li></ul><p><strong>The connecting thread:</strong> This was the week AI stopped competing on who&#8217;s smartest and started competing on who&#8217;s most useful. The infrastructure, not the intelligence, is becoming the battleground.</p><div><hr></div><h2>The Take That Started the Week</h2><p>This week I published a piece about something I watched happen in real time: <strong>three companies &#8212; Anthropic, AWS, and Google &#8212; all made the same move within days of each other.</strong> They shifted AI from chatbot to coworker.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.techwithdarin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Tech with Darin is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>Anthropic launched scheduled tasks in Cowork. AWS shipped Bedrock agents with stateful runtime environments. Google expanded Gemini&#8217;s workspace integrations. All of them moving in the same direction: AI that does work, not just answers questions.</p><p>I&#8217;ve been building and operating systems for over 30 years. The pattern here is identical to what happened with DevOps, then containers, then serverless. The raw capability commoditizes fast. What differentiates teams is the harness &#8212; the constraints, feedback loops, and observability layers that turn raw capability into reliable output.</p><p><strong>The teams already winning with AI agents aren&#8217;t the ones with the best model.</strong> They&#8217;re the ones who built the best control systems around the model. Guardrails that prevent hallucinations from reaching production. Feedback loops that improve output quality over time. Monitoring that catches drift before it becomes a problem.</p><p>This is the control-layer-as-moat thesis I keep coming back to. The model is the engine. The harness is the car. Nobody buys an engine without a car.</p><p>I&#8217;ve watched this exact fork happen with virtualization, containers, cloud, and now AI. Depth wins every time. The timeline just keeps compressing.</p><div><hr></div><h2>Cloud Roundup: March 2026</h2><p><strong>AWS</strong> had its biggest infrastructure week in recent memory &#8212; and it wasn&#8217;t a re:Invent.</p><p>The headline: <strong>a $50 billion, multi-year deal to host OpenAI on AWS infrastructure.</strong> Initial commitment is $15 billion. The practical impact for practitioners is already landing &#8212; Amazon Bedrock now has a Stateful Runtime Environment and an OpenAI-compatible Projects API, bringing better context management, access control, and cost tracking.</p><p>This is the Graviton playbook applied to AI. Own the substrate. Make the workloads sticky. AWS isn&#8217;t building frontier models &#8212; they&#8217;re building the platform where everyone else&#8217;s models run. Same strategy, different decade.</p><p>Also worth flagging: MediaConvert&#8217;s Probe API hit GA (rapid metadata analysis without full processing &#8212; useful for video pipelines), and AppConfig&#8217;s New Relic integration now enables automated feature flag rollbacks in seconds instead of minutes. Both are the kind of quiet operational upgrades that add up.</p><div><hr></div><p><strong>Azure</strong> had a quiet first week of March. No major GA releases or pricing changes hit the practitioner radar. Sometimes the most useful thing to report is that nothing broke and nothing changed &#8212; stability has value too.</p><div><hr></div><p><strong>GCP</strong> was similarly quiet this week on the infrastructure side. The bigger Google news was on the AI model side (see below).</p><div><hr></div><h2>AI Model Roundup: March 2026</h2><p><strong>OpenAI</strong> shipped GPT-5.4 Instant on March 4 &#8212; better conversational flow, improved web search integration, and notably fewer refusals. The model is more direct, which matters for production workflows where over-cautious responses slow down real work.</p><p>But the story I&#8217;m most interested in isn&#8217;t the benchmark improvement. I tested GPT-5.4 against Claude Opus 4.6 on an actual client proposal &#8212; not a toy task, a real business deliverable built from rough meeting notes. Then I asked GPT-5.4 to score both outputs honestly.</p><p><strong>GPT-5.4&#8217;s self-assessment:</strong> Claude&#8217;s first draft: 8.5/10. Its own: 7/10. But as a foundation for the final SOW? GPT-5.4 rated itself 8.5 to Claude&#8217;s 8.</p><p>GPT-5.4&#8217;s own words: &#8220;Your draft is the better first draft. It reads like something a human would actually circulate.&#8221; But it also noted: &#8220;My version was weaker as a first draft, but stronger as a don&#8217;t-miss-anything scaffold.&#8221;</p><p><strong>That calibration is new.</strong> Earlier GPT versions would have rated themselves higher. The willingness to honestly assess relative strengths is a more important capability improvement than any benchmark delta. It means you can actually trust the model&#8217;s self-evaluation when deciding which tool to use for which stage of the work.</p><div><hr></div><p><strong>Anthropic</strong> had a week that demonstrated range.</p><p>On the product side: <strong>Cowork launched scheduled tasks</strong> &#8212; browser-based AI that runs on a recurring schedule without human intervention. I&#8217;ve been using it to automate my entire daily content pipeline: four stages from 5 AM research to 7 PM engagement. The coupling of scheduled automation with browser context is genuinely new.</p><p>On the security side: <strong>Anthropic partnered with Mozilla to test Claude against Firefox&#8217;s codebase.</strong> In two weeks, Claude found 22 vulnerabilities (14 high-severity) &#8212; nearly one-fifth of all high-severity Firefox bugs fixed in 2025. The first one was found in 20 minutes.</p><p>But here&#8217;s the nuance that matters: despite finding 22 bugs, Claude could only exploit 2 of them &#8212; and only in test environments without browser sandboxing. Anthropic spent $4K in API credits on exploitation attempts. The defender&#8217;s advantage is real: AI finds vulnerabilities much faster than it can exploit them. Defense-in-depth works. That&#8217;s the most important finding in this research.</p><div><hr></div><p><strong>Google AI</strong> released Gemini 3.1 Flash-Lite on March 4 &#8212; a cost-optimized model at $0.25/M input tokens and $1.50/M output tokens. This is Google&#8217;s play for the high-volume, cost-sensitive workloads that can&#8217;t justify premium model pricing.</p><p>The pricing strategy is clear: make the entry point so cheap that teams default to Google for their bulk inference needs, then upsell to Pro for the complex tasks. It&#8217;s the classic cloud pricing playbook &#8212; free tier hooks, volume tier retains.</p><div><hr></div><h2>The Pattern I&#8217;m Watching</h2><p>Three signals from this week all point the same direction:</p><ol><li><p>AWS invested $50 billion not to build AI models, but to host them.</p></li><li><p>GPT-5.4 honestly scored Claude higher than itself on a first-draft task.</p></li><li><p>Claude found vulnerabilities in Firefox&#8217;s codebase faster than any human team could &#8212; but couldn&#8217;t exploit them.</p></li></ol><p><strong>The model layer is commoditizing.</strong> When GPT rates Claude higher on some tasks and Claude rates GPT higher on others, the question &#8220;which model is best?&#8221; loses meaning. The answer is always &#8220;it depends on the task.&#8221;</p><p><strong>The infrastructure layer is concentrating value.</strong> AWS hosting OpenAI is the same signal as AWS hosting Anthropic. The platform that runs everything wins regardless of which model wins.</p><p><strong>The security and operations layers are becoming the differentiator.</strong> Claude finding Firefox bugs in 20 minutes but failing to exploit them is a preview of what AI-accelerated security looks like. The teams with the best patching velocity, the best observability, and the best control planes will outperform the teams with the best models.</p><p>Same pattern. Different decade. The infrastructure always wins &#8212; it just takes a cycle for everyone to notice.</p><p>What&#8217;s your current strategy &#8212; are you picking models, or building platforms?</p><p>Hit reply and tell me. I read every response.</p><p>&#8212; Darin</p><div><hr></div><p><em>Weekly AI and cloud breakdowns from someone who&#8217;s been in the game since the early days of the internet. No ads. No filler. Just the signal.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.techwithdarin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Tech with Darin is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[This Week in AI + Cloud: Your Experience Is the Advantage, Not the Liability]]></title><description><![CDATA[AI distillation attacks, broken benchmarks, cloud updates from AWS/Azure/GCP, and why your years of experience are the one thing AI can't replace. Weekly signal]]></description><link>https://www.techwithdarin.com/p/this-week-in-ai-cloud-your-experience</link><guid isPermaLink="false">https://www.techwithdarin.com/p/this-week-in-ai-cloud-your-experience</guid><dc:creator><![CDATA[Darin Deters]]></dc:creator><pubDate>Sat, 28 Feb 2026 03:48:54 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/3c50ba4a-cc10-4d6a-abe8-afaf2fa7de5e_1200x630.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2>The Bottom Line (No Jargon Edition)</h2><p>If you only read one section, read this. Here&#8217;s what happened this week in plain English:</p><ul><li><p><strong>Your experience is your superpower, not your weakness.</strong> AI tools can write code and generate content fast, but they don&#8217;t know what &#8220;good&#8221; looks like in your field. If you&#8217;ve been doing your job for 10, 20, or 30 years, you have exactly the judgment AI lacks. Don&#8217;t be afraid of it &#8212; learn to use it. You&#8217;ll be more valuable, not less.</p></li><li><p><strong>Three Chinese companies got caught copying one of the biggest AI models.</strong> They ran millions of fake conversations with Anthropic&#8217;s Claude to steal its intelligence. Nobody broke in &#8212; they just used the product at massive scale through fake accounts. It&#8217;s like photocopying an entire library one page at a time. The takeaway: protecting AI isn&#8217;t just about locks and firewalls anymore.</p></li><li><p><strong>A popular AI scorecard turned out to be broken.</strong> The test that companies used to prove their AI could write code? Over half the test cases were flawed, and the AI models had basically memorized the answers. So those impressive scores you keep seeing? Take them with a grain of salt. Ask: was this a private test, or could the AI have studied the answer key?</p></li><li><p><strong>The big three clouds (AWS, Azure, Google) all made it easier to build with AI this week.</strong> AWS lets AI agents take real actions now (not just chat). Azure added Anthropic&#8217;s best model to its data platform. Google kept simplifying its tools so you spend less time configuring and more time building.</p></li><li><p><strong>OpenAI is no longer exclusive to Microsoft.</strong> Their agent-building platform is coming to Amazon&#8217;s cloud too. That means you&#8217;ll have more choices about where to run AI tools &#8212; and that&#8217;s good for everyone.</p></li></ul><p><strong>The thread connecting all of it:</strong> In a world flooded with AI-generated everything, the real value is in knowing what&#8217;s actually good, what&#8217;s actually true, and what actually works. That&#8217;s human judgment. And it&#8217;s not going anywhere.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.techwithdarin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Tech with Darin is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h2>The Take That Started the Week</h2><p>This week I wrote something that hit a nerve: <strong>the comfort zone has a cost. You just don&#8217;t see the invoice until it&#8217;s too late.</strong></p><p>I talk to engineers every week who are genuinely afraid AI is going to replace them. And I get it &#8212; the headlines are engineered to scare you. But after watching 30 years of tech shifts play out, I can tell you: fear is the wrong operating system for what&#8217;s actually happening.</p><p>Virtualization was supposed to replace sysadmins. Cloud was going to eliminate infrastructure teams. DevOps would make ops engineers obsolete. I was there for all three. None of it played out the way the fearmongers predicted. What actually happened: the roles changed, the people who adapted early thrived, and the ones who froze got left behind. AI is the same pattern. Faster timeline, same playbook.</p><p><strong>Here&#8217;s the part I really want to land with this audience:</strong></p><p>If you have 10, 20, 30 years in your field, you don&#8217;t have a disadvantage. You have a massive one that most people haven&#8217;t recognized yet. AI generates, but it doesn&#8217;t judge. It produces output, but it doesn&#8217;t know what good looks like in your domain. You do. That judgment &#8212; knowing which output is right, which approach fits, which edge cases will bite you in production &#8212; that&#8217;s the part AI can&#8217;t replicate. And it&#8217;s the part that only comes from years of doing the work.</p><p>A junior engineer with AI is fast but unfiltered. A senior engineer with AI is a force multiplier. Your experience isn&#8217;t the thing being replaced &#8212; it&#8217;s the thing that makes AI actually useful. You just have to harness it.</p><p>I laid out three paths I&#8217;m watching emerge: the Orchestrator (manage agents, define outcomes), the Systems Builder (build the infrastructure agents run on), and the Domain Translator (combine deep industry expertise with AI tools to build things nobody else can). None of them require you to be an AI researcher. All of them require you to start.</p><div><hr></div><h2>Also This Week: Two Stories That Should Change How You Evaluate AI</h2><p><strong>Anthropic caught three Chinese AI labs distilling Claude at industrial scale.</strong></p><p>DeepSeek ran 150,000+ exchanges targeting reasoning. Moonshot AI hit 3.4 million+ targeting tool use, coding, and computer vision. MiniMax &#8212; the largest &#8212; ran 13 million+ exchanges focused on agentic coding. Total: 16 million+ exchanges across 24,000 fraudulent accounts running through commercial proxy services.</p><p>Nobody hacked anything. They used the API exactly as designed, at massive scale, through fake identities. The attack surface was the product itself.</p><p>Anthropic&#8217;s response included behavioral fingerprinting classifiers, strengthened verification, and countermeasures at the product, API, and model levels. But the bigger takeaway isn&#8217;t about Anthropic&#8217;s defenses. It&#8217;s that the AI moat isn&#8217;t the model &#8212; it&#8217;s the control system around it. Export controls on chips don&#8217;t work when knowledge flows out through the API. This pattern will play out across every major lab.</p><p><strong>OpenAI published why they stopped using SWE-bench Verified.</strong></p><p>They audited 27.6% of the dataset. Of those, 59.4% had flawed test cases that rejected correct code. 35.5% enforced implementation details never mentioned in the task. Worse: every frontier model they tested could reproduce the original human-written solutions verbatim. The models had memorized the answers. Scores climbed from 74.9% to 80.9% in six months. The capability didn&#8217;t improve &#8212; the benchmark got gamed.</p><p>Classic Goodhart&#8217;s Law applied to AI. When a measure becomes a target, it stops being useful.</p><p>OpenAI now recommends SWE-bench Pro and built their own private benchmark called GDPVal. The shift to private evaluation is the real signal. If someone shows you a benchmark score from a public dataset, the first question should be: is the test private? If not, you might be comparing memorization.</p><div><hr></div><h2>Cloud Roundup: Late February 2026</h2><p><strong>AWS</strong> had a quieter week by recent standards, but one update matters.</p><p><strong>Amazon Bedrock now supports server-side tool execution via AgentCore</strong> &#8212; secure actions like web search and database updates, executed server-side within the Bedrock environment. If you&#8217;re building AI agents on AWS, this is the piece that lets agents actually do things without you managing the tool execution infrastructure yourself. Also: EKS Node Monitoring Agent went open source (community contributions welcome), and Deadline Cloud added task chunking for better rendering throughput.</p><div><hr></div><p><strong>Azure</strong> landed a notable model addition.</p><p><strong>Claude Opus 4.6 is now on Azure Databricks</strong> (as of Feb 26). Serverless Workspaces for Databricks hit GA. WAF Default Ruleset 2.2 is now the standard for Application Gateway &#8212; update your configs. Also flagged: the DHE cipher suite retirement hits Azure Front Door and CDN on April 1. Start planning now if you&#8217;re affected.</p><p>The Databricks play is interesting &#8212; Azure is positioning itself as the neutral platform where you can access any model through the analytics stack, not just through the core AI services.</p><div><hr></div><p><strong>GCP</strong> focused on operational improvements.</p><p>AlloyDB now integrates with Database Center for prioritized health monitoring &#8212; one-click navigation to recommended fixes. Composer deployments generate Airflow v3-compatible DAGs, which means the Airflow v2 end-of-life migration just got a clear path. API Hub got a specification boost preview that improves documentation quality automatically.</p><p>Google&#8217;s pattern continues: reduce friction, improve defaults, make the platform disappear so teams focus on building.</p><div><hr></div><h2>AI Model Roundup: Late February 2026</h2><p><strong>OpenAI</strong> made a strategic move that&#8217;s bigger than any model release: <strong>Frontier is coming to AWS.</strong></p><p>OpenAI&#8217;s no-code agent platform &#8212; build, deploy, and manage AI agents &#8212; will be hosted on Amazon&#8217;s infrastructure alongside Azure. This is the first real crack in the Microsoft-OpenAI exclusivity narrative. Microsoft still retains exclusive IP rights, but the compute layer is diversifying. For practitioners, this means your cloud choice may stop being an AI provider choice. That&#8217;s a good thing.</p><p>Also: $285B valuation after a $1B Thrive Capital investment, and multi-year alliances with BCG, McKinsey, Accenture, and Capgemini for enterprise adoption. OpenAI is building the consulting channel. The enterprise sales motion is accelerating.</p><div><hr></div><p><strong>Anthropic</strong> released RSP 3.0 on February 24 &#8212; updated safety protocols addressing misalignment risks. Government deployments were confirmed in classified environments, with restrictions on firms linked to foreign adversaries. And of course, the distillation attacks disclosure dominated the conversation (covered above).</p><p>The pattern I&#8217;m seeing from Anthropic this month: security and trust as competitive differentiators. While other labs race on capabilities, Anthropic is racing on the control layer. That&#8217;s consistent with their positioning from day one &#8212; and the distillation disclosure is evidence that the threats they&#8217;ve been planning for are now real.</p><div><hr></div><p><strong>Google AI</strong> shipped Gemini 3.1 Flash image generation with real-time web knowledge and consistent character appearance. Android task automation expanded to multi-step actions through Uber, Lyft, and DoorDash. Lyria 3 now generates 30-second music tracks from text prompts.</p><p>The consumer play is aggressive. Google is embedding Gemini into every surface &#8212; phone, browser, workspace, photos. The practitioner signal: if you&#8217;re building on Google&#8217;s stack, the AI primitives are showing up everywhere. The question isn&#8217;t &#8220;should we use AI&#8221; &#8212; it&#8217;s &#8220;which layer do we integrate with.&#8221;</p><div><hr></div><h2>The Pattern I&#8217;m Watching</h2><p>Three themes collided this week, and they&#8217;re all connected.</p><p><strong>Theme 1: The benchmark trust crisis.</strong> SWE-bench Verified just became the poster child for Goodhart&#8217;s Law in AI. Public benchmarks are getting gamed. Labs are shifting to private evaluation. The implication: we&#8217;re entering a period where you can&#8217;t compare AI tools by published scores alone. Hands-on evaluation is the only evaluation that counts.</p><p><strong>Theme 2: The model security arms race.</strong> Anthropic&#8217;s distillation disclosure proves that AI capabilities are now a target &#8212; not just the infrastructure that runs them. The moat isn&#8217;t the model. It&#8217;s the detection, verification, and control systems around it. Every lab will face this. The ones who invested in security early will have the advantage.</p><p><strong>Theme 3: Experience as competitive advantage.</strong> In a world where AI handles the generation and junior-level execution, the premium shifts to judgment. Knowing what good looks like. Knowing which edge cases matter. Knowing when the AI&#8217;s output looks right but isn&#8217;t. That&#8217;s 10, 20, 30 years of pattern recognition &#8212; and it&#8217;s exactly what AI can&#8217;t replicate.</p><p>These three themes are the same theme. In a world flooded with generated output &#8212; code, benchmarks, model capabilities, content &#8212; the value moves to judgment, verification, and trust. The people and organizations that can separate signal from noise will win.</p><p>That&#8217;s been true for every tech cycle I&#8217;ve watched. It&#8217;s just never been this obvious.</p><p>What&#8217;s your take &#8212; are you seeing the same pattern in your world?</p><p>Hit reply and tell me. I read every response.</p><p>&#8212; Darin</p><div><hr></div><p><em>Weekly AI and cloud breakdowns from someone who&#8217;s been in the game since the early days of the internet. No ads. No filler. Just the signal.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.techwithdarin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Tech with Darin is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[This Week in AI + Cloud: The Developer Career Fork, AWS/Azure/GCP, and a Benchmark That Changes Everything]]></title><description><![CDATA[AI + Cloud &#8212; Week of February 22, 2026]]></description><link>https://www.techwithdarin.com/p/this-week-in-ai-cloud-the-developer</link><guid isPermaLink="false">https://www.techwithdarin.com/p/this-week-in-ai-cloud-the-developer</guid><dc:creator><![CDATA[Darin Deters]]></dc:creator><pubDate>Sun, 22 Feb 2026 11:28:33 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/0af99562-0a3d-4946-8628-2dec5f6f9dff_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2><strong>AI + Cloud &#8212; Week of February 22, 2026</strong></h2><div><hr></div><h2><strong>The Take That Started the Week</strong></h2><p>This week I published a piece I&#8217;ve been sitting on for a while: <strong>AI has forked the developer career into three tracks.</strong></p><p>Not killed it. Forked it.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.techwithdarin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Tech with Darin is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h3><strong>Track 1 &#8212; The Orchestrator</strong></h3><p>Writes specs and manages agents. Nobody on the team is touching code &#8212; the agents write, test, and ship. The humans write the specs and review the results.</p><p>The unit of work is no longer the instruction. It&#8217;s the token &#8212; a unit of purchased intelligence.</p><h3><strong>Track 2 &#8212; The Systems Builder</strong></h3><p>Builds what the orchestrators use: agent frameworks, eval pipelines, routing layers that send the right task to the right model at the right cost.</p><p>This is where 30 years of infrastructure experience pays off. High bar. High ceiling.</p><h3><strong>Track 3 &#8212; The Domain Translator</strong></h3><p>The one nobody&#8217;s talking about.</p><p>Technical fluency + deep domain expertise = build tools instead of just using them. The dental practice specialist. The construction scheduling expert. The insurance compliance analyst who can now ship software.</p><p>These people are becoming developers &#8212; without CS degrees or bootcamps.</p><p><strong>The person most exposed right now:</strong> the competent coder in the middle. Solid code. No deep systems expertise. No deep domain expertise. Generic code production value is going to zero.</p><p>I&#8217;ve seen this exact pattern with DevOps, cloud, and containers. Depth wins every time.</p><p>The difference now is the timeline &#8212; this fork is happening in months, not years.</p><div><hr></div><h2><strong>Cloud Roundup: February 2026</strong></h2><h3><strong>AWS</strong></h3><p><strong>AWS</strong> had one of its stronger February drops in recent memory. Two highlights worth your attention:</p><ul><li><p><strong>Claude Opus 4.6 is now in Amazon Bedrock.</strong> The most powerful model currently available is now natively inside AWS. If you&#8217;re building AI apps on AWS and still stitching together third-party APIs, your architecture just got simpler.</p></li><li><p><strong>EC2 G7e instances with NVIDIA Blackwell GPUs.</strong> Up to 2.3x inference performance over the previous generation. LLM and multimodal workloads just got significantly cheaper to run at scale.</p></li></ul><p>Also worth flagging:</p><ul><li><p>DynamoDB now supports cross-account global tables (big for multi-tenant architectures)</p></li><li><p>ECS gets native canary and linear deployments via NLB</p></li><li><p>Aurora DSQL dropped SDK connectors for Go, Python, and Node.js with IAM auth auto-handled</p></li><li><p>Network Firewall dropped data processing charges for TLS Advanced Inspection</p></li></ul><p>Security upgrade that also cuts the bill &#8212; those rarely come together.</p><div><hr></div><h3><strong>Azure</strong></h3><p><strong>Azure</strong> had a quieter but practical month:</p><ul><li><p>New AMD Turin + Intel Xeon 6 VM families (Dasv7, Easv7, Fasv7) are now GA &#8212; better price-performance across general purpose, memory-optimized, and compute-optimized workloads.</p></li><li><p>AKS gets LocalDNS (lower latency inside clusters) and auto encryption-at-host &#8212; two less things to configure manually.</p></li><li><p>Azure Functions adds .NET 10 runtime support.</p></li><li><p><strong>Claude Sonnet 4.6 is now on Azure AI.</strong> Both major clouds now have Anthropic&#8217;s latest model available. The hyperscaler AI integration race is real &#8212; and the developer wins either way.</p></li></ul><div><hr></div><h3><strong>GCP</strong></h3><p><strong>GCP</strong> had one genuinely remarkable update: <strong>Gemini 3.1 Pro with a 1 million token context window.</strong></p><p>One million tokens means entire codebases, full legal document sets, complete video transcripts &#8212; all processed in a single inference call.</p><p>That&#8217;s not incremental. It changes what&#8217;s architecturally possible.</p><p>Also landed:</p><ul><li><p>GKE now auto-selects between Persistent Disk and Hyperdisk based on hardware compatibility (no more manual pairing or complex scheduling rules)</p></li><li><p>Cloud SQL adds brute-force attack detection baked in by default</p></li><li><p>OpenAPI v3 support for API Gateway is now GA</p></li><li><p>AlloyDB integrates with Database Center for one-click health remediation</p></li></ul><p>Google&#8217;s pattern this month: reduce the operational burden everywhere so teams can focus on what they&#8217;re actually building.</p><div><hr></div><h2><strong>AI Model Roundup: February 2026</strong></h2><h3><strong>OpenAI</strong></h3><p><strong>OpenAI</strong> shipped GPT-5.2 and retired GPT-4o, GPT-4.1, and o4-mini from ChatGPT in the same month.</p><p>That pattern &#8212; accelerate and consolidate simultaneously &#8212; is something I&#8217;ve been watching play out every quarter now.</p><p><strong>Practical implication:</strong> if your team has workflows, prompts, or evals tuned to any of those retired models, February is a good time to audit what you&#8217;re actually calling. The API versions aren&#8217;t changing yet, but the ChatGPT surface is moving on.</p><p>Also shipped:</p><ul><li><p>Lockdown Mode for enterprise security (data exfiltration protections, better admin oversight)</p></li><li><p>file attachments bumped to 20 per message</p></li><li><p>Code Blocks with a proper IDE experience inside ChatGPT</p></li></ul><div><hr></div><h3><strong>Anthropic</strong></h3><p><strong>Anthropic</strong> shipped two major models in 12 days: Claude Opus 4.6 on February 5, and Claude Sonnet 4.6 on February 17.</p><p>That release velocity is a signal about where the company is operating right now.</p><p>The updates I&#8217;m watching most closely:</p><ul><li><p><strong>Claude Code is now included in every Team plan.</strong> Previously an add-on. The barrier to AI-assisted coding just disappeared for a lot of teams.</p></li><li><p><strong>HIPAA-ready Claude for Enterprise.</strong> Healthcare AI just got a credible, enterprise-grade option.</p></li><li><p><strong>Apple Xcode 26.3 integrates the Claude Agent SDK.</strong> The agentic coding wave is hitting every major IDE.</p></li><li><p><strong>Permanently ad-free &#8212; official.</strong> Anthropic made it explicit: no ads, ever. Their reasoning: advertising incentives fundamentally conflict with building a genuinely helpful assistant.</p></li></ul><p>Business model shapes product behavior. That positioning choice matters more than it sounds in a market where every free tier is hunting for monetization.</p><div><hr></div><h3><strong>Google AI</strong></h3><p><strong>Google AI</strong> had one number dominate the conversation: <strong>77.1% on ARC-AGI-2.</strong></p><p>That&#8217;s more than double the reasoning performance of Gemini 3 Pro.</p><p>ARC-AGI-2 is one of the harder benchmarks for measuring general reasoning &#8212; not just pattern matching. Hitting 77% would have been unimaginable two years ago.</p><p>Also:</p><ul><li><p>Gemini 3 Flash + Pro moved from preview to GA in AI Studio</p></li><li><p>Gemini 3.1 Pro is free to use during the preview period &#8212; classic developer adoption strategy</p></li><li><p>Workspace AI now has an Expanded Access add-on, with Gemini usage metrics available in the Admin console</p></li></ul><p>If you&#8217;re trying to build a business case for AI investment at your org, that admin visibility feature is worth a closer look.</p><p>The question isn&#8217;t whether AI is being used &#8212; it&#8217;s whether you can measure it.</p><div><hr></div><h2><strong>The Pattern I&#8217;m Watching</strong></h2><p>One thing stands out across all six of these companies this month:</p><p>Every cloud is racing to be the platform where you run your AI.</p><p>Every AI lab is racing to make their model available on every cloud.</p><p>AWS has Claude. Azure has Claude. GCP has Gemini. All of them will have everything within a year.</p><p>The winner of this race will probably be whoever makes the integration seamless enough that teams stop thinking about it as a separate decision.</p><p>Right now, it&#8217;s still a decision. That window is closing.</p><p><strong>Which platform are you building on &#8212; and has that choice gotten harder or easier in the last six months?</strong></p><p>Hit reply or post a comment and tell me. I read every response.</p><p>&#8212; Darin</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.techwithdarin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Tech with Darin is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Lockdown Mode, Billable Agents, and the Cost of Autonomy]]></title><description><![CDATA[Between February 10&#8211;15, three signals landed that matter if you&#8217;re building or buying agents in production:]]></description><link>https://www.techwithdarin.com/p/lockdown-mode-billable-agents-and</link><guid isPermaLink="false">https://www.techwithdarin.com/p/lockdown-mode-billable-agents-and</guid><dc:creator><![CDATA[Darin Deters]]></dc:creator><pubDate>Fri, 20 Feb 2026 05:11:30 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/1581e23f-a808-4e3d-8f7e-96602fa73b49_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Between February 10&#8211;15, three signals landed that matter if you&#8217;re building or buying agents in production:</p><ul><li><p>OpenAI introduced <strong>Lockdown Mode</strong> and <strong>Elevated Risk labels</strong> inside ChatGPT.</p></li><li><p>Google Cloud pushed <strong>Vertex AI Agent Engine</strong> capabilities into GA &#8212; including billable code execution, sessions, and memory.</p></li><li><p>Amazon Web Services expanded model choice in Bedrock with <strong>Claude Sonnet 4.6</strong> from Anthropic.</p></li></ul><p>This isn&#8217;t about model benchmarks.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.techwithdarin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Tech with Darin is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>It&#8217;s about <strong>control, cost, and production posture</strong>.</p><div><hr></div><h2><strong>1. OpenAI: Agents Now Ship with Guardrails</strong></h2><p>With Lockdown Mode, browsing and network-exposed tools can be restricted to prevent prompt injection and tool abuse. Elevated Risk labels surface contextual warnings before certain capabilities are used.</p><p>Translation for enterprise:</p><ul><li><p>Agent autonomy is no longer &#8220;just trust the prompt.&#8221;</p></li><li><p>Risk posture becomes visible and configurable.</p></li><li><p>Admins can constrain behavior without killing capability entirely.</p></li></ul><p>This is a shift from &#8220;powerful model&#8221; to <strong>managed execution environment</strong>.</p><p>If you&#8217;re running document ingestion, compliance extraction, financial analysis, or anything with external tool calls &#8212; this matters.</p><div><hr></div><h2><strong>2. Google: Agent Runtime Is Now Infrastructure</strong></h2><p>Google Cloud moved Code Execution, Sessions, and Memory Bank in Vertex AI Agent Engine to GA.</p><p>And importantly:</p><p>They&#8217;re billable.</p><p>That means:</p><ul><li><p>Session state persistence costs money.</p></li><li><p>Sandbox code execution costs money.</p></li><li><p>Memory storage costs money.</p></li><li><p>Agent loops now show up in your cloud bill.</p></li></ul><p>For teams used to &#8220;model token cost&#8221; as the primary driver &#8212; this is the next wave of FinOps for AI.</p><p>You&#8217;re not just paying for tokens.</p><p>You&#8217;re paying for <strong>runtime behavior</strong>.</p><div><hr></div><h2><strong>3. AWS: Model Optionality Is the Strategy</strong></h2><p>Amazon Web Services added Claude Sonnet 4.6 to Bedrock.</p><p>This continues AWS&#8217;s strategy:</p><ul><li><p>Provide multiple frontier models.</p></li><li><p>Let customers benchmark inside their VPC.</p></li><li><p>Keep control plane + data residency consistent.</p></li></ul><p>For enterprise buyers, this matters more than leaderboard scores.</p><p>Optionality + governance + isolation = leverage.</p><div><hr></div><h2><strong>Here&#8217;s the throughline</strong></h2><p>This isn&#8217;t just feature shipping. It&#8217;s a shift in posture.</p><p><strong>Safety</strong> used to live in policy docs and internal guidelines.</p><p>Now it&#8217;s enforced at runtime with configurable controls.</p><p><strong>Cost</strong> used to mean tokens.</p><p>Now it means tokens <strong>plus</strong> compute, memory, and session persistence.</p><p><strong>Autonomy</strong> used to be prompt-level intelligence.</p><p>Now it&#8217;s managed execution inside a governed environment.</p><p><strong>Procurement</strong> used to be model comparison.</p><p>Now it&#8217;s platform + runtime evaluation.</p><div><hr></div><p>The industry conversation has quietly moved from:</p><blockquote><p>&#8220;Which model writes better code?&#8221;</p></blockquote><p>To:</p><blockquote><p>&#8220;Which environment can safely and predictably finish work?&#8221;</p></blockquote><p>That&#8217;s a very different buying decision.</p><div><hr></div><h1><strong>What I&#8217;d Do This Week</strong></h1><p>If you&#8217;re serious about agents in production, don&#8217;t debate Twitter takes.</p><p>Run a structured benchmark.</p><p>Example:</p><p><strong>Pipeline</strong></p><ul><li><p>Document ingestion</p></li><li><p>Structured extraction</p></li><li><p>JSON schema enforcement</p></li><li><p>Compliance tagging</p></li></ul><p>Deploy on:</p><ul><li><p>Bedrock with Claude Sonnet 4.6</p></li><li><p>Vertex AI Agent Engine with sessions enabled</p></li><li><p>OpenAI with Lockdown Mode toggled on/off</p></li></ul><p>Track:</p><ol><li><p>Task success rate</p></li><li><p>Median end-to-end latency</p></li><li><p>Cost per task (including runtime)</p></li><li><p>Failure mode type (hallucination vs tool misuse vs timeout)</p></li></ol><p>Because in 2026, the decision isn&#8217;t just model quality.</p><p>It&#8217;s:</p><ul><li><p>Can I constrain it?</p></li><li><p>Can I observe it?</p></li><li><p>Can I predict the bill?</p></li></ul><div><hr></div><h1><strong>The Bigger Pattern</strong></h1><p>We&#8217;ve moved from:</p><p>Models &#8594; Agents &#8594; Agent Platforms &#8594; <strong>Agent Governance</strong></p><p>And the moat isn&#8217;t raw capability anymore.</p><p>It&#8217;s control surfaces.</p><p>Safety flags.</p><p>Session billing.</p><p>Model optionality.</p><p>Isolation boundaries.</p><p>The vendors are telling you something very clearly:</p><p>Agents are not a toy layer.</p><p>They&#8217;re infrastructure now.</p><p>If you treat them like a chatbot feature, your architecture will lag.</p><p>If you treat them like a distributed system with risk and cost controls, you&#8217;ll win.</p><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.techwithdarin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Tech with Darin is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[The Entry-Level Job is Gone. Here’s the new one (2026)]]></title><description><![CDATA[If you&#8217;re graduating into tech right now &#8212; or you&#8217;re early-career and feeling weirdly behind &#8212; you&#8217;re not imagining it.]]></description><link>https://www.techwithdarin.com/p/the-entry-level-job-is-gone-heres</link><guid isPermaLink="false">https://www.techwithdarin.com/p/the-entry-level-job-is-gone-heres</guid><dc:creator><![CDATA[Darin Deters]]></dc:creator><pubDate>Sun, 15 Feb 2026 02:32:04 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!4B2V!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff69ac3d5-f261-4e0f-9c92-ca59f685d498_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>If you&#8217;re graduating into tech right now &#8212; or you&#8217;re early-career and feeling weirdly behind &#8212; you&#8217;re not imagining it.</p><p></p><p>The entry-level bar moved.</p><p>Not because you got worse.</p><p></p><p>Because a lot of what used to be junior work is now handled by AI + a senior engineer reviewing it.</p><p>So the market is doing what markets always do: it stops paying for what&#8217;s abundant.</p><p>Code is abundant now.</p><p>Which means the thing that gets you hired in 2026 isn&#8217;t &#8220;I can code.&#8221;</p><p>It&#8217;s:</p><p>Can you be trusted to ship?</p><p>That&#8217;s the new job.</p><p>And the fastest way to win is to stop training for the old one.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.techwithdarin.com/subscribe?utm_source=email&r=&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.techwithdarin.com/subscribe?utm_source=email&r="><span>Subscribe</span></a></p><p></p><p><strong>The new competition: &#8220;AI + a senior&#8221; (not other grads)</strong></p><p>Here&#8217;s the mental model that changes everything:</p><p>You&#8217;re not being compared to a senior engineer.</p><p>You&#8217;re being compared to:</p><ul><li><p>an AI that can generate 80% of a feature in minutes</p></li><li><p>plus a senior who knows what &#8220;good&#8221; looks like</p></li></ul><p>So &#8220;I can type code fast&#8221; isn&#8217;t a differentiator anymore.</p><p>Your differentiator is your ability to turn AI output into something:</p><ul><li><p>correct</p></li><li><p>secure</p></li><li><p>reliable</p></li><li><p>deployable</p></li><li><p>explainable</p></li></ul><p>That&#8217;s not &#8220;senior-only.&#8221; That&#8217;s baseline.</p><p><strong>1) Stop trying to be a faster coder. Become a better driver.</strong></p><p>In 2026, code is the easy part.</p><p>The job is driving.</p><p>Driving means you can:</p><ul><li><p>turn vague asks into clear requirements</p></li><li><p>steer AI toward the right shape of solution</p></li><li><p>catch bad outputs before they ship</p></li><li><p>prove it works (tests + observability)</p></li><li><p>make tradeoffs (cost / security / performance)</p></li></ul><p><strong>The cheat code: the code review mindset</strong></p><p>If you&#8217;re early-career, you can level up faster by practicing reviewing instead of only writing.</p><p></p><p>Use this loop:</p><ol><li><p>Use AI to generate the module</p></li><li><p>Review it like it&#8217;s going into production</p></li><li><p>Fix what&#8217;s wrong</p></li><li><p>Write down what you found and why it mattered</p></li></ol><p>What &#8220;wrong&#8221; looks like:</p><ul><li><p>insecure defaults (wide permissions, open CORS, missing auth)</p></li><li><p>missing input validation</p></li><li><p>edge cases (timeouts, retries, empty data, partial failures)</p></li><li><p>dependency landmines</p></li><li><p>performance traps</p></li><li><p>&#8220;works locally&#8221; but breaks in deployment</p></li></ul><p>If you can consistently do this, you stop reading as &#8220;junior who needs babysitting&#8221; and start reading as &#8220;junior who reduces risk.&#8221;</p><p>That&#8217;s who gets hired.</p><p></p><p><strong>2) The 2026 Golden Trio: learn what companies urgently need</strong></p><p>You don&#8217;t need to learn everything. You need leverage.</p><p></p><p>Here are three domains that keep showing up because they map to real business pain:</p><p><strong>Cloud-native &#8220;ship it&#8221; skills</strong></p><p>Not theory. Not just certs.</p><p>Real skills:</p><ul><li><p>deploy an API or app</p></li><li><p>set up logs/metrics</p></li><li><p>understand basic scaling</p></li><li><p>make authentication not sketchy</p></li><li><p>have a cost opinion (even a simple one)</p></li></ul><p></p><p></p><p>If you can deploy something cleanly, you&#8217;re instantly above average.</p><p><strong>AI integration: RAG + tool use</strong></p><p>You don&#8217;t need to train models to be valuable.</p><p>You need to connect them to real systems:</p><ul><li><p>documents</p></li><li><p>databases</p></li><li><p>internal tools</p></li><li><p>APIs</p></li></ul><p>In production, &#8220;useful AI&#8221; usually means retrieval + tools + guardrails.</p><p><strong>DevSecOps lite: don&#8217;t ship footguns</strong></p><p>AI increases code volume. More code means more vulnerabilities.</p><p>If you can demonstrate:</p><ul><li><p>least privilege thinking</p></li><li><p>secrets handled properly</p></li><li><p>dependency hygiene</p></li><li><p>basic secure-by-default design</p></li></ul><p>&#8230;you become low-risk.</p><p>Low-risk gets offers.</p><p></p><p><strong>3) Portfolios in 2026: tutorials are invisible</strong></p><p>Hiring managers are speed-scanning.</p><p>If your portfolio is:</p><ul><li><p>todo app</p></li><li><p>weather app</p></li><li><p>calculator</p></li></ul><p>&#8230;it reads like &#8220;followed a guide.&#8221;</p><p>Instead, build what I call an Anti-Tutorial Portfolio:</p><p><strong>One real project. One real user. Real constraints.</strong></p><p>Pick:</p><ul><li><p>a friend&#8217;s side hustle</p></li><li><p>a local nonprofit</p></li><li><p>your own annoying workflow</p></li><li><p>a small business problem</p></li></ul><p>Then ship something small but real.</p><p>Usage forces real engineering:</p><ul><li><p>requirements change</p></li><li><p>edge cases appear</p></li><li><p>reliability matters</p></li><li><p>you iterate</p></li></ul><p><strong>Add a Decision Log (this makes you stand out)</strong></p><p>In your README, include a short &#8220;Decision Log&#8221; like:</p><ul><li><p>why this architecture</p></li><li><p>why these services/libraries</p></li><li><p>what tradeoffs you made</p></li><li><p>how you handled auth</p></li><li><p>how you thought about cost</p></li><li><p>what you&#8217;d change at 10&#215; scale</p></li></ul><p>This signals: builder mindset, not student mindset.</p><p><strong>Documentation is back</strong></p><p>In a world of AI-generated spaghetti, clean docs are a superpower:</p><p></p><ul><li><p>clear README</p></li><li><p>API spec</p></li><li><p>diagram</p></li><li><p>runbook</p></li><li><p>how to run locally + in cloud</p></li></ul><p></p><p><strong>4) Soft skills just became career skills</strong></p><p>When routine tasks get automated, human value rises.</p><p></p><p>The people who win can:</p><ul><li><p>explain constraints to non-technical stakeholders</p></li><li><p>break vague asks into shippable chunks</p></li><li><p>ask good questions</p></li><li><p>learn fast and adapt without being handheld</p></li></ul><p>This is engineering, not &#8220;being extroverted.&#8221;</p><p></p><p><strong>The 30-day plan that produces proof (not vibes)</strong></p><p></p><p><strong>Week 1: Ship something deployed</strong></p><p>Build a small API/app. Deploy it. Add logs. Write a clean README.</p><p>Signal: I can ship.</p><p><strong>Week 2: Add AI that touches real data</strong></p><p>Add RAG over real docs/data. Show evaluation. Document failure modes.</p><p>Signal: I can make AI useful.</p><p><strong>Week 3: Add an agent workflow (with guardrails)</strong></p><p>Let AI call tools/APIs. Add validation and tests.</p><p>Signal: I can orchestrate safely.</p><p><strong>Week 4: Make it hiring-ready</strong></p><p>Security cleanup, architecture diagram, 2-minute demo video, tighten docs.</p><p>Signal: I can communicate like a pro.</p><p></p><p><strong>Capstone project ideas (pick one)</strong></p><p><strong>Cloud / Platform track: Ops Copilot for a Service</strong></p><ul><li><p>Deploy a small service (API + datastore)</p></li><li><p>Add observability (logs/metrics/alerts)</p></li><li><p>AI feature: summarize incidents + suggest runbook steps</p></li><li><p>Bonus: cost report + &#8220;why this design&#8221; decision log</p></li></ul><p><strong>Backend track: Support Triage Assistant</strong></p><ul><li><p>Ingest tickets/emails/forms &#8594; categorize &#8594; route</p></li><li><p>AI feature: summarize + propose response drafts</p></li><li><p>Guardrails: sensitive data handling + approval workflow</p></li><li><p>Bonus: evaluation set (20 sample tickets) + accuracy reporting</p></li></ul><p><strong>Full-stack track: Real User Dashboard</strong></p><ul><li><p>Build a dashboard for a real user (inventory, bookings, donations, etc.)</p></li><li><p>AI feature: Ask your data (RAG over records + definitions)</p></li><li><p>Bonus: role-based access + audit log</p></li></ul><p><strong>Security track: AI Code Safety Gate</strong></p><ul><li><p>Pipeline that scans PRs for risky patterns</p></li><li><p>AI feature: explain findings + recommend fixes</p></li><li><p>Bonus: policy-as-code + least-privilege reference templates</p></li></ul><p><strong>FinOps / Cost track: Cloud Cost Guardrails</strong></p><ul><li><p>Pull billing/cost signals (even mocked)</p></li><li><p>AI feature: explain spikes + recommend actions</p></li><li><p>Bonus: expected monthly cost model + alerts</p></li></ul><p></p><p><strong>The line that changes everything</strong></p><p>In 2026, your job isn&#8217;t to prove you can code.</p><p>Your job is to prove you can take AI output and turn it into something:</p><p>safe</p><p>reliable</p><p>real</p><p>That&#8217;s the person teams want on day one.</p>]]></content:encoded></item><item><title><![CDATA[Harness Engineering: The Moat Isn’t Code Anymore. It’s Control.]]></title><description><![CDATA[TL;DR: AI made code cheap. The new moat is the harness: constraints, feedback loops, repo legibility, and drift control. If the agent can&#8217;t observe it, measure it, or reproduce it &#8212; it can&#8217;t reliabl]]></description><link>https://www.techwithdarin.com/p/harness-engineering-the-moat-isnt</link><guid isPermaLink="false">https://www.techwithdarin.com/p/harness-engineering-the-moat-isnt</guid><dc:creator><![CDATA[Darin Deters]]></dc:creator><pubDate>Thu, 12 Feb 2026 02:24:19 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/a209a28e-6d43-4e8d-9faa-9301eb06003d_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h3><strong>Humans steer. Agents execute.</strong></h3><p>OpenAI just documented an experiment that quietly rewrites what &#8220;software engineering&#8221; means in 2026:</p><p>They shipped an internal beta product with <strong>0 lines of manually-written code</strong> &#8212; product logic, tests, CI, docs, observability, internal tooling &#8212; all written by Codex, merged through a normal PR workflow.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.techwithdarin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Tech with Darin is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>They estimate it took <strong>~1/10th the time</strong> it would&#8217;ve taken to write by hand.</p><p>They started from an empty repo (<strong>first commit: late August 2025</strong>) and ended up with <strong>~1M lines of code</strong> and <strong>~1,500 PRs</strong>, initially driven by <strong>three engineers</strong> &#8212; later <strong>seven</strong> &#8212; with throughput rising as they scaled.</p><p>If your reaction is &#8220;cool, but that&#8217;s OpenAI,&#8221; you&#8217;re looking at the wrong thing.</p><p>The story isn&#8217;t &#8220;AI wrote a lot of code.&#8221;</p><p>The story is what they had to build <strong>around the AI</strong> so that cheap code didn&#8217;t turn into expensive chaos.</p><p>That&#8217;s the shift:</p><p><strong>Software engineering is becoming harness engineering</strong> &#8212; designing environments, specifying intent, and building feedback loops that let agents do reliable work.</p><p>And the moat isn&#8217;t clever code.</p><p>It&#8217;s <strong>control</strong>.</p><div><hr></div><h2><strong>The new bottleneck (it&#8217;s not typing)</strong></h2><p>Once code generation is abundant, your limiting factor isn&#8217;t output.</p><p>It&#8217;s:</p><ul><li><p><strong>environment design</strong></p></li><li><p><strong>constraints</strong></p></li><li><p><strong>feedback loops</strong></p></li><li><p><strong>legibility</strong></p></li><li><p><strong>garbage collection for drift</strong></p></li></ul><p>OpenAI put it bluntly: early progress was slower not because Codex couldn&#8217;t code, but because the environment was underspecified &#8212; the agent lacked tools, abstractions, and internal structure. So the engineers&#8217; &#8220;job&#8221; became enabling the agent.</p><p><strong>Your value isn&#8217;t writing code. Your value is making code safe to generate.</strong></p><div><hr></div><h2><strong>Lesson 1: If the agent can&#8217;t observe it, it doesn&#8217;t exist</strong></h2><p>Agents don&#8217;t magically &#8220;understand&#8221; your system.</p><p>They <em>inspect</em> it.</p><p>So OpenAI made the app legible to the agent:</p><ul><li><p>bootable per git worktree so Codex could run an isolated instance per change</p></li><li><p>wired <strong>Chrome DevTools Protocol</strong> into the agent runtime, with skills for DOM snapshots, screenshots, and navigation &#8212; enabling the agent to reproduce UI bugs and validate fixes by actually driving the app</p></li><li><p>gave the agent a local, ephemeral observability stack per worktree; Codex could query logs via <strong>LogQL</strong> and metrics via <strong>PromQL</strong></p></li></ul><p>That&#8217;s how prompts like &#8220;startup under 800ms&#8221; stop being vibes and become testable acceptance criteria.</p><p><strong>If the agent can&#8217;t measure it, it can&#8217;t improve it.</strong></p><p><strong>If it can&#8217;t reproduce it, it can&#8217;t fix it.</strong></p><div><hr></div><h2><strong>Lesson 2: Stop writing one giant &#8220;AI manual.&#8221; Build a repo knowledge system.</strong></h2><p>They tried the classic play: one big AGENTS.md.</p><p>It failed in exactly the ways you&#8217;d expect:</p><ul><li><p>context is scarce, so a giant instruction file crowds out the task and the code</p></li><li><p>when everything is &#8220;important,&#8221; nothing is</p></li><li><p>it rots</p></li><li><p>it&#8217;s hard to mechanically verify freshness, coverage, ownership, or links</p></li></ul><p>So they flipped it:</p><p>AGENTS.md became a <strong>short map</strong> (~100 lines) and the repository&#8217;s actual knowledge base moved into a structured, versioned docs/ directory treated as the <strong>system of record</strong>.</p><p>That&#8217;s the underrated unlock.</p><p>Most teams are trying to prompt their way into agent productivity.</p><p>OpenAI treated repo knowledge like infrastructure.</p><p><strong>Docs aren&#8217;t documentation anymore. They&#8217;re runtime dependencies for agents.</strong></p><div><hr></div><h2><strong>Lesson 3: You don&#8217;t manage agents by lecturing them. You manage them by constraining the world.</strong></h2><p>Agents don&#8217;t just ship features.</p><p>They replicate patterns &#8212; at scale.</p><p>So if your architecture is squishy, an agent will amplify the squish into a full-on &#8220;smell event.&#8221;</p><p>OpenAI&#8217;s response: enforce invariants, not implementations.</p><p>They built a rigid model:</p><ul><li><p>domains divided into layers</p></li><li><p>dependency direction validated</p></li><li><p>only a limited set of permissible edges allowed</p></li><li><p>enforced mechanically via custom linters and structural tests</p></li></ul><p>This line is the whole philosophy:</p><p><strong>Don&#8217;t tell the agent to have good taste. Make bad taste impossible.</strong></p><p>And here&#8217;s the part most teams miss: they pushed &#8220;taste&#8221; into systems &#8212; review comments and bugs get captured as docs updates or promoted into code rules when docs aren&#8217;t enough.</p><div><hr></div><h2><strong>Lesson 4: Throughput breaks your merge philosophy</strong></h2><p>This is where agent-first engineering starts to feel alien.</p><p>As Codex throughput increased, OpenAI found many &#8220;best practices&#8221; became counterproductive.</p><p>They operate with:</p><ul><li><p>minimal blocking merge gates</p></li><li><p>short-lived PRs</p></li><li><p>flakes often handled with follow-up runs instead of blocking progress indefinitely</p></li></ul><p>Because in a world where agent throughput far exceeds human attention:</p><p><strong>corrections are cheap, and waiting is expensive.</strong></p><p>OpenAI even notes this would be irresponsible in a low-throughput environment &#8212; but under agent abundance, it can be the right trade.</p><p>The point isn&#8217;t &#8220;copy their process.&#8221;</p><p>The point is: <strong>if your workflow assumes scarcity, it collapses under abundance.</strong></p><div><hr></div><h2><strong>Lesson 5: AI drift is a memory leak &#8212; schedule the garbage collector</strong></h2><p>Full agent autonomy introduces a new class of problem: replication.</p><p>Codex will copy whatever patterns exist in the repo &#8212; including uneven ones &#8212; and that leads to drift.</p><p>OpenAI&#8217;s early approach was brutally relatable:</p><p>They spent <strong>every Friday (20% of the week)</strong> cleaning up &#8220;AI slop.&#8221;</p><p>It didn&#8217;t scale.</p><p>So they built garbage collection:</p><ul><li><p>encoded &#8220;golden principles&#8221; as mechanical rules in-repo</p></li><li><p>ran recurring background tasks that scan for deviations</p></li><li><p>opened targeted refactor PRs (many reviewable in under a minute and auto-mergeable)</p></li></ul><p>Their framing is perfect:</p><p>Technical debt is a high-interest loan. Pay it continuously or it compounds.</p><p><strong>Drift is the new technical debt. GC is the new hygiene.</strong></p><div><hr></div><h1><strong>What I&#8217;d do about it (Monday-morning playbook)</strong></h1><p>If you run engineering, platform, SRE, cloud, or security &#8212; here&#8217;s the practical version. You&#8217;re building <strong>control surfaces</strong>.</p><h2><strong>1) Build an &#8220;agent map&#8221;</strong></h2><ul><li><p>keep AGENTS.md short (TOC, not encyclopedia)</p></li><li><p>put the real truth into versioned in-repo docs (architecture, runbooks, standards)</p></li><li><p>add CI checks for broken links + doc freshness (make drift loud)</p></li></ul><p><strong>Goal:</strong> make the repo navigable for a machine.</p><h2><strong>2) Make your system machine-debuggable</strong></h2><ul><li><p>one-command boot per branch/worktree</p></li><li><p>deterministic dev environments</p></li><li><p>agent-accessible logs/metrics/traces (even if local + ephemeral)</p></li></ul><p><strong>Goal:</strong> turn &#8220;feels broken&#8221; into &#8220;fails a measurable invariant.&#8221;</p><h2><strong>3) Encode constraints, not vibes</strong></h2><ul><li><p>structural tests for dependency direction</p></li><li><p>linters that enforce invariants (style, layering, boundaries)</p></li><li><p>lint errors that <em>teach the fix</em> (because the agent reads them)</p></li></ul><p><strong>Goal:</strong> make correctness the path of least resistance.</p><h2><strong>4) Create an autonomy ladder (so humans spend time on judgment)</strong></h2><p>OpenAI lists what autonomy looks like when the harness is real: reproduce a bug, record evidence (even videos), implement a fix, validate by driving the app, open a PR, respond to feedback, remediate build failures, and escalate only when judgment is needed &#8212; then merge.</p><p>Start simple. Earn trust. Climb the ladder.</p><h2><strong>5) Treat cleanup like production ops</strong></h2><ul><li><p>define &#8220;golden principles&#8221; (mechanical, enforceable)</p></li><li><p>run scans on a cadence</p></li><li><p>ship small refactors continuously</p></li></ul><p><strong>Goal:</strong> pay entropy continuously so it never compounds.</p><div><hr></div><h2><strong>The punchline</strong></h2><p>OpenAI&#8217;s claim isn&#8217;t &#8220;AI replaces engineers.&#8221;</p><p>It&#8217;s more interesting &#8212; and more uncomfortable:</p><p>As code gets cheaper, engineering becomes the discipline of keeping code coherent.</p><p>Harness engineering is the new platform layer.</p><p>And the winners won&#8217;t be the teams with the best prompts.</p><p>They&#8217;ll be the teams who build the best scaffolding.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.techwithdarin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Tech with Darin is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[The Cloud Plumbing, Security, and Business Model Stories Behind the Agent Stack]]></title><description><![CDATA[Anthropic shipped a bigger brain. AWS shipped identity resilience and runtime controls. Microsoft shipped AI-native security. And the business model debate got loud at exactly the wrong time.]]></description><link>https://www.techwithdarin.com/p/the-cloud-plumbing-security-and-business</link><guid isPermaLink="false">https://www.techwithdarin.com/p/the-cloud-plumbing-security-and-business</guid><dc:creator><![CDATA[Darin Deters]]></dc:creator><pubDate>Wed, 11 Feb 2026 15:48:52 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/6438ad4e-fec2-434a-9f6b-48496eeeeb6a_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Last week I broke down the agent platform layer &#8212; OpenAI Frontier and Atlas, the Codex App Server protocol, Prism, Claude&#8217;s propose-verify-approve pattern, ServiceNow distribution, and Chrome going agentic with Gemini 3. (<a href="https://www.techwithdarin.com/p/agents-became-infrastructure-frontier">Read that post here if you missed it.</a>)</p><p>The throughline: we&#8217;re moving from models to agents to agent platforms, and the moat is context, runtime, and distribution.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.techwithdarin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Tech with Darin is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>This week I want to go one layer down.</p><p>Because platforms don&#8217;t float. They sit on top of models, cloud infrastructure, security tooling, and business models &#8212; and all four of those shifted between February 2&#8211;11, 2026, in ways that matter for anyone building or buying agents in production.</p><p>Here&#8217;s what landed, why it matters, and what I&#8217;d actually do about it.</p><div><hr></div><h2>Claude Opus 4.6: what 1M tokens actually changes (and what it doesn&#8217;t)</h2><p>Anthropic shipped Claude Opus 4.6, its strongest &#8220;agentic&#8221; model yet, with a 1M token context window now available in beta. The headline number gets attention, but the real story is what this does &#8212; and doesn&#8217;t &#8212; change for how you design retrieval and reasoning in agent workloads.</p><h3>Where long context is a genuine unlock</h3><p>If your workload is &#8220;one large, bounded corpus,&#8221; you can now try loading the whole thing into context and reasoning over it directly. No chunking. No retrieval pipeline. No re-ranking. Just the model and the material.</p><p>This matters for specific use cases:</p><p><strong>Codebase reasoning.</strong> An entire repo in context means the model can trace dependencies, understand architectural decisions, and generate changes that are consistent with patterns it can actually see &#8212; not patterns it inferred from retrieval snippets.</p><p><strong>Incident investigation.</strong> Feeding a complete timeline &#8212; logs, alerts, runbook excerpts, Slack threads, post-mortems &#8212; into one context window lets the model correlate signals that would be fragmented across RAG chunks.</p><p><strong>Contract and regulatory analysis.</strong> Cross-referencing terms, definitions, obligations, and exceptions across a full document set without worrying about whether your retrieval pipeline surfaced the right clause.</p><h3>Where long context doesn&#8217;t replace RAG</h3><p>If your workload is &#8220;infinite enterprise sprawl&#8221; &#8212; thousands of documents across dozens of systems with different owners, permissions, freshness requirements, and classification levels &#8212; a 1M token window doesn&#8217;t solve your problem. You still need retrieval. You still need permissions. You still need a semantic layer that knows what&#8217;s current and what&#8217;s stale.</p><p>Context windows don&#8217;t solve governance. They don&#8217;t solve freshness. They don&#8217;t solve multi-tenancy. Don&#8217;t let the headline number distract from the architectural work that still needs to happen.</p><h3>The cross-cloud angle matters more than people think</h3><p>Opus 4.6 isn&#8217;t locked to one vendor. It&#8217;s available in Amazon Bedrock and Microsoft Foundry (Azure). If you&#8217;re evaluating agent-capable models and you need cross-cloud availability &#8212; because your infrastructure spans AWS and Azure, or because procurement won&#8217;t sign off on a single-vendor dependency &#8212; this simplifies the conversation. You can run the same model wherever your workloads already live.</p><h3>The practical experiment</h3><p>Take a real workload &#8212; a repo your team works in, a set of docs your team references daily &#8212; and run it through Opus 4.6 in &#8220;long-context-first&#8221; mode. Then compare the quality, latency, and cost against your existing RAG pipeline for the same queries. Let the data tell you where the tradeoff lands for your specific use case, instead of guessing.</p><div><hr></div><h2>AWS: the runtime and identity layers that make agents survivable</h2><p>While the platform-layer headlines went to Frontier and Atlas, AWS quietly shipped the kind of infrastructure changes that determine whether agents actually work in production. Two runtime updates and two foundational infrastructure releases.</p><h3>Bedrock server-side tool use</h3><p>Bedrock added server-side tool use and extended prompt caching in the Responses API. This is the &#8220;make it controllable&#8221; layer.</p><p>Previously, most agent tool-use implementations were client-side orchestration &#8212; your code called the model, parsed the tool request, executed it, and sent the result back. That works in a demo. It breaks in production when you need consistent security boundaries, audit trails, and cost controls.</p><p>Server-side tool use means the execution happens inside AWS&#8217;s security perimeter &#8212; IAM policies, VPC boundaries, CloudTrail logging &#8212; with the guardrails you&#8217;d expect. Extended prompt caching means repeated context (system prompts, shared documents, conversation history) doesn&#8217;t get re-processed on every call, which directly impacts cost and latency for multi-turn agent workflows.</p><p>If you&#8217;re building agents on Bedrock, this is the shift from &#8220;model capability&#8221; to &#8220;operational capability.&#8221; It&#8217;s what makes tool use shippable.</p><h3>IAM Identity Center multi-Region replication</h3><p>This one doesn&#8217;t sound exciting. It is quietly one of the most important AWS releases this quarter.</p><p>Identity is a Tier-0 dependency. Everything downstream &#8212; console access, CLI sessions, service roles, federated access, SSO &#8212; depends on IAM Identity Center being available. Until now, it was single-Region. If that Region had an issue, your identity plane was impaired.</p><p>AWS now lets you replicate IAM Identity Center from a primary Region to additional Regions, including identities, permission sets, and account assignments.</p><p>Why this matters beyond availability: <strong>data residency.</strong> Some regulatory frameworks require that identity and access data reside in specific geographies. Multi-Region replication gives you the ability to place identity data where your compliance requirements demand &#8212; without building a parallel identity system.</p><p>If you&#8217;ve ever sat in a BCDR review where someone said &#8220;nothing works if identity is down,&#8221; this is that conversation getting resolved.</p><p><strong>What to do now:</strong> Start planning your replication topology. Identify your KMS key strategy for the secondary Regions. Define failover access patterns. Test before you need it. This isn&#8217;t a &#8220;set and forget&#8221; feature &#8212; it requires deliberate design around Region selection, replication lag tolerance, and operational runbooks.</p><h3>Security group &#8220;Related resources&#8221; tab</h3><p>Small console enhancement, outsized impact for anyone managing a large AWS estate. The EC2/VPC console now shows which resources &#8212; ENIs, instances, load balancers, Lambda functions &#8212; are associated with a given security group.</p><p>Before this, deleting or modifying a security group was a &#8220;hope nothing breaks&#8221; exercise unless you had custom tooling to map dependencies. Now you can see the blast radius before you make the change.</p><p><strong>Integrate this into your change management workflow.</strong> Especially before deletions &#8212; require a check of the related resources tab as part of your change request documentation. It&#8217;s a small step that prevents expensive mistakes.</p><div><hr></div><h2>Microsoft: AI-native security is becoming a first-class agent concern</h2><p>Microsoft shipped two security-related updates this cycle that are worth reading together.</p><p><strong>AI-powered incident prioritization</strong> is now in public preview in Defender. It uses machine learning to help SOC analysts cut through alert noise and focus on the incidents most likely to be real and impactful. If your SOC is drowning in false positives &#8212; and statistically, it probably is &#8212; this is worth evaluating against your current triage metrics: mean time to acknowledge, false positive rate, analyst fatigue.</p><p><strong>Expanded Defender coverage for Foundry-hosted agents</strong> means Microsoft is extending its security tooling to cover agent workloads specifically. This is Microsoft positioning security as a first-class concern for agent deployments, not something you bolt on after the fact.</p><p>The timing is deliberate. As agent platforms ship (Frontier, Foundry, Bedrock), the attack surface expands. Agents that can execute code, query databases, and take actions on behalf of users are a fundamentally different security problem than a chatbot answering questions. Microsoft is building the security layer to match.</p><div><hr></div><h2>Anthropic&#8217;s 0-day research: a signal worth taking seriously</h2><p>Separately from the Opus 4.6 release, Anthropic is explicitly researching the risk of LLM-discovered 0-days &#8212; previously unknown vulnerabilities found by advanced models &#8212; and publishing findings about it.</p><p>This is a model builder acknowledging that &#8220;agentic capability&#8221; and &#8220;security capability&#8221; are two sides of the same coin.</p><p>Here&#8217;s my honest take: I&#8217;m not sure most organizations are ready for the speed at which capable agents can become unintentional security researchers. An agent tasked with &#8220;find a way to make this API call work&#8221; could, in theory, discover and exploit a vulnerability in the process. The security posture around agent workloads needs to assume mistakes will happen and build containment accordingly &#8212; not just for malicious actors, but for well-intentioned automation that wanders into dangerous territory.</p><p>This is why the governance layer I talked about last week matters so much. Agent identity, scoped permissions, audit trails, and the ability to revoke access without breaking other workflows &#8212; these aren&#8217;t nice-to-haves. They&#8217;re the difference between an agent that helps and an agent that becomes a liability.</p><div><hr></div><h2>Google GEAR: investing in the developer enablement layer</h2><p>Google Cloud launched GEAR &#8212; a structured skills path for building and deploying agents using Google&#8217;s Agent Development Kit (ADK). It includes labs, credits, and a progression path, housed inside the Google Developer Program.</p><p>While OpenAI and Anthropic are leading on the platform and model layers, Google is investing in developer enablement &#8212; making it easier to get started building agents on its stack. Different strategy, complementary signal. The market is moving fast enough that developer adoption velocity matters as much as raw platform capability.</p><p>If you have teams evaluating Google&#8217;s agent tooling, GEAR is worth pointing them toward as a structured on-ramp.</p><div><hr></div><h2>Business model whiplash: ads vs. subscriptions (and why enterprise should pay attention)</h2><p>Two announcements landed almost back-to-back, and the contrast is stark.</p><p><strong>OpenAI started testing ads in ChatGPT</strong> for logged-in adult users on the Free and Go tiers in the US. Plus, Pro, Business, Enterprise, and Education tiers are not affected. The ads are described as &#8220;relevant ads within conversations&#8221; &#8212; the implementation details and data-handling specifics are still emerging.</p><p><strong>Anthropic used Super Bowl visibility to position Claude as explicitly ad-free</strong> &#8212; framing the absence of advertising as a trust and alignment feature, not just a business model choice.</p><p>For enterprise buyers, this isn&#8217;t about moral judgment. It&#8217;s about trust boundaries and the questions your security, compliance, and procurement teams are going to ask:</p><p><strong>Data handling.</strong> What user data flows into the ad-targeting pipeline? Even if your org is on an enterprise tier, does the existence of an ad-supported tier change how the underlying model is trained or tuned?</p><p><strong>Response integrity.</strong> How do you prove that responses in the paid tiers are completely uninfluenced by commercial relationships in the ad-supported tiers?</p><p><strong>Vendor risk narrative.</strong> When your CISO asks &#8220;is our AI vendor also an advertising company?&#8221; &#8212; what&#8217;s your answer, and does it change your risk posture?</p><p>My read: this is going to matter most in regulated industries &#8212; healthcare, financial services, government &#8212; where the <em>perception</em> of data mixing or commercial influence can be as damaging as the reality. Expect this to become a procurement checklist item within the next two quarters.</p><div><hr></div><h2>Databricks: follow the money (agents will eat your data platform bill)</h2><p>Databricks raising ~$5B at a ~$134B valuation, with AI products crossing ~$1.4B in annualized revenue, is a signal worth reading carefully.</p><p>The thesis is straightforward: agents don&#8217;t just &#8220;think.&#8221; They query, join, filter, write, summarize, re-query, materialize, and do it again &#8212; often in loops. Every agentic workflow that touches structured data is a workload on your data platform. Every multi-step reasoning chain that needs fresh data is a set of warehouse queries. Every agent that &#8220;monitors&#8221; something is a recurring compute job.</p><p>Databricks is calling itself an &#8220;AI beneficiary&#8221; because it&#8217;s sitting on the metered layer where agent work becomes billable compute. That&#8217;s not speculation &#8212; it&#8217;s already showing up in their revenue numbers.</p><p><strong>The FinOps implication is real.</strong> If your organization is deploying agents that interact with data platforms &#8212; Databricks, Snowflake, BigQuery, Redshift &#8212; you need usage budgets and alerts in place <em>before</em> the agents are in production. &#8220;Helpful automation&#8221; has a way of becoming a surprise bill when nobody set a ceiling on how many queries an agent could run per hour.</p><p>Set the budgets. Set the alerts. Have the conversation with your data platform team about what &#8220;agent-driven usage&#8221; looks like in their billing model. Do it this week, not after the first invoice lands.</p><div><hr></div><h2>What I&#8217;d do this week &#8212; a practical 10-step plan</h2><p>Pick the items that match where you are:</p><p><strong>1. Run a &#8220;long-context first&#8221; experiment with Opus 4.6.</strong> Take a real repo or document set your team actually uses. Load it into a 1M token context. Run the same queries you&#8217;d run against your RAG pipeline. Compare quality, latency, and cost. Let the data decide.</p><p><strong>2. Evaluate Bedrock server-side tool use for your agent workloads.</strong> If you&#8217;re building agents on AWS, test server-side tool execution against your current client-side orchestration. Measure the difference in security posture, auditability, and operational complexity.</p><p><strong>3. Plan IAM Identity Center multi-Region replication.</strong> Identify target Regions. Plan your KMS key strategy. Define failover access patterns. Test before you need it.</p><p><strong>4. Use the security group dependency view in your change workflow.</strong> Make &#8220;check related resources&#8221; a standard step before security group modifications or deletions. Small habit, big risk reduction.</p><p><strong>5. SOC teams: evaluate Defender&#8217;s AI prioritization preview.</strong> Run it in parallel with your current triage process. Measure against MTTA, false positive rate, and analyst workload. See if it actually reduces noise or just moves it around.</p><p><strong>6. Review your agent security posture against the 0-day research signal.</strong> Ask: if an agent tasked with a legitimate workflow accidentally discovered a vulnerability, would your containment model catch it? If the answer is &#8220;I don&#8217;t know,&#8221; that&#8217;s the priority.</p><p><strong>7. Set agent-aware FinOps budgets.</strong> Assume agent-driven warehouse and API usage is coming. Set budgets, alerts, and per-agent usage caps before the workloads are live.</p><p><strong>8. Point your teams at Google GEAR if they&#8217;re evaluating Google&#8217;s agent stack.</strong> Structured learning paths beat ad-hoc exploration for teams that need to ramp quickly.</p><p><strong>9. Start your &#8220;Trust FAQ&#8221; document.</strong> Ads, data handling, model choices, logging, retention, response integrity. Get ahead of the questions your security and compliance teams are going to ask &#8212; because they <em>are</em> going to ask.</p><p><strong>10. Re-read last week&#8217;s post and map your agent boundary diagram.</strong> Data sources &#8594; tools &#8594; actions &#8594; approval points &#8594; rollback mechanisms. If you can&#8217;t draw it on a whiteboard, you&#8217;re not ready to ship it. (<a href="https://www.techwithdarin.com/p/agents-became-infrastructure-frontier">Here&#8217;s the post.</a>)</p><div><hr></div><p><em>Last week, the agent stack became a product. This week, the foundation underneath it got stronger &#8212; better models, more resilient identity, controlled runtimes, AI-native security, and a business model debate that enterprise can&#8217;t afford to ignore. The organizations that move deliberately on both layers &#8212; platform and foundation &#8212; are the ones that will actually ship agents that last.</em></p><p><em>See you next week.</em></p><p>&#8212; Darin</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.techwithdarin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Tech with Darin is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Agents became infrastructure: Frontier + Atlas, Prism, Codex Harness, Claude, Gemini 3]]></title><description><![CDATA[For a while, AI updates felt like magic tricks.]]></description><link>https://www.techwithdarin.com/p/agents-became-infrastructure-frontier</link><guid isPermaLink="false">https://www.techwithdarin.com/p/agents-became-infrastructure-frontier</guid><dc:creator><![CDATA[Darin Deters]]></dc:creator><pubDate>Thu, 05 Feb 2026 16:10:07 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/4b97b8f3-42eb-4cee-9232-51b18ad42bc5_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>For a while, AI updates felt like magic tricks.</p><p>A new model. A new benchmark. A new demo.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.techwithdarin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Tech with Darin is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>This week felt different.</p><p>This week was about <strong>shipping the boring parts</strong> that make agents real:</p><ul><li><p>Shared context</p></li><li><p>Runtime protocols</p></li><li><p>Approvals and guardrails</p></li><li><p>Distribution inside the tools people already use</p></li></ul><p>In other words, the agent stack is hardening into infrastructure.</p><p>You can see it across the big releases from OpenAI, Anthropic, and Google over the last couple of weeks:</p><ul><li><p>OpenAI launched <strong>Frontier</strong> (Feb 5, 2026), a platform to build, deploy, and manage AI coworkers across the enterprise.</p></li><li><p>OpenAI&#8217;s <strong>Atlas</strong> (Oct 21, 2025) is already the &#8220;work happens here&#8221; browser surface that Frontier can plug into.</p></li><li><p>OpenAI opened up the <strong>Codex harness</strong> details, centered on a bidirectional App Server protocol.</p></li><li><p>OpenAI shipped <strong>Prism</strong> (Jan 27, 2026), a free AI-native workspace for scientific writing and collaboration powered by GPT-5.2.</p></li><li><p>Anthropic showed Claude planning a Mars rover route with the &#8220;propose &#8594; verify &#8594; approve&#8221; shape that enterprises desperately need.</p></li><li><p>Anthropic + ServiceNow is pushing Claude into enterprise workflows at massive scale.</p></li><li><p>Google&#8217;s January AI recap makes the strategy loud: <strong>personal intelligence</strong> plus <strong>Chrome auto browse</strong> powered by Gemini 3.</p></li></ul><p>There&#8217;s a throughline here:</p><p>We are moving from <strong>models</strong> to <strong>agents</strong> to <strong>agent platforms</strong>.</p><p>And the moat is not &#8220;who is smartest.&#8221;</p><p>The moat is:</p><ol><li><p><strong>Context</strong></p></li><li><p><strong>Runtime</strong></p></li><li><p><strong>Distribution</strong></p></li></ol><p>Let&#8217;s break down what shipped, and why it matters for builders, IT leaders, and anyone selling into enterprise.</p><div><hr></div><h2><strong>The big shift: intelligence is cheap, context is expensive</strong></h2><p>Most teams are not blocked by &#8220;can the model do it.&#8221;</p><p>They are blocked by:</p><ul><li><p>Data scattered across systems</p></li><li><p>Permissions that do not map cleanly to &#8220;an agent&#8221;</p></li><li><p>Integrations that become one-off projects</p></li><li><p>No quality loop, so pilots never become dependable systems</p></li></ul><p>Frontier names this directly: the thing slowing enterprises down is not model intelligence, it&#8217;s <strong>how agents are built and run</strong> inside real organizations.</p><p>So the battleground moved.</p><p>Not &#8220;better answers.&#8221;</p><p>More &#8220;reliable work in production.&#8221;</p><div><hr></div><h2><strong>OpenAI Frontier: the control plane for AI coworkers</strong></h2><p>Frontier is OpenAI&#8217;s answer to the enterprise reality: multi-cloud, messy systems, governance everywhere, and agents that need to operate inside that mess without breaking things.</p><p>Frontier&#8217;s core idea is simple:</p><p>AI coworkers need the same fundamentals humans need at a company:</p><ul><li><p>Onboarding and institutional knowledge</p></li><li><p>Access to the right systems</p></li><li><p>Clear boundaries and permissions</p></li><li><p>Learning via feedback so performance improves over time</p></li></ul><h3><strong>1) The &#8220;semantic layer&#8221; framing is the tell</strong></h3><p>Frontier connects siloed warehouses, CRMs, ticketing tools, and internal apps to create shared business context, explicitly calling this a <strong>semantic layer for the enterprise</strong> that all AI coworkers can reference.</p><p>That is the game.</p><p>The agent that understands your internal language and where truth lives will beat the agent that doesn&#8217;t, even if the second agent has a slightly stronger model.</p><h3><strong>2) Open standards, no forced replatform</strong></h3><p>Frontier says it works with the systems you already have, across multiple clouds, using open standards, with no requirement to abandon existing agents or apps.</p><p>This is a direct shot at the &#8220;rip and replace&#8221; fear that kills AI adoption inside large enterprises.</p><h3><strong>3) Execution environment, not just chat</strong></h3><p>Frontier is built around agents completing complex tasks &#8220;like working with files, running code, and using tools&#8221; inside a &#8220;dependable&#8221; execution environment, and building memories as they operate.</p><p>This is not a prompt library.</p><p>This is an operating layer.</p><h3><strong>4) Evaluation and optimization built in</strong></h3><p>Frontier emphasizes built-in evaluation and optimization so good behaviors improve on real work over time.</p><p>That&#8217;s the difference between a clever demo and a system you can trust.</p><h3><strong>5) The human layer: Forward Deployed Engineers</strong></h3><p>Frontier also comes with OpenAI FDEs embedded with customer teams to help get agents into production and feed deployment learnings back into research.</p><p>That&#8217;s OpenAI saying: enterprise is not only software. It is execution.</p><div><hr></div><h2><strong>Atlas: the distribution surface Frontier has been pointing at</strong></h2><p>In my earlier framing, I treated Atlas like a generic workflow mention.</p><p>That was wrong.</p><p>Atlas is a real product, and it matters here.</p><p>OpenAI introduced <strong>ChatGPT Atlas on October 21, 2025</strong> as &#8220;the browser with ChatGPT built in.&#8221;</p><p>Here&#8217;s why Atlas is strategically important to the Frontier story:</p><h3><strong>1) Atlas is &#8220;ChatGPT comes with you&#8221;</strong></h3><p>Atlas lets ChatGPT work &#8220;anywhere across the web&#8221; inside the window you&#8217;re already using, without copy/paste or leaving the page.</p><p>That is distribution.</p><p>Not &#8220;come to my AI app.&#8221;</p><p>More &#8220;AI meets you where work already happens.&#8221;</p><h3><strong>2) Memory becomes ambient context</strong></h3><p>Atlas ships with ChatGPT memory built in, and adds &#8220;browser memories&#8221; that can remember context from sites you visit. Those browser memories are optional and user-controlled (view, archive, delete).</p><p>This is the missing bridge between:</p><ul><li><p>web activity</p></li><li><p>and agent usefulness</p></li></ul><h3><strong>3) Agent mode becomes native to browsing</strong></h3><p>Atlas includes agent mode designed to act while you browse, and OpenAI says it&#8217;s faster and more useful when working with browsing context. Agent mode in Atlas launched in preview for Plus, Pro, and Business users.</p><p>This is not theory. It is a product surface built for agentic work.</p><h3><strong>4) Controls and safety constraints are explicit</strong></h3><p>Atlas includes visibility toggles per site and restrictions like &#8220;cannot run code in the browser, download files, or install extensions,&#8221; and it pauses on sensitive sites.</p><p>Again, boring parts.</p><p>Also the parts enterprises demand.</p><h3><strong>5) Frontier explicitly plugs into Atlas</strong></h3><p>Frontier says AI coworkers can be accessed through any interface, including &#8220;workflows with Atlas.&#8221;</p><p>So think of it like this:</p><ul><li><p><strong>Frontier</strong>: enterprise control plane (context + permissions + eval + execution)</p></li><li><p><strong>Atlas</strong>: high-distribution client surface where a lot of work actually happens</p></li></ul><p>That pairing is not accidental.</p><div><hr></div><h2><strong>Codex harness: protocols over prompts</strong></h2><p>The Codex harness post is one of the most important &#8220;builder&#8221; updates in this whole set.</p><p>Because it is OpenAI showing the architecture that makes agents portable across surfaces.</p><p>The key piece is the <strong>Codex App Server</strong>.</p><p>OpenAI describes the App Server as both:</p><ul><li><p>the <strong>JSON-RPC protocol</strong> between client and server</p></li><li><p>and a <strong>long-lived process</strong> that hosts Codex core threads</p></li></ul><p>And the design choices are exactly what enterprises need:</p><h3><strong>A single request can produce many event updates</strong></h3><p>One client request can result in many event updates, transformed into stable UI-ready notifications so you can build rich interfaces.</p><h3><strong>Fully bidirectional, approval-native</strong></h3><p>The protocol is fully bidirectional. The server can initiate requests when the agent needs input, &#8220;like an approval,&#8221; and pause until the client responds.</p><p>That pattern is the difference between:</p><ul><li><p>&#8220;agent takes actions&#8221;</p></li><li><p>and &#8220;agent takes actions safely&#8221;</p></li></ul><h3><strong>A portable transport layer</strong></h3><p>OpenAI says the transport is JSON-RPC over stdio (JSONL), making it straightforward to build bindings in many languages.</p><p>If you build internal developer platforms, this should ping your radar.</p><p>The future agent ecosystem will not be &#8220;one UI.&#8221;</p><p>It will be:</p><ul><li><p>IDE</p></li><li><p>terminal</p></li><li><p>browser</p></li><li><p>desktop app</p></li><li><p>internal portals</p></li></ul><p>Protocols win.</p><div><hr></div><h2><strong>Prism: the document becomes the workspace</strong></h2><p>Prism is framed as science, but the pattern is bigger.</p><p>OpenAI introduced <strong>Prism on January 27, 2026</strong> as a free AI-native workspace for scientists to write and collaborate on research, powered by GPT-5.2, with unlimited projects and collaborators, available to anyone with a ChatGPT personal account.</p><p>Prism is:</p><ul><li><p>cloud-based</p></li><li><p>LaTeX-native</p></li><li><p>built for real-time collaboration without local installs</p></li></ul><p>The important thing is not &#8220;LaTeX.&#8221;</p><p>It is this:</p><p>AI is moving from a side chat into the place where the work actually lives.</p><p>Expect the same pattern to eat enterprise docs:</p><ul><li><p>architecture docs</p></li><li><p>incident reviews</p></li><li><p>security exception narratives</p></li><li><p>audits</p></li><li><p>proposals</p></li><li><p>runbooks</p></li></ul><p>Once the workspace is AI-native, &#8220;asking&#8221; becomes &#8220;editing in place.&#8221;</p><p>That is a workflow shift, not a feature.</p><div><hr></div><h2><strong>Anthropic: the enterprise-safe shape is propose &#8594; verify &#8594; approve</strong></h2><h3><strong>Claude on Mars is a process demo, not a space demo</strong></h3><p>Anthropic describes Claude using vision to plan a Mars rover &#8220;breadcrumb trail,&#8221; then the waypoints were run through simulation with &#8220;over 500,000 variables,&#8221; engineers reviewed, only minor changes were needed, and the route held up. Engineers estimate this can cut route planning time in half.</p><p>Steal the shape:</p><ul><li><p>AI proposes</p></li><li><p>systems validate</p></li><li><p>humans approve</p></li><li><p>execution happens</p></li></ul><p>That is exactly how we should deploy agents into:</p><ul><li><p>infrastructure changes</p></li><li><p>incident response</p></li><li><p>compliance workflows</p></li><li><p>ticket automation</p></li></ul><h3><strong>ServiceNow + Claude is distribution inside enterprise muscle memory</strong></h3><p>ServiceNow is targeting a 50% reduction in time-to-implement for customers using Claude, and early testing showed up to 95% reduction in seller prep time via a Claude-powered coaching tool.</p><p>They also state Claude is the default model for Build Agent and a preferred model across the ServiceNow AI Platform.</p><p>If you run IT, ServiceNow is not just where tickets live.</p><p>It is becoming where agents live.</p><div><hr></div><h2><strong>Google: the browser becomes an agent</strong></h2><p>Google&#8217;s January recap highlights:</p><ul><li><p>Gemini app connecting to Google apps for personalized help (opt-in, beta)</p></li><li><p>Chrome features built on Gemini 3 including &#8220;auto browse&#8221; for multi-step chores like booking travel or scheduling appointments</p></li><li><p>Gemini 3 as the default model for AI Overviews globally</p></li></ul><p>This matters because it signals where the next &#8220;default agent surface&#8221; sits for most humans:</p><p>The browser.</p><p>And now we have two major &#8220;browser is agentic&#8221; moves at once:</p><ul><li><p>OpenAI Atlas as a ChatGPT-native browser</p></li><li><p>Google Chrome adding Gemini 3 auto-browse behavior</p></li></ul><p>Different approaches.</p><p>Same destination.</p><div><hr></div><h2><strong>What to do this week if you lead IT, security, or a platform team</strong></h2><p>Here&#8217;s the practical playbook I&#8217;d run right now.</p><h3><strong>1) Pick two workflows with obvious ROI</strong></h3><p>Do not start with &#8220;company-wide AI.&#8221;</p><p>Start with two that pay back fast:</p><ul><li><p>Incident triage and root cause acceleration</p></li><li><p>Change impact analysis with approvals</p></li></ul><p>Frontier&#8217;s own example describes collapsing root-cause identification from hours to minutes by pulling together logs, docs, workflows, and code.</p><h3><strong>2) Build a context map, not an agent</strong></h3><p>List your truth sources:</p><ul><li><p>CMDB</p></li><li><p>runbooks</p></li><li><p>incident history</p></li><li><p>logs, metrics, traces</p></li><li><p>change calendars</p></li><li><p>identity and permissions</p></li><li><p>repos and pipelines</p></li></ul><p>Then decide what the agent can read, what it can write, and where it must ask.</p><h3><strong>3) Standardize the &#8220;approval handshake&#8221;</strong></h3><p>Codex App Server bakes this pattern in: the agent requests approval and pauses until the client responds.</p><p>Make that your enterprise standard:</p><ul><li><p>read-only by default</p></li><li><p>propose diffs</p></li><li><p>require approvals for write actions</p></li><li><p>log every decision</p></li></ul><h3><strong>4) Choose your distribution surfaces early</strong></h3><p>Where should the agent live?</p><ul><li><p>ITSM (ServiceNow)</p></li><li><p>browser (Atlas, Chrome)</p></li><li><p>IDE and CLI (Codex harness)</p></li><li><p>doc workspaces (Prism-style)</p></li></ul><p>If you force a new UI, you lose.</p><h3><strong>5) Measure quality on real work</strong></h3><p>If you cannot measure outcomes, you cannot scale trust.</p><p>Frontier&#8217;s evaluation and optimization emphasis is the right direction.</p><div><hr></div><h2><strong>My take</strong></h2><p>This is the week agents stopped being a feature and started becoming infrastructure.</p><ul><li><p><strong>Frontier</strong> is the enterprise control plane.</p></li><li><p><strong>Atlas</strong> is a high-distribution workflow surface where context and action can meet.</p></li><li><p><strong>Codex App Server</strong> is the runtime protocol blueprint: event streams, portability, approvals.</p></li><li><p><strong>Prism</strong> shows the doc-workflow future: AI inside the workspace, not beside it.</p></li><li><p><strong>Claude on Mars</strong> proves the safe enterprise shape: propose, verify, approve.</p></li><li><p><strong>ServiceNow + Claude</strong> and <strong>Chrome + Gemini 3</strong> show the distribution war is already underway.</p></li></ul><p>The question is not whether AI will change how work gets done.</p><p>The question is whether you are building:</p><ul><li><p>shared context</p></li><li><p>safe runtimes</p></li><li><p>high-trust approvals</p></li><li><p>and distribution inside real workflows</p></li></ul><p>Because that is where the advantage will live.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.techwithdarin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Tech with Darin is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Vibe coding works best when the user is you]]></title><description><![CDATA[Thailand changed my life. Vibe coding changed how I build. I created my own Thai-learning accelerator to practice daily with less friction.]]></description><link>https://www.techwithdarin.com/p/vibe-coding-works-best-when-the-user</link><guid isPermaLink="false">https://www.techwithdarin.com/p/vibe-coding-works-best-when-the-user</guid><dc:creator><![CDATA[Darin Deters]]></dc:creator><pubDate>Sun, 01 Feb 2026 18:47:05 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/92097b4f-40b9-4717-9208-a0ca2b7beed9_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>I fell in love with Thailand a little over two years ago.</p><p>It started on an adventure sail across the Andaman Sea with 13 strangers from around the world. No shared background. No shared routine. Just long days on the water, stories at night, and that rare feeling that your life just got wider.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.techwithdarin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Tech with Darin is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>A few months later, I went back and met an unexpected love. Her name is Arisara.</p><p>Now we&#8217;re working toward building our future and path in rural northeast Thailand.</p><p>And that&#8217;s where &#8220;learning Thai&#8221; stopped being a nice-to-have.</p><p>It became personal.</p><p>Her English is improving. My Thai has been harder. Not because I don&#8217;t care. Because consistency is brutal when the practice loop is clunky.</p><p>So I built my own accelerator.</p><p>Not as a startup.</p><p>As a tool I needed.</p><h2><strong>The best reason to vibe code</strong></h2><p>Vibe coding is the fastest way I&#8217;ve ever seen to go from idea to working software.</p><p>But the real unlock is not the AI.</p><p>It&#8217;s the feedback loop.</p><p>The best vibe-coding projects start when:</p><ul><li><p>you have a real problem</p></li><li><p>you feel the friction daily</p></li><li><p>you can tell, instantly, what&#8217;s working and what&#8217;s not</p></li></ul><p>When you build for yourself, you don&#8217;t need motivation hacks. You don&#8217;t need fake deadlines. You just want the problem gone.</p><p>That&#8217;s why personal tools are the perfect playground for vibe coding.</p><h2><strong>What I built</strong></h2><p>I built Sawasdee Speak:</p><p>https://sawasdeespeak.com/</p><p>The goal is simple: make Thai practice easier than procrastination.</p><p>Short sessions. Real repetition. Focus on phrases I actually use. Minimal friction to start.</p><p>Not &#8220;the best Thai app.&#8221;</p><p>The best Thai app for my life.</p><h2><strong>How I built it</strong></h2><p>I used OpenAI (ChatGPT) to turn the messy idea into a one-page spec and a tight MVP loop.</p><p>Then I shipped it using a stack that optimized for speed and deployment, not ideology:</p><ul><li><p>Concept development and requirements in ChatGPT</p></li><li><p>Built on Google AI Studio</p></li><li><p>Running on Google Cloud Cloud Run</p></li><li><p>Domain hosted on Amazon Web Services</p></li></ul><p>This mix matters.</p><p>Vibe coding rewards momentum. The stack is whatever keeps you moving.</p><h2><strong>What vibe coding looked like in practice</strong></h2><p>This is the workflow that worked for me and is repeatable.</p><h3><strong>1) Describe outcomes, not implementation</strong></h3><p>I did not start with &#8220;build a language learning app.&#8221;</p><p>I started with:</p><ul><li><p>I want a daily Thai practice loop</p></li><li><p>I want short reps</p></li><li><p>I want the phrases I care about</p></li><li><p>I want low friction: open &#8594; practice &#8594; done</p></li></ul><h3><strong>2) Get a one-page MVP spec</strong></h3><p>I asked for:</p><ul><li><p>the smallest possible feature set</p></li><li><p>the screens and states</p></li><li><p>what data needs to exist</p></li><li><p>what is explicitly out of scope</p></li></ul><p>That &#8220;out of scope&#8221; list is the guardrail that keeps vibe coding from turning into a bloated mess.</p><h3><strong>3) Build the thinnest version that works</strong></h3><p>The temptation is to accept every shiny idea.</p><p>I resisted.</p><p>If it didn&#8217;t improve the practice loop, it didn&#8217;t ship.</p><h3><strong>4) Deploy early so the feedback is real</strong></h3><p>Local builds lie.</p><p>Production tells the truth.</p><p>Once it was live, the app stopped being a project and became a habit. That&#8217;s the entire point.</p><h2><strong>Why building for yourself is the cheat code</strong></h2><p>When you are the user, you automatically have:</p><ul><li><p>real requirements (you feel what&#8217;s missing)</p></li><li><p>instant QA (you notice what&#8217;s annoying)</p></li><li><p>relentless prioritization (only what matters survives)</p></li></ul><p>Most projects fail because the feedback loop is imaginary.</p><p>Personal tools don&#8217;t have that problem.</p><h2><strong>Guardrails so it doesn&#8217;t become future pain</strong></h2><p>Personal apps have a funny habit of becoming real.</p><p>A few rules I follow so &#8220;quick&#8221; doesn&#8217;t turn into &#8220;fragile&#8221;:</p><ul><li><p>keep secrets out of the frontend</p></li><li><p>add basic logging early</p></li><li><p>assume inputs are hostile if the app is public</p></li><li><p>rate limit anything exposed to the internet</p></li><li><p>write down what you will not build (and honor it)</p></li></ul><p>Speed is great. Maintainable speed is better.</p><h2><strong>If you want to try vibe coding this week</strong></h2><p>Don&#8217;t start with &#8220;a SaaS.&#8221;</p><p>Start with one recurring irritation in your life or work.</p><p>Do a single sitting sprint:</p><ol><li><p>Write the problem in 3 bullets</p></li><li><p>Define the loop (open &#8594; do the thing &#8594; done)</p></li><li><p>Ask an LLM for a one-page MVP spec</p></li><li><p>Build only that</p></li><li><p>Deploy it somewhere you will actually use</p></li></ol><p>That&#8217;s it.</p><p>You are not proving you can build a company.</p><p>You are proving you can ship something useful.</p><h2><strong>What I&#8217;m improving next</strong></h2><p>Now that I&#8217;m using Sawasdee Speak, the next steps are obvious because I can feel the friction:</p><ul><li><p>faster &#8220;start session&#8221; flow</p></li><li><p>smarter repetition (missed items return more often)</p></li><li><p>pronunciation support</p></li><li><p>lightweight streak or reminder loop</p></li></ul><p>Not because a roadmap said so.</p><p>Because I want the practice loop to be automatic.</p><h2><strong>Closing</strong></h2><p>I didn&#8217;t build Sawasdee Speak to create another project.</p><p>I built it because I&#8217;m building a life in Thailand, and language is part of that.</p><p>Vibe coding is powerful, but the real advantage is purpose.</p><p>Build something you need.</p><p>Use it tomorrow.</p><p>Improve it next week.</p><p>That&#8217;s the cheat code.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.techwithdarin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Tech with Darin is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Claude is weirdly better at Excel than Copilot]]></title><description><![CDATA[I keep running into a reality that feels both funny and kind of sad: Claude is currently better at creating and editing Excel workbooks than Microsoft Copilot is.]]></description><link>https://www.techwithdarin.com/p/claude-is-weirdly-better-at-excel</link><guid isPermaLink="false">https://www.techwithdarin.com/p/claude-is-weirdly-better-at-excel</guid><dc:creator><![CDATA[Darin Deters]]></dc:creator><pubDate>Thu, 22 Jan 2026 00:20:11 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/d8e5c9ec-b72d-40a5-bdc4-e3209e85937c_1200x630.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>I keep running into a reality that feels both funny and kind of sad:</p><p><strong>Claude is currently better at creating and editing Excel workbooks than Microsoft Copilot is.</strong></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.techwithdarin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Tech with Darin is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>And yeah&#8230; you&#8217;d think the company that <em>owns</em> Office would have this locked down.</p><h3><strong>What Claude does (that feels like cheating)</strong></h3><p>Claude can straight-up <strong>generate an actual .xlsx</strong>&#8212;with real sheets, formulas, formatting, tables, the whole thing&#8212;and hand it back to you as a file you can open in Excel. It can also take an existing workbook and <strong>apply edits</strong> and return the updated file.</p><p>That changes the game because &#8220;making a spreadsheet&#8221; isn&#8217;t just about answering questions. It&#8217;s about producing the artifact:</p><ul><li><p>New workbook from scratch</p></li><li><p>New tabs + consistent structure</p></li><li><p>Formulas that actually work</p></li><li><p>Clean formatting</p></li><li><p>Repeatable templates</p></li></ul><p>Claude&#8217;s file-creation workflow is designed for exactly that.</p><h3><strong>What Copilot in Excel does (and why it feels constrained)</strong></h3><p>Copilot in Excel absolutely has useful capabilities&#8212;helping create/understand formulas, summarize data, and analyze what&#8217;s in front of you.</p><p>But the experience often hits &#8220;product rails&#8221; fast:</p><ul><li><p><strong>Your data needs to be structured</strong> as an <strong>Excel table</strong> or a very specific &#8220;supported range&#8221; format (unique headers, no merged cells, no blank headers, etc.).</p></li><li><p>It generally expects your file to be in <strong>OneDrive/SharePoint with AutoSave on</strong>.</p></li><li><p>Microsoft is actively shifting how this works: they note <strong>&#8220;App Skills&#8221; in Excel are being removed by late February 2026</strong>, pointing people toward &#8220;Agent Mode,&#8221; &#8220;Copilot Chat,&#8221; or &#8220;Analyst.&#8221; That kind of transition usually means uneven experiences while the plane is being rebuilt mid-flight.</p></li></ul><p>And then there&#8217;s the new <strong>=COPILOT() function</strong> concept (cool idea): AI inside the grid as a formula. But Microsoft&#8217;s own docs emphasize constraints that matter in real spreadsheet work:</p><ul><li><p>It only sees the <strong>prompt + the ranges you pass in</strong> (not the whole workbook, not other files, not enterprise info, not the internet).</p></li><li><p>It&#8217;s <strong>non-deterministic</strong> (can change results on recalculation), and they explicitly warn against using AI outputs for <strong>financial reporting / legal / other high-stakes scenarios</strong>.</p></li><li><p>There are <strong>usage limits</strong> (100 calls / 10 minutes, 300 / hour in the current rollout).</p></li></ul><p>None of that is &#8220;bad.&#8221; It&#8217;s enterprise reality: governance, compliance, repeatability, safe defaults.</p><p>But it explains why Copilot can feel like it&#8217;s helping you <em>inside a narrow lane</em>, while Claude feels like it&#8217;s willing to rebuild the whole workbook with you.</p><h3><strong>The funniest part: Claude is literally in Excel</strong></h3><p>This is the part that makes me laugh every time:</p><p>There&#8217;s a &#8220;<strong>Claude by Anthropic in Excel</strong>&#8221; integration on Microsoft&#8217;s own marketplace&#8212;positioned as a tool that can analyze, edit, and create workbooks, including multi-tab workbooks, with change tracking and explanations.</p><p>So the story isn&#8217;t &#8220;Microsoft can&#8217;t do AI.&#8221;</p><p>It&#8217;s more like: <strong>the best spreadsheet experience right now is coming from the outside</strong>, while Microsoft is still tightening the bolts on the inside.</p><h3><strong>Why this keeps happening</strong></h3><p>Spreadsheets are messy. Real ones are:</p><ul><li><p>multi-tab models</p></li><li><p>weird headers</p></li><li><p>legacy formatting</p></li><li><p>half-table / half-dashboard Frankenbooks</p></li><li><p>brittle formulas with tribal-knowledge assumptions</p></li></ul><p>Copilot&#8217;s sweet spot is clean, structured data you can safely summarize, filter, chart, or extend.</p><p>Claude&#8217;s sweet spot (right now) is: &#8220;Give me the goal, I&#8217;ll produce the whole deliverable.&#8221;</p><h3><strong>The workflow I recommend (right now)</strong></h3><p>If you live in Excel, the best setup I&#8217;ve found is a split-brain approach:</p><p><strong>Use Claude for:</strong></p><ul><li><p>Building a workbook from scratch (tabs, structure, formatting, formulas)</p></li><li><p>Refactoring a messy file into a clean model (standardized tables, consistent naming)</p></li><li><p>Producing reusable templates (forecast model, KPI dashboard shell, variance tracker)</p></li><li><p>Big &#8220;edit this entire spreadsheet&#8221; asks (rename columns across tabs, re-map assumptions)</p></li></ul><p><strong>Use Copilot for:</strong></p><ul><li><p>Quick in-Excel analysis once your data is already in a compliant shape (tables/supported ranges)</p></li><li><p>Low-stakes summarization/categorization on a defined range, especially with =COPILOT() (with the constraints in mind)</p></li></ul><h3><strong>Prompt pack (steal these)</strong></h3><p><strong>Claude prompts</strong></p><ol><li><p>&#8220;Create an Excel workbook for monthly P&amp;L variance: Actual vs Budget vs Prior Year, with a Summary tab and 12 monthly tabs. Use tables, named ranges, and variance % formulas.&#8221;</p></li><li><p>&#8220;Here&#8217;s my workbook. Normalize it: convert ranges to tables where possible, remove merged cells, standardize headers, and return an updated .xlsx.&#8221;</p></li><li><p>&#8220;Build a KPI dashboard tab with slicers-ready tables and charts for Revenue, Gross Margin, CAC, Churn, and NRR.&#8221;</p></li><li><p>&#8220;Add a Scenario tab with Base/Best/Worst assumptions and propagate them through the model.&#8221;</p></li><li><p>&#8220;Explain the calculation chain for these cells and then rewrite the model so the assumptions are centralized.&#8221;</p></li></ol><p><strong>Copilot in Excel prompts</strong></p><ol><li><p>&#8220;Highlight anomalies: values 2+ std dev from the mean.&#8221;</p></li><li><p>&#8220;Create a new column formula that buckets these rows into 5 categories based on these keywords.&#8221;</p></li><li><p>&#8220;Summarize the main drivers of change month-over-month.&#8221;</p></li><li><p>&#8220;Create a PivotTable that groups by Region and Product and shows Revenue and Margin.&#8221;</p></li><li><p>=COPILOT(&#8221;Classify customer feedback into themes&#8221;, A2:A200)</p></li></ol><h3><strong>My take</strong></h3><p>This isn&#8217;t really a &#8220;model&#8221; story. It&#8217;s a &#8220;product&#8221; story.</p><p>Microsoft is optimizing for:</p><ul><li><p>tenant controls</p></li><li><p>predictable behavior</p></li><li><p>data boundaries</p></li><li><p>compliance posture</p></li></ul><p>Claude is optimizing for:</p><ul><li><p>generating the artifact</p></li><li><p>doing multi-step edits end-to-end</p></li><li><p>moving fast</p></li></ul><p><strong>And the punchline is:</strong> the &#8220;best Excel AI&#8221; experience right now might be the one that&#8217;s least afraid to just&#8230; <em>touch the spreadsheet</em>.</p><p>I&#8217;m not sure this gap lasts&#8212;Microsoft moves fast when it decides something matters. But today, if you care about spreadsheets, the surprising move is simple:</p><p><strong>Don&#8217;t pick one assistant. Pick the right assistant for the shape of the job.</strong></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.techwithdarin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Tech with Darin is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[New Year, New Updates: Agents for Work, Walls for Health]]></title><description><![CDATA[Claude goes after productivity. OpenAI goes after trust.]]></description><link>https://www.techwithdarin.com/p/new-year-new-updates-agents-for-work</link><guid isPermaLink="false">https://www.techwithdarin.com/p/new-year-new-updates-agents-for-work</guid><dc:creator><![CDATA[Darin Deters]]></dc:creator><pubDate>Fri, 16 Jan 2026 00:32:52 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/2c2f2475-37db-4509-ae66-b5fe9e8bd60e_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>We&#8217;ve spent the last two years arguing about which model is &#8220;best.&#8221;</p><p>This year&#8217;s fight is different.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.techwithdarin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Tech with Darin is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>It&#8217;s not just models. It&#8217;s product design.</p><p>Who can ship AI that people actually use every day, without it turning into a privacy nightmare or a workflow mess.</p><p>This week&#8217;s clearest signal came from two launches aimed at totally different sectors:</p><ul><li><p>Claude Cowork: an agent-style workspace that&#8217;s clearly gunning for business ops and day-to-day execution</p></li><li><p>ChatGPT Health: a dedicated, walled-off health experience that feels like OpenAI taking &#8220;sensitive context&#8221; seriously</p></li></ul><p>Different sectors. Same level of innovation. Both interesting for one reason:</p><p>AI is splitting into specialized experiences.</p><div><hr></div><h2><strong>Claude Cowork: the missing bridge between &#8220;chat&#8221; and &#8220;work&#8221;</strong></h2><p>Cowork is the first Claude release that feels like it&#8217;s trying to become a true daily driver, not just a smart conversation partner.</p><h3><strong>First impressions from Darin</strong></h3><p><strong>The UI needs to be optimized for the average non-technical user</strong></p><p>Cowork is clearly aiming beyond developers, but the UI still feels like it was designed by and for people who live in tools all day.</p><p>A bunch of things I&#8217;ve already personalized in ChatGPT feel like the default posture in Cowork. That&#8217;s not a complaint. It&#8217;s actually interesting. But it also means Cowork may feel &#8220;pre-tuned&#8221; in a way that can be confusing for someone new to this.</p><p><strong>It&#8217;s surprisingly refreshing for business tasks</strong></p><p>Cowork bridges the gap between the conversational feel of ChatGPT and the more builder-style vibe of Codex in the OpenAI ecosystem.</p><p>For business work, that middle ground matters.</p><p>Most business tasks aren&#8217;t clean coding problems. They&#8217;re messy documents, questionable spreadsheets, exports from five systems, and someone asking for a summary &#8220;by EOD&#8221; with zero context.</p><p>Cowork feels like it was built for that kind of chaos.</p><p><strong>This preview feels long overdue</strong></p><p>It&#8217;s overdue in the best way.</p><p>Claude needed a bigger product story than &#8220;chat with a really good model.&#8221; Cowork finally feels like the start of that shift.</p><p>And honestly, I suspect we&#8217;ll see an OpenAI response on this front. It&#8217;s been a while since we&#8217;ve had a meaningful &#8220;work mode&#8221; evolution, and it&#8217;s long overdue in my opinion.</p><div><hr></div><h2><strong>ChatGPT Health: WebMD instincts, but finally integrated</strong></h2><p>This one hits a different nerve because it points at the most personal category of data people have.</p><h3><strong>First impressions from Darin</strong></h3><p><strong>It reminds me of the early days of the internet</strong></p><p>ChatGPT Health brought back a memory I didn&#8217;t expect: the era when the internet became the place you went first for health questions.</p><p>The WebMD instinct.</p><p>The difference now is the experience is dramatically more detailed and interactive. It feels less like reading static pages and more like having a structured way to think through what&#8217;s happening.</p><p><strong>The record integration is the real leap</strong></p><p>What&#8217;s genuinely interesting is how seamless it is to pull in your health records and then do your own brainstorming.</p><p>You can ask questions you forgot to ask your doctor.</p><p>You can work through patterns over time.</p><p>You can translate medical language into plain English.</p><p>You can walk into your next appointment more prepared.</p><p>That&#8217;s not just &#8220;AI answering health questions.&#8221;</p><p>That starts to look like a personal health platform.</p><div><hr></div><h2><strong>The real story: AI is becoming two things at once</strong></h2><p>Put these two launches next to each other and the direction gets loud.</p><h3><strong>Agents that act (Cowork)</strong></h3><p>AI that can move work forward, not just talk about it.</p><h3><strong>Walls that protect (Health)</strong></h3><p>AI that can handle sensitive context without leaking it into everything else.</p><p>If the last couple years were about bigger models, 2026 feels like it&#8217;s shaping up to be about better product surfaces.</p><div><hr></div><h2><strong>Practical take: how I&#8217;d use each without getting burned</strong></h2><h3><strong>Cowork: start like you&#8217;re onboarding a new hire</strong></h3><p>I&#8217;d treat Cowork like someone new joining the team.</p><p>Give it a contained scope first.</p><p>Make sure the outputs are predictable.</p><p>Build trust before you expand access.</p><p>Cowork has the potential to be a daily driver, but the UX will need to evolve if it wants to win the average non-technical user without friction.</p><h3><strong>ChatGPT Health: use it as a prep tool, not a doctor</strong></h3><p>This is where I&#8217;d keep it grounded.</p><p>Best uses:</p><ul><li><p>summarizing records into understandable language</p></li><li><p>building a question list for appointments</p></li><li><p>turning vague symptoms into a clear timeline</p></li><li><p>helping you advocate for yourself</p></li></ul><p>Worst uses:</p><ul><li><p>self-diagnosing serious issues</p></li><li><p>making medication decisions</p></li><li><p>replacing real clinical care</p></li></ul><p>The biggest risk isn&#8217;t the tool. It&#8217;s over-trust.</p><div><hr></div><h2><strong>Non-tech reader summary</strong></h2><ul><li><p>Claude Cowork is AI stepping into your work life and trying to actually finish tasks, not just answer prompts.</p></li><li><p>ChatGPT Health is AI stepping into a sensitive category and trying to do it with more structure and separation.</p></li><li><p>The next AI winners won&#8217;t just be the ones that sound smartest. They&#8217;ll be the ones that ship AI you can use and trust in the real world.</p></li></ul><div><hr></div><h2><strong>Closing take</strong></h2><p>Cowork makes me think: AI is becoming an employee.</p><p>Health makes me think: AI is becoming a vault.</p><p>Agents for work. Walls for health.</p><p>New year, new updates, and the product era is officially here.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.techwithdarin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Tech with Darin is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[My 2025 “Year with ChatGPT”: Confessions of a Power User (and Why You Should Try This)]]></title><description><![CDATA[Every December we all do the same thing:]]></description><link>https://www.techwithdarin.com/p/my-2025-year-with-chatgpt-confessions</link><guid isPermaLink="false">https://www.techwithdarin.com/p/my-2025-year-with-chatgpt-confessions</guid><dc:creator><![CDATA[Darin Deters]]></dc:creator><pubDate>Tue, 23 Dec 2025 07:32:30 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Cdzc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49d434e6-4229-43f6-8582-ec84641b8b78_1320x2868.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Every December we all do the same thing:</p><ul><li><p>We pretend we&#8217;re going to &#8220;slow down next year.&#8221;</p></li><li><p>We make one brave resolution.</p></li><li><p>And then the internet drops a year-in-review that makes us feel <em>seen</em>.</p></li></ul><p>This year, ChatGPT joined the party with <strong>Your Year with ChatGPT</strong> &#8212; an optional recap that looks back on how you used it in 2025, including high-level themes and some ridiculous stats (in the best way).</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.techwithdarin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Tech with Darin is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>And friends&#8230; my recap called me out.</p><h2><strong>My 2025 Stats (aka: Proof I Have a Type)</strong></h2><p>Here&#8217;s what my year looked like:</p><ul><li><p><strong>13.61K messages sent</strong></p></li><li><p><strong>569 total chats</strong></p></li><li><p><strong>193 images generated</strong></p></li><li><p><strong>17.19K em-dashes exchanged</strong></p></li><li><p><strong>Top 1% of messages sent</strong></p></li><li><p><strong>First 0.5% of users</strong></p></li><li><p><strong>Chattiest day: Aug 25</strong></p></li></ul><p>If you&#8217;re reading that like, &#8220;Darin&#8230; are you okay?&#8221;</p><p>Yes.</p><p>And also: I clearly use ChatGPT the way some people use coffee and yes I drink a lot of coffee as well. </p><p>(screenshot for reference)</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Cdzc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49d434e6-4229-43f6-8582-ec84641b8b78_1320x2868.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Cdzc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49d434e6-4229-43f6-8582-ec84641b8b78_1320x2868.png 424w, https://substackcdn.com/image/fetch/$s_!Cdzc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49d434e6-4229-43f6-8582-ec84641b8b78_1320x2868.png 848w, https://substackcdn.com/image/fetch/$s_!Cdzc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49d434e6-4229-43f6-8582-ec84641b8b78_1320x2868.png 1272w, https://substackcdn.com/image/fetch/$s_!Cdzc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49d434e6-4229-43f6-8582-ec84641b8b78_1320x2868.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Cdzc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49d434e6-4229-43f6-8582-ec84641b8b78_1320x2868.png" width="1320" height="2868" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/49d434e6-4229-43f6-8582-ec84641b8b78_1320x2868.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2868,&quot;width&quot;:1320,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:3031526,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.techwithdarin.com/i/182397500?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49d434e6-4229-43f6-8582-ec84641b8b78_1320x2868.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Cdzc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49d434e6-4229-43f6-8582-ec84641b8b78_1320x2868.png 424w, https://substackcdn.com/image/fetch/$s_!Cdzc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49d434e6-4229-43f6-8582-ec84641b8b78_1320x2868.png 848w, https://substackcdn.com/image/fetch/$s_!Cdzc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49d434e6-4229-43f6-8582-ec84641b8b78_1320x2868.png 1272w, https://substackcdn.com/image/fetch/$s_!Cdzc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49d434e6-4229-43f6-8582-ec84641b8b78_1320x2868.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h2><strong>What This Actually Says About How I Work</strong></h2><p>This isn&#8217;t &#8220;wow, look at my stats.&#8221;</p><p>It&#8217;s more like a mirror for how I think.</p><h3><strong>1) I don&#8217;t use AI as a search engine &#8212; I use it as a workbench</strong></h3><p><strong>569 chats</strong> tells you everything.</p><p>That&#8217;s not &#8220;what&#8217;s the capital of France&#8221; usage.</p><p>That&#8217;s:</p><ul><li><p>spin up a new thread for a new problem</p></li><li><p>break big ideas into small steps</p></li><li><p>iterate fast</p></li><li><p>keep moving</p></li></ul><h3><strong>2) I iterate out loud</strong></h3><p><strong>13.61K messages</strong> and <strong>Top 1%</strong> = I&#8217;m not here for one-and-done prompts.</p><p>I&#8217;m here for:</p><ul><li><p>&#8220;Try again.&#8221;</p></li><li><p>&#8220;Better.&#8221;</p></li><li><p>&#8220;Make it tighter.&#8221;</p></li><li><p>&#8220;Now make it funnier.&#8221;</p></li><li><p>&#8220;Now make it clearer for non-tech readers.&#8221;</p></li><li><p>&#8220;Now&#8230; cite your sources.&#8221;</p></li></ul><p>This is how good work gets made. Not by inspiration &#8212; by reps.</p><h3><strong>3) My punctuation has a personality</strong></h3><p><strong>17.19K em-dashes exchanged</strong> is objectively hilarious.</p><p>It means my conversations are full of:</p><ul><li><p>asides</p></li><li><p>nuance</p></li><li><p>mid-flight pivots</p></li><li><p>&#8220;wait, but what about&#8230;&#8221;</p></li></ul><p>So yes, the recap basically said:</p><p><strong>&#8220;This guy builds in public, in real time, with chaos energy.&#8221;</strong></p><p>Accurate.</p><h3><strong>4) I&#8217;m using image generation like a creative scratchpad</strong></h3><p><strong>193 images generated</strong> isn&#8217;t &#8220;I&#8217;m an artist now.&#8221;</p><p>It&#8217;s:</p><ul><li><p>thumbnails</p></li><li><p>visual concepts</p></li><li><p>quick mockups</p></li><li><p>&#8220;show me 5 options&#8221;</p></li><li><p>&#8220;okay now make it more modern&#8221;</p></li><li><p>&#8220;okay now make it less cursed&#8221;</p></li></ul><p>If you create content (or presentations, or training, or internal docs), this is a cheat code.</p><h2><strong>The Bigger Point: This Isn&#8217;t a &#8220;Wrap.&#8221; It&#8217;s a Feedback Loop.</strong></h2><p>The most underrated part of this feature isn&#8217;t the numbers.</p><p>It&#8217;s the idea that you can step back and ask:</p><ul><li><p>What did I actually spend time thinking about this year?</p></li><li><p>What kept coming up?</p></li><li><p>What did I learn?</p></li><li><p>What did I build?</p></li><li><p>Where did I get stuck?</p></li><li><p>What patterns show up in how I work?</p></li></ul><p>That&#8217;s not a gimmick. That&#8217;s a <strong>reflection tool</strong>.</p><p>And reflection tools are how you compound.</p><h2><strong>Why You Should Use &#8220;Your Year with ChatGPT&#8221; (Even If Your Stats Are Tiny)</strong></h2><p>Because it&#8217;s not a flex.</p><p>It&#8217;s a <strong>baseline</strong>.</p><p>If you&#8217;re early on, it helps you notice:</p><ul><li><p>where AI is already helping you</p></li><li><p>where it&#8217;s not</p></li><li><p>what kinds of prompts actually create value for you</p></li></ul><p>And if you&#8217;re a heavy user, it helps you tune your workflow:</p><ul><li><p>fewer dead-end chats</p></li><li><p>better prompt patterns</p></li><li><p>more repeatable &#8220;systems&#8221; instead of one-off convos</p></li></ul><h2><strong>A Challenge for 2026</strong></h2><p>If you haven&#8217;t used ChatGPT much, don&#8217;t start by asking it random trivia.</p><p>Start by giving it <em>your actual life</em> (the non-sensitive parts):</p><ul><li><p>That email you need to write</p></li><li><p>The plan you&#8217;re trying to shape</p></li><li><p>The decision you&#8217;re stuck on</p></li><li><p>The thing you want to learn but keep postponing</p></li><li><p>The content idea you&#8217;ve started 7 times</p></li></ul><p>Then iterate.</p><p>Don&#8217;t treat it like a vending machine.</p><p>Treat it like a collaborator.</p><h2><strong>The One Setting That Matters</strong></h2><p>Per OpenAI&#8217;s release notes, to see the full experience you need:</p><ul><li><p><strong>Memory ON</strong></p></li><li><p><strong>Reference Chat History ON</strong></p></li><li><p>A minimum activity threshold</p></li></ul><p>It&#8217;s optional, personalized, and rolling out gradually.</p><p>That&#8217;s it.</p><h2><strong>Your Turn</strong></h2><p>If you got your recap&#8230;</p><p><strong>What did it reveal about you?</strong></p><p>Were you a &#8220;one chat a month&#8221; person &#8212; or did you accidentally build a second brain?</p><p>Drop your funniest stat (or biggest surprise) in the comments.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.techwithdarin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Tech with Darin is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Cloud Year in Review 2025: Agentic Ops, Multicloud Plumbing, and the Stuff That Actually Survived Production]]></title><description><![CDATA[2025 cloud wasn&#8217;t about migration. It was about running: agentic ops, multicloud seams, FinOps automation, security in pipelines&#8212;and the myth of &#8220;zero legacy.&#8221;]]></description><link>https://www.techwithdarin.com/p/cloud-year-in-review-2025-agentic</link><guid isPermaLink="false">https://www.techwithdarin.com/p/cloud-year-in-review-2025-agentic</guid><dc:creator><![CDATA[Darin Deters]]></dc:creator><pubDate>Sat, 20 Dec 2025 15:24:49 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/3af7b23d-d56d-478a-b079-7d8de238d954_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>If 2025 had a single theme across AWS, Azure, and Google Cloud, it wasn&#8217;t &#8220;who has the best model.&#8221;</p><p>It was: <strong>who is turning cloud into an operating system for real-world automation</strong> &#8212; where agents, workflows, data, security, and cost controls all have to work together <em>without</em> humans babysitting every edge case.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.techwithdarin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Tech with Darin is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p><strong>My Take:</strong> Think of the last few years like moving into a new skyscraper. We spent a ton of time just getting the furniture inside (migration). <strong>2025 is the year teams finally hooked up the smart building sensors, automated the climate control, and streamlined billing for every floor so the building actually runs itself.</strong></p><div><hr></div><h2><strong>The broader 2025 shift nobody should ignore: AI sprinkled into everything (useful&#8230; sometimes)</strong></h2><p>Across all three clouds, we saw the same pattern: <strong>AI features getting embedded into an assortment of services</strong>&#8212;from identity and policy surfaces to ops workflows and developer tooling. Some of it is genuinely helpful. Some of it is&#8230; a checkbox.</p><p><strong>My Take (consulting lens):</strong> This changes <em>our</em> job. We need to stop selling &#8220;AI transformation&#8221; like it automatically equals &#8220;zero legacy,&#8221; because that mindset is how orgs get hurt.</p><p>Legacy doesn&#8217;t just mean mainframes.</p><p>Legacy can be:</p><ul><li><p>the <strong>tech debt</strong> you inherited after cutting a decade of tribal knowledge.</p></li><li><p>the &#8220;new mess&#8221; that shows up when an outsourced team ships fast but never read the documentation.</p></li><li><p>the organizational stress of believing you can replace your <strong>tenured</strong> staff with AI (or just saying you are), or with lower-cost labor.</p></li><li><p>the damage from a &#8220;big reorg&#8221; that breaks things that didn&#8217;t need breaking.</p></li></ul><p>In 2025, cloud got more capable&#8212;but the real work is still in the trenches: <strong>stabilize, standardize, automate, govern.</strong></p><div><hr></div><h2><strong>1) The shift from &#8220;agent toys&#8221; to agentic operations</strong></h2><p>In 2025, the focus moved from &#8220;model flexing&#8221; to building <strong>agentic operations</strong>: systems that are <strong>observable, governable, and affordable</strong> to run continuously.</p><p><strong>My Take:</strong> This stopped being a model contest and became an SRE contest. Traceability, failure modes, blast radius, and cost control are the real differentiators now.</p><div><hr></div><h2><strong>2) Multicloud as plumbing, not strategy</strong></h2><p>Enterprise multicloud isn&#8217;t a theoretical choice anymore. It&#8217;s operational reality. And in 2025, providers started reducing the friction in the seams.</p><p><strong>My Take:</strong> The biggest multicloud failures rarely happen in the compute layer. They happen in the seams: identity, networking, policy, and data movement. Making the seams less painful is the real platform move.</p><div><hr></div><h2><strong>3) Operational modernization over simple migration</strong></h2><p>For a lot of teams, &#8220;lift-and-shift&#8221; is done &#8212; and now the real work begins: cleaning up the aftermath.</p><p>That means moving away from one-off snowflakes and toward <strong>repeatable patterns, policy-driven management, and standardized guardrails</strong> at scale.</p><p><strong>My Take:</strong> This is the part nobody brags about in keynotes, but it&#8217;s where cloud becomes a true operating model. Quotas, fleet controls, guardrails, and delivery standards matter more than &#8220;one more feature.&#8221;</p><div><hr></div><h2><strong>4) The productization of FinOps</strong></h2><p>This is where I want to be blunt: <strong>FinOps only becomes real when it stops being a report and becomes an operating loop.</strong></p><p>The &#8220;lame&#8221; version of FinOps is tagging workshops and a monthly PowerPoint that everyone politely nods at.</p><p>The 2025 version looks more like:</p><ul><li><p><strong>A unified optimization command center, not scattered recommendations.</strong></p><p>AWS continuing to mature the <em>Cost Optimization Hub</em> pattern is a big deal: one place to triage optimization opportunities, prioritize by impact/risk/effort, and track progress over time. I&#8217;m not sure any single feature name is the point here&#8212;the point is the move from &#8220;recommendation sprawl&#8221; to &#8220;operational queue.&#8221;</p></li><li><p><strong>Automation events: the bridge from &#8220;advice&#8221; to &#8220;action.&#8221;</strong></p><p><em>Compute Optimizer automation events</em> are a great example of where FinOps gets simpler: a centralized record of what automation changed, what the estimated savings are, and the ability to roll back when the business impact isn&#8217;t what you expected (using the AWS docs link you shared).</p></li><li><p><strong>Stakeholder-ready billing views (Finance stops calling it a black box).</strong></p><p>Better dashboards and exports matter because it&#8217;s how you reduce friction with Finance and product owners. When costs are visible, attributable, and reviewable without heroics, cloud starts behaving like a product that compounds instead of an expense that spikes.</p></li></ul><p><strong>My Take:</strong> The moment optimization becomes measurable <em>and</em> reversible, it stops being &#8220;recommendations&#8221; and starts being &#8220;repeatable operations.&#8221;</p><div><hr></div><h2><strong>5) Security shifting left into pipelines</strong></h2><p>Cloud maturity in 2025 wasn&#8217;t about buying more security tools. It was about making security and compliance part of the delivery motion:</p><ul><li><p>templates</p></li><li><p>policy-as-code</p></li><li><p>automated scanning in CI/CD</p></li><li><p>standardized environment patterns</p></li></ul><p><strong>My Take:</strong> Guardrails don&#8217;t scale when they live in meetings. They scale when they live in pipelines.</p><div><hr></div><h2><strong>One more hot take: the recognition gap is getting weird</strong></h2><p>There&#8217;s a trend I can&#8217;t unsee: organizations love celebrating awards and culture on social media&#8212;but do a lackluster job of showing meaningful recognition to the individuals who made those awards and culture possible.</p><p>If we want mature cloud organizations, we need mature operating models&#8212;and that includes how we retain the people who actually keep the lights on, fix the snowflakes, and turn chaos into repeatable systems.</p><div><hr></div><h2><strong>My Take: 2026 predictions</strong></h2><ol><li><p><strong>Agent governance becomes as normal as IAM</strong></p><p>&#8220;Can we do it?&#8221; becomes &#8220;can we audit it?&#8221; Expect eval gates, tool-call tracing, and policy-as-code to become baseline requirements.</p></li><li><p><strong>FinOps shifts from dashboards to automation loops</strong></p><p>The winning programs won&#8217;t just report waste&#8212;they&#8217;ll automatically reduce it (with guardrails + rollback). Automation events are the template for where this is going.</p></li><li><p><strong>Multicloud connectivity becomes a default assumption</strong></p><p>More orgs will treat cross-cloud data movement and private connectivity as first-class, not special projects.</p></li><li><p><strong>&#8220;Legacy&#8221; gets redefined as organizational tech debt</strong></p><p>The hardest modernization work will be rebuilding operational clarity after churn: documentation drift, broken ownership, and systems nobody fully understands.</p></li><li><p><strong>Culture marketing gets audited by retention reality</strong></p><p>The most &#8220;high-performing cloud orgs&#8221; will be the ones that reward the operators and builders&#8212;not just the announcements.</p></li></ol><div><hr></div><h2><strong>Non-tech reader translation</strong></h2><p>Earlier cloud years were about <em>getting into the building.</em></p><p><strong>2025 was about making the building run itself</strong>: sensors (observability), automation (agentic ops), plumbing (multicloud connectivity), billing per floor (chargeback-ready FinOps), and fire codes enforced by default (security in pipelines).</p><p>That&#8217;s what makes automation safe, scalable, and sustainable.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.techwithdarin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Tech with Darin is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[2025 AI Year in Review: The Hype Correction, the Agent Era, and the Tools That Actually Moved the Needle]]></title><description><![CDATA[If I had to sum up 2025 in one line: we stopped arguing about whether AI is &#8220;real,&#8221; and started arguing about whether it&#8217;s reliable.]]></description><link>https://www.techwithdarin.com/p/2025-ai-year-in-review-the-hype-correction</link><guid isPermaLink="false">https://www.techwithdarin.com/p/2025-ai-year-in-review-the-hype-correction</guid><dc:creator><![CDATA[Darin Deters]]></dc:creator><pubDate>Thu, 18 Dec 2025 14:51:54 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/323fd751-e115-4410-835b-8e14b682a863_2848x1504.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>If I had to sum up 2025 in one line: <strong>we stopped arguing about whether AI is &#8220;real,&#8221; and started arguing about whether it&#8217;s reliable.</strong></p><p>This was the year AI graduated from &#8220;look what it can do&#8221; to &#8220;cool&#8230; can it do it <em>again</em>, on Tuesday, with logs, guardrails, budgets, and a business owner who expects it not to break?&#8221;</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.techwithdarin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Tech with Darin is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>And yes&#8212;there was absolutely a vibe shift. Call it the hype correction. Call it maturity. Call it the moment we collectively realized that shipping AI into real workflows is less like magic and more like engineering.</p><div><hr></div><h2><strong>For non-tech readers: what mattered in 2025 (and why the speed is the story)</strong></h2><p>If you don&#8217;t live in the tech world, here&#8217;s the simple version:</p><p><strong>AI didn&#8217;t just get &#8220;better&#8221; this year &#8212; it got embedded into everyday tools fast.</strong></p><p>That matters because when a technology starts showing up inside the apps people already use (docs, email, browsers, coding tools, phones), it stops being optional and starts becoming part of how work gets done.</p><p>What to pay attention to:</p><ul><li><p><strong>The pace is accelerating.</strong> Big changes used to take years to become mainstream. In 2025, major updates landed monthly (sometimes weekly). The result: companies and workers had to adapt <em>while the road was being built</em>.</p></li><li><p><strong>The winners weren&#8217;t the people with the fanciest AI.</strong> The winners were the people who learned how to use AI safely and consistently inside their process.</p></li><li><p><strong>AI is becoming a &#8220;new layer&#8221; of work.</strong> Think of it like the jump from flip phones to smartphones &#8212; not one feature, but a shift in <em>how you do things</em>. The &#8220;assistant&#8221; is becoming a normal part of writing, research, planning, and building.</p></li><li><p><strong>This isn&#8217;t about replacing everyone.</strong> The practical impact in 2025 was mostly about <em>speed and leverage</em>: some people and teams started moving 2&#8211;5x faster on specific tasks (drafting, research, testing, summarizing, prototyping). That creates pressure: expectations rise, timelines shrink, and &#8220;good enough&#8221; becomes the baseline.</p></li><li><p><strong>Trust is the new battleground.</strong> If AI is going to be inside important work, it has to be predictable. That&#8217;s why so much of 2025 shifted from hype to reliability, safety, and accountability.</p></li></ul><p>If you take only one thing from this recap, take this:</p><p><strong>2025 was the year the velocity of change became impossible to ignore.</strong></p><div><hr></div><h2><strong>What actually changed in 2025 (the big themes)</strong></h2><h3><strong>1) From chatbots to agents (and from prompts to systems)</strong></h3><p>The narrative wasn&#8217;t &#8220;prompt better.&#8221; It was:</p><ul><li><p>tool use</p></li><li><p>multi-step planning</p></li><li><p>memory/context management</p></li><li><p>governance, safety, and auditing</p></li><li><p>and integration into where work already happens (IDE, CLI, browser, docs)</p></li></ul><p>In other words: <strong>systems &gt; prompts.</strong></p><h3><strong>2) &#8220;Control knobs&#8221; became the difference between toys and tools</strong></h3><p>The winners weren&#8217;t the loudest model launches. The winners were the platforms giving builders real controls:</p><ul><li><p>reasoning depth knobs</p></li><li><p>token/latency tradeoffs</p></li><li><p>multimodal cost controls</p></li><li><p>tool orchestration</p></li><li><p>and the ability to inspect what happened (or at least reconstruct it)</p></li></ul><p>That&#8217;s what made the agent story feel more real this year.</p><h3><strong>3) The cost conversation got serious</strong></h3><p>2025 was also the year we stopped pretending cost didn&#8217;t matter.</p><p>If you&#8217;re building anything beyond a demo, you&#8217;re budgeting:</p><ul><li><p>tokens</p></li><li><p>time</p></li><li><p>evals</p></li><li><p>human review</p></li><li><p>and real operational support</p></li></ul><div><hr></div><h2><strong>OpenAI DevDay: personal take from being in the room</strong></h2><p>DevDay this year felt less like &#8220;look at this model&#8221; and more like &#8220;here&#8217;s the platform posture.&#8221;</p><p>What stuck with me: OpenAI is clearly building toward a world where <strong>ChatGPT isn&#8217;t just a product&#8212;it&#8217;s a surface</strong> where apps, agents, and workflows live.</p><p>That matters because distribution is half the game. You can have the best internal agent in the world, but if your users can&#8217;t access it easily&#8212;or don&#8217;t trust it&#8212;it won&#8217;t stick.</p><div><hr></div><h2><strong>OpenAI Codex: the sleeper hit of my year</strong></h2><p>I&#8217;ve got to call this out explicitly because it surprised me in the best way.</p><p><strong>Codex ended up being one of the most practically useful AI tools I used in 2025</strong>, especially for:</p><ul><li><p>testing help (unit tests, edge cases, scaffolding test harnesses)</p></li><li><p>quick &#8220;vibe coding&#8221; prototypes to explore ideas fast</p></li><li><p>refactors where I wanted momentum without losing structure</p></li><li><p>converting messy snippets into something repeatable</p></li></ul><p>The part I didn&#8217;t expect: how often it helped me move from <strong>idea &#8594; working shape</strong> without the usual friction of context switching and staring at blank files.</p><p>I still don&#8217;t treat it like an autopilot. But as a co-pilot for:</p><ul><li><p>getting a project started,</p></li><li><p>keeping momentum through the ugly middle,</p></li><li><p>and accelerating the &#8220;boring but necessary&#8221; work (tests especially)&#8230;</p></li></ul><p>&#8230;it was a genuinely pleasant surprise.</p><p>My rule of thumb by the end of the year:</p><p><strong>Use Codex to go faster, then use your brain to go correct.</strong></p><div><hr></div><h2><strong>Google&#8217;s 2025 developer launches: a year of &#8220;agent-first&#8221; everywhere</strong></h2><p>The Google end-of-year recap you shared nails the pattern: the big theme wasn&#8217;t &#8220;one AI thing.&#8221; It was <strong>AI woven through the developer experience</strong>.</p><p>Here&#8217;s how it reads in plain English:</p><h3><strong>Gemini 3 + API enhancements: deeper reasoning and better agent building blocks</strong></h3><p>The story here is &#8220;reasoning + agent tooling + cost control.&#8221; Thinking levels, thought signatures, media controls, hosted tools&#8212;you can feel the platform leaning hard into <em>builders assembling systems</em>, not just calling a model.</p><h3><strong>Antigravity + Nano Banana Pro: agentic dev + image workflows that feel product-grade</strong></h3><p>Agent-first development surfaces and image generation/editing tuned for practical design and UI work signals something important: Google isn&#8217;t just shipping models&#8212;they&#8217;re shipping <strong>workflows</strong>.</p><h3><strong>Universal AI assistant + Project Astra &#8594; Gemini Live</strong></h3><p>When you blend multimodal understanding (video, calls, live context) into Gemini Live, you get closer to the assistant people imagine in their head&#8212;not just what we&#8217;ve been calling an assistant.</p><h3><strong>Jules: asynchronous coding agent + CLI tooling</strong></h3><p>This is a real signal: &#8220;async agent&#8221; becomes a normal expectation. Not everything needs to be interactive. Some work should be delegated, reviewed later, and merged when it&#8217;s correct.</p><h3><strong>Gemini in Firebase Studio and Android Studio Agent Mode</strong></h3><p>This is where it gets serious: agents inside the IDE become part of normal building. Ask &#8594; plan &#8594; execute &#8594; review is becoming the default loop.</p><h3><strong>Android XR: immersive + Gemini as the helpfulness layer</strong></h3><p>XR is still early, but the direction is clear: AI becomes the interaction layer that makes new form factors usable.</p><div><hr></div><h2><strong>Anthropic: strong year for coding + agent reliability</strong></h2><p>Anthropic&#8217;s positioning this year stayed consistent:</p><ul><li><p>stronger coding capability</p></li><li><p>agent workflows and &#8220;computer use&#8221;</p></li><li><p>and a heavy emphasis on responsible deployment and evaluation</p></li></ul><p>Whether you prefer Claude, GPT, or Gemini for a given workflow, it&#8217;s a good thing for all of us that at least one major player is relentlessly focused on &#8220;can this be used safely and predictably&#8221; instead of &#8220;can this win a benchmark.&#8221;</p><div><hr></div><h2><strong>NotebookLM: still a game changer (and I&#8217;ve been saying that for over a year)</strong></h2><p>I&#8217;ve been talking about <strong>NotebookLM</strong> as a game changer for me for over a year now, and I still feel that way more than a year later.</p><p>It&#8217;s not flashy in the way a new model drop is flashy&#8212;but it&#8217;s the kind of tool that quietly changes how you work if you live in documents, research, and messy source material.</p><p>NotebookLM became my &#8220;second brain&#8221; for:</p><ul><li><p>synthesizing long sources</p></li><li><p>extracting themes across multiple docs</p></li><li><p>building structured notes I can actually reuse</p></li><li><p>turning research into drafts faster (without losing traceability)</p></li></ul><p>I don&#8217;t think enough people talk about this category: <strong>knowledge workflow tools</strong>. Models are great. But the tools that help you <em>think with your sources</em> are what make AI usable day-to-day.</p><p>NotebookLM has stayed in my rotation because it consistently helps me do the thing I actually need:</p><p><strong>turn information into decisions and outputs.</strong></p><div><hr></div><h2><strong>The 2025 hype correction: my take</strong></h2><p>The &#8220;hype correction&#8221; narrative resonated because it matches what many of us experienced:</p><ul><li><p>LLMs are powerful, but not a universal solvent</p></li><li><p>AI doesn&#8217;t fix messy processes&#8212;it amplifies them</p></li><li><p>production systems require evals, guardrails, and operational thinking</p></li><li><p>and &#8220;agent&#8221; is not a magic word&#8230; it&#8217;s a responsibility</p></li></ul><p>2025 didn&#8217;t kill the AI story. It made the story practical.</p><div><hr></div><h2><strong>What I&#8217;m taking into 2026</strong></h2><p>This is the playbook I&#8217;m carrying forward:</p><ol><li><p>Pick workflows, not demos. Solve one repeatable pain with measurable impact.</p></li><li><p>Treat agents like junior operators. Give them tools, constraints, logging, and escalation paths.</p></li><li><p>Invest in evaluation early. If you can&#8217;t measure quality, you can&#8217;t improve quality.</p></li><li><p>Budget cost and latency from day one. Otherwise you&#8217;re building a surprise bill, not a product.</p></li><li><p>Stay tool-agnostic, outcome-obsessed. The best stack is the one that ships and gets adopted.</p></li></ol><div><hr></div><h2><strong>Closing</strong></h2><p>2025 was the year AI stopped being a novelty and became a discipline.</p><p>DevDay made it obvious where OpenAI is headed: platform + distribution + agents.</p><p>Google made it obvious where Google is headed: agents embedded everywhere devs live.</p><p>Anthropic made it clear there&#8217;s still room to compete on reliability and responsibility.</p><p>And NotebookLM reminded me (again) that the most valuable AI tools are often the ones that help you <strong>think</strong>, not just generate.</p><p>Next up: I&#8217;ll do a separate &#8220;2025 Cloud Year in Review&#8221; because cloud didn&#8217;t slow down either&#8230; it just got more intertwined with AI than ever.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.techwithdarin.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Tech with Darin is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item></channel></rss>