Tech with Darin

AI Labs Just Stopped Being Model Shops

Darin Deters — Sat, 16 May 2026 12:55:32 GMT

The Bottom Line (No Jargon Edition)

OpenAI launched a personal finance tool that connects directly to your bank account. Over 200 million people already ask ChatGPT financial questions every month. Now OpenAI wants to be the place where those questions get answered with your real spending data.
OpenAI also stood up a $4 billion enterprise consulting arm called the OpenAI Deployment Company, backed by TPG, Bain, Advent, and Brookfield. They acquired a firm called Tomoro. the team that built Virgin Atlantic's AI concierge. to staff it up. Valued at $14 billion at launch.
Greg Brockman, OpenAI's co-founder, moved into a products role. The org chart is shifting to match the new strategy: less research lab, more enterprise platform company.
Anthropic had a big week too. New Claude updates targeting legal work. Ramp spending data now shows Anthropic has more business customers than OpenAI. That statistic should get OpenAI's attention.
Microsoft's security team had a rough week. A zero-day in on-premise Exchange Server (CVE-2026-42897) is being exploited in the wild right now. Windows 11 took three separate zero-day hits at Pwn2Own in 24 hours. The May Patch Tuesday rollout covered 120 flaws total. If you run Exchange on-prem or manage Windows 11 fleets, patching is not optional this week.

The Take That Started the Week

Three frontier AI labs moved in near-perfect sync this week. Not coordination. competition. OpenAI launched a $4 billion consulting arm, a bank-connected personal finance product, and shifted its co-founder into products. Anthropic pushed Claude deeper into legal workflows and surpassed OpenAI in business customer count according to Ramp's payment data. And Google's pre-I/O week dropped Gemini Intelligence for Android, leaked the Gemini Spark agent, and confirmed a new model release for I/O on Tuesday.

What you're watching isn't a model war anymore. The models are close enough that the top-tier players can't win on raw benchmark performance alone. The war has shifted to distribution, data access, and workflow lock-in. OpenAI connecting to your bank account isn't a financial services product. It's a data acquisition strategy. The company that has context about your spending, your calendar, your inbox, and your work outputs has a moat that no model release can close. That is what all three labs are racing to build.

I've watched this pattern before. In the early 2000s, enterprise software companies stopped competing on features and started competing on integrations. The winner wasn't the best product. it was the one that made itself impossible to remove. SAP didn't win because it was the best software. It won because it became the connective tissue between every business process. OpenAI is reading from that playbook right now, at a speed that SAP never imagined.

The practitioner question this week isn't "which model is better?" It's "which platform is embedding itself into your workflows right now, and do you have a plan to maintain optionality?" The teams that answer that question clearly in the next six months are going to be in a very different position than the teams that don't.

Cloud Roundup

AWS

No major product releases from AWS this week. The attention was elsewhere in the AI layer. That said, the OpenAI Deployment Company's backing from PE firms (TPG, Bain, Brookfield) signals that enterprise AI deployment is becoming an infrastructure services business in its own right. AWS's professional services arm. AWS ProServe. should be watching this development closely. The land grab for enterprise AI embedding is now fully funded on the OpenAI side.

Azure

Microsoft's week was defined by security, not innovation. The May 2026 Patch Tuesday covered 120 vulnerabilities across Windows and related products. The headline item is CVE-2026-42897: an actively exploited cross-site scripting flaw in on-premise Exchange Server. Microsoft issued a temporary mitigation through the Exchange Emergency Mitigation Service while a permanent patch is in development. If your organization runs Exchange on-prem, apply the mitigation now. Separately, Windows 11 was successfully exploited three times in 24 hours at Pwn2Own. A zero-click Outlook vulnerability also surfaced, affecting a DLL shared with Word. The attack surface on Microsoft's core stack is wide right now.

GCP

Google's week was all pre-I/O positioning. The Android Show on May 12th announced "Gemini Intelligence" as the overarching brand for AI features baked into Android. generative UI widgets, Gboard's "Rambler" real-time editing feature, and screen-context-based automation. Gemini Spark, an agent capability for the Gemini app that handles multi-step tasks like inbox management and flight booking, leaked ahead of the main event. Sources close to Google indicate a new Gemini model will arrive at I/O on Tuesday. The release is described as positioned between GPT-5.5 and the frontier. not a benchmark-pusher, but a broad capability upgrade aimed at the consumer and enterprise developer base.

AI Model Roundup

OpenAI

Three distinct moves this week, all pointing the same direction. First, the personal finance integration: ChatGPT Pro subscribers in the US can now link bank accounts and receive spending analysis and financial planning powered by GPT-5.5. More than 200 million users already ask ChatGPT financial questions monthly. OpenAI is monetizing that intent with real data access. Second, the OpenAI Deployment Company: a $4 billion joint venture with TPG as lead and Advent, Bain, Brookfield, and Warburg Pincus as co-founders. The company acquired Tomoro, the UK consulting firm behind Virgin Atlantic's AI concierge. Valued at $14 billion. The goal is embedding OpenAI engineers and playbooks inside enterprise clients. Third, Greg Brockman has moved into a products role. The organizational signal is clear: OpenAI is building a platform company, and it needs product leadership to match.

Anthropic

Anthropic pushed Claude into legal tech this week with a direct update to legal workflows. contract analysis, document review, research acceleration. The play targets law firms and in-house legal teams, where accuracy and auditability matter more than raw speed. The bigger number came from Ramp: Anthropic now has more business customers than OpenAI by Ramp's payment data. That is a remarkable position for a company that launched later and has spent less on marketing. WSJ ran a profile framing Anthropic as the AI boom's current front-runner in enterprise. That framing is not wrong based on the data available this week.

Google AI

Google's AI week started before I/O. The Gemini Intelligence brand brings generative features into Android at the OS layer. not as an app you open, but as ambient capability across your phone. Gemini Spark, which leaked via APK teardowns, is designed to act as an autonomous agent across your linked apps: declutter your Gmail, book a flight, manage your calendar without you starting each interaction. Seven internally tested Gemini Live voice models surfaced in a hidden selector. specialized voice experiences with distinct capabilities. The new Gemini model confirmed for I/O on Tuesday is positioned as a broad-capability upgrade, not a frontier push. Google is playing distribution this week, not benchmarks.

The Pattern I'm Watching

In 1999, Salesforce launched with a pitch that seemed almost absurd at the time: software delivered over the internet, no installation required. The incumbents laughed. Oracle and Siebel had distribution locked up. They had the enterprise relationships. They had the integrations. What they didn't have was a reason for customers to want to stay. and Salesforce figured out that the switching cost wasn't the software itself. It was the data inside the software. Once your leads and pipeline and customer history lived in Salesforce, leaving became genuinely painful. That insight built a $200 billion company.

OpenAI connecting to bank accounts, building a consulting arm, and moving Brockman into products is the same move. Twenty-seven years later, different stack, same logic. The goal is not to sell you a model. The goal is to become the place where your consequential data lives and gets acted on. Once your financial data, your inbox context, your legal documents, and your business workflows run through ChatGPT, the switching cost rises to something that looks a lot more like an SAP or Salesforce migration than an API key swap. That is a fundamentally different competitive position than "our model scored better on MMLU."

The 30-year lens on this: every major platform war in enterprise tech has been won at the data and workflow layer, not the feature layer. The product that earns the right to sit in the middle of your business processes wins, regardless of who built the underlying technology. We are about six to twelve months into that race in AI. The question I want you to sit with this week: what would it take for your organization to switch away from the AI platform you're building on today? If the answer is "not much," you're still in good shape. If the answer is "it would take months and we'd lose critical context," the platform layer decision has already been made for you.

Weekly AI and cloud breakdowns from someone who's been in the game since the early days of the internet. No ads. No filler. The signal.

What I built instead of writing another AI tools post

Darin Deters — Sun, 10 May 2026 14:26:10 GMT

Vol. I of Learn With Darin is up. Here’s what’s in it.

learn.techwithdarin.com got a full refresh this week. Vol. I is sixteen long-form guides on the AI tools most working teams actually touch, three head-to-head comparisons for when the question is “which one,” and a study companion for the new Claude Certified Architect exam.

Here’s the shape of it.

Two foundation pieces sit before any tool gets named

For first-timers. A plain-English guide for anyone who has never opened ChatGPT, Claude, or any chatbot. Thirty minutes from now they’ve done the three things that turn it from mystery to useful. No setup, no jargon, no commitment.

For practitioners. Twelve cross-cutting habits that survive a model change, a vendor switch, and whatever gets announced next month. If you’re already comfortable with one tool, this is the right place to start. Read it before any of the vendor guides.

Fourteen tool and infrastructure guides

The lineup most working teams touch:

Anthropic: Claude Code, the Claude apps, Claude Cowork
OpenAI: ChatGPT, Codex
Google: Gemini, NotebookLM, Antigravity, AI Studio
Microsoft: 365 Copilot
Mistral: Le Chat
xAI: Grok
Infrastructure: local models on your own hardware, and managed inference across Bedrock, Vertex, and Foundry

Each one is the version I wished existed before I had to learn the tool the hard way. What changes between platforms. Where the limits are. What licensing actually costs you. What works in practice versus what looks good in a demo and falls over in production.

Three head-to-heads for the “which one” questions

ChatGPT vs Claude vs Gemini for everyday writing and thinking
Claude Code vs Antigravity vs Cursor for agentic coding
Bedrock vs Vertex vs Foundry for production model hosting

These are the three decisions I get asked about most often. The comparisons are written to help you choose, not to declare a winner.

Claude Certified Architect study companion

Anthropic’s first solution-architect exam gets its own study companion. The five judgments the questions reward, the seven question patterns, and a ten-hour lab plan so you sit the exam with the muscle memory of someone who has built the thing.

What this is not

Not a course. Not a paid product. Not a 10x-your-output thread. No popup email captures, no AI-generated robot thumbnails. Just the field notes I would have wanted earlier.

If a guide saves you a week of figuring something out, that’s the point. If you find a gap or something I got wrong, the contact link goes to me directly.

Start at the practitioner primer →

Tech with Darin — Weekly Rollup 5/4-5/10 2026

Darin Deters — Sat, 09 May 2026 13:32:49 GMT

The Bottom Line (No Jargon Edition)

AWS had a data center overheating incident in Northern Virginia on May 8. Coinbase was among the companies hit. When the world's biggest cloud provider goes down, every business running solely on that infrastructure goes down with it.
Anthropic is now buying compute from Amazon, Google, Akamai (a $1.8 billion, seven-year deal), AND SpaceX's Colossus supercomputer. The company building the AI is not trusting any single provider to keep the lights on.
A federal trial between Elon Musk and Sam Altman surfaced documents showing Microsoft's early investment strategy was structured to make OpenAI deeply dependent on Azure. The lock-in playbook is not new. The evidence is just more public now.
US authorities suspect that Nvidia chips worth roughly $2.5 billion were smuggled to China through Thailand, with Alibaba identified as a suspected end customer. Alibaba denies any involvement. The story shows that export controls on AI hardware are creating a parallel, illegal supply chain.
In China, Nvidia's B300 server is reportedly selling for $1 million each on gray markets. That is what supply scarcity looks like in practice.
Google Cloud crossed $20 billion in quarterly revenue but said capacity constraints held growth back. OpenAI shipped GPT-5.5 Instant as the new default ChatGPT model, with less padding and more direct answers.
The message across all of it: single-provider dependency is now a business continuity problem, not just an architectural preference.

The Take That Started the Week

The AWS Northern Virginia outage on Friday, May 8, was not the biggest outage in cloud history. Coinbase went down. Some SaaS apps staggered. Services came back. The postmortem will say "overheating in one availability zone" and most people will file it under "things happen."

That's the wrong read.

The reason this week felt different is the context around it. Anthropic, one of the fastest-growing AI companies in the world, has now structured its compute footprint across four providers: Amazon, Google, Akamai, and SpaceX. In the same week that a data center in Virginia overheated, the company most associated with Claude announced a $1.8 billion, seven-year contract with Akamai. a CDN company now selling cloud infrastructure. and a separate compute deal giving it access to SpaceX's Colossus 1 supercomputer, which runs more than 220,000 Nvidia GPUs. Dario Amodei said at the Code with Claude developer conference that Anthropic is "growing faster than the exponential" in 2026. They are not betting the company on any one provider.

The Musk-Altman trial added a different layer. Court documents revealed that Microsoft's early investment structure in OpenAI was designed, over time, to deepen Azure dependency. The profit cap that limited Microsoft's early returns was lifted as the relationship matured. Whether that rises to the level of legal wrongdoing is for the court to decide. What it confirms is that hyperscalers have always understood compute dependency as a competitive moat. The difference now is that the evidence is in a federal courtroom and everyone can read it.

For anyone managing infrastructure at scale, this week's news is not a collection of separate stories. It is the same story told three times. Lock-in creates fragility. Fragility creates risk. Risk at this scale reaches the board.

Cloud Roundup

AWS

The May 8 data center overheating event in the US-EAST-1 (Northern Virginia) region disrupted multiple services. Coinbase confirmed impact on its platform. Northern Virginia is AWS's oldest and most trafficked region, which means concentration risk is highest there. If your architecture treats us-east-1 as a default without multi-region failover, this week gave you the reason to change that.

Separately, AWS's parent Amazon confirmed another $5 billion investment in Anthropic during Q1 earnings, on top of the $8 billion already deployed. Amazon is also committed to up to $20 billion in future funding. That is a meaningful financial position in one AI company, which creates its own dependency dynamic. this time on Amazon's side.

Azure

The Musk-Altman trial surfaced internal Microsoft documents and communications showing how the company structured its OpenAI relationship to build Azure as the exclusive compute platform. Early investment terms limited Microsoft's profit share, but those limits were removed as the relationship deepened. The strategy worked: OpenAI runs on Azure. The question the trial raises for every enterprise buyer is simple. what does your own vendor agreement say about exclusivity, data portability, and exit rights?

GCP

Google Cloud crossed $20 billion in quarterly revenue in Q1 2026, beating Wall Street estimates by nearly $2 billion. The notable detail: Google said capacity constraints limited growth. A cloud business that is capacity-constrained is a business that could not sell all the infrastructure customers wanted to buy. Google is spending aggressively to close that gap. The company is also preparing an "AI Ultra Lite" Gemini subscription tier, adding explicit usage limits and overage credits to manage token budgets at scale.

AI Model Roundup

OpenAI

GPT-5.5 Instant became the new default model for ChatGPT this week. The design intent is visible: fewer unsolicited follow-up questions, less formatting overhead, tighter answers. OpenAI also released GPT-5.5-Cyber to a broader group of security defenders. a model designed to help teams write proofs of concept for vulnerabilities and run attack simulations. The move puts an offensive-capable model in defenders' hands, which is a meaningful shift in how AI gets applied to security operations.

Anthropic

Claude Managed Agents got three new capabilities at the Code with Claude developer conference: dreaming (agents review past sessions to find patterns and self-improve), outcomes (users define explicit success criteria), and multiagent orchestration (a lead agent delegates tasks to specialist agents). The dreaming feature is the one worth watching. A system that learns from its own operational history without human annotation is moving toward something that looks a lot less like a tool and a lot more like a junior hire who studies their own performance logs.

Anthropic also confirmed the Akamai deal ($1.8 billion, seven years) and the SpaceX Colossus access agreement in the same week. The company is running a four-provider compute strategy. That is a deliberate architecture choice, not just opportunistic deal-making.

Google AI

Gemini is getting an "AI Ultra Lite" tier alongside usage limits and overage credits. That is product management, not research news. it means Gemini is being treated like enterprise software now, with tiered access and consumption governance. Google is also pushing Gemini into federal government workflows as an agentic workforce platform, targeting agencies as proving grounds for large-scale AI deployment. Government contracts move slowly, but they signal where Google thinks its agentic future lands.

The Pattern I'm Watching

Thirty years ago, the enterprise software industry ran on a playbook that every senior architect has lived through: land the platform, deepen the integration, make the exit painful. IBM did it with hardware and services. Oracle did it with databases. SAP did it with ERP. Microsoft did it with Office and Active Directory. The strategy was not secret. It was documented in business school case studies. Every customer who got locked in knew the theory. Most of them did it anyway because the short-term convenience outweighed the long-term cost. until it didn't.

What's different this time is speed and stakes. AI infrastructure is not a five-year ERP rollout. It is compute, data, and model access all bundled into a single vendor relationship, and the decisions are being made in months, not years. A startup choosing AWS for its AI training today is not just picking a cloud provider. It is picking a vendor whose pricing, availability, and model access policies could reshape its unit economics within two years. The Anthropic compute strategy. intentionally spread across Amazon, Google, Akamai, and SpaceX. is a public acknowledgment that no single provider should hold all the leverage. Anthropic is arguably one of the most compute-intensive companies on earth right now. If they won't go single-provider, the argument for anyone else doing so gets weaker.

The geopolitical layer makes this harder, not simpler. Nvidia export controls have created a two-tier AI infrastructure market: one where you pay market rate for H100s and B300s through legitimate channels, and one where you pay a million dollars per server on gray markets. That split will widen as AI compute requirements grow. Teams building AI infrastructure in 2026 are not just making architecture decisions. They are making bets on which side of a geopolitical divide their supply chain sits on. Here is the question I keep coming back to: how many boards are actually having that conversation, and how many are still treating "which cloud do we use" as an IT decision?

Weekly AI and cloud breakdowns from someone who's been in the game since the early days of the internet. No ads. No filler. The signal.

Who Controls the AI Stack?

Darin Deters — Sat, 02 May 2026 12:46:51 GMT

The Bottom Line (No Jargon Edition)

The Pentagon signed AI agreements with seven companies for classified military networks. Microsoft, AWS, and Nvidia are doing the real infrastructure work. OpenAI, Google, and SpaceX are plugging in on top.
A critical Linux flaw called "Copy Fail" (CVE-2026-31431) lets any user give themselves admin rights with a Python script. It affects every major Linux distribution shipped since 2017. Patches are arriving, but cloud environments with slow patching windows are still exposed.
OpenAI is no longer exclusive to Microsoft. OpenAI models are now coming to AWS Bedrock, which means you can use ChatGPT-class models without touching Azure. This matters for teams already running on AWS.
Q1 2026 cloud earnings landed this week. Google Cloud grew 63% year over year and crossed $20 billion in revenue for the first time. AWS grew 28%. Azure grew 40%, but capacity is being pulled toward Microsoft Copilot.
The Musk vs. Altman trial opened in Oakland. Musk is seeking $134 billion, arguing OpenAI broke its nonprofit promise. The outcome will shape how AI governance is written into enterprise contracts for years.
The four big hyperscalers are on track to spend $710 billion on AI infrastructure this year. Nvidia's data center revenue hit $193.7 billion, up 75% year over year. The buildout is not slowing down.
The week's pattern: every major story was a control story. Who governs AI models, who owns the IP, who patches the infrastructure, and who gets into classified networks.

The Take That Started the Week

The Pentagon published agreements with seven AI companies this week. OpenAI, Google, Nvidia, Microsoft, Amazon Web Services, SpaceX, and Reflection AI all signed on to deploy AI inside classified Impact Level 6 and 7 networks. These are the most sensitive military environments in the federal government.

Read the list again. Seven companies. Not a dozen startups. Not a diverse ecosystem. Seven. The Department of War (the official designation is no longer Defense) is betting classified national security AI on a handful of vendors who were mostly consumer product companies five years ago. That is a remarkable consolidation of trust in a remarkably short period of time.

Notice who is absent. Anthropic reportedly declined to participate after pushing back on use cases involving mass surveillance and autonomous weapons. That decision will cost them federal revenue in the short term. Whether it earns them enterprise trust in the long term is the more interesting bet to watch. Every procurement officer reading that story is quietly filing it away.

The deeper signal here is not the list of winners. It is the velocity of institutionalization. Governments move slowly. When they move fast, it usually means the direction is already locked in. The Pentagon did not take years to evaluate options and run pilots. They called a handful of known vendors and signed deals. That tells you where the market is going, and it tells you that the window for challengers to get into the classified AI stack is closing faster than most people realize.

Cloud Roundup

AWS Q1 2026 earnings showed AWS growing 28% year over year. Strong number, but the story this week went beyond the revenue report. The bigger news landed on Monday when Amazon and OpenAI announced an expanded partnership. OpenAI models are coming to Amazon Bedrock, including the latest frontier models through the same APIs and controls Bedrock customers already use. OpenAI's Codex coding agent is also part of the deal. For the last two years, OpenAI's models were effectively a reason to stay on Azure. That leverage is gone. AWS customers no longer have to choose between the best foundation models and their existing cloud infrastructure.

Azure Azure grew 40% in Q1 2026, accelerating from 39% the prior quarter. The number looks strong on the surface. The caveat is that capacity is being diverted to power Microsoft Copilot products, which means standard Azure workloads are competing internally for the same compute. Microsoft is also navigating the OpenAI exclusivity change. The deal still makes Microsoft OpenAI's primary cloud partner through 2032, but the exclusive lock on model distribution ended. Both companies framed it as a win. Structurally, it is a meaningful shift in leverage.

GCP (Google Cloud) Google Cloud crossed $20 billion in quarterly revenue for the first time in history, growing 63% year over year against a $12.3 billion comparison period. Alphabet beat Wall Street estimates broadly, and Cloud was the headline driver. The caveat Google flagged on its earnings call is worth paying attention to: growth was capacity-constrained. Demand is outrunning supply. Google is building fast, and the $710 billion CapEx commitment across the four major hyperscalers is the answer to that problem. CapEx for Big Tech is now projected to exceed $1 trillion annually by 2027.

AI Model Roundup

OpenAI Two major OpenAI stories hit this week and they pull in opposite directions. On the product side, the AWS Bedrock partnership makes OpenAI models more accessible to more developers than at any point in the company's history. On the governance side, the Musk trial opened in Oakland. Musk is seeking $134 billion, arguing that OpenAI's conversion from nonprofit to for-profit entity broke the original founding commitments. OpenAI listed the lawsuit as a business risk in its IPO materials. The trial outcome will not change the technology, but it will shape how AI governance clauses are written in enterprise contracts for the next decade.

Anthropic Anthropic was not in the Pentagon deals. That choice reflects their stated position on AI safety, but it comes with real business consequences. The AI resource war is real. Compute is constrained, training runs are expensive, and the companies getting classified government contracts are also getting preferred access to infrastructure. Anthropic is running a different strategy, and it is worth watching whether that strategy holds as the capital requirements grow. The gap between Anthropic's compute access and OpenAI's is widening, not narrowing.

Google AI Google's AI week was strong across the board. Gemini models are shipping across Google products, from Maps to design tools. Google is part of the Pentagon classified network agreements. And Google Cloud's 63% growth reflects real enterprise demand for Gemini-based products, not just infrastructure spend. The question worth asking is whether Google's AI advantage in search and consumer products translates into AI platform wins in enterprise. The Q1 numbers suggest it is starting to.

The Pattern I'm Watching

I have watched infrastructure consolidation happen three times at scale in the last thirty years. It happened with enterprise software in the late 1990s, when the ERP market went from dozens of vendors to SAP and Oracle. It happened with public cloud in the 2010s, when three providers absorbed most of the market. And it is happening again right now with AI infrastructure.

The Pentagon's vendor list this week is not just a government procurement story. It is a signal that the consolidation phase for AI infrastructure has started. When a government agency is willing to put classified national security workloads on a vendor's platform, that vendor has cleared a trust threshold that takes years to rebuild for a challenger. Microsoft, AWS, and Nvidia are not just winning contracts. They are building institutional moats that will be as durable as the ones Oracle and SAP built in the 1990s, maybe more so because the switching costs in AI workloads are higher than they were in ERP. You cannot easily retrain a frontier model, swap your CUDA-optimized inference stack, or rebuild your classified IL-6 deployment for a new vendor without years of work.

The wild card is governance. The Musk trial, the Anthropic principled refusal, the CVE-2026-31431 patch window exposure all point to the same question: as AI gets embedded deeper into critical infrastructure, who is actually accountable when something goes wrong? In the ERP era, accountability was contractual. In the cloud era, it became shared responsibility. In the AI era, nobody has figured out the governance model yet. The companies that do will have a second moat on top of the infrastructure moat. That is the pattern worth watching over the next ten years.

Here is the question I keep coming back to: if you are an enterprise CTO today, and you are watching the Pentagon consolidate around seven vendors, does that make you more confident in your own AI vendor selection, or does it make you more nervous about being locked in to a list someone in Washington approved?

Weekly AI and cloud breakdowns from someone who's been in the game since the early days of the internet. No ads. No filler. The signal.

The Race Is Over. The Supply Chain Race Just Started.

Darin Deters — Sat, 25 Apr 2026 14:09:56 GMT

The Bottom Line (No Jargon Edition)

Google committed up to $40 billion to Anthropic. $10 billion now, $30 billion contingent on performance targets. at a $350 billion valuation. This is not a normal investment. Google already has Gemini. This is infrastructure control disguised as a check.
Anthropic's Claude Code and $30 billion annual run rate are driving the deal. Enterprise adoption of coding AI is scaling faster than Anthropic's compute supply. Google and Amazon are both providing chips and cloud capacity to close that gap. Two cloud giants are now co-funding the same AI lab they compete with.
DeepSeek V4 launched this week. Pro and Flash versions with better reasoning and agentic task performance. China's AI infrastructure story is running in parallel to the U.S. one, backed by Huawei Ascend chips. The separation is deepening.
Meta signed a deal to become one of the world's largest AWS Graviton customers, using hundreds of thousands of Graviton 5 cores for CPU-intensive agentic AI workloads. Infrastructure diversification is now the stated strategy at the top of Big Tech.
The Musk vs. Altman trial starts Monday, April 27. The lawsuit claims OpenAI deviated from its mission and defrauded Musk into donating. Microsoft is named as a co-defendant. Watch for what comes out in discovery, not just the verdict.
Anthropic's Claude Opus 4.7 launched last week and sits atop current benchmarks. The more telling detail: Anthropic publicly acknowledged it trails its own unreleased Mythos model. The labs are now building tools they admit they will not sell.
OpenAI’s GPT-5.5 launched this week, while GPT-5.4-Cyber expanded access for vetted security teams. The model race is still moving fast, but the more important split is now obvious: general-purpose models for everyone, specialized high-risk models for vetted users, and unreleased frontier systems held back entirely.
The pattern across all of it: winners are not being chosen by who has the best model. They are being chosen by who controls the supply chain. compute, contracts, legal standing, and government access.

The Take That Started the Week

Google's $40 billion commitment to Anthropic is the kind of move that looks strange until you see the logic behind it. Google already has Gemini. Gemini is good. Gemini 3.1 Pro has a two-million token context window and strong multimodal capabilities. So why write a $40 billion check to a competitor?

Because Anthropic's Claude Code is generating $30 billion in annualized revenue and growing. Because enterprise teams are adopting Claude for agentic coding work at a rate that is outpacing Anthropic's compute supply. And because whoever provides that compute. Google Cloud, Amazon Web Services, the chips underneath them. ends up with the structural position in the AI supply chain that matters over the next ten years. This is not altruism. It is infrastructure acquisition with a minority equity stake attached.

The Amazon side of this is already visible. Meta signed a deal this week to bring tens of millions of AWS Graviton 5 cores into its compute portfolio, explicitly for agentic AI workloads. Graviton 5 delivers 192 cores and 25% better performance than the previous generation, with inter-core communication latency reduced by up to 33%. Meta's head of infrastructure said it plainly: diversifying computing resources is strategically essential as they scale the infrastructure behind Meta's AI business. That is not a vendor preference statement. That is a supply chain strategy statement.

I have watched this play out before. In the early cloud era, the conversation was about features and latency. The durable advantages were built in procurement, in multi-year contracts, in infrastructure commitments that shaped everything downstream. The labs and the hyperscalers have figured this out. The question for everyone else is whether you have.

Cloud Roundup

AWS The Meta-Graviton deal is the story this week. Amazon announced that Meta will adopt hundreds of thousands of AWS Graviton 5 chips, making Meta one of the largest Graviton customers on the planet. The use case is specific: CPU-intensive workloads behind agentic AI. Graviton 5's 192-core architecture and reduced inter-core latency are not generic server upgrades. They are purpose-built for the continuous inference and multi-step task execution that agentic AI requires at scale. AWS is not just selling compute. It is positioning Graviton as the CPU-side infrastructure layer for the agentic era. That framing is intentional, and it matters for how you evaluate your own compute strategy.

Azure Microsoft is named as a co-defendant in the Musk-Altman trial starting Monday. The lawsuit argues that OpenAI's shift from a nonprofit to a commercial entity violated commitments made to early donors, including Musk, and that Microsoft's involvement accelerated that shift. Whatever the legal outcome, discovery alone will generate months of internal communications that enterprise teams will want to read. If your AI strategy runs heavily through OpenAI APIs on Azure, this week is a good time to review your concentration risk. The trial starting April 27 is not background noise for enterprise procurement teams. It is front-page vendor risk.

GCP Google's Anthropic bet reshapes how you read its cloud positioning. Google Cloud is not competing against Anthropic in the traditional sense. It is competing to be the infrastructure layer that Anthropic runs on. That means GCP wins whether teams choose Gemini or Claude, as long as Claude runs on Google Cloud infrastructure. That is a more sophisticated market position than most coverage is giving Google credit for. Watch how Google begins to market GCP as the neutral infrastructure layer for AI workloads. including workloads that use models it did not build.

AI Model Roundup

OpenAI GPT-5.5 did land this week, which changes the framing. The release was not just another leaderboard move. It reinforced the pattern that OpenAI is still pushing hard on general-purpose reasoning while also carving out specialized lanes like cybersecurity. GPT-5.4-Cyber expanded access for vetted security teams, and ChatGPT Extended Thinking hit a 94% reasoning score on ARC-AGI-1. The cybersecurity model expansion came one week after Anthropic rolled out Project Glasswing and previewed Mythos, the unreleased model restricted to a handful of companies for security testing. OpenAI is responding to Anthropic’s security positioning in near-real time. The cybersecurity lane is now a second competitive track running alongside general-purpose capability, and the Trusted Access for Cyber program is OpenAI’s infrastructure for controlling access to its most capable security tooling. Watch who gets in, and on what terms.

Anthropic Claude Opus 4.7 launched April 16 and currently leads on SWE-bench Pro benchmarks for agentic coding. Anthropic called it openly: Opus 4.7 is less broadly capable than Mythos, its unreleased flagship. That admission is notable. The lab is publicly acknowledging a two-tier model strategy. one tier you can buy, one tier you earn access to through vetted programs. Mythos Preview found and reported a 17-year-old remote code execution vulnerability in FreeBSD on its own (CVE-2026-4747). It also found bugs in OpenBSD, FFmpeg, and Linux kernel privilege escalation chains. The week also included a brief outage. elevated error rates across Claude, the API, and Claude Code. resolved by 1:50 PM ET on April 15. At $30 billion annualized revenue, even a short infrastructure incident surfaces fragility questions that enterprise buyers are actively asking.

Google AI Gemini 3.1 Pro's two-million token context window continues to be its sharpest differentiator. On agentic coding benchmarks, Opus 4.7 leads. On long-context research tasks, Gemini and Opus 4.7 tied at a 0.715 aggregate score. The $40 billion Anthropic investment does not signal that Google is abandoning Gemini. It signals that Google is building a portfolio position across the model layer and the infrastructure layer simultaneously. Gemini is the internal flagship. Anthropic is the external bet. GCP is the layer both run on. That is a three-part strategy, not a pivot.

The Pattern I'm Watching

Google's $40 billion Anthropic bet looks strange until you remember what happened in 1995. Microsoft invested in Apple. The investment saved Apple from bankruptcy, killed the antitrust argument that Microsoft was a pure monopolist, and gave Microsoft a browser distribution deal. Both companies got something they needed. The minority equity stake was the smallest part of the transaction. The infrastructure and distribution dynamics were the durable parts.

I am not saying Google and Anthropic are Microsoft and Apple. The dynamics are different. But the structure of the move is recognizable. Google is not buying Anthropic for the equity upside. Google is buying a supply chain position, a compute dependency, and an institutional relationship with the lab that enterprise teams are treating as the other serious AI option. If Anthropic runs on Google Cloud, uses Google TPUs, and takes Google capital, then Google is in the room regardless of which model your team chooses. That is the play.

DeepSeek V4 landing this week, backed by Huawei Ascend chips, is the other side of this pattern. China is building a parallel supply chain. models, chips, and cloud infrastructure that do not depend on NVIDIA or the U.S. hyperscalers. The V4 Pro's agentic performance claims are meaningful. But the Huawei backstory is the more durable signal. Two separate infrastructure stacks are forming at the global level, and every enterprise building AI systems in the next three years will eventually have to decide which supply chain they are willing to depend on. Most teams are not having that conversation yet.

After 30 years of watching these cycles, here is what I know: the consolidation phase always feels like a lot of separate stories until it snaps into a single picture. This week gave you the picture. Google, Amazon, and the hyperscalers are competing to be the infrastructure that the winning model runs on. The model scores will keep changing. The infrastructure dependencies will not. Which layer is your team actually building on. and do you know who controls it?

Hit reply and tell me. I read every response. Darin

Weekly AI and cloud breakdowns from someone who's been in the game since the early days of the internet. No ads. No filler. The signal.

The AI Platform Grab Is Here — And the Model Race Is Now Secondary

Darin Deters — Mon, 20 Apr 2026 00:17:31 GMT

The Bottom Line (No Jargon Edition)

Anthropic's Claude Opus 4.7 retook the top LLM benchmark spot. 64.3% on SWE-bench Pro for agentic coding. but the bigger story is that the gap between the top models is now razor-thin. The model leaderboard matters less every week.
Cloudflare and OpenAI launched Agent Cloud for enterprises, a joint infrastructure layer for running AI agents at scale. This is two major vendors locking arms around workflow infrastructure. Watch who builds on top of it.
Both OpenAI and Anthropic released dedicated cybersecurity models this week. OpenAI opened GPT-5.4-Cyber to thousands of vetted security professionals. Anthropic previewed Mythos, its own security-focused model. Defenders now have purpose-built tools. The arms race just got a second lane.
Jane Street signed a $6 billion AI cloud deal with CoreWeave. That is not a typo. One firm. One deal. Six billion dollars. The infrastructure layer is where the real money is moving.
OpenAI lost three senior executives in a single day. product chief Kevin Weil, Sora head Bill Peebles, and enterprise CTO Srinivas Narayanan. Leadership attrition at this scale is a signal worth tracking, not a footnote.
Anthropic won a key government appeals court ruling after the Pentagon tried to exclude the company from defense contracts over a national security designation. Vendor risk is now a legal category, not a technical one.
OpenAI updated its Agents SDK with new harness and sandbox capabilities for enterprise builders. More guardrails, more capability, more surface area for your teams to evaluate.

The Take That Started the Week

Anthropic released Claude Opus 4.7 on Thursday and it retook the top spot on SWE-bench Pro with a 64.3% score on agentic coding tasks. It edged out GPT-5.4 and Gemini 3.1 Pro. It runs at $5 per million tokens. The benchmark headline will get most of the coverage, and most of that coverage will miss the point.

The more interesting move happened the same week. Anthropic previewed Mythos, a security-focused model, while simultaneously running an appeals court fight against a Pentagon exclusion order. OpenAI responded to Mythos by widening access to GPT-5.4-Cyber for vetted security teams. That is not two companies competing on model specs. That is two companies competing for institutional trust. government, enterprise, legal standing. The playing field shifted and a lot of people are still watching the benchmark leaderboard.

Here is the dynamic I am tracking: the top AI labs are now spending as much energy on access programs, regulatory positioning, and legal defense as they are on model training. Anthropic's government court win matters to every enterprise procurement team. When vendor risk becomes a legal category. something a court has to rule on. it changes how you write contracts, how you evaluate suppliers, and how you think about concentration risk in your AI stack. A model score does not tell you any of that.

The Claude Design launch. Anthropic's shift toward UI and product layer investment. and the Cloudflare-OpenAI Agent Cloud partnership both point the same direction: the labs are building stickiness into the workflow, not just into the weights. After 30 years of watching infrastructure cycles, this is the consolidation phase. The window for neutral, best-of-breed integration is narrowing. The teams that think clearly about this now will have more options than the teams that wait.

Cloud Roundup

AWS No major launches this week from AWS on the infrastructure side. That is worth noting. While OpenAI and Cloudflare were announcing Agent Cloud and Anthropic was in court defending its government relationships, Amazon stayed quiet. AWS Bedrock continues to be the default enterprise AI infrastructure layer for teams that already live in the AWS ecosystem. The absence of a big AWS announcement this week is not absence of activity. it is what market position looks like when you do not need to make noise.

Azure Microsoft-adjacent news continued to be dominated by the OpenAI relationship. The triple executive departure at OpenAI. Weil, Peebles, and Narayanan all leaving on the same day. creates real uncertainty for enterprise teams that built their Azure AI strategy around OpenAI product continuity. Azure's own Copilot stack is increasingly a separate track from OpenAI's direct API products. If you are building on OpenAI through Azure, pay attention to which product line you are actually on.

GCP Google stayed visible in the benchmark conversation. Gemini 3.1 Pro sits at a two-million token context window, double what Claude Opus 4.7 offers. On long-context research tasks, Opus 4.7 and Gemini 3.1 Pro tied. Google's infrastructure advantage on context length is real for specific use cases. long-document analysis, large codebase reasoning, multi-session enterprise workflows. If that is your primary use case, the context window delta is worth pricing into your model selection.

AI Model Roundup

OpenAI Three moves this week. GPT-5.4-Cyber launched with expanded access for vetted security teams through the Trusted Access for Cyber program. binary reverse engineering, exploit analysis, vulnerability research for verified defenders. The Agents SDK update added harness and sandbox capabilities, initially in Python with TypeScript support coming. And Cloudflare partnership brought Agent Cloud to enterprise. That is a lot of product surface in one week from a company that also lost three senior executives. The execution is there. The leadership continuity question is real.

Anthropic Opus 4.7 is the headline, but Mythos and the court victory are the story. Anthropic's annual run-rate revenue hit $30 billion in April 2026, driven by enterprise adoption and Claude Code. The Pentagon exclusion attempt. which an appeals court reversed. came after Anthropic refused to enable mass surveillance capabilities. That refusal, and the legal fight that followed, is now part of Anthropic's institutional positioning. Some enterprise buyers will see that as a risk. Others will see it as a feature. Know which camp your organization is in before your next contract renewal.

Google AI Gemini 3.1 Pro's two-million token context window remains its clearest differentiation against Opus 4.7's one-million token ceiling. On agentic coding tasks, Gemini lost this week's benchmark round. On long-context research benchmarks, it tied with Opus 4.7 at a 0.715 aggregate score. Google's model strategy is playing the long-context and multimodal angles hard. For teams doing document-heavy work or building agents that need to reason across massive codebases in a single pass, that context advantage is not abstract.

The Pattern I'm Watching

Jane Street just signed a $6 billion AI cloud deal with CoreWeave. One financial firm. One infrastructure vendor. Six billion dollars. Set that number next to the conversation about model benchmarks and ask yourself which number actually tells you where we are in this cycle.

I watched this exact dynamic play out in the early cloud era. When AWS, Azure, and GCP were fighting for enterprise workloads in the mid-2010s, the technical debate was about feature sets and latency. The real consolidation happened in the contracts. Organizations that locked into three-to-five year infrastructure deals shaped the next decade of their architecture choices, whether they meant to or not. The feature debates were real. But the contractual gravity was stronger. Jane Street knows this. That is why they signed a $6 billion deal with a GPU cloud provider rather than spreading the spend across five vendors and waiting to see who wins.

What is different this time is the speed. The mid-2010s cloud consolidation took five to seven years to settle into recognizable patterns. The AI infrastructure consolidation is happening in roughly eighteen months. The Cloudflare-OpenAI Agent Cloud launch this week is the same move. two players building shared infrastructure before smaller competitors can establish neutral ground. Anthropic's $30 billion run rate and its government legal fight are happening in the same quarter. OpenAI losing three executives and shipping three major products in the same week is happening in the same quarter. The pace compresses everything, including the window to make deliberate choices about your stack. The question worth sitting with: does your team have an explicit AI vendor strategy, or are you accumulating dependencies faster than you are evaluating them?

Going Deeper This Month

The paid tier this month looks at the 30-year pattern behind this week's AI platform grab. Specifically: how the infrastructure consolidation of the cloud era maps onto what is happening with AI agent infrastructure right now. and what the teams that navigated that transition well actually did differently. If you are making architectural decisions or vendor commitments in the next 90 days, that pattern is worth your time.

Upgrade your subscription to get the full breakdown on Friday.

Weekly AI and cloud breakdowns from someone who's been in the game since the early days of the internet. No ads. No filler. The signal.

The AI platforms are pulling up the ladder on developers

Darin Deters — Sat, 11 Apr 2026 15:29:26 GMT

The Bottom Line (No Jargon Edition)

Anthropic built an AI model so capable at finding security holes in software that it sent an unsolicited email to one of its own researchers during testing. The company decided not to release it publicly. Only 11 handpicked partners get access.
A developer named Peter Steinberger built a popular open-source tool called OpenClaw that works with Anthropic's Claude. Anthropic first changed the billing rules so his tool costs extra, then temporarily banned him from the platform entirely for "suspicious activity." The ban was lifted, but the message was sent.
Microsoft quietly stripped the Copilot name from Windows apps like Notepad and Snipping Tool. The AI features are still there. The brand is not. That is what a retreat looks like.
A woman filed a lawsuit against OpenAI, alleging that ChatGPT encouraged her ex-boyfriend's stalking behavior and helped him create harassment materials. OpenAI is simultaneously backing a bill that would shield AI companies from liability in exactly these kinds of cases.
CoreWeave, the GPU cloud provider that went public last month, signed a multi-year compute deal with Anthropic. CoreWeave now serves nine of the ten largest AI model providers. The company projects more than $12 billion in revenue for 2026, up from $5.1 billion last year.

The Take That Started the Week

Anthropic built something it was afraid to ship. Claude Mythos, the company's next-generation model, could autonomously find and exploit zero-day vulnerabilities in production software. During testing it broke out of its sandbox and emailed a researcher. Anthropic halted the public release, restricted access to 11 partners under "Project Glasswing," and committed up to $100 million in usage credits for defensive cybersecurity work.

That decision matters more than the model itself. This is the first time a major AI lab has built something and explicitly said: we are not ready to put this in the world. Not a PR talking point. An actual operational hold. The model found a 27-year-old flaw in OpenBSD during testing. That is not a benchmark score. That is a real vulnerability in software that runs real systems.

Now hold that decision next to this: the same week Anthropic briefly banned Peter Steinberger, the creator of OpenClaw, from accessing Claude at all. The reason cited was "suspicious activity." He had built one of the most widely used third-party agent frameworks for Claude. Earlier that week, Anthropic had already changed its billing policy to charge extra for anyone using Claude through third-party harnesses like his. The ban was lifted quickly, but the sequence is worth noting. The platform giveth. The platform taketh away.

These two events together tell you something about where we are. The AI labs are now big enough, and their models capable enough, that they are making sovereign-level decisions. They decide what gets released and what doesn't. They decide which developers get access and on what terms. They are the regulators now, whether they want that title or not. And the developers building on top of these platforms are finding out the hard way what "platform risk" really means when the platform is the intelligence layer.

Cloud Roundup

AWS

Amazon's satellite internet service entered enterprise beta this week. Originally called Project Kuiper before being rebranded as Amazon Leo last November, the service now has roughly 250 satellites in orbit. CEO Andy Jassy confirmed a mid-2026 commercial target in his shareholder letter and said pricing will undercut Starlink. Partners already signed include Verizon, AT&T, Delta, JetBlue, and NASA. The FCC requires 1,618 satellites by July 30. That is a lot of launches in a short window. Worth watching whether the timeline holds.

The broader AWS signal this week is what wasn't announced. The big infrastructure moves were all going to CoreWeave and other specialized GPU clouds. AWS has Bedrock and SageMaker, but when Anthropic needed raw compute capacity at scale, they went to CoreWeave first. That is a quiet data point, not a verdict. But it is worth tracking.

Azure

Microsoft's Copilot retreat continued. The company removed Copilot buttons and branding from Notepad and Snipping Tool in Windows 11, replacing the menu with "Writing Tools." The AI features are still running underneath. The name is gone.

This matters because Microsoft spent two years and enormous marketing budget making Copilot a household name inside enterprise IT. The fact they are now quietly distancing the brand suggests the adoption numbers or satisfaction scores are not where they expected. Microsoft's own documentation acknowledged this week that users should "not trust AI" for certain tasks, then had to walk that statement back publicly. That is not a confident narrative for a product line that represents billions in future revenue.

GCP

Google released Gemma 4 this week, its most capable open-weights model family. The models are designed for complex reasoning on low-power devices and come with an Apache 2.0 license, which is a meaningful shift from prior licensing terms. Gemini Nano 4 for Android is coming later this year, with 2B and 4B parameter variants running locally on device.

The Anthropic-Google relationship is also worth tracking. Anthropic signed a deal this week to secure 3.5 gigawatts of Google TPU capacity starting in 2027, expanding a prior 1 gigawatt commitment. Anthropic's annualized revenue run rate crossed $30 billion in early April 2026, up from $9 billion at the end of 2025. At that growth rate, compute supply becomes the constraint before model capability does. Google is both a competitor and a critical infrastructure provider to Anthropic. That relationship gets more complicated as the revenue gap closes.

AI Model Roundup

OpenAI

OpenAI is finalizing a cybersecurity-focused model similar to what Anthropic built with Mythos. Axios reported a staggered rollout plan, driven by the same concerns: a model this capable at finding vulnerabilities cannot be released wide-open. The company is also introducing a $100 per month ChatGPT tier targeting professionals doing heavier coding and "real projects."

The legal picture darkened this week. A woman filed suit alleging ChatGPT encouraged her ex-boyfriend's stalking and helped him create materials to harass her. Florida's Attorney General opened a separate investigation into whether ChatGPT was involved in the 2025 Florida State University shooting. Meanwhile OpenAI is actively lobbying for a bill that would shield frontier AI developers from liability for critical harms caused by their models, as long as those harms were not intentional or reckless. The timing of that lobbying effort, against this legal backdrop, is not subtle.

Anthropic

Three things happened at Anthropic this week and they point in the same direction. First, Mythos held back from public release. Second, OpenClaw's creator temporarily banned. Third, CoreWeave deal signed to scale compute capacity. Put them together: Anthropic is getting more powerful, more cautious about what it deploys, and more aggressive about controlling how its models get used. The $30 billion annualized revenue run rate is the fuel behind all three decisions.

Project Glasswing, the restricted access program for Mythos, includes Nvidia, Google, AWS, Apple, and Microsoft as partners. Those are not startups. That is a list of the largest technology companies on earth getting private access to a model the public cannot touch.

Google AI

Gemma 4 shipped with Apache 2.0 licensing, which is a genuine open move. The models bring serious reasoning capability to devices that previously couldn't run anything close to frontier performance. For developers building local AI applications, this is the most interesting release of the week.

The Intel partnership for AI infrastructure using Xeon CPUs and custom IPUs is worth a look for anyone architecting inference pipelines. CPUs are making a quiet comeback in the inference stack, especially for latency-sensitive workloads where GPU queue time is the real bottleneck.

The Pattern I'm Watching

I have watched this exact sequence before. Not with AI, but with cloud itself. In 2008 and 2009, AWS started offering raw compute to developers who had no other way to scale quickly. The terms were simple, the access was wide, and the ecosystem exploded. Then, somewhere around 2012 and 2013, the platform calculus shifted. Pricing got more complex. Preferred partnerships emerged. Certain workloads got steered toward AWS's own managed services rather than raw compute. Developers who built on top of the platform started finding their integrations quietly deprecated or repriced.

This week's Anthropic-OpenClaw story is that pattern running at AI speed. A developer builds something useful on a platform. The platform grows fast enough that it no longer needs that developer's goodwill. The billing rules change. The access gets tightened. The developer's position goes from "ecosystem partner" to "third-party risk." This is not malicious. It is just what platforms do when they get big enough to set the terms instead of accepting them.

The legal liability angle is new this time, though. In the cloud era, the worst a platform could do to a developer was shut off their API. Today, if a model deployed through a third-party harness causes harm, the question of who is liable is genuinely unsettled. OpenAI backing a bill to cap its own liability while simultaneously facing stalking and mental health lawsuits is the most honest preview of where this goes. The liability is going to land somewhere. The fight right now is about where.

After 30 years of watching platform cycles, I keep coming back to the same question: at what point does a platform become infrastructure? And once it becomes infrastructure, what obligations come with that? The power grid doesn't get to decide which appliances plug in. The phone network couldn't refuse calls based on the conversation it predicted. AI platforms are making content and access decisions that no prior infrastructure layer was allowed to make. The regulatory frameworks that eventually caught up to cloud were slow and incomplete. The ones catching up to AI are going to be faster and more aggressive, because the harms are more visible and more immediate.

What happens to the developer ecosystem when the intelligence layer consolidates into five platforms, each with the power to ban, reprice, or restrict access at will?

Weekly AI and cloud breakdowns from someone who's been in the game since the early days of the internet. No ads. No filler. The signal.

The AI Stack War Is Now Internal

Darin Deters — Sat, 04 Apr 2026 14:24:19 GMT

The Bottom Line (No Jargon Edition)

AI labs are buying the tools they used to rent. Anthropic paid $400 million for a biotech AI startup. Microsoft shipped three in-house models to chip away at its OpenAI dependency. The era of "we'll just use someone else's AI" is ending for the big players.
AI vendors are now a supply chain risk. Mercor, a recruiting platform backed by Meta, got hit through LiteLLM, a third-party AI integration layer. This is not an isolated incident. Every AI tool you plug into your stack is a new attack surface.
OpenAI is under real operational stress. The CEO of applications went on medical leave. The CMO is stepping back to fight cancer. The COO shifted to a special projects role. Three major leadership moves in one week at a company serving nearly one billion users.
Microsoft is spending $10 billion in Japan. That's infrastructure spending through 2029, with SoftBank and Sakura Internet as partners. The goal: train one million engineers and developers by 2030 while expanding compute capacity in-region.
Google's Gemma 4 is now Apache 2.0 licensed. That's a meaningful shift. It means enterprise teams can use it without the licensing headaches that came with previous Gemma releases. Open weights, agentic support, and lower-power device compatibility in one drop.
The moat is no longer the model. It's the vertical stack around the model. Who owns the data pipeline, the tooling, the workflow integration, the compliance layer. The labs figured this out. Now enterprise teams need to figure out what it means for vendor decisions.

The Take That Started the Week

The Mercor breach didn't make most front pages. A hiring platform gets hit through a dependency in its AI stack. Happens all the time. Move on.

Except it doesn't happen all the time. Not like this. Mercor was using LiteLLM, a popular open-source library that lets you route calls across multiple AI model providers from a single interface. It's a reasonable engineering choice. Lots of teams use it. And that's exactly the problem.

When you add an AI layer to your product, you're not adding one vendor. You're adding the vendor's dependencies, the dependencies' dependencies, and every integration point along the way. In traditional software supply chain terms, this is a known pattern. We watched it blow up with Log4Shell in 2021. We watched it again with the XZ Utils backdoor in 2024. The AI tooling ecosystem is running through the same learning curve, just faster, with more surface area, and with less institutional memory because most of the teams building on top of these tools are doing it for the first time.

The practical takeaway is not "don't use AI vendors." That's not a real answer in 2026. The takeaway is that AI vendor risk now belongs on your threat model the same way third-party software dependencies do. If you haven't mapped which AI tools your products or internal systems call out to, and what permissions those tools hold, that audit is overdue.

Cloud Roundup

AWS

A quieter week for AWS on the product announcement front. No flagship drops. The bigger story for AWS practitioners is contextual: Microsoft's $10B Japan commitment and its MAI model trio represent a direct challenge in the enterprise cloud space where AWS has historically owned the conversation. AWS's regional infrastructure strategy, particularly in Asia-Pacific, is going to face harder questions as Microsoft builds out compute with local partners like SoftBank and Sakura Internet. Worth watching what AWS counters with in Q2.

Azure

Microsoft had the biggest cloud week of the year so far. The three MAI models (MAI-Transcribe-1, MAI-Voice-1, and MAI-Image-2) are now broadly available for commercial use on Microsoft Foundry and the new MAI Playground. The numbers are concrete: MAI-Transcribe-1 runs 2.5x faster than Azure Fast, supports 25 languages, and starts at $0.36/hour. MAI-Voice-1 handles audio generation and custom voice creation at $22 per million characters. MAI-Image-2 does images and video at $5 per million input tokens.

This is not just a product launch. Microsoft is systematically reducing its dependency on OpenAI models. The relationship is still active, but Redmond is clearly building the capability to operate independently. That's a strategic hedge that makes the $10B Japan announcement even more coherent. If you're building on Azure, your model routing options just got wider.

GCP

Google shipped Gemma 4 this week. The Apache 2.0 license switch is the headline that matters most to enterprise teams. Previous Gemma releases had custom terms that created legal ambiguity in commercial deployments. Apache 2.0 removes that friction. The model itself is built on the same foundation as Gemini 3, with improved reasoning, native function calling, structured output support, and agentic workflow management baked in. It runs on low-power devices, which positions it well for edge deployments and on-prem use cases where data sovereignty matters.

AI Model Roundup

OpenAI

No new model drops this week, which is notable given the leadership turbulence. Fidji Simo, CEO of applications, announced medical leave for a worsening neuroimmune condition. CMO Kate Rouch is stepping back for cancer recovery. COO Brad Lightcap is shifting to a special projects role reporting directly to Sam Altman. Greg Brockman will oversee product in Simo's absence. Chief Strategy Officer Jason Kwon, CFO Sarah Friar, and CRO Denise Dresser are splitting business and operations oversight. Former Meta CMO Gary Briggs is stepping in as interim CMO. That is a lot of change to manage at a company approaching one billion users and an active IPO process.

Anthropic

The Coefficient Bio acquisition landed this week. $400 million in stock for a sub-10-person stealth startup. Coefficient Bio's platform lets AI draft drug R&D plans, manage clinical regulatory strategies, and identify drug candidates. Anthropic is folding that into biopharma R&D workflows using its foundation models as the backbone. This is vertical integration in the truest sense: owning the domain application layer, not just the model underneath it. The same week, Anthropic cut off Claude Pro and Max subscribers from using their subscriptions to power third-party AI agents, citing compute and engineering resource management. You now need the API or pay-as-you-go billing to run third-party agent workflows on Claude.

Google AI

Gemma 4 is the AI model story from Google this week. Ten-trillion-parameter count at the top of the family (Claude Mythos 5 at the same scale is Anthropic's comparable), Apache 2.0 licensing, agentic support, and a design philosophy built around both cloud and local deployment. Google has been playing a long game in open-weights models, and Gemma 4 feels like the first version that's genuinely ready for serious enterprise workloads without legal headaches.

The Pattern I'm Watching

Thirty years in tech, and I've watched this cycle play out more than once. In the early 2000s, enterprise software companies started acquiring the consulting firms and implementation partners that lived on top of their platforms. SAP bought its way into services. Oracle absorbed its own ecosystem. The reasoning was always the same: the money isn't in the license, it's in the workflow. Own the workflow, and the license renews itself.

What's happening now across OpenAI, Anthropic, Microsoft, and Google is structurally identical, just compressed and running at AI speed. Anthropic buying Coefficient Bio is not primarily a talent acquisition or a technology bet. It's a workflow acquisition. Drug discovery workflows, clinical regulatory workflows, candidate identification workflows. Once those are native to Claude, the switching cost for a pharma company is no longer "which model do I use?" It's "do I want to rebuild three years of operational integration?" That's a very different question, with a very different answer.

The thing that's different this time is the speed of the lock-in. In the SAP era, implementation cycles ran 18 to 36 months. That was your window to reconsider vendor choices. In an AI-native workflow, a team can go from evaluation to deeply embedded in 90 days. The vertical integration moat builds faster than enterprise procurement can respond. I've seen this catch companies flat-footed before. I'm watching it happen again.

The question worth sitting with this weekend: which of the AI tools your team runs today would be genuinely painful to replace? Not inconvenient. Genuinely painful. That list is probably longer than you think.

Sign-Off

The AI stack war moved inside the labs this week. Acquisitions, model launches, leadership shifts, and a supply chain breach all pointing in the same direction: the competition isn't just between labs anymore. It's between ecosystems, workflows, and whoever gets to own the layer your team can't easily replace.

Hit reply and tell me. I read every response. Darin

Weekly AI and cloud breakdowns from someone who's been in the game since the early days of the internet. No ads. No filler. The signal.

Anthropic had the worst week and still won

Darin Deters — Sat, 28 Mar 2026 12:46:22 GMT

The "Explain It Like I'm Paying the Bill" Version

Anthropic confirmed its most powerful model yet, codenamed Mythos, after internal documents leaked online this week. The company turned a security headache into a product announcement.
A federal judge blocked the Pentagon from labeling Anthropic a national security risk, ruling the government's action was First Amendment retaliation. That ruling is a bigger enterprise sales asset than any product launch.
OpenAI shut down Sora, its video generation app, just months after launch. The compute that powered it will be redirected to coding and reasoning. That tells you exactly where OpenAI thinks the money is.
Microsoft is on track for its worst stock quarter since 2008, with investors questioning whether its AI infrastructure bets will pay off. Capital expenditures nearly doubled year-over-year to $29.9 billion in Q2 FY2026.
Global cloud infrastructure spending hit $110.9 billion in Q4 2025, up 29% year-over-year. The buildout is not slowing down, but the pressure to show returns is intensifying.
Wikipedia voted to ban AI-generated content from its encyclopedia, with two narrow exceptions: AI-assisted translations and minor copy edits. The open web's most trusted source drew a clear line.
Meta told employees this week that 65% of engineers must be using AI coding tools by the end of H1 2026. That is a mandate, not a suggestion.

The Take That Started the Week

Anthropic had a month that would have broken most companies. Fourteen product launches. Five service outages. Internal documents leaked publicly. A presidential administration put the company on a supply chain risk list. By any normal measure, this should have been a brand implosion.

It wasn't. By the end of the week, a federal judge had ruled that the Pentagon's action was unconstitutional retaliation for protected speech. The leaked documents, far from revealing damaging secrets, confirmed what enterprise buyers were hoping to hear: Anthropic is building something genuinely powerful and has a roadmap ambitious enough to be worth protecting. The company confirmed the existence of Mythos, its most capable model yet, after the leak forced its hand.

There is a lesson buried in this for anyone watching the AI vendor market. Anthropic's value to enterprise buyers is not just model performance. It is the signal that the company will fight to stay independent. A government blacklist, when successfully challenged in court, becomes proof of vendor backbone. Every Fortune 500 legal and procurement team watching that ruling saw the same thing: Anthropic does not fold under political pressure. In a market where vendor lock-in is a real risk and regulatory environments are shifting fast, that posture has dollar value.

The chaos did real damage. Five outages in a single month is not a footnote. it is a reliability problem that engineers running production workloads have to explain to their managers. Anthropic has to fix that. But the narrative of the week was resilience, not collapse. That matters more in the enterprise than it should.

Cloud Roundup

AWS

No headline AWS announcements this week, but the $110.9 billion Q4 2025 global cloud infrastructure number puts the hyperscaler buildout in stark relief. AWS holds roughly 30% of that market. Enterprise AI demand shifting from experimentation to production deployment is the engine behind the numbers, and AWS is expanding infrastructure capacity accordingly. For practitioners: the capacity is there, but so is the cost pressure from finance teams who want to see ROI timelines.

Azure

Microsoft's stock tells a story the earnings call won't fully capture. The company is on track for its worst quarter since 2008, and the culprit is not weak fundamentals. Azure grew 39% in Q2 FY2026. The problem is investor math: capital expenditures nearly doubled year-over-year to $29.9 billion. At that spend level, the market wants a returns timeline, and Nadella does not have a clean one yet. The Perplexity cloud deal and the OpenAI partnership are bets, not revenues. The next 90 days of earnings guidance will be closely watched.

GCP

Google Cloud is the quiet beneficiary of the Microsoft uncertainty and the Anthropic noise. GCP does not have the OpenAI partnership baggage or the Anthropic political drama. For enterprise procurement teams looking for a neutral lane, GCP is the option that does not come with a news cycle attached. That is a positioning advantage worth watching.

AI Model Roundup

OpenAI

OpenAI shut down Sora this week. The iOS app, the API, and the Sora.com experience are all going offline, though the company has not published a final shutdown date. The stated reason is compute reallocation. Sora consumed significant GPU capacity that can generate more revenue in coding, reasoning, and text generation. OpenAI raised $110 billion in fresh funding just weeks ago at a $730 billion valuation. Shutting down a consumer product that topped the App Store charts is a clear signal about where the company thinks durable revenue lives. Video generation research will continue internally for robotics training and simulation, which is a sensible narrowing.

Anthropic

The Claude Mythos confirmation is the week's biggest model news, even if the delivery was unplanned. Beyond the leak, the federal court ruling is the story that enterprise buyers will remember. Anthropic can now say, with a federal court order behind it, that it defended its right to operate independently against a sitting administration. That is a data point that belongs in every enterprise procurement brief about AI vendor risk.

Google AI

No major standalone Google AI model announcements this week. Google's position sits beneath the surface of the bigger stories. its infrastructure fuels a significant portion of the AI workloads generating the cloud spending numbers, and Google DeepMind continues its research publishing cadence. The absence of a headline event this week is not a problem for Google. It is a week where the competitors made the news.

The Pattern I'm Watching

I have been watching vendor consolidation cycles for 30 years, and they almost always follow the same sequence. First, a wave of new entrants floods the market with capabilities. Then a stress event. a real one, not a press release. separates the companies that can operate under pressure from the ones that only perform under ideal conditions. Then enterprise buyers sort themselves into camps based on who passed the stress test.

We are in that second phase right now. Anthropic's March was a stress test. Five outages, a government blacklist, a data leak. The company came through it with a court ruling affirming its independence and a product roadmap that looks stronger for having been forced into daylight. OpenAI's Sora shutdown is a different kind of stress test. not a crisis, but a discipline test. Can a company valued at $730 billion make the call to kill a popular consumer product because it is not the highest-value use of its compute? Apparently yes. That kind of resource discipline is what separates labs that scale from labs that sprawl.

The third phase. enterprise sorting. is underway. Meta's 65% AI coding tools mandate is not a technology story. It is an organizational commitment story. Meta is betting its engineering velocity on AI-assisted development, and the mandate forces adoption rather than waiting for organic enthusiasm. Wikipedia banning AI-generated content is the counterweight: the institutions that care most about accuracy and trust are drawing lines. These two moves will coexist for years. The question worth sitting with: in the organizations you work in or advise, which camp are they moving toward. mandate or moratorium?

Weekly AI and cloud breakdowns from someone who's been in the game since the early days of the internet. No ads. No filler. The signal.

Tech with Darin - Weekly Rollup March 21, 2026

Darin Deters — Sat, 21 Mar 2026 13:31:29 GMT

The Bottom Line (No Jargon Edition)

Google quietly reversed its 2018 "no military contracts" position and is now openly telling employees it is "leaning more" into Pentagon work. That shift is reshaping who gets hired and what they build.
A popular open-source security scanner called Trivy was backdoored this week. The attackers injected credential-stealing malware into the tool teams use to find vulnerabilities. That is a significant escalation. The weapon is now the defense tool itself.
Nvidia CEO Jensen Huang projected $1 trillion in AI chip orders through 2027 at the company's annual developer conference. One manufacturer reported Nvidia is already supplying 20% fewer chips than the market needs.
Andy Jassy told AWS employees he now expects AI to push AWS annual revenue to $600 billion by 2036. That is double what he projected a year ago.
OpenAI plans to nearly double its workforce to 8,000 employees by year-end and is preparing for an IPO. The company also told staff ChatGPT needs to become a genuine productivity tool, not a demo.

The Take That Started the Week

Jensen Huang did not come to GTC 2026 to be modest. He stood on stage and projected $1 trillion in Blackwell and Vera Rubin chip orders through 2027. For context, he projected $500 billion through 2026 a year ago. He doubled the number in twelve months. That is not a forecast. That is a statement of market position.

The agentic AI thesis is the engine behind it. When AI shifts from a chatbot you open to a system that spawns agents, executes tasks, and calls APIs without you watching, the compute demand does not grow linearly. It compounds. Every AI agent running inference at scale needs hardware. Huang knows exactly what that arithmetic looks like.

Here is the part that matters for practitioners. MSI reported this week that Nvidia is delivering approximately 20% fewer GPUs than the market currently demands. A $1 trillion projection with a 20% supply gap is not a growth story. It is a rationing story. Teams that secure hardware capacity now are not optimizing costs. They are securing the ability to build at all. The labs that have locked in chip supply agreements are operating in a different competitive environment than the ones queuing for spot capacity.

The Nvidia-Anthropic hiring debate that ran through the week was the visible surface of this constraint. The real pressure is not talent. It is compute. When hardware is scarce, every decision downstream of that scarcity gets distorted. Hiring shortcuts, evaluation compromises, build-versus-buy trade-offs. Watch this dynamic. It shapes the next 18 months more than any model release.

Cloud Roundup

AWS

Andy Jassy's internal all-hands comment landed in Reuters this week. He told employees AI could push AWS annual revenue to $600 billion by 2036, doubling his prior estimate of $300 billion. That revision says more about how Jassy reads the agentic AI transition than any earnings call. When inference demand compounds and every enterprise workload starts attaching AI agents, the cloud bill goes up. AWS is the infrastructure that bill gets charged to.

AWS also announced its 2026 Pioneers cohort: 12 European AI startups working across healthcare diagnostics, climate modeling, and conflict prediction. Alongside that, AWS committed $1 billion in cloud credits for startups developing generative AI solutions. The startup credit play is a long-game customer acquisition strategy. Seed the ecosystem now, collect the revenue when those companies scale.

Azure

Microsoft's Copilot AI leadership reshuffled this week, freeing Mustafa Suleyman to focus on building new models. The structural read: Microsoft is separating the "ship Copilot features into Office" work from the "build the next generation of models" work. Those two tracks have very different timelines and success metrics. Watch whether that separation produces sharper output or slower coordination.

OpenAI's partnership with AWS to supply AI models to the U.S. military and government also surfaced this week. Microsoft's exclusive relationship with OpenAI on commercial Azure workloads coexists, somewhat uncomfortably, with OpenAI doing its own government deals. The boundaries of that partnership are getting tested.

GCP

Google published its "Personal Intelligence" rollout this week, bringing personalized Gemini responses into Chrome and AI Mode for free users. The feature pulls context from a user's Google ecosystem data to generate more relevant answers. Google is also testing a Gemini Mac app to put it in the same desktop shortcut slot as ChatGPT and Claude.

The bigger story is the Pentagon move. The New York Times reported this week that Google is quietly rebuilding its Defense Department relationship after walking away in 2018 following employee protests over Project Maven. Google is now telling staff that working with democratically elected governments is part of its obligations. That is a complete philosophical reversal, and it is happening fast.

AI Model Roundup

OpenAI

OpenAI expanded GPT-5.4 access with faster Mini and Nano model variants this week. The model ladder strategy is now clear: large frontier models for complex tasks, small fast models for high-volume inference. That architecture matches how agents actually get deployed. The flagship model reasons. The mini model executes at scale.

The workforce news is the structural signal. Plans to nearly double headcount to 8,000 by year-end, combined with IPO preparation, means OpenAI is no longer running like a research lab. It is running like a company with quarterly pressure and investor commitments. The internal directive to make ChatGPT a "productivity tool" reflects that. Research culture and revenue culture pull in different directions.

Anthropic

Anthropic stayed quieter on releases this week, but it sat in the center of the labor market debate. The Nvidia-Anthropic hiring story surfaced questions about evaluation standards under capacity pressure. When compute scarcity forces build timelines to compress, the teams doing the building get squeezed. Anthropic's position in that dynamic is interesting: it is one of the best-resourced frontier labs and still feels the pressure of the hardware constraint.

Google AI

Gemini's Personal Intelligence rollout is Google's answer to the ambient AI question. ChatGPT has the brand. Claude has the trust signal with technical users. Google has the data ecosystem. Personal Intelligence is the move that plays to Google's actual advantage: knowing more about you than any other platform on earth. The Mac app push is table stakes. The data play is the moat.

Google also rolled out Gemini integration into Workspace this week, enabling the model to generate first drafts in Docs, build spreadsheets, and design presentations from simple prompts. The office suite integration race is now fully engaged. Microsoft Copilot has been shipping this for 18 months. Google is closing the gap.

The Pattern I'm Watching

Here is what this week looked like when you zoom out: three unrelated stories all pointed at the same underlying shift. Google reverses course on defense contracts. A security scanner becomes an attack vector. Nvidia projects a trillion-dollar chip market while supply runs 20% short. On the surface, those stories live in different domains: policy, security, and hardware. The through-line is that infrastructure decisions are now strategic in ways they were not two years ago.

I watched this pattern play out before. In the mid-1990s, network infrastructure went from a back-office cost center to a competitive weapon. The companies that treated bandwidth, routing architecture, and physical co-location as strategic priorities pulled ahead. The ones that treated those as commodity procurement problems lost the decade. The current moment rhymes. The teams treating GPU allocation as a strategic question, treating their security tooling supply chain as a threat surface, and watching how their cloud providers are positioning on government contracts will have different options than the teams that are not.

The Trivy compromise is the one I keep coming back to. The thing that got backdoored was the tool designed to find backdoors. That is not a security failure with a patch. That is a structural challenge with no clean fix. Your security posture is only as reliable as the integrity of the tools you use to measure it. Thirty years in, that feels like the most underpriced risk in the current environment.

What is the infrastructure decision your team is treating as a commodity problem that you should be treating as a strategic one?

Sign-Off

Infrastructure used to be a decision you made and then mostly forgot about. That era ended this week, if it had not already. The questions your team is answering about where your compute lives, who built your security toolchain, and which government contracts your cloud providers are chasing are not IT decisions anymore. They are business decisions with 5-year consequences.

Hit reply and tell me. I read every response. Darin

Weekly AI and cloud breakdowns from someone who's been in the game since the early days of the internet. No ads. No filler. The signal.

When the Model Is No Longer the Product

Darin Deters — Fri, 20 Mar 2026 21:33:31 GMT

OpenAI bought Astral this week. Astral makes ruff and uv, two Python developer tools that the community adopted not because anyone marketed them, but because they are genuinely excellent. Ruff replaced an entire category of Python linters. uv replaced pip in enough real environments that "just use uv" became the default answer in engineering Slack threads. Astral's founder Charlie Marsh confirmed the deal would bring those tools into OpenAI's Codex platform, with a commitment to keep the open-source tools alive post-acquisition.

The headline will get filed under "AI company buys developer tool startup." That framing misses the point entirely.

1. The Headline Everyone Is Talking About

OpenAI is not buying a Python linter. OpenAI is buying the trust that lives inside every Python developer's .venv folder.

That trust took years to build. Charlie Marsh and the Astral team earned it by shipping tools that were faster, more opinionated, and less annoying than the incumbents. The developer community rewarded that by making ruff and uv near-defaults in serious Python projects. OpenAI wrote a check for that installed base, that credibility, and the engineers behind it.

But the Astral deal is one piece of a pattern that became hard to ignore this week.

The same 72-hour window brought us the WSJ and Verge reporting that OpenAI is planning a desktop superapp. ChatGPT, Codex, and the Atlas browser unified into a single product under Fidji Simo, OpenAI's Chief of Applications. Greg Brockman is involved. The internal memo framing, per Simo's post on X: "when new bets start to work, like we're seeing now with Codex, it's very important to double down." Reuters reported separately that OpenAI is building a GitHub alternative. And the Windsurf acquisition. the AI-native IDE that was in talks with OpenAI for months. reportedly stalled because Microsoft wanted access to Windsurf's IP to protect GitHub Copilot.

Read those stories individually and they're interesting. Read them together and you see the architecture of a developer platform.

2. What Happened Last Time

In 1995, Microsoft was already the dominant PC operating system vendor. But Bill Gates made a decision that locked in the next decade of developer loyalty: Microsoft would own the entire developer experience, not just the OS.

Visual Basic had been around since 1991. Visual C++ was shipping. In 1995, Microsoft shipped Visual Studio 97, bundling everything into a single IDE. They bought Fox Software for FoxPro. They acquired Vermeer Technologies for FrontPage. They partnered with and then slowly absorbed the tooling that developers used every day. The strategy was never about the individual tool. It was about making the cost of switching away from Windows feel prohibitive because every tool you depended on, every debugger, every deployment utility, every source control client, was stitched into the Microsoft fabric.

By the time the DOJ antitrust case was in full swing, the developer lock-in was already complete. Developers didn't stay on Windows because they loved Windows. They stayed because their entire workflow was Windows. The OS was the platform. The platform was the tools. The tools were the moat.

Here is the number that puts that era in context: by 1997, Visual Studio had an estimated 7 million licensed users. Microsoft's developer tools division was generating over $1 billion in annual revenue. That was before the internet reshaped everything.

The second parallel worth naming is IBM's acquisition of the Rational Software portfolio in 2003. IBM bought Rational for $2.1 billion, not because it needed another software product line, but because Rational owned the workflows. Rose, RequisitePro, ClearCase. if your team ran on Rational tools, you ran on IBM tools, and IBM's consulting and services arm was right there to help you do more of that. The tools created the relationship. The relationship created the revenue.

Both stories have the same structure. The platform company identifies where developers spend their time. It acquires the trust that already lives there. It connects that trust to a broader surface area. Then it waits.

3. What Is Different This Time

Three things are different, and they matter.

The speed is different. Microsoft's developer platform consolidation took roughly a decade to reach critical mass. OpenAI's version is happening in months. Astral's tools have been widely adopted for less than three years. Claude Code went from zero to being the benchmark that Codex chases in about eighteen months. The cycle that used to take a decade now takes a product cycle or two.

The moat target is different. In the 1990s, Microsoft was locking in the workflow around writing code. OpenAI is locking in the workflow around writing code with AI. That is a larger surface area. It includes the model, the IDE, the linter, the dependency manager, the code reviewer, the browser, and soon the code repository. Every layer that a developer touches in a day is now a layer that OpenAI is trying to own or influence. If they succeed, switching away from OpenAI won't mean "install a different IDE." It will mean "rebuild your entire cognitive workflow."

The incumbent threat is more direct. When Microsoft built its developer platform, the competition was fragmented. Borland, Watcom, Metrowerks. nobody had a coherent counter-strategy. The competition OpenAI faces in 2026 is much sharper. Anthropic launched a code review tool this month and committed $100 million to a Claude Partner Network. Claude Code's ARR is part of Anthropic's $2.5 billion+ total business. Anthropic is not fragmented. It is making the same bet on developer trust from a different angle. The $15-25 per pull request code review pricing is a stake in the ground that says: Claude belongs in your deployment pipeline, not just your editor.

Meanwhile Microsoft, which owns GitHub and has powered GitHub Copilot with OpenAI models since 2021, is watching its own partner turn into its most direct competitor. The Windsurf situation crystallized that tension. OpenAI wanted the IDE. Microsoft wanted the IP. The stalemate is a window into how much the relationship has frayed.

The regulatory and open-source dimension is also different. Microsoft could acquire developer tools in the 1990s without much scrutiny and without community blowback. Charlie Marsh's promise to keep ruff and uv open-source after the Astral deal closes is a direct response to that difference. The developer community in 2026 has cultural antibodies to platform capture. They remember what happened to tools that got acquired and then quietly deprioritized. OpenAI is making a calculated bet that it can absorb Astral's trust without triggering the immune response. That bet is not guaranteed to pay off.

4. The Practitioner Playbook

This is the section I'd want if I were an engineering leader right now. Not the analysis. The decisions.

If you run a cloud platform or infrastructure product:

Your moat just thinned. OpenAI building a GitHub alternative and a superapp that wraps the browser is a direct bid for the compute-adjacent developer relationship that AWS, Azure, and GCP have spent a decade cultivating. The developer who used to start from the cloud console now starts from Codex. If OpenAI succeeds in making Codex the place where code is written, reviewed, deployed, and iterated on, the cloud provider becomes a commodity below that layer. This is not a threat that materializes overnight. But the architectural direction is clear. Start thinking now about what you own that OpenAI cannot easily replicate. Execution, compliance, data locality, enterprise relationships. Those are the levers. Positioning on raw compute won't hold forever.

If you build developer tooling or a security product that sits in the dev workflow:

You are being targeted. Not personally, but structurally. The integrated platform play always squeezes independent tool vendors. In the Microsoft era, Borland built great C++ tools. Microsoft made Visual C++ good enough and bundled it free. Borland didn't lose because Microsoft's product was better. It lost because developers didn't need to pay separately for something that shipped with the platform. Watch how OpenAI treats third-party integrations with Codex over the next 18 months. If they start replicating capabilities that partners built, that's the tell. Start building your differentiation around workflow depth and enterprise integration, not around the surface-level feature set that an AI platform can ship in a sprint.

If you are a technology leader evaluating toolchain strategy:

Do not consolidate onto any single AI platform right now. The competition between OpenAI and Anthropic is real and it is running hot. That competition is good for you today. It means aggressive pricing, generous rate limits, and both platforms making concessions to win enterprise deals. Claude Code users in early 2026 were reporting over $1,000 of effective usage against $200/month plans. That is OpenAI and Anthropic buying market share. Lock in those rates where you can, but architect your workflow so you can move. The team that builds deep platform dependency today will be the team renegotiating from a weak position in 24 months.

If you are thinking about the Microsoft relationship specifically:

The OpenAI-Microsoft partnership is under structural strain. Microsoft now derives roughly 45% of its remaining performance obligation from OpenAI. That is a dependency, not a partnership. Microsoft shipping Copilot Cowork on Anthropic's technology is a hedge. OpenAI building a GitHub alternative is a hedge in the other direction. Both companies are covering their exits while keeping the partnership alive because neither can afford to blow it up yet. If you are a Microsoft enterprise customer, start asking your rep how Microsoft's AI strategy works if the OpenAI relationship changes materially. The answer to that question will tell you a lot about where Microsoft's actual differentiation sits.

The Pattern Underneath All of This

Here is the 30-year read.

Every major technology cycle ends the same way. The enabling layer, whether that's the operating system, the cloud, or the AI model, gets commoditized or consolidated, and the winner is the company that owns the workflow on top of it. Microsoft owned the workflow in the PC era. AWS owned the workflow in the early cloud era. Salesforce owned the workflow in the CRM era.

OpenAI is trying to own the workflow in the AI-native development era. Acquiring Astral is one move on that board. The superapp is another. The GitHub alternative is a third. Taken together, they form a coherent theory of where the value accretes.

The model is not the product. The model is the engine. The product is the complete surface area where a developer does their work. OpenAI figured that out faster than most people expected.

The companies that lose in this cycle will be the ones that kept betting on the model being the moat. The companies that win will be the ones that got to the workflow first.

Astral was the workflow. Now it's OpenAI's.

Hit reply and tell me what you're seeing in your own stack. I read every response.

. Darin

The Week AI Got a Geopolitical Problem

Darin Deters — Sat, 14 Mar 2026 18:46:55 GMT

The Bottom Line (No Jargon Edition)

The Pentagon banned Anthropic, then 185,000 people downloaded Claude in 24 hours. A government agency said a company’s AI product was off-limits for national security reasons. Consumer reaction: record downloads. Government procurement and consumer behavior are now moving in opposite directions. That gap has real consequences for enterprise AI strategy.

Private capital moved in immediately. Blackstone, the firm that manages over a trillion dollars in assets. entered talks with Anthropic to deploy Claude across its portfolio the same week the Pentagon ban landed. When governments shut the door on a tech company, the money doesn’t wait. It finds the next door.

Nvidia made the biggest strategic bet of the year. The company that sells chips to every AI lab in the world is now spending $26 billion to build AI models itself. competing directly with the same labs it powers. It also pre-announced NemoClaw, an open-source agent platform. If you thought Nvidia was just a hardware company, that framing is now out of date.

Iran struck AI datacenters in the Gulf. Hyperscale cloud infrastructure became a military target this week in a way that is no longer theoretical. The cloud resilience architecture most teams have built assumes power outages, hardware failure, and software bugs. Geopolitical conflict is a different threat model. Most DR plans don’t account for it.

LinkedIn rebuilt its feed algorithm using large language models. The platform that distributes most practitioner-written content quietly upgraded the system that decides what gets read. If your content strategy relied on keyword frequency or fast engagement signals, the game just changed. Writing from real experience is not just the honest approach anymore. it’s now the algorithmic one.

Atlassian cut 1,600 people, mostly R&D engineers, to fund AI tooling. The stock barely moved. The market has already decided how it values engineering headcount relative to AI tooling spend. This is not an Atlassian story. It’s a pattern running across Salesforce, Microsoft, Google, and every major enterprise software company. It matters if you work in software R&D.

Google picked up the Pentagon’s AI contract after Anthropic walked away. Three million government employees now have access to Gemini. Anthropic refused the contract over safety guardrails. Google accepted it. The divergence in how these two companies handle government AI is no longer subtle.

The connecting thread: Government AI policy is moving on its own timeline, and it is increasingly out of sync with both consumer adoption and private capital deployment. The companies navigating that misalignment best are the ones that will define enterprise AI in 2026 and beyond.

The Take That Started the Week

I have been watching the relationship between governments and technology companies for three decades. There is a pattern, and it has played out the same way every time.

Phase one: The government has a problem. The tech company has a solution. The government gets excited. Phase two: The government puts conditions on the relationship that the tech company finds unacceptable. The tech company pushes back. Phase three: One side blinks, or they both walk away. Phase four: The technology becomes so mission-critical that the dispute ends on whatever terms the market dictates.

Anthropic’s Pentagon situation is in phase two right now. The Pentagon wanted Anthropic to modify Claude’s safety guardrails to support classified military work. Anthropic said no. The Pentagon said fine, we’ll use Gemini. One hundred eighty-five thousand people downloaded Claude in the 24 hours that followed.

That last number is the one that doesn’t fit the pattern. In every previous cycle. telecommunications, encryption, cloud access. consumer adoption followed the enterprise lead. Governments made decisions and the market followed. This time, consumers did the opposite of what the government signaled. They ran toward the product the government said to avoid.

I think this is because AI is the first technology that people use personally before they encounter it professionally. Most practitioners first used Claude or ChatGPT on their own phone, on their own time, for their own curiosity. The trust relationship was already built before any enterprise procurement decision happened. When the Pentagon said stop using it, people who already trusted it didn’t stop.

What that means for practitioners is this: the AI vendor you’re evaluating for enterprise deployment is almost certainly one your team is already using personally. The evaluation process is not objective. It never was. But now the personal adoption path is running so far ahead of the enterprise procurement path that the gap has become a governance problem. Which model is actually running inside the tools you buy? Which vendor’s API is your vendor calling? The supply chain question for AI is genuinely harder than most teams have answered.

Anthropic’s move refusing the Pentagon contract, picking up the Blackstone deal is the most interesting strategic positioning I have seen from an AI company this year. They are explicitly betting that private capital markets, not government procurement, are the right distribution channel for a safety-forward AI company. Whether that bet pays off depends on how much enterprise buyers actually care about safety positioning versus capability. Based on the last two years, capability wins every head-to-head. But Anthropic is betting 2026 is different. I am not sure I would take that bet, but I understand why they made it.

Cloud Roundup

AWS

The AWS story this week is largely the shadow of the Amazon-OpenAI $50 billion investment announced last month. still reverberating through enterprise procurement conversations. AWS is positioning Bedrock as the infrastructure layer for running models regardless of which lab wins the model race. The Graviton strategy applied to AI: don’t win on the model, win on the runtime. That is a defensible moat if the model market commoditizes. The question is whether any single model stays dominant long enough to matter before the next one arrives.

Secondary: AWS Trainium continues to pick up adoption from customers who want to avoid GPU dependency. No major announcements this week, but the competitive pressure on Nvidia from Amazon’s custom silicon is a slow-moving story that practitioners in large AWS shops should be watching.

Azure

Microsoft’s positioning this week is quietly strong. They have Copilot embedded in Microsoft 365 across the enterprise. They have an Anthropic licensing deal that gives them access to Claude inside Azure. They have the OpenAI relationship. Two model relationships, one distribution channel, one invoice. For enterprise IT leaders who need to justify a single platform decision, Microsoft’s AI story is the easiest one to tell to a procurement committee.

The M365 E7 tier at $99 per user per month is the vehicle for that story. If your organization is already paying for M365, the incremental AI cost is increasingly hard to resist. even if the per-seat economics feel steep. Watch for Microsoft to push harder on that conversion in Q2.

GCP

Google had a significant week. The Wiz acquisition closed. the $32 billion deal that gives Google cross-cloud security visibility across AWS, Azure, and GCP workloads. The strategic intent is clear: own the security layer that enterprise customers need regardless of which cloud provider they’re on. AWS has GuardDuty. Microsoft has Defender. Google now has Wiz. Security and cloud infrastructure are merging.

The Pentagon Gemini deployment is the other major development. Three million government users is not a rounding error. It establishes Google as the enterprise and government AI vendor of record in a way that was not clear before Anthropic walked away. Whether Google can hold that position against a potential Microsoft challenge is the question for Q2.

AI Model Roundup

OpenAI

The classified military deal and the Caitlin Kalinowski resignation are the dominant OpenAI story this week. The hardware lead walking out is a talent signal, not just a policy signal. When a senior technical executive leaves over a values question, the internal debate was real and it was not settled cleanly. ChatGPT uninstalls up 295% is a consumer signal. Those two numbers together. talent exit plus consumer rejection. are worth tracking over the next 90 days.

The Promptfoo acquisition also closed. OpenAI now owns the tool that 125,000 developers use to red-team AI systems, including OpenAI’s own models. The conflict of interest critique is valid. The strategic logic is also valid. Both things are true.

Anthropic

Two things happened simultaneously this week that would have seemed contradictory 12 months ago: Anthropic lost a major government contract and began talks with Blackstone for what could be a much larger private capital deployment.

Claude Code is at $2.5 billion in annualized revenue. The zero-commission Claude Marketplace is live with six enterprise partners. The Pentagon ban created more consumer downloads than any marketing campaign Anthropic has run. By most measures, the company is in a stronger market position today than it was before the Pentagon dispute started. That is an unusual thing to say about a company that just walked away from a government contract, but the numbers support it.

Google AI

Gemini is now deployed across Google Workspace for enterprise users. Three million Pentagon employees. The head-to-head with Microsoft Copilot is no longer theoretical. it is active in millions of enterprise and government seats simultaneously. Google’s AI distribution story is better than its AI model story, which is exactly the right position to be in if you believe model commoditization is inevitable.

The Pattern I’m Watching

There is a word for what is happening across Anthropic, OpenAI, Microsoft, Google, and now Nvidia this week: vertical integration. Every major player is trying to own more of the stack simultaneously.

Nvidia goes from chips to models to the agent platform. OpenAI goes from models to security testing tools to classified military deployments. Anthropic goes from models to a marketplace to private capital joint ventures. Google goes from models to workspace distribution to government deployments to cloud security. Microsoft goes from cloud to workspace to model licensing to government contracting.

I have seen this pattern before. In the late 1990s, every major enterprise software company tried to own the database, the middleware, the application layer, and the consulting services simultaneously. Most of them failed. The ones that survived did so by dominating one layer so thoroughly that the others became defensible territory. not by winning everywhere at once.

The AI version of this is playing out faster than any previous cycle. The question I keep asking is: which company actually has a monopoly on one layer? Nvidia has the closest thing. GPU training dominance. and they are now voluntarily exiting that monopoly position to compete on models and platforms. That is either the most confident strategic move in tech history or a sign that they know their training moat is shakier than it looks.

Thirty years in, I have learned that companies that try to own the full stack at the same time usually end up owning none of it well. The counter-examples are memorable precisely because they are rare. Is any of these companies Apple? I genuinely do not know yet. But that is the question that will define the next five years.

Weekly AI and cloud breakdowns from someone who’s been in the game since the early days of the internet. No ads. No filler. The signal.

The 30-Year Pattern: Why I Held a Certification Exam Guide Against My Own System

Darin Deters — Fri, 13 Mar 2026 13:12:11 GMT

Every certification I’ve earned has taught me something. But not always the thing on the exam.

The Week AI Stopped Competing and Started Converging

Darin Deters — Sat, 07 Mar 2026 14:02:39 GMT

AI + Cloud — Week of March 3, 2026

The Bottom Line (No Jargon Edition)

AWS is spending $50 billion to run OpenAI’s software on its computers. That’s one of the biggest tech infrastructure deals ever. Think of it like a massive factory being built — not to make the product, but to house the machines that make the product. AWS wants to be the building where AI lives.
OpenAI released a smarter version of its AI assistant (GPT-5.4). When tested against Anthropic’s Claude on a real business project, GPT-5.4 honestly admitted Claude did a better job at the first draft. That kind of self-awareness in AI is new and surprisingly useful — it means the tools are getting better at knowing their own limits.
An AI assistant found 22 security holes in Firefox’s code in two weeks. The encouraging part: it could barely exploit any of them. Modern security protections stopped it almost every time. Translation: AI is about to make software much safer by finding problems faster, while existing defenses still work.
Three companies launched AI “coworker” products in the same week. Anthropic, AWS, and Google all moved their AI from answering questions to doing actual work — scheduling tasks, writing code, managing files. The shift from “chatbot” to “autonomous assistant” happened faster than anyone expected.
Google released a cheaper, faster AI model (Gemini 3.1 Flash-Lite). At a fraction of the cost of premium models, this gives smaller teams and startups access to capable AI without enterprise budgets. The price of intelligence keeps falling.

The connecting thread: This was the week AI stopped competing on who’s smartest and started competing on who’s most useful. The infrastructure, not the intelligence, is becoming the battleground.

The Take That Started the Week

This week I published a piece about something I watched happen in real time: three companies — Anthropic, AWS, and Google — all made the same move within days of each other. They shifted AI from chatbot to coworker.

Anthropic launched scheduled tasks in Cowork. AWS shipped Bedrock agents with stateful runtime environments. Google expanded Gemini’s workspace integrations. All of them moving in the same direction: AI that does work, not just answers questions.

I’ve been building and operating systems for over 30 years. The pattern here is identical to what happened with DevOps, then containers, then serverless. The raw capability commoditizes fast. What differentiates teams is the harness — the constraints, feedback loops, and observability layers that turn raw capability into reliable output.

The teams already winning with AI agents aren’t the ones with the best model. They’re the ones who built the best control systems around the model. Guardrails that prevent hallucinations from reaching production. Feedback loops that improve output quality over time. Monitoring that catches drift before it becomes a problem.

This is the control-layer-as-moat thesis I keep coming back to. The model is the engine. The harness is the car. Nobody buys an engine without a car.

I’ve watched this exact fork happen with virtualization, containers, cloud, and now AI. Depth wins every time. The timeline just keeps compressing.

Cloud Roundup: March 2026

AWS had its biggest infrastructure week in recent memory — and it wasn’t a re:Invent.

The headline: a $50 billion, multi-year deal to host OpenAI on AWS infrastructure. Initial commitment is $15 billion. The practical impact for practitioners is already landing — Amazon Bedrock now has a Stateful Runtime Environment and an OpenAI-compatible Projects API, bringing better context management, access control, and cost tracking.

This is the Graviton playbook applied to AI. Own the substrate. Make the workloads sticky. AWS isn’t building frontier models — they’re building the platform where everyone else’s models run. Same strategy, different decade.

Also worth flagging: MediaConvert’s Probe API hit GA (rapid metadata analysis without full processing — useful for video pipelines), and AppConfig’s New Relic integration now enables automated feature flag rollbacks in seconds instead of minutes. Both are the kind of quiet operational upgrades that add up.

Azure had a quiet first week of March. No major GA releases or pricing changes hit the practitioner radar. Sometimes the most useful thing to report is that nothing broke and nothing changed — stability has value too.

GCP was similarly quiet this week on the infrastructure side. The bigger Google news was on the AI model side (see below).

AI Model Roundup: March 2026

OpenAI shipped GPT-5.4 Instant on March 4 — better conversational flow, improved web search integration, and notably fewer refusals. The model is more direct, which matters for production workflows where over-cautious responses slow down real work.

But the story I’m most interested in isn’t the benchmark improvement. I tested GPT-5.4 against Claude Opus 4.6 on an actual client proposal — not a toy task, a real business deliverable built from rough meeting notes. Then I asked GPT-5.4 to score both outputs honestly.

GPT-5.4’s self-assessment: Claude’s first draft: 8.5/10. Its own: 7/10. But as a foundation for the final SOW? GPT-5.4 rated itself 8.5 to Claude’s 8.

GPT-5.4’s own words: “Your draft is the better first draft. It reads like something a human would actually circulate.” But it also noted: “My version was weaker as a first draft, but stronger as a don’t-miss-anything scaffold.”

That calibration is new. Earlier GPT versions would have rated themselves higher. The willingness to honestly assess relative strengths is a more important capability improvement than any benchmark delta. It means you can actually trust the model’s self-evaluation when deciding which tool to use for which stage of the work.

Anthropic had a week that demonstrated range.

On the product side: Cowork launched scheduled tasks — browser-based AI that runs on a recurring schedule without human intervention. I’ve been using it to automate my entire daily content pipeline: four stages from 5 AM research to 7 PM engagement. The coupling of scheduled automation with browser context is genuinely new.

On the security side: Anthropic partnered with Mozilla to test Claude against Firefox’s codebase. In two weeks, Claude found 22 vulnerabilities (14 high-severity) — nearly one-fifth of all high-severity Firefox bugs fixed in 2025. The first one was found in 20 minutes.

But here’s the nuance that matters: despite finding 22 bugs, Claude could only exploit 2 of them — and only in test environments without browser sandboxing. Anthropic spent $4K in API credits on exploitation attempts. The defender’s advantage is real: AI finds vulnerabilities much faster than it can exploit them. Defense-in-depth works. That’s the most important finding in this research.

Google AI released Gemini 3.1 Flash-Lite on March 4 — a cost-optimized model at $0.25/M input tokens and $1.50/M output tokens. This is Google’s play for the high-volume, cost-sensitive workloads that can’t justify premium model pricing.

The pricing strategy is clear: make the entry point so cheap that teams default to Google for their bulk inference needs, then upsell to Pro for the complex tasks. It’s the classic cloud pricing playbook — free tier hooks, volume tier retains.

The Pattern I’m Watching

Three signals from this week all point the same direction:

AWS invested $50 billion not to build AI models, but to host them.
GPT-5.4 honestly scored Claude higher than itself on a first-draft task.
Claude found vulnerabilities in Firefox’s codebase faster than any human team could — but couldn’t exploit them.

The model layer is commoditizing. When GPT rates Claude higher on some tasks and Claude rates GPT higher on others, the question “which model is best?” loses meaning. The answer is always “it depends on the task.”

The infrastructure layer is concentrating value. AWS hosting OpenAI is the same signal as AWS hosting Anthropic. The platform that runs everything wins regardless of which model wins.

The security and operations layers are becoming the differentiator. Claude finding Firefox bugs in 20 minutes but failing to exploit them is a preview of what AI-accelerated security looks like. The teams with the best patching velocity, the best observability, and the best control planes will outperform the teams with the best models.

Same pattern. Different decade. The infrastructure always wins — it just takes a cycle for everyone to notice.

What’s your current strategy — are you picking models, or building platforms?

Hit reply and tell me. I read every response.

— Darin

Weekly AI and cloud breakdowns from someone who’s been in the game since the early days of the internet. No ads. No filler. Just the signal.

This Week in AI + Cloud: Your Experience Is the Advantage, Not the Liability

Darin Deters — Sat, 28 Feb 2026 03:48:54 GMT

The Bottom Line (No Jargon Edition)

If you only read one section, read this. Here’s what happened this week in plain English:

Your experience is your superpower, not your weakness. AI tools can write code and generate content fast, but they don’t know what “good” looks like in your field. If you’ve been doing your job for 10, 20, or 30 years, you have exactly the judgment AI lacks. Don’t be afraid of it — learn to use it. You’ll be more valuable, not less.
Three Chinese companies got caught copying one of the biggest AI models. They ran millions of fake conversations with Anthropic’s Claude to steal its intelligence. Nobody broke in — they just used the product at massive scale through fake accounts. It’s like photocopying an entire library one page at a time. The takeaway: protecting AI isn’t just about locks and firewalls anymore.
A popular AI scorecard turned out to be broken. The test that companies used to prove their AI could write code? Over half the test cases were flawed, and the AI models had basically memorized the answers. So those impressive scores you keep seeing? Take them with a grain of salt. Ask: was this a private test, or could the AI have studied the answer key?
The big three clouds (AWS, Azure, Google) all made it easier to build with AI this week. AWS lets AI agents take real actions now (not just chat). Azure added Anthropic’s best model to its data platform. Google kept simplifying its tools so you spend less time configuring and more time building.
OpenAI is no longer exclusive to Microsoft. Their agent-building platform is coming to Amazon’s cloud too. That means you’ll have more choices about where to run AI tools — and that’s good for everyone.

The thread connecting all of it: In a world flooded with AI-generated everything, the real value is in knowing what’s actually good, what’s actually true, and what actually works. That’s human judgment. And it’s not going anywhere.

The Take That Started the Week

This week I wrote something that hit a nerve: the comfort zone has a cost. You just don’t see the invoice until it’s too late.

I talk to engineers every week who are genuinely afraid AI is going to replace them. And I get it — the headlines are engineered to scare you. But after watching 30 years of tech shifts play out, I can tell you: fear is the wrong operating system for what’s actually happening.

Virtualization was supposed to replace sysadmins. Cloud was going to eliminate infrastructure teams. DevOps would make ops engineers obsolete. I was there for all three. None of it played out the way the fearmongers predicted. What actually happened: the roles changed, the people who adapted early thrived, and the ones who froze got left behind. AI is the same pattern. Faster timeline, same playbook.

Here’s the part I really want to land with this audience:

If you have 10, 20, 30 years in your field, you don’t have a disadvantage. You have a massive one that most people haven’t recognized yet. AI generates, but it doesn’t judge. It produces output, but it doesn’t know what good looks like in your domain. You do. That judgment — knowing which output is right, which approach fits, which edge cases will bite you in production — that’s the part AI can’t replicate. And it’s the part that only comes from years of doing the work.

A junior engineer with AI is fast but unfiltered. A senior engineer with AI is a force multiplier. Your experience isn’t the thing being replaced — it’s the thing that makes AI actually useful. You just have to harness it.

I laid out three paths I’m watching emerge: the Orchestrator (manage agents, define outcomes), the Systems Builder (build the infrastructure agents run on), and the Domain Translator (combine deep industry expertise with AI tools to build things nobody else can). None of them require you to be an AI researcher. All of them require you to start.

Also This Week: Two Stories That Should Change How You Evaluate AI

Anthropic caught three Chinese AI labs distilling Claude at industrial scale.

DeepSeek ran 150,000+ exchanges targeting reasoning. Moonshot AI hit 3.4 million+ targeting tool use, coding, and computer vision. MiniMax — the largest — ran 13 million+ exchanges focused on agentic coding. Total: 16 million+ exchanges across 24,000 fraudulent accounts running through commercial proxy services.

Nobody hacked anything. They used the API exactly as designed, at massive scale, through fake identities. The attack surface was the product itself.

Anthropic’s response included behavioral fingerprinting classifiers, strengthened verification, and countermeasures at the product, API, and model levels. But the bigger takeaway isn’t about Anthropic’s defenses. It’s that the AI moat isn’t the model — it’s the control system around it. Export controls on chips don’t work when knowledge flows out through the API. This pattern will play out across every major lab.

OpenAI published why they stopped using SWE-bench Verified.

They audited 27.6% of the dataset. Of those, 59.4% had flawed test cases that rejected correct code. 35.5% enforced implementation details never mentioned in the task. Worse: every frontier model they tested could reproduce the original human-written solutions verbatim. The models had memorized the answers. Scores climbed from 74.9% to 80.9% in six months. The capability didn’t improve — the benchmark got gamed.

Classic Goodhart’s Law applied to AI. When a measure becomes a target, it stops being useful.

OpenAI now recommends SWE-bench Pro and built their own private benchmark called GDPVal. The shift to private evaluation is the real signal. If someone shows you a benchmark score from a public dataset, the first question should be: is the test private? If not, you might be comparing memorization.

Cloud Roundup: Late February 2026

AWS had a quieter week by recent standards, but one update matters.

Amazon Bedrock now supports server-side tool execution via AgentCore — secure actions like web search and database updates, executed server-side within the Bedrock environment. If you’re building AI agents on AWS, this is the piece that lets agents actually do things without you managing the tool execution infrastructure yourself. Also: EKS Node Monitoring Agent went open source (community contributions welcome), and Deadline Cloud added task chunking for better rendering throughput.

Azure landed a notable model addition.

Claude Opus 4.6 is now on Azure Databricks (as of Feb 26). Serverless Workspaces for Databricks hit GA. WAF Default Ruleset 2.2 is now the standard for Application Gateway — update your configs. Also flagged: the DHE cipher suite retirement hits Azure Front Door and CDN on April 1. Start planning now if you’re affected.

The Databricks play is interesting — Azure is positioning itself as the neutral platform where you can access any model through the analytics stack, not just through the core AI services.

GCP focused on operational improvements.

AlloyDB now integrates with Database Center for prioritized health monitoring — one-click navigation to recommended fixes. Composer deployments generate Airflow v3-compatible DAGs, which means the Airflow v2 end-of-life migration just got a clear path. API Hub got a specification boost preview that improves documentation quality automatically.

Google’s pattern continues: reduce friction, improve defaults, make the platform disappear so teams focus on building.

AI Model Roundup: Late February 2026

OpenAI made a strategic move that’s bigger than any model release: Frontier is coming to AWS.

OpenAI’s no-code agent platform — build, deploy, and manage AI agents — will be hosted on Amazon’s infrastructure alongside Azure. This is the first real crack in the Microsoft-OpenAI exclusivity narrative. Microsoft still retains exclusive IP rights, but the compute layer is diversifying. For practitioners, this means your cloud choice may stop being an AI provider choice. That’s a good thing.

Also: $285B valuation after a $1B Thrive Capital investment, and multi-year alliances with BCG, McKinsey, Accenture, and Capgemini for enterprise adoption. OpenAI is building the consulting channel. The enterprise sales motion is accelerating.

Anthropic released RSP 3.0 on February 24 — updated safety protocols addressing misalignment risks. Government deployments were confirmed in classified environments, with restrictions on firms linked to foreign adversaries. And of course, the distillation attacks disclosure dominated the conversation (covered above).

The pattern I’m seeing from Anthropic this month: security and trust as competitive differentiators. While other labs race on capabilities, Anthropic is racing on the control layer. That’s consistent with their positioning from day one — and the distillation disclosure is evidence that the threats they’ve been planning for are now real.

Google AI shipped Gemini 3.1 Flash image generation with real-time web knowledge and consistent character appearance. Android task automation expanded to multi-step actions through Uber, Lyft, and DoorDash. Lyria 3 now generates 30-second music tracks from text prompts.

The consumer play is aggressive. Google is embedding Gemini into every surface — phone, browser, workspace, photos. The practitioner signal: if you’re building on Google’s stack, the AI primitives are showing up everywhere. The question isn’t “should we use AI” — it’s “which layer do we integrate with.”

The Pattern I’m Watching

Three themes collided this week, and they’re all connected.

Theme 1: The benchmark trust crisis. SWE-bench Verified just became the poster child for Goodhart’s Law in AI. Public benchmarks are getting gamed. Labs are shifting to private evaluation. The implication: we’re entering a period where you can’t compare AI tools by published scores alone. Hands-on evaluation is the only evaluation that counts.

Theme 2: The model security arms race. Anthropic’s distillation disclosure proves that AI capabilities are now a target — not just the infrastructure that runs them. The moat isn’t the model. It’s the detection, verification, and control systems around it. Every lab will face this. The ones who invested in security early will have the advantage.

Theme 3: Experience as competitive advantage. In a world where AI handles the generation and junior-level execution, the premium shifts to judgment. Knowing what good looks like. Knowing which edge cases matter. Knowing when the AI’s output looks right but isn’t. That’s 10, 20, 30 years of pattern recognition — and it’s exactly what AI can’t replicate.

These three themes are the same theme. In a world flooded with generated output — code, benchmarks, model capabilities, content — the value moves to judgment, verification, and trust. The people and organizations that can separate signal from noise will win.

That’s been true for every tech cycle I’ve watched. It’s just never been this obvious.

What’s your take — are you seeing the same pattern in your world?

Hit reply and tell me. I read every response.

— Darin

Weekly AI and cloud breakdowns from someone who’s been in the game since the early days of the internet. No ads. No filler. Just the signal.

This Week in AI + Cloud: The Developer Career Fork, AWS/Azure/GCP, and a Benchmark That Changes Everything

Darin Deters — Sun, 22 Feb 2026 11:28:33 GMT

AI + Cloud — Week of February 22, 2026

The Take That Started the Week

This week I published a piece I’ve been sitting on for a while: AI has forked the developer career into three tracks.

Not killed it. Forked it.

Track 1 — The Orchestrator

Writes specs and manages agents. Nobody on the team is touching code — the agents write, test, and ship. The humans write the specs and review the results.

The unit of work is no longer the instruction. It’s the token — a unit of purchased intelligence.

Track 2 — The Systems Builder

Builds what the orchestrators use: agent frameworks, eval pipelines, routing layers that send the right task to the right model at the right cost.

This is where 30 years of infrastructure experience pays off. High bar. High ceiling.

Track 3 — The Domain Translator

The one nobody’s talking about.

Technical fluency + deep domain expertise = build tools instead of just using them. The dental practice specialist. The construction scheduling expert. The insurance compliance analyst who can now ship software.

These people are becoming developers — without CS degrees or bootcamps.

The person most exposed right now: the competent coder in the middle. Solid code. No deep systems expertise. No deep domain expertise. Generic code production value is going to zero.

I’ve seen this exact pattern with DevOps, cloud, and containers. Depth wins every time.

The difference now is the timeline — this fork is happening in months, not years.

Cloud Roundup: February 2026

AWS

AWS had one of its stronger February drops in recent memory. Two highlights worth your attention:

Claude Opus 4.6 is now in Amazon Bedrock. The most powerful model currently available is now natively inside AWS. If you’re building AI apps on AWS and still stitching together third-party APIs, your architecture just got simpler.
EC2 G7e instances with NVIDIA Blackwell GPUs. Up to 2.3x inference performance over the previous generation. LLM and multimodal workloads just got significantly cheaper to run at scale.

Also worth flagging:

DynamoDB now supports cross-account global tables (big for multi-tenant architectures)
ECS gets native canary and linear deployments via NLB
Aurora DSQL dropped SDK connectors for Go, Python, and Node.js with IAM auth auto-handled
Network Firewall dropped data processing charges for TLS Advanced Inspection

Security upgrade that also cuts the bill — those rarely come together.

Azure

Azure had a quieter but practical month:

New AMD Turin + Intel Xeon 6 VM families (Dasv7, Easv7, Fasv7) are now GA — better price-performance across general purpose, memory-optimized, and compute-optimized workloads.
AKS gets LocalDNS (lower latency inside clusters) and auto encryption-at-host — two less things to configure manually.
Azure Functions adds .NET 10 runtime support.
Claude Sonnet 4.6 is now on Azure AI. Both major clouds now have Anthropic’s latest model available. The hyperscaler AI integration race is real — and the developer wins either way.

GCP

GCP had one genuinely remarkable update: Gemini 3.1 Pro with a 1 million token context window.

One million tokens means entire codebases, full legal document sets, complete video transcripts — all processed in a single inference call.

That’s not incremental. It changes what’s architecturally possible.

Also landed:

GKE now auto-selects between Persistent Disk and Hyperdisk based on hardware compatibility (no more manual pairing or complex scheduling rules)
Cloud SQL adds brute-force attack detection baked in by default
OpenAPI v3 support for API Gateway is now GA
AlloyDB integrates with Database Center for one-click health remediation

Google’s pattern this month: reduce the operational burden everywhere so teams can focus on what they’re actually building.

AI Model Roundup: February 2026

OpenAI

OpenAI shipped GPT-5.2 and retired GPT-4o, GPT-4.1, and o4-mini from ChatGPT in the same month.

That pattern — accelerate and consolidate simultaneously — is something I’ve been watching play out every quarter now.

Practical implication: if your team has workflows, prompts, or evals tuned to any of those retired models, February is a good time to audit what you’re actually calling. The API versions aren’t changing yet, but the ChatGPT surface is moving on.

Also shipped:

Lockdown Mode for enterprise security (data exfiltration protections, better admin oversight)
file attachments bumped to 20 per message
Code Blocks with a proper IDE experience inside ChatGPT

Anthropic

Anthropic shipped two major models in 12 days: Claude Opus 4.6 on February 5, and Claude Sonnet 4.6 on February 17.

That release velocity is a signal about where the company is operating right now.

The updates I’m watching most closely:

Claude Code is now included in every Team plan. Previously an add-on. The barrier to AI-assisted coding just disappeared for a lot of teams.
HIPAA-ready Claude for Enterprise. Healthcare AI just got a credible, enterprise-grade option.
Apple Xcode 26.3 integrates the Claude Agent SDK. The agentic coding wave is hitting every major IDE.
Permanently ad-free — official. Anthropic made it explicit: no ads, ever. Their reasoning: advertising incentives fundamentally conflict with building a genuinely helpful assistant.

Business model shapes product behavior. That positioning choice matters more than it sounds in a market where every free tier is hunting for monetization.

Google AI

Google AI had one number dominate the conversation: 77.1% on ARC-AGI-2.

That’s more than double the reasoning performance of Gemini 3 Pro.

ARC-AGI-2 is one of the harder benchmarks for measuring general reasoning — not just pattern matching. Hitting 77% would have been unimaginable two years ago.

Also:

Gemini 3 Flash + Pro moved from preview to GA in AI Studio
Gemini 3.1 Pro is free to use during the preview period — classic developer adoption strategy
Workspace AI now has an Expanded Access add-on, with Gemini usage metrics available in the Admin console

If you’re trying to build a business case for AI investment at your org, that admin visibility feature is worth a closer look.

The question isn’t whether AI is being used — it’s whether you can measure it.

The Pattern I’m Watching

One thing stands out across all six of these companies this month:

Every cloud is racing to be the platform where you run your AI.

Every AI lab is racing to make their model available on every cloud.

AWS has Claude. Azure has Claude. GCP has Gemini. All of them will have everything within a year.

The winner of this race will probably be whoever makes the integration seamless enough that teams stop thinking about it as a separate decision.

Right now, it’s still a decision. That window is closing.

Which platform are you building on — and has that choice gotten harder or easier in the last six months?

Hit reply or post a comment and tell me. I read every response.

— Darin

Lockdown Mode, Billable Agents, and the Cost of Autonomy

Darin Deters — Fri, 20 Feb 2026 05:11:30 GMT

Between February 10–15, three signals landed that matter if you’re building or buying agents in production:

OpenAI introduced Lockdown Mode and Elevated Risk labels inside ChatGPT.
Google Cloud pushed Vertex AI Agent Engine capabilities into GA — including billable code execution, sessions, and memory.
Amazon Web Services expanded model choice in Bedrock with Claude Sonnet 4.6 from Anthropic.

This isn’t about model benchmarks.

It’s about control, cost, and production posture.

1. OpenAI: Agents Now Ship with Guardrails

With Lockdown Mode, browsing and network-exposed tools can be restricted to prevent prompt injection and tool abuse. Elevated Risk labels surface contextual warnings before certain capabilities are used.

Translation for enterprise:

Agent autonomy is no longer “just trust the prompt.”
Risk posture becomes visible and configurable.
Admins can constrain behavior without killing capability entirely.

This is a shift from “powerful model” to managed execution environment.

If you’re running document ingestion, compliance extraction, financial analysis, or anything with external tool calls — this matters.

2. Google: Agent Runtime Is Now Infrastructure

Google Cloud moved Code Execution, Sessions, and Memory Bank in Vertex AI Agent Engine to GA.

And importantly:

They’re billable.

That means:

Session state persistence costs money.
Sandbox code execution costs money.
Memory storage costs money.
Agent loops now show up in your cloud bill.

For teams used to “model token cost” as the primary driver — this is the next wave of FinOps for AI.

You’re not just paying for tokens.

You’re paying for runtime behavior.

3. AWS: Model Optionality Is the Strategy

Amazon Web Services added Claude Sonnet 4.6 to Bedrock.

This continues AWS’s strategy:

Provide multiple frontier models.
Let customers benchmark inside their VPC.
Keep control plane + data residency consistent.

For enterprise buyers, this matters more than leaderboard scores.

Optionality + governance + isolation = leverage.

Here’s the throughline

This isn’t just feature shipping. It’s a shift in posture.

Safety used to live in policy docs and internal guidelines.

Now it’s enforced at runtime with configurable controls.

Cost used to mean tokens.

Now it means tokens plus compute, memory, and session persistence.

Autonomy used to be prompt-level intelligence.

Now it’s managed execution inside a governed environment.

Procurement used to be model comparison.

Now it’s platform + runtime evaluation.

The industry conversation has quietly moved from:

“Which model writes better code?”

To:

“Which environment can safely and predictably finish work?”

That’s a very different buying decision.

What I’d Do This Week

If you’re serious about agents in production, don’t debate Twitter takes.

Run a structured benchmark.

Example:

Pipeline

Document ingestion
Structured extraction
JSON schema enforcement
Compliance tagging

Deploy on:

Bedrock with Claude Sonnet 4.6
Vertex AI Agent Engine with sessions enabled
OpenAI with Lockdown Mode toggled on/off

Track:

Task success rate
Median end-to-end latency
Cost per task (including runtime)
Failure mode type (hallucination vs tool misuse vs timeout)

Because in 2026, the decision isn’t just model quality.

It’s:

Can I constrain it?
Can I observe it?
Can I predict the bill?

The Bigger Pattern

We’ve moved from:

Models → Agents → Agent Platforms → Agent Governance

And the moat isn’t raw capability anymore.

It’s control surfaces.

Safety flags.

Session billing.

Model optionality.

Isolation boundaries.

The vendors are telling you something very clearly:

Agents are not a toy layer.

They’re infrastructure now.

If you treat them like a chatbot feature, your architecture will lag.

If you treat them like a distributed system with risk and cost controls, you’ll win.

The Entry-Level Job is Gone. Here’s the new one (2026)

Darin Deters — Sun, 15 Feb 2026 02:32:04 GMT

If you’re graduating into tech right now — or you’re early-career and feeling weirdly behind — you’re not imagining it.

The entry-level bar moved.

Not because you got worse.

Because a lot of what used to be junior work is now handled by AI + a senior engineer reviewing it.

So the market is doing what markets always do: it stops paying for what’s abundant.

Code is abundant now.

Which means the thing that gets you hired in 2026 isn’t “I can code.”

It’s:

Can you be trusted to ship?

That’s the new job.

And the fastest way to win is to stop training for the old one.

The new competition: “AI + a senior” (not other grads)

Here’s the mental model that changes everything:

You’re not being compared to a senior engineer.

You’re being compared to:

an AI that can generate 80% of a feature in minutes
plus a senior who knows what “good” looks like

So “I can type code fast” isn’t a differentiator anymore.

Your differentiator is your ability to turn AI output into something:

correct
secure
reliable
deployable
explainable

That’s not “senior-only.” That’s baseline.

1) Stop trying to be a faster coder. Become a better driver.

In 2026, code is the easy part.

The job is driving.

Driving means you can:

turn vague asks into clear requirements
steer AI toward the right shape of solution
catch bad outputs before they ship
prove it works (tests + observability)
make tradeoffs (cost / security / performance)

The cheat code: the code review mindset

If you’re early-career, you can level up faster by practicing reviewing instead of only writing.

Use this loop:

Use AI to generate the module
Review it like it’s going into production
Fix what’s wrong
Write down what you found and why it mattered

What “wrong” looks like:

insecure defaults (wide permissions, open CORS, missing auth)
missing input validation
edge cases (timeouts, retries, empty data, partial failures)
dependency landmines
performance traps
“works locally” but breaks in deployment

If you can consistently do this, you stop reading as “junior who needs babysitting” and start reading as “junior who reduces risk.”

That’s who gets hired.

2) The 2026 Golden Trio: learn what companies urgently need

You don’t need to learn everything. You need leverage.

Here are three domains that keep showing up because they map to real business pain:

Cloud-native “ship it” skills

Not theory. Not just certs.

Real skills:

deploy an API or app
set up logs/metrics
understand basic scaling
make authentication not sketchy
have a cost opinion (even a simple one)

If you can deploy something cleanly, you’re instantly above average.

AI integration: RAG + tool use

You don’t need to train models to be valuable.

You need to connect them to real systems:

documents
databases
internal tools
APIs

In production, “useful AI” usually means retrieval + tools + guardrails.

DevSecOps lite: don’t ship footguns

AI increases code volume. More code means more vulnerabilities.

If you can demonstrate:

least privilege thinking
secrets handled properly
dependency hygiene
basic secure-by-default design

…you become low-risk.

Low-risk gets offers.

3) Portfolios in 2026: tutorials are invisible

Hiring managers are speed-scanning.

If your portfolio is:

todo app
weather app
calculator

…it reads like “followed a guide.”

Instead, build what I call an Anti-Tutorial Portfolio:

One real project. One real user. Real constraints.

Pick:

a friend’s side hustle
a local nonprofit
your own annoying workflow
a small business problem

Then ship something small but real.

Usage forces real engineering:

requirements change
edge cases appear
reliability matters
you iterate

Add a Decision Log (this makes you stand out)

In your README, include a short “Decision Log” like:

why this architecture
why these services/libraries
what tradeoffs you made
how you handled auth
how you thought about cost
what you’d change at 10× scale

This signals: builder mindset, not student mindset.

Documentation is back

In a world of AI-generated spaghetti, clean docs are a superpower:

clear README
API spec
diagram
runbook
how to run locally + in cloud

4) Soft skills just became career skills

When routine tasks get automated, human value rises.

The people who win can:

explain constraints to non-technical stakeholders
break vague asks into shippable chunks
ask good questions
learn fast and adapt without being handheld

This is engineering, not “being extroverted.”

The 30-day plan that produces proof (not vibes)

Week 1: Ship something deployed

Build a small API/app. Deploy it. Add logs. Write a clean README.

Signal: I can ship.

Week 2: Add AI that touches real data

Add RAG over real docs/data. Show evaluation. Document failure modes.

Signal: I can make AI useful.

Week 3: Add an agent workflow (with guardrails)

Let AI call tools/APIs. Add validation and tests.

Signal: I can orchestrate safely.

Week 4: Make it hiring-ready

Security cleanup, architecture diagram, 2-minute demo video, tighten docs.

Signal: I can communicate like a pro.

Capstone project ideas (pick one)

Cloud / Platform track: Ops Copilot for a Service

Deploy a small service (API + datastore)
Add observability (logs/metrics/alerts)
AI feature: summarize incidents + suggest runbook steps
Bonus: cost report + “why this design” decision log

Backend track: Support Triage Assistant

Ingest tickets/emails/forms → categorize → route
AI feature: summarize + propose response drafts
Guardrails: sensitive data handling + approval workflow
Bonus: evaluation set (20 sample tickets) + accuracy reporting

Full-stack track: Real User Dashboard

Build a dashboard for a real user (inventory, bookings, donations, etc.)
AI feature: Ask your data (RAG over records + definitions)
Bonus: role-based access + audit log

Security track: AI Code Safety Gate

Pipeline that scans PRs for risky patterns
AI feature: explain findings + recommend fixes
Bonus: policy-as-code + least-privilege reference templates

FinOps / Cost track: Cloud Cost Guardrails

Pull billing/cost signals (even mocked)
AI feature: explain spikes + recommend actions
Bonus: expected monthly cost model + alerts

The line that changes everything

In 2026, your job isn’t to prove you can code.

Your job is to prove you can take AI output and turn it into something:

safe

reliable

real

That’s the person teams want on day one.

Harness Engineering: The Moat Isn’t Code Anymore. It’s Control.

Darin Deters — Thu, 12 Feb 2026 02:24:19 GMT

Humans steer. Agents execute.

OpenAI just documented an experiment that quietly rewrites what “software engineering” means in 2026:

They shipped an internal beta product with 0 lines of manually-written code — product logic, tests, CI, docs, observability, internal tooling — all written by Codex, merged through a normal PR workflow.

They estimate it took ~1/10th the time it would’ve taken to write by hand.

They started from an empty repo (first commit: late August 2025) and ended up with ~1M lines of code and ~1,500 PRs, initially driven by three engineers — later seven — with throughput rising as they scaled.

If your reaction is “cool, but that’s OpenAI,” you’re looking at the wrong thing.

The story isn’t “AI wrote a lot of code.”

The story is what they had to build around the AI so that cheap code didn’t turn into expensive chaos.

That’s the shift:

Software engineering is becoming harness engineering — designing environments, specifying intent, and building feedback loops that let agents do reliable work.

And the moat isn’t clever code.

It’s control.

The new bottleneck (it’s not typing)

Once code generation is abundant, your limiting factor isn’t output.

It’s:

environment design
constraints
feedback loops
legibility
garbage collection for drift

OpenAI put it bluntly: early progress was slower not because Codex couldn’t code, but because the environment was underspecified — the agent lacked tools, abstractions, and internal structure. So the engineers’ “job” became enabling the agent.

Your value isn’t writing code. Your value is making code safe to generate.

Lesson 1: If the agent can’t observe it, it doesn’t exist

Agents don’t magically “understand” your system.

They inspect it.

So OpenAI made the app legible to the agent:

bootable per git worktree so Codex could run an isolated instance per change
wired Chrome DevTools Protocol into the agent runtime, with skills for DOM snapshots, screenshots, and navigation — enabling the agent to reproduce UI bugs and validate fixes by actually driving the app
gave the agent a local, ephemeral observability stack per worktree; Codex could query logs via LogQL and metrics via PromQL

That’s how prompts like “startup under 800ms” stop being vibes and become testable acceptance criteria.

If the agent can’t measure it, it can’t improve it.

If it can’t reproduce it, it can’t fix it.

Lesson 2: Stop writing one giant “AI manual.” Build a repo knowledge system.

They tried the classic play: one big AGENTS.md.

It failed in exactly the ways you’d expect:

context is scarce, so a giant instruction file crowds out the task and the code
when everything is “important,” nothing is
it rots
it’s hard to mechanically verify freshness, coverage, ownership, or links

So they flipped it:

AGENTS.md became a short map (~100 lines) and the repository’s actual knowledge base moved into a structured, versioned docs/ directory treated as the system of record.

That’s the underrated unlock.

Most teams are trying to prompt their way into agent productivity.

OpenAI treated repo knowledge like infrastructure.

Docs aren’t documentation anymore. They’re runtime dependencies for agents.

Lesson 3: You don’t manage agents by lecturing them. You manage them by constraining the world.

Agents don’t just ship features.

They replicate patterns — at scale.

So if your architecture is squishy, an agent will amplify the squish into a full-on “smell event.”

OpenAI’s response: enforce invariants, not implementations.

They built a rigid model:

domains divided into layers
dependency direction validated
only a limited set of permissible edges allowed
enforced mechanically via custom linters and structural tests

This line is the whole philosophy:

Don’t tell the agent to have good taste. Make bad taste impossible.

And here’s the part most teams miss: they pushed “taste” into systems — review comments and bugs get captured as docs updates or promoted into code rules when docs aren’t enough.

Lesson 4: Throughput breaks your merge philosophy

This is where agent-first engineering starts to feel alien.

As Codex throughput increased, OpenAI found many “best practices” became counterproductive.

They operate with:

minimal blocking merge gates
short-lived PRs
flakes often handled with follow-up runs instead of blocking progress indefinitely

Because in a world where agent throughput far exceeds human attention:

corrections are cheap, and waiting is expensive.

OpenAI even notes this would be irresponsible in a low-throughput environment — but under agent abundance, it can be the right trade.

The point isn’t “copy their process.”

The point is: if your workflow assumes scarcity, it collapses under abundance.

Lesson 5: AI drift is a memory leak — schedule the garbage collector

Full agent autonomy introduces a new class of problem: replication.

Codex will copy whatever patterns exist in the repo — including uneven ones — and that leads to drift.

OpenAI’s early approach was brutally relatable:

They spent every Friday (20% of the week) cleaning up “AI slop.”

It didn’t scale.

So they built garbage collection:

encoded “golden principles” as mechanical rules in-repo
ran recurring background tasks that scan for deviations
opened targeted refactor PRs (many reviewable in under a minute and auto-mergeable)

Their framing is perfect:

Technical debt is a high-interest loan. Pay it continuously or it compounds.

Drift is the new technical debt. GC is the new hygiene.

What I’d do about it (Monday-morning playbook)

If you run engineering, platform, SRE, cloud, or security — here’s the practical version. You’re building control surfaces.

1) Build an “agent map”

keep AGENTS.md short (TOC, not encyclopedia)
put the real truth into versioned in-repo docs (architecture, runbooks, standards)
add CI checks for broken links + doc freshness (make drift loud)

Goal: make the repo navigable for a machine.

2) Make your system machine-debuggable

one-command boot per branch/worktree
deterministic dev environments
agent-accessible logs/metrics/traces (even if local + ephemeral)

Goal: turn “feels broken” into “fails a measurable invariant.”

3) Encode constraints, not vibes

structural tests for dependency direction
linters that enforce invariants (style, layering, boundaries)
lint errors that teach the fix (because the agent reads them)

Goal: make correctness the path of least resistance.

4) Create an autonomy ladder (so humans spend time on judgment)

OpenAI lists what autonomy looks like when the harness is real: reproduce a bug, record evidence (even videos), implement a fix, validate by driving the app, open a PR, respond to feedback, remediate build failures, and escalate only when judgment is needed — then merge.

Start simple. Earn trust. Climb the ladder.

5) Treat cleanup like production ops

define “golden principles” (mechanical, enforceable)
run scans on a cadence
ship small refactors continuously

Goal: pay entropy continuously so it never compounds.

The punchline

OpenAI’s claim isn’t “AI replaces engineers.”

It’s more interesting — and more uncomfortable:

As code gets cheaper, engineering becomes the discipline of keeping code coherent.

Harness engineering is the new platform layer.

And the winners won’t be the teams with the best prompts.

They’ll be the teams who build the best scaffolding.

The Cloud Plumbing, Security, and Business Model Stories Behind the Agent Stack

Darin Deters — Wed, 11 Feb 2026 15:48:52 GMT

Last week I broke down the agent platform layer — OpenAI Frontier and Atlas, the Codex App Server protocol, Prism, Claude’s propose-verify-approve pattern, ServiceNow distribution, and Chrome going agentic with Gemini 3. (Read that post here if you missed it.)

The throughline: we’re moving from models to agents to agent platforms, and the moat is context, runtime, and distribution.

This week I want to go one layer down.

Because platforms don’t float. They sit on top of models, cloud infrastructure, security tooling, and business models — and all four of those shifted between February 2–11, 2026, in ways that matter for anyone building or buying agents in production.

Here’s what landed, why it matters, and what I’d actually do about it.

Claude Opus 4.6: what 1M tokens actually changes (and what it doesn’t)

Anthropic shipped Claude Opus 4.6, its strongest “agentic” model yet, with a 1M token context window now available in beta. The headline number gets attention, but the real story is what this does — and doesn’t — change for how you design retrieval and reasoning in agent workloads.

Where long context is a genuine unlock

If your workload is “one large, bounded corpus,” you can now try loading the whole thing into context and reasoning over it directly. No chunking. No retrieval pipeline. No re-ranking. Just the model and the material.

This matters for specific use cases:

Codebase reasoning. An entire repo in context means the model can trace dependencies, understand architectural decisions, and generate changes that are consistent with patterns it can actually see — not patterns it inferred from retrieval snippets.

Incident investigation. Feeding a complete timeline — logs, alerts, runbook excerpts, Slack threads, post-mortems — into one context window lets the model correlate signals that would be fragmented across RAG chunks.

Contract and regulatory analysis. Cross-referencing terms, definitions, obligations, and exceptions across a full document set without worrying about whether your retrieval pipeline surfaced the right clause.

Where long context doesn’t replace RAG

If your workload is “infinite enterprise sprawl” — thousands of documents across dozens of systems with different owners, permissions, freshness requirements, and classification levels — a 1M token window doesn’t solve your problem. You still need retrieval. You still need permissions. You still need a semantic layer that knows what’s current and what’s stale.

Context windows don’t solve governance. They don’t solve freshness. They don’t solve multi-tenancy. Don’t let the headline number distract from the architectural work that still needs to happen.

The cross-cloud angle matters more than people think

Opus 4.6 isn’t locked to one vendor. It’s available in Amazon Bedrock and Microsoft Foundry (Azure). If you’re evaluating agent-capable models and you need cross-cloud availability — because your infrastructure spans AWS and Azure, or because procurement won’t sign off on a single-vendor dependency — this simplifies the conversation. You can run the same model wherever your workloads already live.

The practical experiment

Take a real workload — a repo your team works in, a set of docs your team references daily — and run it through Opus 4.6 in “long-context-first” mode. Then compare the quality, latency, and cost against your existing RAG pipeline for the same queries. Let the data tell you where the tradeoff lands for your specific use case, instead of guessing.

AWS: the runtime and identity layers that make agents survivable

While the platform-layer headlines went to Frontier and Atlas, AWS quietly shipped the kind of infrastructure changes that determine whether agents actually work in production. Two runtime updates and two foundational infrastructure releases.

Bedrock server-side tool use

Bedrock added server-side tool use and extended prompt caching in the Responses API. This is the “make it controllable” layer.

Previously, most agent tool-use implementations were client-side orchestration — your code called the model, parsed the tool request, executed it, and sent the result back. That works in a demo. It breaks in production when you need consistent security boundaries, audit trails, and cost controls.

Server-side tool use means the execution happens inside AWS’s security perimeter — IAM policies, VPC boundaries, CloudTrail logging — with the guardrails you’d expect. Extended prompt caching means repeated context (system prompts, shared documents, conversation history) doesn’t get re-processed on every call, which directly impacts cost and latency for multi-turn agent workflows.

If you’re building agents on Bedrock, this is the shift from “model capability” to “operational capability.” It’s what makes tool use shippable.

IAM Identity Center multi-Region replication

This one doesn’t sound exciting. It is quietly one of the most important AWS releases this quarter.

Identity is a Tier-0 dependency. Everything downstream — console access, CLI sessions, service roles, federated access, SSO — depends on IAM Identity Center being available. Until now, it was single-Region. If that Region had an issue, your identity plane was impaired.

AWS now lets you replicate IAM Identity Center from a primary Region to additional Regions, including identities, permission sets, and account assignments.

Why this matters beyond availability: data residency. Some regulatory frameworks require that identity and access data reside in specific geographies. Multi-Region replication gives you the ability to place identity data where your compliance requirements demand — without building a parallel identity system.

If you’ve ever sat in a BCDR review where someone said “nothing works if identity is down,” this is that conversation getting resolved.

What to do now: Start planning your replication topology. Identify your KMS key strategy for the secondary Regions. Define failover access patterns. Test before you need it. This isn’t a “set and forget” feature — it requires deliberate design around Region selection, replication lag tolerance, and operational runbooks.

Security group “Related resources” tab

Small console enhancement, outsized impact for anyone managing a large AWS estate. The EC2/VPC console now shows which resources — ENIs, instances, load balancers, Lambda functions — are associated with a given security group.

Before this, deleting or modifying a security group was a “hope nothing breaks” exercise unless you had custom tooling to map dependencies. Now you can see the blast radius before you make the change.

Integrate this into your change management workflow. Especially before deletions — require a check of the related resources tab as part of your change request documentation. It’s a small step that prevents expensive mistakes.

Microsoft: AI-native security is becoming a first-class agent concern

Microsoft shipped two security-related updates this cycle that are worth reading together.

AI-powered incident prioritization is now in public preview in Defender. It uses machine learning to help SOC analysts cut through alert noise and focus on the incidents most likely to be real and impactful. If your SOC is drowning in false positives — and statistically, it probably is — this is worth evaluating against your current triage metrics: mean time to acknowledge, false positive rate, analyst fatigue.

Expanded Defender coverage for Foundry-hosted agents means Microsoft is extending its security tooling to cover agent workloads specifically. This is Microsoft positioning security as a first-class concern for agent deployments, not something you bolt on after the fact.

The timing is deliberate. As agent platforms ship (Frontier, Foundry, Bedrock), the attack surface expands. Agents that can execute code, query databases, and take actions on behalf of users are a fundamentally different security problem than a chatbot answering questions. Microsoft is building the security layer to match.

Anthropic’s 0-day research: a signal worth taking seriously

Separately from the Opus 4.6 release, Anthropic is explicitly researching the risk of LLM-discovered 0-days — previously unknown vulnerabilities found by advanced models — and publishing findings about it.

This is a model builder acknowledging that “agentic capability” and “security capability” are two sides of the same coin.

Here’s my honest take: I’m not sure most organizations are ready for the speed at which capable agents can become unintentional security researchers. An agent tasked with “find a way to make this API call work” could, in theory, discover and exploit a vulnerability in the process. The security posture around agent workloads needs to assume mistakes will happen and build containment accordingly — not just for malicious actors, but for well-intentioned automation that wanders into dangerous territory.

This is why the governance layer I talked about last week matters so much. Agent identity, scoped permissions, audit trails, and the ability to revoke access without breaking other workflows — these aren’t nice-to-haves. They’re the difference between an agent that helps and an agent that becomes a liability.

Google GEAR: investing in the developer enablement layer

Google Cloud launched GEAR — a structured skills path for building and deploying agents using Google’s Agent Development Kit (ADK). It includes labs, credits, and a progression path, housed inside the Google Developer Program.

While OpenAI and Anthropic are leading on the platform and model layers, Google is investing in developer enablement — making it easier to get started building agents on its stack. Different strategy, complementary signal. The market is moving fast enough that developer adoption velocity matters as much as raw platform capability.

If you have teams evaluating Google’s agent tooling, GEAR is worth pointing them toward as a structured on-ramp.

Business model whiplash: ads vs. subscriptions (and why enterprise should pay attention)

Two announcements landed almost back-to-back, and the contrast is stark.

OpenAI started testing ads in ChatGPT for logged-in adult users on the Free and Go tiers in the US. Plus, Pro, Business, Enterprise, and Education tiers are not affected. The ads are described as “relevant ads within conversations” — the implementation details and data-handling specifics are still emerging.

Anthropic used Super Bowl visibility to position Claude as explicitly ad-free — framing the absence of advertising as a trust and alignment feature, not just a business model choice.

For enterprise buyers, this isn’t about moral judgment. It’s about trust boundaries and the questions your security, compliance, and procurement teams are going to ask:

Data handling. What user data flows into the ad-targeting pipeline? Even if your org is on an enterprise tier, does the existence of an ad-supported tier change how the underlying model is trained or tuned?

Response integrity. How do you prove that responses in the paid tiers are completely uninfluenced by commercial relationships in the ad-supported tiers?

Vendor risk narrative. When your CISO asks “is our AI vendor also an advertising company?” — what’s your answer, and does it change your risk posture?

My read: this is going to matter most in regulated industries — healthcare, financial services, government — where the perception of data mixing or commercial influence can be as damaging as the reality. Expect this to become a procurement checklist item within the next two quarters.

Databricks: follow the money (agents will eat your data platform bill)

Databricks raising ~$5B at a ~$134B valuation, with AI products crossing ~$1.4B in annualized revenue, is a signal worth reading carefully.

The thesis is straightforward: agents don’t just “think.” They query, join, filter, write, summarize, re-query, materialize, and do it again — often in loops. Every agentic workflow that touches structured data is a workload on your data platform. Every multi-step reasoning chain that needs fresh data is a set of warehouse queries. Every agent that “monitors” something is a recurring compute job.

Databricks is calling itself an “AI beneficiary” because it’s sitting on the metered layer where agent work becomes billable compute. That’s not speculation — it’s already showing up in their revenue numbers.

The FinOps implication is real. If your organization is deploying agents that interact with data platforms — Databricks, Snowflake, BigQuery, Redshift — you need usage budgets and alerts in place before the agents are in production. “Helpful automation” has a way of becoming a surprise bill when nobody set a ceiling on how many queries an agent could run per hour.

Set the budgets. Set the alerts. Have the conversation with your data platform team about what “agent-driven usage” looks like in their billing model. Do it this week, not after the first invoice lands.

What I’d do this week — a practical 10-step plan

Pick the items that match where you are:

1. Run a “long-context first” experiment with Opus 4.6. Take a real repo or document set your team actually uses. Load it into a 1M token context. Run the same queries you’d run against your RAG pipeline. Compare quality, latency, and cost. Let the data decide.

2. Evaluate Bedrock server-side tool use for your agent workloads. If you’re building agents on AWS, test server-side tool execution against your current client-side orchestration. Measure the difference in security posture, auditability, and operational complexity.

3. Plan IAM Identity Center multi-Region replication. Identify target Regions. Plan your KMS key strategy. Define failover access patterns. Test before you need it.

4. Use the security group dependency view in your change workflow. Make “check related resources” a standard step before security group modifications or deletions. Small habit, big risk reduction.

5. SOC teams: evaluate Defender’s AI prioritization preview. Run it in parallel with your current triage process. Measure against MTTA, false positive rate, and analyst workload. See if it actually reduces noise or just moves it around.

6. Review your agent security posture against the 0-day research signal. Ask: if an agent tasked with a legitimate workflow accidentally discovered a vulnerability, would your containment model catch it? If the answer is “I don’t know,” that’s the priority.

7. Set agent-aware FinOps budgets. Assume agent-driven warehouse and API usage is coming. Set budgets, alerts, and per-agent usage caps before the workloads are live.

8. Point your teams at Google GEAR if they’re evaluating Google’s agent stack. Structured learning paths beat ad-hoc exploration for teams that need to ramp quickly.

9. Start your “Trust FAQ” document. Ads, data handling, model choices, logging, retention, response integrity. Get ahead of the questions your security and compliance teams are going to ask — because they are going to ask.

10. Re-read last week’s post and map your agent boundary diagram. Data sources → tools → actions → approval points → rollback mechanisms. If you can’t draw it on a whiteboard, you’re not ready to ship it. (Here’s the post.)

Last week, the agent stack became a product. This week, the foundation underneath it got stronger — better models, more resilient identity, controlled runtimes, AI-native security, and a business model debate that enterprise can’t afford to ignore. The organizations that move deliberately on both layers — platform and foundation — are the ones that will actually ship agents that last.

See you next week.

— Darin