Agents became infrastructure: Frontier + Atlas, Prism, Codex Harness, Claude, Gemini 3

Darin Deters

Feb 05, 2026

For a while, AI updates felt like magic tricks.

A new model. A new benchmark. A new demo.

This week felt different.

This week was about shipping the boring parts that make agents real:

Shared context
Runtime protocols
Approvals and guardrails
Distribution inside the tools people already use

In other words, the agent stack is hardening into infrastructure.

You can see it across the big releases from OpenAI, Anthropic, and Google over the last couple of weeks:

OpenAI launched Frontier (Feb 5, 2026), a platform to build, deploy, and manage AI coworkers across the enterprise.
OpenAI’s Atlas (Oct 21, 2025) is already the “work happens here” browser surface that Frontier can plug into.
OpenAI opened up the Codex harness details, centered on a bidirectional App Server protocol.
OpenAI shipped Prism (Jan 27, 2026), a free AI-native workspace for scientific writing and collaboration powered by GPT-5.2.
Anthropic showed Claude planning a Mars rover route with the “propose → verify → approve” shape that enterprises desperately need.
Anthropic + ServiceNow is pushing Claude into enterprise workflows at massive scale.
Google’s January AI recap makes the strategy loud: personal intelligence plus Chrome auto browse powered by Gemini 3.

There’s a throughline here:

We are moving from models to agents to agent platforms.

And the moat is not “who is smartest.”

The moat is:

Context
Runtime
Distribution

Let’s break down what shipped, and why it matters for builders, IT leaders, and anyone selling into enterprise.

The big shift: intelligence is cheap, context is expensive

Most teams are not blocked by “can the model do it.”

They are blocked by:

Data scattered across systems
Permissions that do not map cleanly to “an agent”
Integrations that become one-off projects
No quality loop, so pilots never become dependable systems

Frontier names this directly: the thing slowing enterprises down is not model intelligence, it’s how agents are built and run inside real organizations.

So the battleground moved.

Not “better answers.”

More “reliable work in production.”

OpenAI Frontier: the control plane for AI coworkers

Frontier is OpenAI’s answer to the enterprise reality: multi-cloud, messy systems, governance everywhere, and agents that need to operate inside that mess without breaking things.

Frontier’s core idea is simple:

AI coworkers need the same fundamentals humans need at a company:

Onboarding and institutional knowledge
Access to the right systems
Clear boundaries and permissions
Learning via feedback so performance improves over time

1) The “semantic layer” framing is the tell

Frontier connects siloed warehouses, CRMs, ticketing tools, and internal apps to create shared business context, explicitly calling this a semantic layer for the enterprise that all AI coworkers can reference.

That is the game.

The agent that understands your internal language and where truth lives will beat the agent that doesn’t, even if the second agent has a slightly stronger model.

2) Open standards, no forced replatform

Frontier says it works with the systems you already have, across multiple clouds, using open standards, with no requirement to abandon existing agents or apps.

This is a direct shot at the “rip and replace” fear that kills AI adoption inside large enterprises.

3) Execution environment, not just chat

Frontier is built around agents completing complex tasks “like working with files, running code, and using tools” inside a “dependable” execution environment, and building memories as they operate.

This is not a prompt library.

This is an operating layer.

4) Evaluation and optimization built in

Frontier emphasizes built-in evaluation and optimization so good behaviors improve on real work over time.

That’s the difference between a clever demo and a system you can trust.

5) The human layer: Forward Deployed Engineers

Frontier also comes with OpenAI FDEs embedded with customer teams to help get agents into production and feed deployment learnings back into research.

That’s OpenAI saying: enterprise is not only software. It is execution.

Atlas: the distribution surface Frontier has been pointing at

In my earlier framing, I treated Atlas like a generic workflow mention.

That was wrong.

Atlas is a real product, and it matters here.

OpenAI introduced ChatGPT Atlas on October 21, 2025 as “the browser with ChatGPT built in.”

Here’s why Atlas is strategically important to the Frontier story:

1) Atlas is “ChatGPT comes with you”

Atlas lets ChatGPT work “anywhere across the web” inside the window you’re already using, without copy/paste or leaving the page.

That is distribution.

Not “come to my AI app.”

More “AI meets you where work already happens.”

2) Memory becomes ambient context

Atlas ships with ChatGPT memory built in, and adds “browser memories” that can remember context from sites you visit. Those browser memories are optional and user-controlled (view, archive, delete).

This is the missing bridge between:

web activity
and agent usefulness

3) Agent mode becomes native to browsing

Atlas includes agent mode designed to act while you browse, and OpenAI says it’s faster and more useful when working with browsing context. Agent mode in Atlas launched in preview for Plus, Pro, and Business users.

This is not theory. It is a product surface built for agentic work.

4) Controls and safety constraints are explicit

Atlas includes visibility toggles per site and restrictions like “cannot run code in the browser, download files, or install extensions,” and it pauses on sensitive sites.

Again, boring parts.

Also the parts enterprises demand.

5) Frontier explicitly plugs into Atlas

Frontier says AI coworkers can be accessed through any interface, including “workflows with Atlas.”

So think of it like this:

Frontier: enterprise control plane (context + permissions + eval + execution)
Atlas: high-distribution client surface where a lot of work actually happens

That pairing is not accidental.

Codex harness: protocols over prompts

The Codex harness post is one of the most important “builder” updates in this whole set.

Because it is OpenAI showing the architecture that makes agents portable across surfaces.

The key piece is the Codex App Server.

OpenAI describes the App Server as both:

the JSON-RPC protocol between client and server
and a long-lived process that hosts Codex core threads

And the design choices are exactly what enterprises need:

A single request can produce many event updates

One client request can result in many event updates, transformed into stable UI-ready notifications so you can build rich interfaces.

Fully bidirectional, approval-native

The protocol is fully bidirectional. The server can initiate requests when the agent needs input, “like an approval,” and pause until the client responds.

That pattern is the difference between:

“agent takes actions”
and “agent takes actions safely”

A portable transport layer

OpenAI says the transport is JSON-RPC over stdio (JSONL), making it straightforward to build bindings in many languages.

If you build internal developer platforms, this should ping your radar.

The future agent ecosystem will not be “one UI.”

It will be:

IDE
terminal
browser
desktop app
internal portals

Protocols win.

Prism: the document becomes the workspace

Prism is framed as science, but the pattern is bigger.

OpenAI introduced Prism on January 27, 2026 as a free AI-native workspace for scientists to write and collaborate on research, powered by GPT-5.2, with unlimited projects and collaborators, available to anyone with a ChatGPT personal account.

Prism is:

cloud-based
LaTeX-native
built for real-time collaboration without local installs

The important thing is not “LaTeX.”

It is this:

AI is moving from a side chat into the place where the work actually lives.

Expect the same pattern to eat enterprise docs:

architecture docs
incident reviews
security exception narratives
audits
proposals
runbooks

Once the workspace is AI-native, “asking” becomes “editing in place.”

That is a workflow shift, not a feature.

Anthropic: the enterprise-safe shape is propose → verify → approve

Claude on Mars is a process demo, not a space demo

Anthropic describes Claude using vision to plan a Mars rover “breadcrumb trail,” then the waypoints were run through simulation with “over 500,000 variables,” engineers reviewed, only minor changes were needed, and the route held up. Engineers estimate this can cut route planning time in half.

Steal the shape:

AI proposes
systems validate
humans approve
execution happens

That is exactly how we should deploy agents into:

infrastructure changes
incident response
compliance workflows
ticket automation

ServiceNow + Claude is distribution inside enterprise muscle memory

ServiceNow is targeting a 50% reduction in time-to-implement for customers using Claude, and early testing showed up to 95% reduction in seller prep time via a Claude-powered coaching tool.

They also state Claude is the default model for Build Agent and a preferred model across the ServiceNow AI Platform.

If you run IT, ServiceNow is not just where tickets live.

It is becoming where agents live.

Google: the browser becomes an agent

Google’s January recap highlights:

Gemini app connecting to Google apps for personalized help (opt-in, beta)
Chrome features built on Gemini 3 including “auto browse” for multi-step chores like booking travel or scheduling appointments
Gemini 3 as the default model for AI Overviews globally

This matters because it signals where the next “default agent surface” sits for most humans:

The browser.

And now we have two major “browser is agentic” moves at once:

OpenAI Atlas as a ChatGPT-native browser
Google Chrome adding Gemini 3 auto-browse behavior

Different approaches.

Same destination.

What to do this week if you lead IT, security, or a platform team

Here’s the practical playbook I’d run right now.

1) Pick two workflows with obvious ROI

Do not start with “company-wide AI.”

Start with two that pay back fast:

Incident triage and root cause acceleration
Change impact analysis with approvals

Frontier’s own example describes collapsing root-cause identification from hours to minutes by pulling together logs, docs, workflows, and code.

2) Build a context map, not an agent

List your truth sources:

CMDB
runbooks
incident history
logs, metrics, traces
change calendars
identity and permissions
repos and pipelines

Then decide what the agent can read, what it can write, and where it must ask.

3) Standardize the “approval handshake”

Codex App Server bakes this pattern in: the agent requests approval and pauses until the client responds.

Make that your enterprise standard:

read-only by default
propose diffs
require approvals for write actions
log every decision

4) Choose your distribution surfaces early

Where should the agent live?

ITSM (ServiceNow)
browser (Atlas, Chrome)
IDE and CLI (Codex harness)
doc workspaces (Prism-style)

If you force a new UI, you lose.

5) Measure quality on real work

If you cannot measure outcomes, you cannot scale trust.

Frontier’s evaluation and optimization emphasis is the right direction.

My take

This is the week agents stopped being a feature and started becoming infrastructure.

Frontier is the enterprise control plane.
Atlas is a high-distribution workflow surface where context and action can meet.
Codex App Server is the runtime protocol blueprint: event streams, portability, approvals.
Prism shows the doc-workflow future: AI inside the workspace, not beside it.
Claude on Mars proves the safe enterprise shape: propose, verify, approve.
ServiceNow + Claude and Chrome + Gemini 3 show the distribution war is already underway.

The question is not whether AI will change how work gets done.

The question is whether you are building:

shared context
safe runtimes
high-trust approvals
and distribution inside real workflows

Because that is where the advantage will live.

Tech with Darin

Discussion about this post

Ready for more?