š„ Fire It In There Friday ā 12/12/25
Code Red Was Real ā and the Reality Is Messier Than the Headlines
Thereās been no shortage of hype around Code Red, OpenAI, and the rapid escalation between OpenAI and Google over the past year.
Now that GPT-5.2 and Gemini 3 are both public and in active use, weāre finally past speculation.
But that doesnāt mean the picture is clean or settled.
Gemini 3: Past the Launch, Into Real Usage
With about a month of real-world exposure behind Gemini 3, the conversation shifts from launch speculation to practical impact.
And the impact is clear.
Gemini 3 isnāt just a model upgrade ā itās Google treating AI as an operating layer across the surfaces they already dominate:
Search
Workspace
Android
Chrome
Cloud tooling
Its biggest strength isnāt one benchmark win.
Itās that intelligence shows up by default, deeply grounded in real-time information and multimodal context.
This is distribution power.
Google doesnāt need Gemini 3 to be the smartest model in isolation.
They need it to be everywhere people already work.
GPT-5.2: Strong on Paper, Rougher in Practice (So Far)
On paper, GPT-5.2 looks solid:
Strong science and math benchmarks
Deeper reasoning capabilities
Clear focus on enterprise-grade reliability
Continued investment in agents and tool-driven workflows
But hereās the honest, first-hand take ā from someone who uses these tools every single day.
Even though GPT-5.2 was just released and I still have a lot more testing to do, my initial reaction is this:
GPT-5.2 feels a bit rough around the edges.
That surprised me.
My experience so far:
GPT-5 felt like hopscotch ā powerful, but you had to navigate carefully
GPT-5.1 smoothed out many of those edges and felt more consistent
GPT-5.2, in some responses, feels closer to that original hopscotch again
Not broken.
Not bad.
Just less polished in certain interactions.
Some responses require more steering.
Some flows feel slightly less predictable.
That doesnāt invalidate the benchmark gains ā but it does reinforce something important:
Charts never tell the full story. Usage does.
Why This Isnāt Alarming (Yet)
This kind of roughness is common when platforms push deeper capability under pressure.
Code Red forced OpenAI to:
Expand reasoning depth
Push harder into complex domains like science and math
Support more advanced, multi-step use cases
Those moves often introduce short-term friction.
Meanwhile, Gemini 3 feels smoother in many day-to-day interactions ā not necessarily because itās āsmarter,ā but because itās embedded into workflows Google has refined for decades.
Polish comes from repetition at scale.
OpenAI is still tuning a general-purpose reasoning engine.
Google is refining a tightly controlled ecosystem.
What Iām Watching Next
Iām not calling winners.
I need more time with:
Long-running workflows
Agent orchestration
Multi-document reasoning
Daily professional use over weeks, not days
More testing is required, and Iāll keep everyone posted as patterns emerge.
Early impressions are useful ā but only if we treat them as signals, not conclusions.
The Bigger Picture Still Holds
Despite the rough edges, the broader takeaway hasnāt changed:
GPT-5.2 reflects OpenAI doubling down on reasoning depth
Gemini 3 reflects Google doubling down on integration gravity
This race is no longer about raw intelligence alone
Itās about where intelligence lives and how smoothly it shows up
Thatās a much harder problem than winning a benchmark chart.
Closing Thought
AI is officially past the demo phase.
Now weāre in the fit-and-finish phase ā where small rough edges matter more than big claims.
GPT-5.2 is promising, but not perfect.
Gemini 3 is polished, but opinionated.
And the most honest answer right now?
Weāre still learning how these systems behave in the real world.
Fire it in there.

