AI Coding Tools Require Governance

Tim Solley CTO, Cloud and Platform Engineering

AI coding tools are not a fad. They are a genuine step change in how software gets built, and every engineering organization is feeling the pressure to adopt them. That pressure is real and the opportunity is real. But the way a company adopts these tools matters enormously, and right now the industry has a prominent case study in what happens when adoption outpaces governance.

In November 2025, Amazon issued an internal directive signed by two senior vice presidents: 80% of engineers must use the company’s in-house AI coding tool, Kiro, on a weekly basis by end of year. The ambition behind that mandate was understandable. Amazon wanted to capture the productivity gains that AI-assisted development clearly offers. The urgency was real. The competitive pressure was real.

The safeguards were not.

What Happened

Over the four months that followed, Amazon experienced a series of escalating production incidents. Internal documents later described them as having a “high blast radius,” and flagged “Gen-AI assisted changes” as a contributing factor.

December 2025 — AWS Cost Explorer Outage

An AI agent operating with operator-level permissions decided to "delete and recreate" a live production environment. The result was a 13-hour outage of AWS Cost Explorer in mainland China. The AI did exactly what it was capable of doing. Nobody had constrained what it was allowed to do.

March 2, 2026 — Delivery Estimate Errors

Amazon Q contributed to incorrect delivery times across Amazon's marketplaces. The cascade of errors produced 120,000 lost orders and 1.6 million website errors. The change looked fine in development. It was not fine in production.

March 5, 2026 — Amazon.com Goes Down

A faulty software deployment following AI-assisted changes caused a six-hour outage of Amazon.com. Checkout, pricing, and account functionality were all affected. Orders dropped 99% across North American marketplaces. Final count: 6.3 million lost orders.

The Pattern Was Named Internally

Amazon's own internal analysis identified this as a pattern, not a coincidence. "Four major incidents in three months" was how it was characterized. An emergency engineering meeting was convened on March 10, 2026, led by SVP Dave Treadwell, to address what had become a systemic problem.

Amazon's public response was carefully managed. A spokesperson said the company had "not seen compelling evidence that incidents are more common with AI tools" and attributed the failures to "user error" and misconfigured access controls. But internally, those incident reports told a different story, one that explicitly connected AI-assisted changes to the disruptions. A commenter familiar with the situation put it plainly: Amazon "would rather have the world believe their engineers are incompetent than admit their artificial intelligence made a mistake."

Ship the capability. Mandate adoption. Discover the failure mode in production. Add the guardrail.

That quote is not a criticism of Amazon specifically. It describes a pattern that plays out across the industry whenever speed of adoption outpaces the development of governance.

The Anatomy of What Went Wrong

The Amazon incidents were not the result of bad engineering culture or uniquely incompetent engineers. Amazon employs some of the most skilled engineers in the world. What happened was a predictable set of organizational failures that emerge when AI tools are rolled out without a corresponding governance framework.

Unbounded Agentic Permissions

AI agents were given operator-level access to production systems. This is the equivalent of handing a new contractor the keys to the server room on their first day. Agentic tools need carefully scoped permissions, with human approval gates before any action that touches production.

Adoption Metrics Over Safety Metrics

The mandate was measured by how many engineers used the tool, not by whether the tool was producing better outcomes. When you optimize for adoption numbers, engineers adopt the tool. When you optimize for quality, you get different incentives entirely.

No Mandatory Human Review Gates

AI-assisted changes were flowing into production without the kind of human review processes that should be standard. When something goes wrong with AI-generated code, the blast radius can be large because the code often touches systems the engineer did not directly reason through.

Tool Lock-In Prevented Better Choices

Roughly 1,500 Amazon engineers signed an internal petition requesting access to external tools like Claude Code, arguing they outperformed Kiro for multi-language refactoring. VP-level approval was required to use alternatives. Mandating a single tool and blocking alternatives is a monoculture risk, not just an ergonomics inconvenience.

Junior Engineers Without Appropriate Oversight

A junior engineer with an AI coding tool can produce a lot of code very quickly. That is not inherently dangerous. It becomes dangerous when that code enters production without oversight calibrated to the engineer's experience level. Speed without judgment is how you get large mistakes, fast.

Reactive Safeguards Instead of Proactive Governance

The safeguards Amazon implemented after the March 2026 incidents (two-person code reviews, senior sign-offs, director-level audits, a 90-day safety reset across 335 critical systems) are reasonable and sensible practices. The problem is that they were assembled under duress, after millions of dollars in lost orders, rather than designed into the adoption process from the beginning. Governance built reactively costs far more than governance built proactively.

The Reset That Should Have Been the Starting Point

On March 10, 2026, Amazon convened an emergency engineering meeting. SVP Dave Treadwell led the session. What followed was a 90-day safety reset across 335 critical systems.

The new requirements included: two-person code reviews for AI-assisted changes, senior engineer sign-off before junior and mid-level engineers could push AI-generated code to production, director-level audits for production deployments, stricter automated checks, and a formal documentation and approval process for any AI-assisted changes to critical paths.

These are not radical ideas. They are basic software engineering discipline applied to a new class of tooling. The irony is not lost on anyone paying attention. Amazon rushed the adoption and then had to slow everything down to install the practices that responsible adoption would have included from day one.

The question was never whether to adopt AI coding tools. It was always how.

There is an important distinction between the capability and the discipline required to use it well. Amazon had the capability. What the mandate lacked was a parallel investment in the organizational structures, review processes, and permission models that let teams use that capability without introducing catastrophic risk. The lesson here is not "be careful with AI tools." The lesson is "govern AI tools from the start, before you have 6.3 million reasons to wish you had."

This Is Not an Amazon Problem

It would be comfortable to read this story as a tale about one very large company moving too fast. It is not. VergeOps is seeing the same dynamics play out across industries, at companies of every size.

The pattern is consistent. Leadership hears about productivity gains from AI coding tools, or feels competitive pressure to keep up, or both. A mandate goes out, or strong encouragement that functions as a mandate. Engineers start using the tools. Nobody defines what responsible use looks like. Nobody creates review processes calibrated to the new risk profile. Nobody constrains what agentic tools are allowed to do. And then, eventually, something goes wrong and the organization responds reactively with governance structures it should have built proactively.

The organizations that are doing this well share a common characteristic: they treated AI tool adoption as an architectural decision, not just a procurement decision. They asked not only “what tools will we use” but “how will we use them safely, and how do we know when we’re using them well.”

What Responsible Adoption Looks Like

There is no single right framework for every organization. But the practices below represent the foundation of responsible AI coding tool governance. Every team that has adopted these tools at scale without major incidents has some version of all of them in place.

Define Policy Before You Mandate Usage

Before any adoption requirement goes out, document what acceptable AI-assisted development looks like at your organization. What can AI tools do autonomously? What requires human review? What is never delegated to an AI agent? This policy should exist in writing before the first engineer opens the tool.

Establish Review Gates by Experience Level

AI-assisted code from a junior engineer carries different risk than the same output from a principal engineer with deep knowledge of your system. Review requirements should reflect that. This is not about distrust but calibrating oversight to the combination of tool capability and engineer judgment.

Constrain Agentic Permissions Explicitly

AI agents should operate with the minimum permissions necessary to complete the task. Production access requires human approval. Agents that can delete, recreate, or modify live environments should have that capability gated behind explicit human confirmation at every step. This isn't optional, it is the baseline.

Measure Quality Outcomes, Not Just Adoption

Usage rate is not a success metric. Defect rates, incident frequency, time to detect and recover, code review findings are the signals that tell you whether AI-assisted development is working well or creating hidden risk. If you are only measuring adoption, you will hit your number and miss the point entirely.

Build a Feedback Culture, Not a Mandate Culture

Engineers using these tools every day see things that leadership does not. If an AI tool is producing unreliable output for a particular class of tasks, your engineers know. Create channels for that information to reach decision-makers, and take it seriously. A mandate that suppresses honest feedback from the engineers closest to the work is a governance failure waiting to happen.

Invest in AI Literacy, Not Just AI Tooling

The productivity gains from AI coding tools are real, but they scale with the engineer's ability to use the tool well. Prompt quality, context management, knowing when to trust the output and when to verify it independently are skills that require development. Providing the tool without the training is like buying everyone a table saw and skipping the safety orientation. The results will eventually be predictable.

How VergeOps Can Help

VergeOps

VergeOps is working with engineering organizations right now on exactly these challenges. We see the full landscape: teams that have adopted AI coding tools successfully and are compounding their gains, and teams that are discovering the hard way that fast adoption without governance creates new categories of risk. The difference between those two groups is almost never the tools they chose. It is whether they built the practices to use those tools well.

Our engagements in this space typically span three areas. First, an AI readiness assessment that evaluates your current tooling, review processes, permission models, and team literacy to identify where the gaps are before they become incidents. Second, governance framework development that produces a concrete policy your engineering organization can actually operate within. Not a theoretical framework, but a practical one calibrated to your systems, your team structure, and your risk tolerance. Third, ongoing advisory as your AI tooling strategy evolves, because this landscape changes quickly and having a senior architecture perspective in your corner matters as new capabilities and new risks continue to emerge.

The companies that navigate this transition well will build lasting competitive advantages. The ones that mandate adoption without governance will spend the next year cleaning up after their own incidents. You have the option Amazon did not take. Reach out to talk through what responsible AI adoption looks like for your organization before you need a 90-day reset to figure it out.

AI Coding Tools Require Governance

What Amazon's Production Disasters Teach Every Engineering Leader

What Happened

December 2025 — AWS Cost Explorer Outage

March 2, 2026 — Delivery Estimate Errors

March 5, 2026 — Amazon.com Goes Down

The Pattern Was Named Internally

The Anatomy of What Went Wrong

Unbounded Agentic Permissions

Adoption Metrics Over Safety Metrics

No Mandatory Human Review Gates

Tool Lock-In Prevented Better Choices

Junior Engineers Without Appropriate Oversight

Reactive Safeguards Instead of Proactive Governance

The Reset That Should Have Been the Starting Point

This Is Not an Amazon Problem

What Responsible Adoption Looks Like

Define Policy Before You Mandate Usage

Establish Review Gates by Experience Level

Constrain Agentic Permissions Explicitly

Measure Quality Outcomes, Not Just Adoption

Build a Feedback Culture, Not a Mandate Culture

Invest in AI Literacy, Not Just AI Tooling

How VergeOps Can Help