When writing the code is the easy part

June 12, 2026

We gave an AI read access to our production systems. Here's why, and how we made sure it can't change a thing.

Engineering Manager
An illustration on a dark background features the Envoy logo in the top left corner. The image depicts a stylized eye looking through a field of multicolored horizontal lines representing code. A line connects the eye to a green circle containing a white c

“Why did system A not process this event from queue B?”

It’s Tuesday. I am tasked with investigating a monitor indicating a potential issue. After verifying I do indeed actually work at my company by completing MFA (several times), I find our log aggregator isn’t showing specific logs that indicate why this transaction started, but did not complete. Later, I realize I was searching the wrong log facet. My LaCroix is warm.

Making a feature work one time, is coding. Making it work all the time - that’s engineering. And like any engineer would do, we sought to automate and reduce the manual process for ensuring quality. So, we took a leap.

We gave an AI read access to our production systems.

At a security conscious company, that sentence should make you a little nervous. Here's why we did it, and how we made sure it can't change a thing.

Writing code is the part of the job that AI already does well, and we lean on it all day. But writing code was never the part that kept us up at night. The hard part comes after you ship: confirming the thing actually works, noticing when it quietly stops, and chasing down why. That's where the quality of a product gets decided, and almost none of it had been automated.

It matters more for us than it might somewhere else. Envoy started as the iPad you sign in on at the front desk. We still do that, but the company has grown into a security and compliance platform, and the data we hold reflects it: access logs, identity, audit trails, a record of who went where and when. When that's what you're storing, a feature that's only "mostly working" stops being cosmetic. A gap in it, is a security gap. So "is this doing what it's supposed to, in production, right now" is a question we ask constantly, and there's more riding on the answer than there used to be.

The part that isn't writing code

Answering that question by hand is tedious. Checking whether a feature works end to end means visiting four or five tools, each with its own login and its own way of showing you half the picture. Debugging an incident is the same scavenger hunt, except now something is actually broken and the clock is running. You spend more time clicking between dashboards than thinking about the actual problem.

This is the sort of thing a model is genuinely good at. It can keep a dozen systems in its head at once, follow a single event through all of them, and point at where things went wrong. The catch was always access. You don't hand an AI the keys to production casually, and at a security company you don't do it at all until you have a very good answer to "what happens if it breaks something."

What we built

Our answer was a data harness. It's a read-only window into production that the AI can look through: you ask a question in plain English, it goes and finds the answer, and it tells you what it found. It can see everything and change nothing.

The difference is mostly speed. Checking whether a feature works in production used to take half an hour of poking around; now it's closer to ten seconds. Working out why something broke used to mean an hour of dragging in other teams, and these days it's usually a minute or two. A customer bug that would have eaten a day or two of cross-system detective work tends to get a verified fix the same morning. And the daily "is anything on fire" check stopped being a row of browser tabs and turned into one question we type once.

The change nobody wants to make

One of the use cases we didn't see coming, is the scary stuff. Every team has a change that's been parked on the backlog for a year because nobody's quite sure what it will break, and the longer it sits, the worse that gets. Once an AI can read what's really in production, not just the code but the actual state, the actual shapes and volumes and the edge cases nobody wrote down, it can walk through where a change would go wrong before you run it on anything real. That doesn't make the work faster, so much as it makes it possible. Some of those jobs were stuck for one reason only: we couldn't see well enough to feel safe starting. Now, we can create checkpoints, do a dry-run and inspect every single iteration of data to understand - should we trust this migration?

How we keep it from touching anything

The hard part was never making this powerful. It was making it safe, and at our compliance bar, "pretty safe" doesn't count. So we didn't write a polite instruction asking the model not to change anything. We made changing things impossible. Read-only users, replica-only access, strict database authorization. We built a custom MCP that enforces guardrails and security measures as a deterministic pipeline. Access is read-only at every layer we could put a layer on, and the commands that would let it write or delete just aren't part of the tool, so there's nothing there for it to call. If one of those guardrails ever failed, the others would still hold.

We also keep customer data away from the model. Anything sensitive is stably replaced for a stand-in before any AI ever sees it, and consistently enough that it can still tell two records belong to the same person without ever learning who that person is. No customer data ever leaves the MCP tool. Everything fails closed, too. If a safety step can't run for some reason, the request just dies rather than falling back to handing over the real data.

What actually changed

None of this made our AI write more of our code. It gave the AI a safe pair of eyes on the place the code actually lives. The check that used to cost an afternoon costs a sentence. The incident that used to mean five dashboards and a guess is now a question with an answer attached. And since you can run those same questions on a schedule, sometimes the answer comes back before a customer has noticed anything is wrong. That last one is the version we care about most.

For a company whose systems now help decide who gets through a door, that isn't a nice-to-have. It's how we keep the bar where it needs to be while everything around it gets more serious. Our engineers write features faster, safer, and with more confidence than ever before. It’s night and day. For anyone building software today, we highly recommend investing in visibility engineering.

AUTHOR BIO
Engineering Manager

Dave Mun is an Engineering Manager at Envoy, where he leads the Physical Identity and Access Management platform team. With a background spanning real-time communications, workplace technology, and physical security systems, Dave has spent nearly seven years helping build the infrastructure that powers secure, connected workplaces. He specializes in distributed systems, access control, AI-enabled workflows, and scaling engineering teams to solve complex, real-world problems.