I’ve shipped code I didn’t fully understand. Generated it, tested it, deployed it. Couldn’t explain how it worked.

You probably have too.

I have shipped tasks that once took days now only take hours. But production systems fail in unexpected ways, and when they do, you better understand what you’re debugging.

We’re generating code faster than we can comprehend it.


This Isn’t New

Every generation of engineers has hit a wall where complexity exceeded their ability to manage it. The 60s had their “software crisis.” Dijkstra complained that gigantic computers created gigantic problems.

The cycle repeats. C in the 70s. Personal computers in the 80s. OOP in the 90s (thanks Java for those inheritance hierarchies from hell). Agile in the 2000s. Cloud in the 2010s.

Now we have AI. Copilot, Cursor, Claude. We generate code as fast as we can describe it.

Same pattern, different scale. The ceiling on complexity is now effectively infinite.


Easy ≠ Simple

Tools don’t make us 10x faster. The hard part was never writing code. It’s knowing what to write.

We keep optimising for mechanics anyway.

We use “simple” and “easy” interchangeably. They’re not the same.

Every time we choose easy, we’re borrowing against future understanding.

That trade-off used to work. The debt accumulated slowly enough to refactor when needed. AI broke that balance. It’s the ultimate easy button, so frictionless we don’t even consider the simple path anymore.

Why think about architecture when code appears instantly?


How It Goes Wrong

I faced this building RetireCal in React 19. Hadn’t touched React in six years. Leaned heavily on Claude.

Started simple. “Set up the calculation engine.” Clean file. Then RRSP logic. Another file. TFSA calculations. CPP projections.

By turn 20, I wasn’t having a conversation. I was managing context so complex I couldn’t remember my own constraints.

Dead code from abandoned approaches. Tests that passed because I tweaked assertions to match whatever the code was doing. Fragments of three different strategies because I kept saying “wait, actually…”

Each “make this work” morphed the code to satisfy my latest request. No resistance to bad decisions, just whatever I asked for.

Complexity compounded until I couldn’t reason about my own calculator.


AI Can’t See the Seams

When an agent analyses your codebase, every line becomes a pattern to preserve. That incident specific patch? Pattern. That Stack Overflow copy-paste from 2019? Also a pattern.

Technical debt doesn’t register as debt. It’s just more code.

AI makes no distinction. Every pattern gets preserved.

There are two kinds of complexities:

Essential: the actual problem. A checkout flow needs to charge the right amount. An auth system needs to verify identity.

Accidental: everything we added along the way. The retry logic from that network outage. The feature flags from a migration we finished two years ago. The caching layer with twelve invalidation rules.

I faced this on a data pipeline at a previous job. Go service that had grown over three years. Legacy logic everywhere: validation rules buried in transformation code, schema assumptions hardcoded across consumers, reconciliation spanning four services that each had their own idea of “processed.”

AI couldn’t untangle it. It would start refactoring, hit a dependency, spiral. Or worse, preserve old logic while implementing new patterns. Frankenstein code that technically worked but nobody could maintain.


The Fix: Research, Plan, Implement

We can tell essential from accidental complexity, but only when we slow down.

Three phases:

Research

Feed everything upfront. Architecture diagrams, docs, Slack threads. Use the agent to map components and dependencies.

Probe it. What about caching? How does this handle failures? Correct wrong analysis. Provide missing context.

Output: a single research document. What exists. What connects. What your change affects.

Human checkpoint here is critical. Validate against reality. Catch errors now, prevent disasters later.

Plan

Create a detailed implementation plan. Code structure, function signatures, type definitions, data flow.

Make it paint-by-numbers precise. Hand it to your most junior engineer. If they copy line by line, it should work.

This is where you make architectural decisions. Service boundaries. Clean separation. Preventing coupling.

After that pipeline refactor, I applied the same approach to a new Go service. Full interface definitions, error handling strategy, data flow - all documented before generating code. The implementation practically wrote itself.

Implement

Now this phase is simple. That’s the point.

Clear spec = clean context. No 50-message evolutionary spirals. Three focused outputs, each validated before proceeding.

The payoff: background agents can do the work. You’ve done the thinking. Start implementation, work on something else, come back to review. Review is fast because you’re verifying conformance, not deciphering inventions.


Sometimes You Do It By Hand First

Sometimes you can’t even start research until you’ve done part of it manually.

When we refactored that pipeline, we had to migrate one data source by hand. No AI. Just reading Go code, tracing message flows, making changes to see what broke.

Painful. Took a week for what should have been a day.

But it revealed things we couldn’t have known otherwise. Which downstream jobs assumed specific ordering. Which services cached intermediate state. Which retry logic was actually load-bearing versus leftover from a 3am incident response.

Then we fed that PR into research as the seed. The AI could finally see what clean migration looked like.

The three-phase approach isn’t magic. It worked because we earned the understanding first.


The Bottom Line

AI changes how we write code. It doesn’t change why software fails.

Every generation faced their software crisis. The one in the 60s created the discipline of software engineering. Ours has infinite code generation.

The solution isn’t another tool. It’s remembering what we’ve always known: software is a human endeavour. The hard part was never typing code. It’s knowing what to type.

The developers who thrive won’t generate the most code. They’ll understand what they’re building. They’ll see the seams. They’ll recognise when they’re solving the wrong problem.

The question isn’t whether we’ll use AI. That ship has sailed.

The question is whether we’ll still understand our own systems when AI writes most of our code.


Here is a video worth watching - No Vibes Allowed: Solving Hard Problems in Complex Codebases - Dex Horthy, HumanLayer