Testing and Debugging — The Two Skills Senior Engineers Are Actually Paid For

Writing code is the easy part. Debugging at 2am with no logs and an angry CEO is the job. Here's what I learned about both.

Writing code is the easy part. I know this is heretical to say in 2026 — when "writing code" is the metric people obsess over, when LLM productivity claims are denominated in lines of code, when bootcamps grade you on coding challenges — but it's been true the whole time and it's even more true now. Writing the happy path of a feature is a skill any reasonable engineer has within a year of starting. The two skills that separate senior engineers from junior ones, by an enormous margin, are testing and debugging. These are the things you actually get paid for. These are the things AI tools help with least.

Let me describe both, because the conversation about both has been corrupted by enough industry cliche that the actual practice gets lost.

Testing as a discipline.

The point of testing is not to "make sure the code works." The point of testing is to give yourself the ability to change the code with confidence. The test suite is a freezing of behavior — when you change the code, the suite tells you whether you preserved the behavior you cared about. Without tests, every change is a roll of the dice. With tests, you can refactor aggressively, optimize confidently, and ship without panic.

This framing changes what you test and how. You don't test every line of code. You test the behavior that matters. The line of code that calculates a refund matters; it gets tested heavily. The line of code that decides which logging format to use doesn't matter; it doesn't get tested.

The mistake junior engineers make is testing too uniformly. They write a test for every method, including the trivial ones. The coverage is high. The tests catch nothing meaningful because the trivial methods didn't have bugs to begin with. The senior engineer's test suite is smaller, asymmetric, and weighted toward the parts of the codebase where being wrong is catastrophic.

What does "where being wrong is catastrophic" look like?

Money math. Anything involving currency, conversion, rounding, totals. Bugs here are public and embarrassing.

Authorization logic. Anything where the answer to "is this user allowed to do this thing" is computed. Bugs here are security incidents.

State transitions. Order goes from pending to paid to shipped to delivered. Each transition has rules. Bugs here become support tickets.

Data transformations across boundaries. Inputs come in shape A, get processed, leave in shape B. The transformation has edge cases. Bugs here are usually invisible until customer data is wrong.

Anything that's "obviously correct." Code that the engineer didn't think to test because the intent was clear from the function name. This is where most production bugs live, because nobody verified the intent matched the implementation.

The discipline isn't "high coverage." The discipline is "every place where being wrong is expensive has tests that would catch the wrongness." Coverage numbers don't measure that. Stop using them as the goal.

Debugging as a discipline.

Brian Kernighan wrote in The Elements of Programming Style (1978): "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." The line is fifty years old and the math hasn't changed.

Debugging is the skill of taking a symptom and reasoning backwards through a system to the cause. It is a science with a method. Engineers who are good at it are not "naturally talented"; they have learned a process. Engineers who are bad at it are missing the process, not the talent.

The process, simplified:

Reproduce. Get the bug to happen in front of you. If you can't reproduce, you can't debug. Reproducing might involve specific data, specific timing, specific user, specific browser. The reproduction is half the work. If you find yourself trying to fix something you couldn't reproduce, stop — you're not debugging, you're guessing.

Bisect. Find the part of the system where the bug is happening. If the symptom is "user sees wrong number on dashboard," is the bug in the data, the query, the API, the frontend, or the rendering? Each is a hypothesis. Each has a test you can run to confirm or deny. Eliminate possibilities methodically, not by jumping to conclusions.

The bisect should be informed by what's most likely. "It's never the compiler" is a useful heuristic — bugs in code you control are vastly more common than bugs in the language, the framework, the database, or the OS. Junior engineers spend hours blaming the framework when the problem is in their own three lines.

Form a hypothesis. Once you've localized, you should have a specific guess about what's wrong. "The query returns the wrong rows when the user has more than one role." "The cache is stale because we forgot to invalidate it on the write path." Write the hypothesis down. The act of writing forces it to be specific.

Test the hypothesis. Run the experiment. Either confirm or refute. Don't try to fix it before you've confirmed. Many junior engineers leap straight from "I think it's X" to "I'll fix X" and discover hours later that X was never the problem and now they've added a fix that also breaks something else.

Fix the cause, not the symptom. This is the part that separates senior from junior. The junior engineer fixes the symptom — adds a null check where the null should never have happened. The senior engineer figures out why the null happened and fixes that upstream. The null check is sometimes fine as belt-and-suspenders, but it's not the fix; the fix is wherever the unexpected null originated.

Write a regression test. The bug existed because no test caught it. Add the test that would have caught it. Now if it recurs, you'll know. This step is often skipped because the fix feels done; this is wrong. The fix without the test is half done.

Postmortem if it was production. Why did this happen? Why didn't we catch it earlier? What systemic change would prevent the next one? Blameless postmortems are a real discipline and many teams claim to do them and don't. The signal for a real postmortem is that something systemic gets changed — a process, a test, a monitor, a guardrail — not that someone gets blamed and an action item gets logged but never closed.

Production debugging is its own skill.

Local debugging is easy. You have the code, the data, the breakpoint debugger, the time to think. Production debugging is hard. You may not be able to attach a debugger. You may not be able to reproduce. You may have to debug with logs, traces, and metrics — instrumentation that was added before the bug happened. The investment in observability before the incident is what determines how fast you can debug during the incident.

This is the part most teams under-invest in. They write code, they ship it, they don't add structured logging at the seams. When the incident comes, they're squinting at unstructured printf logs trying to figure out what happened. The fix is to invest in observability as a discipline, not as an afterthought: structured logs at every meaningful boundary, traces that span services, metrics on the business behavior (not just CPU and memory), error tracking that captures stack traces in production with the right context.

The teams that have great observability debug production issues in minutes. The teams that don't debug for hours and sometimes never find the root cause. Compound this across a year and the difference in engineering output is huge.

A few specific debugging techniques worth naming.

The rubber duck. Explain the problem out loud to someone who isn't an engineer. You will solve the bug in the explanation. The mechanism by which this works is fascinating and unimportant; the empirical fact is it works. Use it.

Read the actual error. Read the whole stack trace, including the chained causes. Read it twice. The number of engineers who skip past the actual error message and start guessing is high, and it's almost always faster to read the message carefully than to guess.

Print debugging is real debugging. Some engineers look down on print/console.log debugging as primitive compared to a step debugger. This is snobbery. Print debugging is sometimes the right tool, particularly in distributed systems, async code, and production-only bugs. The right tool depends on the problem, not on whose toolkit it's in.

The bisect against git history. When a bug appeared "recently," git bisect is a power tool. Mark a known-good commit and a known-bad commit, and git will walk you through the commits in between to find the exact change that introduced the bug. This finds bugs that hours of code review wouldn't. Most engineers don't know about this. Learn it.

Differential debugging. If something works in environment A and not environment B, find the differences. Same code, different config? Different OS? Different data? Different traffic pattern? The diff between A and B is your candidate bug list.

Slow down to go fast. When you're three hours into a debugging session and getting nowhere, the urge is to keep grinding. The right move is often the opposite — stop, walk away for ten minutes, come back fresh. The cognitive cost of being stuck in a wrong hypothesis is enormous. Stepping away resets the search.

Two cultural notes.

Engineers who can debug well are wildly more valuable than engineers who can't. This is not visible on a resume. It barely shows up in coding interviews. It only emerges over time. The way to identify these engineers is to watch how they react when something breaks in production. The good ones get focused and methodical. The bad ones panic or guess. Pay attention to the difference; promote and protect the good ones.

The same is true of testing. The engineers who care about testing are different from the engineers who don't. They tend to ship more reliable software, mentor better, and build more durable codebases. They're also often the ones who don't get the most credit, because their value shows up as the absence of incidents — which is invisible. Make a point of crediting them.

Hunt and Thomas's The Pragmatic Programmer is the best single book on engineering as a discipline that I know of; the chapters on debugging and on assertive programming are required reading for anyone leveling up out of mid-career. Writing code is the easy part — and getting easier every quarter. The cost of producing a working first draft has fallen by an order of magnitude in the AI-tooling era, and it will keep falling. What hasn't fallen is the cost of knowing whether the code is right. Tests and debugging are the disciplines that close that gap. The AI tools are going to keep getting better at writing. They're not going to get better as fast at noticing what's wrong with what they wrote, because that requires a model of the surrounding system the AI doesn't have. That's where senior engineering careers live in 2026 and beyond. Be the engineer who can take the AI's first draft and find the bugs in it before they reach production.