Velocity vs Mastery in Software Engineering

AI coding tools have produced the most significant productivity shift in software engineering in a generation (and maybe in history). Agents have become tremendously capable of producing high quality code; as a result, engineers are less often authoring code directly, and more often generating it with the help of Cursor, Claude Code, or Codex.

But among this productivity boon has been a risk some talk about, but few engage with seriously: a generation of engineers who can orchestrate code but cannot write it.

Engineering leadership in the near term (maybe not long term, in the event that AGI arrives sooner rather than later) will have to face a decision: to what extent to rely upon and leverage AI agents for writing their code. This is particularly salient in the case of training junior engineers—how do managers direct their less experienced engineers to write code? Should junior engineers be chiefly orchestrators? Should they write all code by hand? Or something inbetween?

I think the answer lies somewhere in the middle, and getting to that answer requires managers to examine what they are indexing for.

What is AI good for?

In order to understand how to best leverage AI, a helpful starting point is to examine what AI is good for—what does it achieve? In my experience, AI coding tools chiefly provide three benefits: 1) greater velocity at tasks that you would formerly write by hand; 2) the ability to take on new tasks in unfamiliar domains or languages, which you wouldn't be able to do otherwise; and 3) faster prototyping/exploration (particularly relevant in my field of creative development).

Missing from the above list is any mention of learning or new skill acquisition. It may be the case that peripheral exposure to AI-generated code does help a programmer learn more about unfamiliar topics—but this is a second-order effect and very limited in its impact.

More realistically, anyone who "vibe codes" or directs AI agents to write code tends to be simply a reviewer of such code. If the reviewer understands the code (as is the case if they are familiar with the domain), the code is good to merge—if the reviewer does not understand the code (for example, in an adjacent-yet-not-mastered subdiscipline), they simply accept all changes (after perhaps having another agent review the changes).

And so if AI coding rarely produces skill-building or knowledge acquisition, what role do juniors play in this new world? The tension becomes evident when you consider your two options at face value: encourage the engineer to use AI tooling to solve problems that would otherwise be beyond their capacity, which may produce greater velocity, or discourage the use of AI which instead produces greater mastery.

Focusing too narrowly on either of these tenets will backfire, for individuals and teams alike: an individual or team that focuses purely on velocity may move faster, but produce a brittle codebase of which they have no understanding, which in turn makes that team nothing more than a glorified-"Accept All Changes"-button-presser. And those who focus only on mastery will benefit from a greater understanding of the code they write, but at a significant cost to their speed.

Surely then, the solution lies in the middle. All engineering teams, and especially engineering leadership looking to steer junior engineers, should consider the balance of velocity and mastery. And one way they can do so is by considering the depth of the task at hand.

Deep vs Shallow Tasks

When I say "depth," I am essentially referring to the level of authorship and understanding an engineer should have when writing (or, reviewing AI-generated) code. In my experience, tasks can broadly be categorized as either deep tasks—those that fall squarely within an engineer's core domain, where genuine understanding matters; where knowing why the code works is as important as the code itself. These are the tasks where meaningful engagement with an agent (asking questions, comparing tradeoffs, rejecting bad patterns) builds the kind of expertise that makes someone a better engineer over time.

You may be wondering what value there is in skill-building in the first place. This take, as pessimistic as it is, is worth considering. My thoughts: at least in the near-term, there is a serious advantage to knowing how to prompt well, to steer agents during a conversation, and to review and reject bad patterns in codebases. I have noticed significant differences in codebases with a senior in the loop, compared to those hacked together by a pure vibe coder. Eventually, as models continue to grow in capacity, this may be resolved and agents will have an omniscient ability to work autonomously and steer themselves—but at that point the whole industry is cooked, so we might as well engage in this thought exercise.

On the other side of the spectrum, there are shallow tasks—those tasks which AI can carry out tremendously well, and serve no real purpose beyond that one-off task. As an example, I recently migrated ~50 articles from a clients' Substack (a blogging platform) to Sanity (our content management system). Rather than manually migrating 50 articles, I enlisted Codex to carry out the migration for me. What followed was a 4 hour long chat thread (consuming 40% of my weekly usage 😄), 1500 lines of code written in Python (a language I rarely use), and a successful content migration. Would it have been productive for me to review the Python code in this case? Surely not. This is a shallow task, because AI can achieve a verifiable end goal (the content has been migrated, and looks correct in the Sanity ecosystem), and understanding how it was done serves essentially zero purpose.

You may argue that understanding how this task was done could lend itself to making future migrations easier. I would say providing 1) the migration scripts and 2) the agent-generated summary of what was done to a new agent would be much more productive than having studied the scripts or summary myself.

And so all tasks fall somewhere on the spectrum of "deep" to "shallow." The placement of a given task on that spectrum can usually be determined by answering a simple question: if AI didn't exist, would I be expected to know how to do this on my own? If the answer is yes, it's a deep task; if instead you would delegate to someone else, hire a contractor, or simply suffer, it's a shallow task.

So, the challenge for engineering leadership is to provide a clear dichotomy to their teams: what tasks are considered deep, and should be reviewed closely, and what tasks are shallow, and can be completed without review.

Of course, viewing tasks as "shallow" or "deep" also requires a manager to revise expectations around velocity. Recall the earlier point about the tradeoff between velocity and mastery. Whereas shallow tasks should produce velocity, deep tasks should produce mastery. As a result, managers need to adjust any expectations around their junior engineers' speed based on their works' depth. What could be done in 1 hour by a glorified-"Accept All Changes"-button-clicker should take 2 hours, assuming that engineer is meaningfully engaging with the agent while they write the code (asking questions, explaining architectural decisions, comparing tradeoffs, etc).

In this new age of agentic programming, there is a temptation for engineering leadership to expect the world of their engineering teams: the world is moving fast, and AI coding is moving faster, so our team must move at unprecedented speeds. Although speed is a noble goal, and it is true that these models have insane capabilities, recall the tradeoff between velocity and mastery. Would you want your team to move as fast as possible, if it meant they did not understand what their code was doing at the most fundamental level? Probably not.

There's no playbook for perfectly balancing deep and shallow tasks. In practice, you'll likely only know you've gotten it wrong in hindsight—either your engineers have moved fast but can't explain what they've built, or they've developed real mastery but only after being outpaced by everyone else. The goal is to never fully arrive at either extreme—which we can avoid by intentionally calibrating when we optimize for mastery and when we optimize for speed.