Pros Don't Vibe, They Control: What Developers Reveal About AI Agent Use

· 8 min read ·
·
AI Agents Developer Tools Software Engineering Research

The loudest voices on Twitter will tell you vibe coding is the future. Just describe what you want, let the agent rip, and ship it. Don’t read the diffs. Forget the code exists. Trust the vibes.

A research team at UC San Diego studied 112 professional developers (3-25 years of experience) and found the exact opposite. Their most salient finding: professional software developers do not vibe code. They carefully control the agents through planning and supervision.

This isn’t a survey of hobbyists building weekend projects. The study combined 13 field observations (45-minute recorded sessions of developers doing real work) with a qualitative survey of 99 experienced developers. The participants used Claude Code, Cursor, GitHub Copilot, and Windsurf on production software, side projects, and R&D work. The median experience was 10 years.

Here’s what the data actually shows about how pros use AI agents.

The Control Gap: What Pros Do Differently

Every single observed developer, all 13 of them, controlled the software design and implementation when working with agents. Not most. All.

11 out of 13 created new features during the study. Every one of them controlled the design of those features, either creating the plan entirely themselves or reviewing agent-generated plans against their engineering expertise. Even when working on unfamiliar tasks, they didn’t hand over the reins.

The strategies broke down into three levels of implementation control:

Control LevelBehaviorParticipants
Monitored executionSpecified requirements, let agent implement, closely watched outputsP1, P4, P5
Reviewed every changeLet agents generate code, carefully reviewed all diffsP2, P3, P6, P7, P8, P9, P10, P11, P13
Used agent as reference onlyCoded manually, used agent for explanation and architecture diagramsP12

Even the “monitored execution” group wasn’t vibing. They rejected unnecessary dependencies the agent tried to install. They manually traced through misbehaving code with debuggers. They were skeptical and controlling despite not reading every line.

Why this matters: The study explicitly addressed why pros refuse to vibe. Four reasons emerged:

  1. Software engineering principles are hard-won habits, not optional decorations
  2. Production code affects real users and involves stakeholders who define requirements
  3. For familiar codebases, developers have better context than agents
  4. When agent solutions go wrong on unfamiliar tasks, fixing them is frustrating and slow

As one survey respondent (S64) put it: “AI agents become problematic once you’re not making them adhere to engineering principles that have been established for decades.”

The Anatomy of a Pro Prompt

One of the most interesting parts of the study is how the researchers dissected the actual prompts developers used. They identified 10 types of context that pros embed in their prompts:

Context TypeHow Many Participants Used It
Technical terms12/13
Reference to input files10/13
Specific library or API10/13
Domain objects8/13
UI or design terms7/13
Interaction descriptions7/13
New feature or requirements7/13
Reference to output file6/13
Purpose of feature7/13
Reference to step in plan5/13

Here’s an actual prompt from the study that demonstrates what “clear context and explicit instructions” looks like in practice:

An example prompt from a study participant showing seven types of context embedded in a single prompt for implementing a full-stack feature

This prompt from participant P3 packs seven of those 10 context types into a single request: data types, domain objects, specific UI interactions, output files, and the critical constraint at the end: “Please do just step 1 now.”

That last line is the key. Participants with plans of 70+ steps still only let the agent execute 2.1 steps on average per prompt. They chunked the work, verified after each chunk, then continued.

Why this works: Prompting agents is not about writing English prose. It’s about translating software engineering specifications into agent instructions. As one respondent said, they approach it by “applying the lessons of software engineering to narrative.”

What Agents Are Actually Good At (and Bad At)

The survey asked developers to reflect on task suitability. The results create a clear picture of where agents deliver and where they fall apart. Out of 59 fine-grained task types mentioned in at least 5 surveys:

Suitable (strong consensus):

TaskSuitableUnsuitable
Accelerating productivity352
Small/straightforward tasks331
Following well-defined plans282
Generating new code272
Tedious/repetitive tasks260
Scaffolding or boilerplate250
Writing tests192
Writing/updating documentation200
General refactoring183
Prototyping or small projects120

Unsuitable (strong consensus):

TaskSuitableUnsuitable
One-shotting code without verification523
Integrating with existing/legacy code317
Complex tasks316
Business logic/domain knowledge215
Replacing human expertise012
Writing performant code39
High-stakes or privacy-sensitive08
Big tasks17
Vague/open-ended tasks07

The controversial middle ground: High-level planning (13 suitable vs. 23 unsuitable) and general debugging (12 vs. 8) split opinions. Some developers used agents as brainstorming partners for architecture. Others wouldn’t touch it: “I never trust LLMs for systematic issues” (S68).

The pattern is clear: as task complexity increases, agent suitability drops. Agents are accelerators for well-scoped work, not autonomous engineers.

The Quality Attributes That Don’t Budge

When the researchers asked what developers cared about most when using agents, 67 out of 99 survey respondents mentioned software quality attributes, outnumbering the 37 who mentioned productivity.

Software quality attributes that experienced developers prioritize when working with AI agents

Correctness and readability dominated. Developers didn’t relax their standards just because an agent wrote the code. If anything, they raised them. Five observation participants felt more willing to test code systematically while using agents. One participant (P6) reinforced test-driven development by having agents generate test cases for every change, citing higher test coverage than before because “it’s part of the workflow now.”

The quality bar doesn’t lower for agent-generated code. It gets applied harder.

The Sentiment Paradox: Happy but Skeptical

Here’s where it gets interesting. Despite all the control and skepticism, developers overwhelmingly enjoy working with agents:

MetricAverage Rating
Enjoyment (1-6 scale)5.11/6
Task suitability (1-6 scale)4.73/6
Code modification frequency (1-5 scale)3.0/5 (about half the time)

77 out of 99 respondents rated their enjoyment in the top two categories. Developers found coding fun again: “This has made code fun again. I’m producing things that I didn’t have time or energy to do before. It’s like rediscovering computers again for the first time” (S8).

But the enjoyment comes from collaboration, not delegation. “I like coding alongside agents. Not vibe coding. But working with” (S96). The F1 car metaphor from another respondent captures it perfectly: “It felt like driving a F1 car. While it also felt like getting stuck in traffic jam a lot, I still felt optimistic about it” (S24).

The key insight: Happiness correlates with control, not with letting go. The developers who enjoyed agents most were the ones who stayed in the driver’s seat.

What This Means for Your Workflow

The study isn’t just academic. It maps directly onto practical strategies:

Prompt with specificity, not vibes. Include file names, function names, library references, domain objects, and expected behavior. Treat every prompt like a mini-specification. The most effective prompts in the study contained 7+ types of context.

Chunk your plans. Even developers with 70-step plans only let agents execute 2-3 steps at a time. Verify after each chunk. As one participant put it: “Please do just step 1 now.”

Use agents for the right tasks. Boilerplate, scaffolding, tests, docs, simple refactoring, and code generation from clear specs. Pull back for complex logic, business rules, architecture decisions, and legacy code integration.

Test harder, not softer. Several participants reported that working with agents actually increased their testing discipline. Agent-generated code needs verification. Build that into the workflow rather than fighting it.

Stay the pilot. “I do everything with assistance but never let the agent be completely autonomous. I am always reading the output and steering” (S83). Zero respondents said agents could replace human decision-making.

The Bottom Line

The vibe coding narrative is seductive. It suggests we’re one model upgrade away from describing apps in English and shipping them. The data tells a different story.

112 experienced developers, median 10 years of experience, using the best tools available in 2025, found that agents are powerful accelerators for well-defined, straightforward tasks. But the moment you stop controlling the agent, the moment you “give in to the vibes,” quality drops and frustration climbs.

The future of AI-assisted development isn’t vibing. It’s engineering with better tools. The developers who get the most out of agents are the ones who bring the most engineering discipline to the table.

As S28 summarized in the study’s own conclusion: “No matter how you slice it, agents are extremely accelerating. One word of warning to the young ones getting into the business: it’s still really important to know what you’re doing.”


Working with AI agents in your development workflow? I’d love to hear what control strategies work for you. Reach out on LinkedIn.