Why hiring ‘AI engineers’ won’t work

Practically every company today is posting roles to hire an “AI engineer.” They’re likely assuming that an “AI engineer” can handle everything from product development to infrastructure to data integration. Most of the time, though, they’re going to be disappointed.

That’s because assessing the competency of engineers has always been hard–and adding AI to the mix makes it even harder–and companies are often testing for the wrong thing. Under the umbrella of “AI engineer,” they’re collapsing at least three different types of technical work into a single job description, then wondering why the person they hired can’t do the job they need done.

At Andela, where we assess, train and vet software engineering talent as the core of our business, we’re finding that basic AI skills assessments produce an almost 75% fail rate.

My first reaction when I saw the 75% fail rate was that we had an assessment problem. But the more I dug in, the clearer it became that the problem was upstream. Those candidates weren’t failing because they lacked skills. Many of them are exceptional engineers. They were failing because the entire industry is based on assessment frameworks that can’t distinguish between the types of AI work that need to be done.

Consider this situation. I was recently reviewing assessment results for a batch of AI engineering candidates. One candidate stood out: strong resume, passed the coding assessment and defined every concept we threw at them: RAG architectures, agentic search, vector databases, prompt chaining. On paper, this person had the skills.

Then we got to design. We presented a real enterprise scenario and asked which approach they’d use and why. The candidate described a RAG implementation. The solution was technically correct and valid. But for this use case, a RAG implementation would have required significantly more engineering while producing less complete results than an agentic search approach. (The problem required dynamic reasoning across multiple data sources rather than retrieval from a fixed index.) The candidate knew the concepts but lacked the judgment to know which solution was dramatically better for the specific problem.

I’d call that a gap in technical taste: the ability to choose between valid options and find the one that’s right for a specific context. And it’s the gap our assessments, and almost every assessment pipeline in the industry, weren’t built to catch.

And it’s costing real money. Enterprises are burning months on mismatched hires, misaligned teams and AI initiatives that stall, not because the technology failed, but because the people doing the work were the wrong people for that particular work. Highlighting this difficulty is ManpowerGroup’s 2026 Talent Shortage Survey, which found that AI skills have surpassed all others to become the most difficult for employers to find globally, with 72% of employers reporting hiring difficulty.

Digging deeper

In my previous article, I spoke about how enterprises should seek to hire Forward Deployed Engineers (FDEs) who can bridge engineering, architecture and business strategy, to push AI past the ‘integration wall’ and into production. FDEs are the expedition leaders. No company has enough of them. No company can afford to hire enough of them for all the work ahead.

So, what do you do below the FDE layer? You have to dig deeper. For every one FDE, teams will need three or four engineers operating in more specific modes of work. In our experience, the AI work that enterprises need done falls along a spectrum defined by three archetypes.

Prototypers. These are the rapid experimenters. They are engineers, product managers or designers who use AI tools to quickly test ideas, find value and throw away what doesn’t work. In a previous era, validating a new product concept meant scoping a project, building a team and committing to a six-month build cycle. Now one person with the right tools and good instincts can shortcut that entire process, testing and discarding dozens of ideas to find the ones worth investing in. The prototyper’s technical taste is about sensing what’s valuable before an organization commits real resources.

Builders. The engineers who turn validated ideas into production systems. A builder needs to do more than ‘vibe code.’ They need to operate as agentic engineers: architecting the system, orchestrating the agents to build it, verifying the output and shipping with confidence. Critically, building in an AI context means building the full stack, including the data pipelines that organize content from disparate systems, the access controls that govern what the AI can reach and the integration layer that connects AI to the messy reality of enterprise data and infrastructure. Without this end-to-end capability, AI stays trapped in sandboxes. The builder’s technical taste is about choosing the right architecture and integration approach when multiple valid options exist and knowing which one will be dramatically better for a specific production context.

Scalers. The engineers responsible for reliability, governance, observability and production AI operations. These professionals were, in a previous era, DevOps engineers. They know how to deploy LLMs and manage the liability of model output at enterprise scale. The scaler’s technical taste is about tradeoffs: performance versus cost, governance rigor versus development velocity and risk tolerance versus time to market.

These aren’t rigid job categories. They’re patterns of AI engagement. In practice, they blend. A backend engineer on a given project might spend 60% of their time doing builder work and 40% on scaling. The point isn’t to put people into boxes. It’s to give enterprises a vocabulary for decomposing what they need, so they stop collapsing fundamentally different work into a single job posting.

These patterns have different toolchains, different skill profiles and different hiring criteria. Companies that treat them as interchangeable will end up building subpar teams. Understanding where your AI initiatives fall along this spectrum is one of the most important change-management decisions enterprises face in an AI-first world, and it’s why companies that identify their specific location on this spectrum move dramatically faster than those hiring generically.

How to think about AI talent

Here’s where it gets practical. Prototypers, builders and scalers are not job titles. They’re lenses that sit on top of the domain roles enterprises have always hired for: frontend engineers, backend engineers, data engineers, DevOps/SRE and so on. To move from the vague ‘AI engineer’ to a structured picture of what you need, you have to think across three dimensions.

Role is the foundation: what technical domain does this person work in? Backend, data engineering, DevOps/SRE, full stack? These are the roles enterprises have always hired for. They come with foundational skills like API design, database architecture and CI/CD pipelines. And they come in specific flavors: a Python backend engineer is not a Java backend engineer. This layer hasn’t changed because of AI.

Seniority determines the level of judgment and autonomy you can expect. A senior backend engineer with 10 years of experience brings architectural instincts and decision-making under ambiguity that a two-year engineer doesn’t. Seniority is also where technical taste compounds. An engineer with deep experience has seen more tradeoffs, made more wrong calls and developed the pattern recognition that allows them to make better-than-default decisions. Not every role on an AI initiative requires a senior engineer, but the roles that involve system design decisions, risk trade-offs and client-facing judgment absolutely do.

AI engagement pattern is how this person engages with AI systems. This is the archetype layer, and it’s what’s new. A backend engineer doing builder work (designing the orchestration logic for an agentic workflow and integrating it with enterprise data) needs fundamentally different technical tastes than that same backend engineer doing scaler work (deploying LLM infrastructure and building observability for model performance). The role is the noun. The archetype is the adjective. And it changes what you need to test for.

In practice, certain role families map naturally to certain AI engagement patterns. Prototypers can come from anywhere; engineering, product, design and are often already on your team. They’re the person who’s always building side projects and testing ideas. Builders tend to draw from full-stack, frontend, backend, data engineering and AI/ML talent. Scalers tend to draw from DevOps/SRE, security, backend and infrastructure engineering. Forward-deployed engineers span all the above with business acumen and stakeholder fluency.

Hiring with precision

This multi-dimensional view is what allows enterprises to stop hiring for a vague ‘AI engineer’ and start composing teams with precision. It’s also what makes a credible assessment strategy possible, because now you know what you’re testing for at each level.

This article is published as part of the Foundry Expert Contributor Network.
Want to join?

Read More from This Article: Why hiring ‘AI engineers’ won’t work
Source: News

Why hiring ‘AI engineers’ won’t work

Digging deeper

How to think about AI talent

Hiring with precision

Related posts