Technical Hiring in the AI Era: What I Evaluate Has Changed

For years, my favourite method for technical hiring was the live code review. I would pick past merge requests from the codebase — debatable ones, imperfect ones, with design issues, questionable implementations — merge requests you can actually discuss. Sometimes, I would pair this review with a live coding session via CodeSandbox or similar, where we would reimplement together a loop, a function, a small module.

This approach had huge advantages:

It removed the need for a take-home test. No weekend spent on an artificial exercise. The candidate showed up empty-handed.
It gave a precise reading of the candidate’s level. Not only on what they noticed or missed in the code, but on how they approached the problem. For example, many candidates do not scroll to the bottom of the merge request, or do not open the file tree in GitHub. Personally, I always do — I start by looking for the big picture, the important file might not be the first one in the MR. So you are also evaluating analytical sense, a form of exploratory rigour.
It allowed alignment on values. I was showing real code, the code the candidate would be working on. I believe it is essential that candidates know what they are getting into. I have sometimes worked on projects with heavy technical debt and almost no test coverage. That kind of situation can be a no-go for some developers, and that is understandable. But you need to find out before signing.
It was a communication exercise. And communication in a development team is an undervalued soft skill. I have seen candidates paralysed by fear, despite all my efforts to make the atmosphere relaxed. I have seen others extremely critical of the code — which I can certainly accept, even encourage — but when it comes to communication, how you say it matters as much as what you say.

Right. That was before AI.

Our profession has changed. Completely.

In April 2025, Tobi Lütke, CEO of Shopify, sent an internal memo that went viral in the tech world: « Reflexive AI usage is now a baseline expectation of everyone at Shopify. » Even more radical: before requesting an additional hire, each manager must prove that the work cannot be done by AI. AI usage has become a performance evaluation criterion for all employees, without exception.

Data from the field confirms this shift. The Developer Skills Report 2025 by HackerRank reports that 97% of developers integrate AI into their workflow, with on average a third of code produced by AI assistants. 61% use two or more AI tools daily. On the recruiter side, 71% of hiring managers say that AI makes technical skills harder to evaluate.

The direct consequence: traditional technical tests are in crisis. A candidate can paste a problem statement into an AI assistant and get a working solution in seconds. Telling apart someone who understands the concepts from someone who relays generated code has become nearly impossible in a classic format.

Meta leads the way

The strongest signal of 2025 came from Meta, which introduced an AI-assisted coding interview during its onsites. The format: 60 minutes on CoderPad, with access to GPT-4o, Claude Sonnet, Gemini 2.5 Pro or Llama 4, at the candidate’s choice. The problem is no longer an isolated algorithmic exercise but a multi-file project on which the candidate must iterate. The guiding principle: use AI for well-defined subtasks, not to solve the problem end to end. Demonstrate critical review of each output.

This format is expected to be extended to all back-end and ops positions in 2026.

Kent Beck and the question of taste

Kent Beck, in his Substack Tidy First?, draws an essential distinction between vibe coding and augmented coding:

In vibe coding you don’t care about the code, just the behavior of the system. In augmented coding you care about the code, its complexity, the tests, & their coverage.

Beck argues that yak shaving — those tedious preparation tasks — disappears thanks to AI. The programmer refocuses on the decisions that matter: architecture, design, simplicity. And above all, taste. « When anyone can build anything, knowing what’s worth building becomes the skill. »

This is exactly the intuition that guides my thinking about hiring.

What I evaluate now

The ability to prompt

There are a few patterns that are essential to know:

Few-shot prompting: providing examples in the prompt to guide the model towards the expected format or reasoning.
Chain of Thought: asking the model to reason step by step, which significantly improves the quality of answers on complex problems.
Iterative Refinement: not expecting the perfect answer on the first try, but progressively refining by giving feedback to the model.

A candidate who knows and applies these patterns does not work the same way as one who sends a one-line prompt and hopes for a magic result.

The ability to read and amend a PRD

A well-written Product Requirements Document has become the main productivity lever. Could the new technical test be, starting from a product spec, to generate an actionable PRD? The candidate receives a vague, deliberately incomplete specification and must produce a structured document that could be consumed by an AI agent or a junior developer.

This test evaluates much more than writing: it evaluates the ability to ask the right questions, to identify gaps, to anticipate edge cases — in short, to think the problem before coding.

Using AI tools in a real context

Should we give access to Claude Code during a technical test? I believe it is essential. Evaluating a developer without their tools in 2026 is like evaluating a carpenter without their machines.

What I want to observe:

Exploration reflexes. Does the candidate launch preparatory audit prompts before coding? Do they use plan mode? Do they try to understand the codebase before producing code?
Tool navigation. Do they know the slash commands, skills, shortcuts? Can they switch between modes, launch a background agent, use /compact to manage context?
Workflow rigour. Do they ask for tests? Documentation? Do they specify constraints (no additional dependencies, respect existing patterns)? Or do they just vibe code and hope it holds?

Will I keep doing code reviews?

I am not sure. Code implementation has become a marginal problem. What I am looking to evaluate is a candidate’s ability to design and maintain a project over time.

Vibe coding a feature — almost everyone can do that today. But searching through the code — with the help of Claude or similar — for existing modules, shared services, implemented tests, established patterns, in order to make a solid and precise graft while minimising the code produced… that is what we are trying to evaluate.

So yes, I will probably keep giving access to a codebase, on non-confidential services, and asking to implement a feature. But what I will be grading is how the candidate uses Claude. The reflexes they have to go look at configuration files, to launch audit prompts, to activate plan mode, to generate a PRD, to specify that they want tests, documentation, to verify that the produced code integrates with the existing codebase rather than reinventing the wheel.

Code review remains an excellent exercise. But what we review has changed. We no longer review code written by a human. We review code steered by a human and produced by a machine. And the quality of the steering is what makes all the difference.

Technical Hiring in the AI Era: What I Evaluate Has Changed

Right. That was before AI.

Meta leads the way

Kent Beck and the question of taste

What I evaluate now

The ability to prompt

The ability to read and amend a PRD

Using AI tools in a real context

Will I keep doing code reviews?

Further reading

1. Shopify, Reflexive AI usage is now a baseline expectation

2. HackerRank, Developer Skills Report 2025

3. Meta, AI-Assisted Coding Interview

4. Kent Beck, Augmented Coding: Beyond the Vibes

5. Karat, Engineering Interview Trends in 2026