Evidence-based scorecards: what to cite and what to ignore

The Hure teamJanuary 28, 20263 min read

A score without evidence is an opinion. It might be a good opinion — experienced interviewers have real instincts — but you cannot audit it, you cannot defend it to a candidate, and you cannot compare it fairly to anyone else’s. The fix is simple to state and harder to practice: anchor every rating to something the candidate actually said.

Anchor every rating to a transcript moment

The discipline is one rule. For each competency you score, point to the specific moment in the conversation that justifies the number. Not a summary of your impression — a citable quote.

When a “4 out of 5 on systems thinking” carries the line “I cut scope to the two tables that actually blocked the migration, then backfilled the rest after launch,” the score stops being a vibe. Anyone reviewing the scorecard can see the reasoning, agree or disagree with the evidence in front of them, and move on. The conversation shifts from “do you trust this interviewer?” to “is this evidence strong enough?” — which is the conversation you actually want to have.

What to cite

Cite content and behavior:

A concrete decision and the constraint that drove it.
A tradeoff the candidate named on their own, including what they gave up.
The failure case — what went wrong, or what would have, and how they handled it.
Ownership: what they did, distinct from what the team did around them.

These are observable in an answer. Two evaluators reading the same transcript will tend to find the same evidence, which is exactly why it makes scores comparable.

What to ignore

Just as important is what you deliberately leave out. Some signals feel informative and are actually noise — or worse, bias:

Accent and fluency. How someone speaks is not what they said.
Confidence and polish. Smooth delivery correlates with practice and privilege, not with the trait you are hiring for.
Biometrics — tone of voice, facial expression, “energy.” These should never touch a score. Hure does not use them, by design: it judges the content of what a candidate says, never how they look or sound.
Résumé halo. A familiar logo is not evidence of the competency in front of you.

If you cannot tie a signal to a citable moment about the actual competency, it does not belong in the score.

Levels, not vibes

Evidence works best against a rubric with observable levels. Instead of asking “how good was this answer?”, ask “which level does this answer demonstrate?” — where each level is a concrete behavior. The evidence you cited should map cleanly onto a level. If it doesn’t, that is a signal your rubric needs sharper language, not that the candidate needs a rounder number.

Make scorecards auditable and comparable

The payoff of evidence-based scoring shows up later — when a hiring manager questions a result, when a candidate asks why, when you look back across a whole funnel to check whether your bar held steady. Because every rating cites a transcript moment, every decision is auditable, and because everyone was measured against the same versioned rubric, the scores are comparable.

This is the model Hure runs end to end: a versioned rubric authored from the role, a natural voice interview that covers it consistently, and a scorecard where each rating is anchored to the transcript — with a human always making the final call. The scorecard’s job is not to decide for you. It is to make sure that when you decide, you are looking at evidence, not at an opinion in a costume.