Different jurors have different grading styles. Some grade harshly, some leniently.
Balanced scoring corrects for that so a project isn't punished for drawing harsh
jurors or rewarded for drawing lenient ones.
How it works
- For each juror, calculate their personal average and spread across all the projects they scored in this round.
- Convert each individual score into "how many standard deviations above or below this juror's typical" — a 6 from a juror who averages 5 reads the same as a 9 from a juror who averages 8.
- Average those normalized values across the project's jurors.
- Rescale back onto the same 1–10 scale using the round's overall average and spread.
- The result is directly comparable to the raw average — same scale, but corrected for grading style.
Worked example
| Juror |
Their typical avg |
Score for "Project X" |
What that means |
| Juror A (lenient) |
8.2 |
9.0 |
Just above their typical (+0.4σ) |
| Juror B (harsh) |
5.8 |
7.5 |
Well above their typical (+1.5σ) |
| Juror C (typical) |
7.0 |
8.0 |
Slightly above their typical (+0.7σ) |
Raw average: (9.0 + 7.5 + 8.0) / 3 = 8.2.
Balanced average rescales each juror's enthusiasm to the round's overall scale and lands at
roughly 8.4 — Juror B's strong endorsement (well above their harsh baseline)
carries more weight than the raw 7.5 suggests.
When it kicks in
- Needs at least 2 evaluations from the round to compute a juror's spread; otherwise that juror falls back to the round-wide average.
- Needs at least one juror with non-zero spread; if every juror gave identical scores, balanced equals raw.
- Computed within a single round only — a juror's grading style in an intake screening doesn't affect their balance in a deep evaluation.
Why we still show "Raw"
Both numbers are always shown so you can sanity-check the correction. The toggle at the top of the
side panel decides which one is used for ranking.