Commit Graph

3 Commits

Author SHA1 Message Date
Matt
70f1f64ea3 feat: factor balanced pass rate into composite rankings
The dashboard now computes its own composite ranking score on the
client, blending (balanced-or-raw) average score with (balanced-or-raw)
advance pass rate via the existing scoreWeight / passRateWeight
sliders. Both inputs are toggled independently:

- 'Balance juror grading style (score)' — existing useBalancedRanking
- 'Balance juror approval rate (advance vote)' — new useBalancedPassRate

Both default to true and persist per-round. The pass rate is balanced
the same way scores are: each juror's personal yes-rate gives them a
Bernoulli stddev, each vote is z-normalized against that, and the
project's mean z is rescaled to the round's overall yes rate. A 'yes'
from a juror who rarely says yes counts more than a 'yes' from a
lenient juror.

List rows now show two chips — score (Bal/Raw X.XX) and pass rate
(Bal Yes% / Yes% N%) — so admins can see what's driving the order.
The threshold cutoff and live re-sort effect both use the same
composite formula.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 14:28:49 +02:00
Matt
97d1f2a3af fix: compute z-context per-round in edition-mode rankings rollup
Previously the edition-level branch of analytics.getProjectRankings
(programId mode) pooled every juror's evaluations across every round
into a single z-normalization context. A juror's mean and stddev are
not stable across round types — quick intake screening produces a
very different grading profile than a deep evaluation round, and
mixing them yields a meaningless personal calibration.

The rollup now groups points by roundId, computes one balance context
per round, and aggregates per-project as the unweighted mean of the
per-round balanced averages. roundId mode is unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 13:14:30 +02:00
Matt
982d5193c5 feat: surface juror-balanced scores and AI calibration advisory
All checks were successful
Build and Push Docker Image / build (push) Successful in 7m27s
Adds a shared juror-balancing utility (z-score normalization per juror,
rescaled back onto the raw 1-10 scale) and wires it into:

- Admin reports page: Top-10 project table now shows "Raw Avg" and
  "Balanced" columns side by side, and the summary stats row shows a
  balanced-average tile. Sort defaults to balanced so harsh and lenient
  graders no longer skew the ranking.
- Ranking dashboard: each project row shows a green/amber balanced-score
  chip next to the raw average when the two differ by ≥0.05, making it
  obvious when juror calibration moved a project's effective ranking.

Also adds AI Juror Calibration Advisory — a mutation that takes
anonymized per-juror stats, calls OpenAI, and produces a plain-language
explanation of the cohort's grading patterns plus per-juror severity
(normal / notable / outlier) with a one-sentence narrative. The advisory
describes the statistical balance that already runs; it does not
introduce a new weighting layer. Rendered as a panel in the Juror
Consistency tab when a specific round is selected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 16:19:00 +02:00