feat: shared 'How scores are calculated' explainer dialog
Reusable component used by admin and observer surfaces. Covers the algorithm, a five-step plain-language walkthrough, a worked example with three jurors of different grading styles, edge cases, and why both Raw and Balanced are always shown. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
109
src/components/shared/score-explainer-dialog.tsx
Normal file
109
src/components/shared/score-explainer-dialog.tsx
Normal file
@@ -0,0 +1,109 @@
|
||||
'use client'
|
||||
|
||||
import {
|
||||
Dialog,
|
||||
DialogContent,
|
||||
DialogHeader,
|
||||
DialogTitle,
|
||||
DialogTrigger,
|
||||
} from '@/components/ui/dialog'
|
||||
import { Button } from '@/components/ui/button'
|
||||
import { Info } from 'lucide-react'
|
||||
import type { ReactNode } from 'react'
|
||||
|
||||
export function ScoreExplainerDialog({ trigger }: { trigger?: ReactNode }) {
|
||||
return (
|
||||
<Dialog>
|
||||
<DialogTrigger asChild>
|
||||
{trigger ?? (
|
||||
<Button variant="ghost" size="sm" className="h-7 gap-1 px-2 text-xs">
|
||||
<Info className="h-3.5 w-3.5" />
|
||||
How scores are calculated
|
||||
</Button>
|
||||
)}
|
||||
</DialogTrigger>
|
||||
<DialogContent className="max-w-xl max-h-[85vh] overflow-y-auto">
|
||||
<DialogHeader>
|
||||
<DialogTitle>How scores are calculated</DialogTitle>
|
||||
</DialogHeader>
|
||||
|
||||
<div className="space-y-4 text-sm">
|
||||
<p>
|
||||
Different jurors have different grading styles. Some grade harshly, some
|
||||
leniently. Balanced scoring corrects for that so a project isn't
|
||||
punished for drawing harsh jurors or rewarded for drawing lenient ones.
|
||||
</p>
|
||||
|
||||
<div>
|
||||
<h3 className="font-semibold mb-1">How it works</h3>
|
||||
<ol className="list-decimal pl-5 space-y-1">
|
||||
<li>For each juror, calculate their personal average and spread across all the projects they scored in this round.</li>
|
||||
<li>Convert each individual score into "how many standard deviations above or below this juror's typical" — a 6 from a juror who averages 5 reads the same as a 9 from a juror who averages 8.</li>
|
||||
<li>Average those normalized values across the project's jurors.</li>
|
||||
<li>Rescale back onto the same 1–10 scale using the round's overall average and spread.</li>
|
||||
<li>The result is directly comparable to the raw average — same scale, but corrected for grading style.</li>
|
||||
</ol>
|
||||
</div>
|
||||
|
||||
<div>
|
||||
<h3 className="font-semibold mb-1">Worked example</h3>
|
||||
<table className="w-full text-xs border-collapse">
|
||||
<thead>
|
||||
<tr className="border-b">
|
||||
<th className="py-1 text-left">Juror</th>
|
||||
<th className="py-1 text-left">Their typical avg</th>
|
||||
<th className="py-1 text-left">Score for "Project X"</th>
|
||||
<th className="py-1 text-left">What that means</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr className="border-b">
|
||||
<td className="py-1">Juror A (lenient)</td>
|
||||
<td>8.20</td>
|
||||
<td>9.00</td>
|
||||
<td>Just above their typical (+0.4σ)</td>
|
||||
</tr>
|
||||
<tr className="border-b">
|
||||
<td className="py-1">Juror B (harsh)</td>
|
||||
<td>5.80</td>
|
||||
<td>7.50</td>
|
||||
<td>Well above their typical (+1.5σ)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td className="py-1">Juror C (typical)</td>
|
||||
<td>7.00</td>
|
||||
<td>8.00</td>
|
||||
<td>Slightly above their typical (+0.7σ)</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
<p className="mt-2 text-xs text-muted-foreground">
|
||||
Raw average: (9.00 + 7.50 + 8.00) / 3 = <strong>8.17</strong>.
|
||||
Balanced average rescales each juror's enthusiasm to the round's
|
||||
overall scale and lands at roughly <strong>8.40</strong> — Juror B's
|
||||
strong endorsement (well above their harsh baseline) carries more weight
|
||||
than the raw 7.50 suggests.
|
||||
</p>
|
||||
</div>
|
||||
|
||||
<div>
|
||||
<h3 className="font-semibold mb-1">When it kicks in</h3>
|
||||
<ul className="list-disc pl-5 space-y-1">
|
||||
<li>Needs at least 2 evaluations from the round to compute a juror's spread; otherwise that juror falls back to the round-wide average.</li>
|
||||
<li>Needs at least one juror with non-zero spread; if every juror gave identical scores, balanced equals raw.</li>
|
||||
<li>Computed within a single round only — a juror's grading style in an intake screening doesn't affect their balance in a deep evaluation.</li>
|
||||
</ul>
|
||||
</div>
|
||||
|
||||
<div>
|
||||
<h3 className="font-semibold mb-1">Why we still show "Raw"</h3>
|
||||
<p>
|
||||
Both numbers are always shown so you can sanity-check the correction. The
|
||||
toggle at the top of the side panel decides which one is used for ranking.
|
||||
</p>
|
||||
</div>
|
||||
</div>
|
||||
</DialogContent>
|
||||
</Dialog>
|
||||
)
|
||||
}
|
||||
Reference in New Issue
Block a user