feat: shared 'How scores are calculated' explainer dialog

Reusable component used by admin and observer surfaces. Covers the algorithm, a five-step plain-language walkthrough, a worked example with three jurors of different grading styles, edge cases, and why both Raw and Balanced are always shown. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 13:25:20 +02:00
parent ee68f8af41
commit b4f5189a8e
1 changed files with 109 additions and 0 deletions
--- a/src/components/shared/score-explainer-dialog.tsx
+++ b/src/components/shared/score-explainer-dialog.tsx
@@ -0,0 +1,109 @@
+'use client'
+
+import {
+  Dialog,
+  DialogContent,
+  DialogHeader,
+  DialogTitle,
+  DialogTrigger,
+} from '@/components/ui/dialog'
+import { Button } from '@/components/ui/button'
+import { Info } from 'lucide-react'
+import type { ReactNode } from 'react'
+
+export function ScoreExplainerDialog({ trigger }: { trigger?: ReactNode }) {
+  return (
+    <Dialog>
+      <DialogTrigger asChild>
+        {trigger ?? (
+          <Button variant="ghost" size="sm" className="h-7 gap-1 px-2 text-xs">
+            <Info className="h-3.5 w-3.5" />
+            How scores are calculated
+          </Button>
+        )}
+      </DialogTrigger>
+      <DialogContent className="max-w-xl max-h-[85vh] overflow-y-auto">
+        <DialogHeader>
+          <DialogTitle>How scores are calculated</DialogTitle>
+        </DialogHeader>
+
+        <div className="space-y-4 text-sm">
+          <p>
+            Different jurors have different grading styles. Some grade harshly, some
+            leniently. Balanced scoring corrects for that so a project isn&apos;t
+            punished for drawing harsh jurors or rewarded for drawing lenient ones.
+          </p>
+
+          <div>
+            <h3 className="font-semibold mb-1">How it works</h3>
+            <ol className="list-decimal pl-5 space-y-1">
+              <li>For each juror, calculate their personal average and spread across all the projects they scored in this round.</li>
+              <li>Convert each individual score into &quot;how many standard deviations above or below this juror&apos;s typical&quot; — a 6 from a juror who averages 5 reads the same as a 9 from a juror who averages 8.</li>
+              <li>Average those normalized values across the project&apos;s jurors.</li>
+              <li>Rescale back onto the same 1–10 scale using the round&apos;s overall average and spread.</li>
+              <li>The result is directly comparable to the raw average — same scale, but corrected for grading style.</li>
+            </ol>
+          </div>
+
+          <div>
+            <h3 className="font-semibold mb-1">Worked example</h3>
+            <table className="w-full text-xs border-collapse">
+              <thead>
+                <tr className="border-b">
+                  <th className="py-1 text-left">Juror</th>
+                  <th className="py-1 text-left">Their typical avg</th>
+                  <th className="py-1 text-left">Score for &quot;Project X&quot;</th>
+                  <th className="py-1 text-left">What that means</th>
+                </tr>
+              </thead>
+              <tbody>
+                <tr className="border-b">
+                  <td className="py-1">Juror A (lenient)</td>
+                  <td>8.20</td>
+                  <td>9.00</td>
+                  <td>Just above their typical (+0.4σ)</td>
+                </tr>
+                <tr className="border-b">
+                  <td className="py-1">Juror B (harsh)</td>
+                  <td>5.80</td>
+                  <td>7.50</td>
+                  <td>Well above their typical (+1.5σ)</td>
+                </tr>
+                <tr>
+                  <td className="py-1">Juror C (typical)</td>
+                  <td>7.00</td>
+                  <td>8.00</td>
+                  <td>Slightly above their typical (+0.7σ)</td>
+                </tr>
+              </tbody>
+            </table>
+            <p className="mt-2 text-xs text-muted-foreground">
+              Raw average: (9.00 + 7.50 + 8.00) / 3 = <strong>8.17</strong>.
+              Balanced average rescales each juror&apos;s enthusiasm to the round&apos;s
+              overall scale and lands at roughly <strong>8.40</strong> — Juror B&apos;s
+              strong endorsement (well above their harsh baseline) carries more weight
+              than the raw 7.50 suggests.
+            </p>
+          </div>
+
+          <div>
+            <h3 className="font-semibold mb-1">When it kicks in</h3>
+            <ul className="list-disc pl-5 space-y-1">
+              <li>Needs at least 2 evaluations from the round to compute a juror&apos;s spread; otherwise that juror falls back to the round-wide average.</li>
+              <li>Needs at least one juror with non-zero spread; if every juror gave identical scores, balanced equals raw.</li>
+              <li>Computed within a single round only — a juror&apos;s grading style in an intake screening doesn&apos;t affect their balance in a deep evaluation.</li>
+            </ul>
+          </div>
+
+          <div>
+            <h3 className="font-semibold mb-1">Why we still show &quot;Raw&quot;</h3>
+            <p>
+              Both numbers are always shown so you can sanity-check the correction. The
+              toggle at the top of the side panel decides which one is used for ranking.
+            </p>
+          </div>
+        </div>
+      </DialogContent>
+    </Dialog>
+  )
+}