feat: shared 'How scores are calculated' explainer dialog

Reusable component used by admin and observer surfaces. Covers the algorithm, a five-step plain-language walkthrough, a worked example with three jurors of different grading styles, edge cases, and why both Raw and Balanced are always shown. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 13:25:20 +02:00
parent ee68f8af41
commit b4f5189a8e
1 changed files with 109 additions and 0 deletions
--- a/src/components/shared/score-explainer-dialog.tsx
+++ b/src/components/shared/score-explainer-dialog.tsx
@@ -0,0 +1,109 @@
 'use client'
 import {
  Dialog,
  DialogContent,
  DialogHeader,
  DialogTitle,
  DialogTrigger,
 } from '@/components/ui/dialog'
 import { Button } from '@/components/ui/button'
 import { Info } from 'lucide-react'
 import type { ReactNode } from 'react'
 export function ScoreExplainerDialog({ trigger }: { trigger?: ReactNode }) {
  return (
    <Dialog>
      <DialogTrigger asChild>
        {trigger ?? (
          <Button variant="ghost" size="sm" className="h-7 gap-1 px-2 text-xs">
            <Info className="h-3.5 w-3.5" />
            How scores are calculated
          </Button>
        )}
      </DialogTrigger>
      <DialogContent className="max-w-xl max-h-[85vh] overflow-y-auto">
        <DialogHeader>
          <DialogTitle>How scores are calculated</DialogTitle>
        </DialogHeader>
        <div className="space-y-4 text-sm">
          <p>
            Different jurors have different grading styles. Some grade harshly, some
            leniently. Balanced scoring corrects for that so a project isn&apos;t
            punished for drawing harsh jurors or rewarded for drawing lenient ones.
          </p>
          <div>
            <h3 className="font-semibold mb-1">How it works</h3>
            <ol className="list-decimal pl-5 space-y-1">
              <li>For each juror, calculate their personal average and spread across all the projects they scored in this round.</li>
              <li>Convert each individual score into &quot;how many standard deviations above or below this juror&apos;s typical&quot; — a 6 from a juror who averages 5 reads the same as a 9 from a juror who averages 8.</li>
              <li>Average those normalized values across the project&apos;s jurors.</li>
              <li>Rescale back onto the same 1–10 scale using the round&apos;s overall average and spread.</li>
              <li>The result is directly comparable to the raw average — same scale, but corrected for grading style.</li>
            </ol>
          </div>
          <div>
            <h3 className="font-semibold mb-1">Worked example</h3>
            <table className="w-full text-xs border-collapse">
              <thead>
                <tr className="border-b">
                  <th className="py-1 text-left">Juror</th>
                  <th className="py-1 text-left">Their typical avg</th>
                  <th className="py-1 text-left">Score for &quot;Project X&quot;</th>
                  <th className="py-1 text-left">What that means</th>
                </tr>
              </thead>
              <tbody>
                <tr className="border-b">
                  <td className="py-1">Juror A (lenient)</td>
                  <td>8.20</td>
                  <td>9.00</td>
                  <td>Just above their typical (+0.4σ)</td>
                </tr>
                <tr className="border-b">
                  <td className="py-1">Juror B (harsh)</td>
                  <td>5.80</td>
                  <td>7.50</td>
                  <td>Well above their typical (+1.5σ)</td>
                </tr>
                <tr>
                  <td className="py-1">Juror C (typical)</td>
                  <td>7.00</td>
                  <td>8.00</td>
                  <td>Slightly above their typical (+0.7σ)</td>
                </tr>
              </tbody>
            </table>
            <p className="mt-2 text-xs text-muted-foreground">
              Raw average: (9.00 + 7.50 + 8.00) / 3 = <strong>8.17</strong>.
              Balanced average rescales each juror&apos;s enthusiasm to the round&apos;s
              overall scale and lands at roughly <strong>8.40</strong> — Juror B&apos;s
              strong endorsement (well above their harsh baseline) carries more weight
              than the raw 7.50 suggests.
            </p>
          </div>
          <div>
            <h3 className="font-semibold mb-1">When it kicks in</h3>
            <ul className="list-disc pl-5 space-y-1">
              <li>Needs at least 2 evaluations from the round to compute a juror&apos;s spread; otherwise that juror falls back to the round-wide average.</li>
              <li>Needs at least one juror with non-zero spread; if every juror gave identical scores, balanced equals raw.</li>
              <li>Computed within a single round only — a juror&apos;s grading style in an intake screening doesn&apos;t affect their balance in a deep evaluation.</li>
            </ul>
          </div>
          <div>
            <h3 className="font-semibold mb-1">Why we still show &quot;Raw&quot;</h3>
            <p>
              Both numbers are always shown so you can sanity-check the correction. The
              toggle at the top of the side panel decides which one is used for ranking.
            </p>
          </div>
        </div>
      </DialogContent>
    </Dialog>
  )
 }