# Juror-Balance Toggle + Round-Scoping Fixes — Implementation Plan

> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.

**Goal:** Fix cross-round contamination in two analytics procedures (and the three UI surfaces that consume them), add a per-round "use balanced scoring" toggle, replace the list-view delta annotation with a richer side-panel display, and ship a shared "How scores are calculated" explainer dialog.

**Architecture:** Server-side: extend `analytics.getProjectDetail` to accept a roundId and scope its evaluation query; rework `analytics.getProjectRankings` edition mode to compute one z-balance context per round before aggregating; add `useBalancedRanking` to `EvaluationConfigSchema` so it persists in `Round.configJson`. Client-side: pass roundId from each caller; rebuild the admin ranking dashboard side sheet to show both raw and balanced averages, the per-round toggle, per-juror balance contributions, and an affordance opening a shared explainer dialog component (`<ScoreExplainerDialog />`) reused on observer surfaces.

**Tech Stack:** Next.js 15 App Router, tRPC 11 with Zod, Prisma 6, Vitest 4 (file-parallelism off, forks pool), shadcn/ui (Dialog + Switch primitives already available), TypeScript strict.

**Spec:** `docs/superpowers/specs/2026-04-27-juror-balance-toggle-and-round-scoping-design.md`

---

## File Structure

### Server

| File | Responsibility |
|---|---|
| `src/server/routers/analytics.ts` | Modify `getProjectDetail` (Task 1) + `getProjectRankings` (Task 2). |
| `src/server/services/juror-balance.ts` | Add a small helper `computePerRoundBalanced(pointsByRound)` consumed by `getProjectRankings` edition mode (Task 2). Keep existing functions untouched. |
| `src/types/competition-configs.ts` | Add `useBalancedRanking: z.boolean().default(true)` to `EvaluationConfigSchema` (Task 6). |

### Client

| File | Responsibility |
|---|---|
| `src/components/shared/score-explainer-dialog.tsx` | NEW. Reusable explainer dialog (Task 11). |
| `src/components/admin/round/ranking-dashboard.tsx` | Wire roundId through to `getProjectDetail`; rebuild side-sheet stats area; add toggle row; per-juror chips; remove list-row delta annotation; mount `<ScoreExplainerDialog />` (Tasks 3, 7, 8, 9, 10, 12, 14). |
| `src/components/observer/observer-project-detail.tsx` | Resolve default round and pass roundId; mount explainer dialog affordance (Tasks 5, 12). |
| `src/components/observer/reports/project-preview-dialog.tsx` | Accept and pass `roundId` prop; mount explainer affordance (Tasks 4, 12). |
| `src/app/(observer)/observer/projects/[projectId]/page.tsx` | Read `?round=` query param and pass to `<ObserverProjectDetail />` (Task 5). |
| `src/app/(admin)/admin/reports/page.tsx` | Decimal audit fix `toFixed(2)` → `toFixed(1)` (Task 13). |

### Tests

| File | Responsibility |
|---|---|
| `tests/unit/juror-balance-round-scoping.test.ts` | NEW. Vitest cases for getProjectDetail roundId filtering and getProjectRankings per-round z-context (Tasks 1, 2). |
| `tests/unit/round-config-balance-toggle.test.ts` | NEW. Vitest case for persisting `useBalancedRanking` via `round.update` (Task 6). |

---

## Test setup notes (for the implementer)

Vitest 4 is the framework; tests run sequentially (`fileParallelism: false`, `pool: 'forks'`). Use the helpers in `tests/helpers.ts` (`createTestUser`, `createTestProgram`, `createTestCompetition`, `createTestRound`, `createTestProject`, `createTestProjectRoundState`, `createTestAssignment`, `createTestEvaluation`, `createTestEvaluationForm`, `cleanupTestData`) and `createCaller(routerModule, user)` from `tests/setup.ts`. Always `cleanupTestData(programId, userIds)` in `afterAll`.

Run a single test file with: `npx vitest run tests/unit/<file>.test.ts`. Run a single test by name with: `npx vitest run -t '<test name>'`.

---

## Task 1: Round-scope `analytics.getProjectDetail`

**Files:**
- Modify: `src/server/routers/analytics.ts:1370-1464`
- Create: `tests/unit/juror-balance-round-scoping.test.ts`

- [ ] **Step 1: Write the failing test (round filtering)**

Create `tests/unit/juror-balance-round-scoping.test.ts`:

```typescript
import { afterAll, beforeAll, describe, expect, it } from 'vitest'
import { prisma, createCaller } from '../setup'
import {
  createTestUser, createTestProgram, createTestCompetition, createTestRound,
  createTestProject, createTestProjectRoundState, createTestAssignment,
  createTestEvaluation, createTestEvaluationForm, cleanupTestData, uid,
} from '../helpers'
import { analyticsRouter } from '../../src/server/routers/analytics'

describe('analytics.getProjectDetail round scoping', () => {
  let programId: string
  let admin: { id: string; email: string; role: 'SUPER_ADMIN' }
  let projectId: string
  let roundAId: string
  let roundBId: string
  const userIds: string[] = []

  beforeAll(async () => {
    const program = await createTestProgram({ name: `bal-scope-${uid()}` })
    programId = program.id
    const competition = await createTestCompetition(programId)
    const roundA = await createTestRound(competition.id, { name: 'Round A', sortOrder: 0, status: 'ROUND_CLOSED' })
    const roundB = await createTestRound(competition.id, { name: 'Round B', sortOrder: 1, status: 'ROUND_ACTIVE' })
    roundAId = roundA.id
    roundBId = roundB.id

    const formA = await createTestEvaluationForm(roundA.id)
    const formB = await createTestEvaluationForm(roundB.id)

    const project = await createTestProject(programId)
    projectId = project.id
    await createTestProjectRoundState(projectId, roundA.id, { state: 'PASSED' })
    await createTestProjectRoundState(projectId, roundB.id, { state: 'IN_PROGRESS' })

    // 2 evaluations on Round A: 7.0, 8.0  (mean 7.5)
    for (const score of [7, 8]) {
      const juror = await createTestUser('JURY_MEMBER')
      userIds.push(juror.id)
      const a = await createTestAssignment(juror.id, projectId, roundA.id)
      await createTestEvaluation(a.id, formA.id, { status: 'SUBMITTED', globalScore: score, submittedAt: new Date() })
    }
    // 3 evaluations on Round B: 9.0, 8.0, 8.0  (mean 8.333…)
    for (const score of [9, 8, 8]) {
      const juror = await createTestUser('JURY_MEMBER')
      userIds.push(juror.id)
      const a = await createTestAssignment(juror.id, projectId, roundB.id)
      await createTestEvaluation(a.id, formB.id, { status: 'SUBMITTED', globalScore: score, submittedAt: new Date() })
    }

    const adminUser = await createTestUser('SUPER_ADMIN')
    userIds.push(adminUser.id)
    admin = { id: adminUser.id, email: adminUser.email, role: 'SUPER_ADMIN' }
  })

  afterAll(async () => {
    await cleanupTestData(programId, userIds)
  })

  it('returns only round-B stats when roundId=roundB is passed', async () => {
    const caller = createCaller(analyticsRouter, admin)
    const result = await caller.getProjectDetail({ id: projectId, roundId: roundBId })
    expect(result.stats).not.toBeNull()
    expect(result.stats!.totalEvaluations).toBe(3)
    expect(result.stats!.averageGlobalScore).toBeCloseTo(8.333, 2)
  })
})
```

- [ ] **Step 2: Run test to verify it fails**

Run: `npx vitest run tests/unit/juror-balance-round-scoping.test.ts -t 'returns only round-B stats'`
Expected: FAIL — current procedure ignores `roundId` and returns 5 evaluations averaging 8.0. The Zod input schema also rejects `roundId` since it's not declared.

- [ ] **Step 3: Add `roundId` to the input schema and scope the query**

Modify `src/server/routers/analytics.ts` around line 1371. Replace the input definition and the `submittedEvaluations` query:

```typescript
  getProjectDetail: observerProcedure
    .input(z.object({ id: z.string(), roundId: z.string().optional() }))
    .query(async ({ ctx, input }) => {
      const [projectRaw, projectTags, assignments, submittedEvaluations] = await Promise.all([
        ctx.prisma.project.findUniqueOrThrow({
          where: { id: input.id },
          include: {
            files: {
              select: {
                id: true, fileName: true, fileType: true, mimeType: true, size: true,
                bucket: true, objectKey: true, pageCount: true, textPreview: true,
                detectedLang: true, langConfidence: true, analyzedAt: true,
                roundId: true,
                requirementId: true,
                requirement: { select: { id: true, name: true, description: true, isRequired: true } },
              },
              orderBy: [{ createdAt: 'asc' }],
            },
            teamMembers: {
              include: {
                user: {
                  select: { id: true, name: true, email: true, profileImageKey: true, profileImageProvider: true },
                },
              },
              orderBy: { joinedAt: 'asc' },
            },
          },
        }),
        ctx.prisma.projectTag.findMany({
          where: { projectId: input.id },
          include: { tag: { select: { id: true, name: true, category: true, color: true } } },
          orderBy: { confidence: 'desc' },
        }).catch(() => [] as { id: string; projectId: string; tagId: string; confidence: number; tag: { id: string; name: string; category: string | null; color: string | null } }[]),
        ctx.prisma.assignment.findMany({
          where: { projectId: input.id },
          include: {
            user: { select: { id: true, name: true, email: true, profileImageKey: true, profileImageProvider: true } },
            round: { select: { id: true, name: true } },
            evaluation: {
              select: {
                id: true, status: true, submittedAt: true, globalScore: true,
                binaryDecision: true, criterionScoresJson: true, feedbackText: true,
              },
            },
          },
          orderBy: { createdAt: 'desc' },
        }),
        ctx.prisma.evaluation.findMany({
          where: {
            status: 'SUBMITTED',
            assignment: {
              projectId: input.id,
              ...(input.roundId ? { roundId: input.roundId } : {}),
            },
          },
        }),
      ])
```

Leave the rest of the procedure body untouched. The `stats = null` fallback (when no submitted evaluations match) already does the right thing.

- [ ] **Step 4: Run test to verify it passes**

Run: `npx vitest run tests/unit/juror-balance-round-scoping.test.ts -t 'returns only round-B stats'`
Expected: PASS.

- [ ] **Step 5: Add a second test asserting unfiltered behavior is preserved**

Append to the same `describe` block:

```typescript
  it('returns aggregated stats across all rounds when roundId is omitted', async () => {
    const caller = createCaller(analyticsRouter, admin)
    const result = await caller.getProjectDetail({ id: projectId })
    expect(result.stats!.totalEvaluations).toBe(5)
  })
```

Run: `npx vitest run tests/unit/juror-balance-round-scoping.test.ts -t 'aggregated stats across all rounds'`
Expected: PASS (no further code change needed; the `?:` fallback handles this).

- [ ] **Step 6: Commit**

```bash
git add src/server/routers/analytics.ts tests/unit/juror-balance-round-scoping.test.ts
git commit -m "fix: scope analytics.getProjectDetail by optional roundId"
```

---

## Task 2: Per-round z-context in `analytics.getProjectRankings` edition mode

**Files:**
- Modify: `src/server/routers/analytics.ts:199-258`
- Modify: `src/server/services/juror-balance.ts` (add `computePerRoundBalanced` helper)
- Modify: `tests/unit/juror-balance-round-scoping.test.ts` (append cases)

- [ ] **Step 1: Write the failing test (edition-mode per-round grouping)**

Append to `tests/unit/juror-balance-round-scoping.test.ts`:

```typescript
describe('analytics.getProjectRankings per-round z-context (edition mode)', () => {
  let programId: string
  let admin: { id: string; email: string; role: 'SUPER_ADMIN' }
  let projectXId: string
  let projectYId: string
  const userIds: string[] = []

  beforeAll(async () => {
    const program = await createTestProgram({ name: `rank-edition-${uid()}` })
    programId = program.id
    const competition = await createTestCompetition(programId)
    const roundA = await createTestRound(competition.id, { name: 'A', sortOrder: 0 })
    const roundB = await createTestRound(competition.id, { name: 'B', sortOrder: 1 })
    const formA = await createTestEvaluationForm(roundA.id, [
      { id: 'c1', label: 'X', scale: '1-10', weight: 1 },
    ])
    const formB = await createTestEvaluationForm(roundB.id, [
      { id: 'c1', label: 'X', scale: '1-10', weight: 1 },
    ])

    const projX = await createTestProject(programId, { title: 'X' })
    const projY = await createTestProject(programId, { title: 'Y' })
    projectXId = projX.id
    projectYId = projY.id
    await createTestProjectRoundState(projX.id, roundA.id)
    await createTestProjectRoundState(projY.id, roundA.id)
    await createTestProjectRoundState(projX.id, roundB.id)
    await createTestProjectRoundState(projY.id, roundB.id)

    // Round A: a "lenient" juror grades 9 on X, 9 on Y. A "harsh" juror grades 6 on X, 4 on Y.
    // Mixing A+B produces a misleading single z-context. Per-round contexts:
    //   - In Round A: lenient mean=9 stddev=0 (fallback), harsh mean=5 stddev=1
    //   - In Round B: identical ratings, separate context
    const lenient = await createTestUser('JURY_MEMBER')
    const harsh = await createTestUser('JURY_MEMBER')
    userIds.push(lenient.id, harsh.id)

    const writeEval = async (jurorId: string, projId: string, roundId: string, formId: string, c1: number) => {
      const a = await createTestAssignment(jurorId, projId, roundId)
      await prisma.evaluation.create({
        data: {
          assignmentId: a.id,
          formId,
          status: 'SUBMITTED',
          submittedAt: new Date(),
          criterionScoresJson: { c1 },
        },
      })
    }

    // Round A
    await writeEval(lenient.id, projX.id, roundA.id, formA.id, 9)
    await writeEval(lenient.id, projY.id, roundA.id, formA.id, 9)
    await writeEval(harsh.id, projX.id, roundA.id, formA.id, 6)
    await writeEval(harsh.id, projY.id, roundA.id, formA.id, 4)
    // Round B (different scoring profile so cross-round pooling skews things)
    await writeEval(lenient.id, projX.id, roundB.id, formB.id, 8)
    await writeEval(lenient.id, projY.id, roundB.id, formB.id, 8)
    await writeEval(harsh.id, projX.id, roundB.id, formB.id, 7)
    await writeEval(harsh.id, projY.id, roundB.id, formB.id, 5)

    const adminUser = await createTestUser('SUPER_ADMIN')
    userIds.push(adminUser.id)
    admin = { id: adminUser.id, email: adminUser.email, role: 'SUPER_ADMIN' }
  })

  afterAll(async () => {
    await cleanupTestData(programId, userIds)
  })

  it('aggregates per-project balanced score as the mean of per-round balanced averages', async () => {
    const caller = createCaller(analyticsRouter, admin)
    const result = await caller.getProjectRankings({ programId })
    const x = result.find((p) => p.id === projectXId)!
    const y = result.find((p) => p.id === projectYId)!

    // Hand-computed expected per-round balanced averages:
    // Round A: lenient stddev=0 (fallback to overall), harsh mean=5 stddev=1.
    //   X: lenient z=fallback (9-7)/sqrt(3.5)=2/1.8708=+1.069, harsh z=(6-5)/1=+1.0 → avg z=1.0345
    //   Round A overall mean=7, stddev=sqrt(3.5)=1.8708 → X balanced = 7 + 1.0345*1.8708 ≈ 8.94
    //   Y: lenient z=fallback (9-7)/1.8708=+1.069, harsh z=(4-5)/1=-1.0 → avg z=0.0345
    //   Y balanced = 7 + 0.0345*1.8708 ≈ 7.06
    // Round B: lenient mean=8 stddev=0 (fallback), harsh mean=6 stddev=1.
    //   B overall mean=7, stddev=sqrt(1.5)=1.2247
    //   X: lenient z=(8-7)/1.2247=+0.8165, harsh z=(7-6)/1=+1.0 → avg z=0.9082
    //   X balanced = 7 + 0.9082*1.2247 ≈ 8.11
    //   Y: lenient z=(8-7)/1.2247=+0.8165, harsh z=(5-6)/1=-1.0 → avg z=-0.0917
    //   Y balanced = 7 - 0.0917*1.2247 ≈ 6.89
    // Project-level edition rollup = mean of per-round balanced averages:
    //   X ≈ (8.94 + 8.11)/2 ≈ 8.52
    //   Y ≈ (7.06 + 6.89)/2 ≈ 6.97
    expect(x.balancedScore!).toBeCloseTo(8.52, 1)
    expect(y.balancedScore!).toBeCloseTo(6.97, 1)
  })
})
```

- [ ] **Step 2: Run test to verify it fails**

Run: `npx vitest run tests/unit/juror-balance-round-scoping.test.ts -t 'mean of per-round balanced averages'`
Expected: FAIL — current code pools all 8 evaluations into one z-context.

- [ ] **Step 3: Add helper `computePerRoundBalanced` to juror-balance service**

Modify `src/server/services/juror-balance.ts`. Append at the end:

```typescript
/**
 * Per-round balanced rollup: groups points by roundId, computes a balance
 * context per round, then averages the per-round balanced averages for each
 * project. Use when surfacing edition-level rankings — never pool z-contexts
 * across rounds, because a juror's grading profile differs by round type.
 */
export type RoundScopedScorePoint = ScorePoint & { roundId: string }

export type EditionRollupResult = {
  projectId: string
  rawAverage: number | null
  balancedAverage: number | null
  count: number
  roundCount: number
}

export function computePerRoundBalanced(
  points: RoundScopedScorePoint[],
): Map<string, EditionRollupResult> {
  const byRound = new Map<string, ScorePoint[]>()
  for (const p of points) {
    const arr = byRound.get(p.roundId) ?? []
    arr.push({ projectId: p.projectId, userId: p.userId, rawScore: p.rawScore })
    byRound.set(p.roundId, arr)
  }

  const perRoundResults: Array<Map<string, BalancedProjectResult>> = []
  for (const roundPoints of byRound.values()) {
    const ctx = computeBalanceContext(roundPoints)
    perRoundResults.push(computeBalancedProjectScores(roundPoints, ctx))
  }

  const accumulator = new Map<
    string,
    { rawSum: number; rawCount: number; balancedSum: number; balancedCount: number; count: number; roundCount: number }
  >()
  for (const roundMap of perRoundResults) {
    for (const [projectId, result] of roundMap.entries()) {
      const acc = accumulator.get(projectId) ?? {
        rawSum: 0, rawCount: 0, balancedSum: 0, balancedCount: 0, count: 0, roundCount: 0,
      }
      if (result.rawAverage != null) {
        acc.rawSum += result.rawAverage
        acc.rawCount += 1
      }
      if (result.balancedAverage != null) {
        acc.balancedSum += result.balancedAverage
        acc.balancedCount += 1
      }
      acc.count += result.count
      acc.roundCount += 1
      accumulator.set(projectId, acc)
    }
  }

  const out = new Map<string, EditionRollupResult>()
  for (const [projectId, acc] of accumulator.entries()) {
    out.set(projectId, {
      projectId,
      rawAverage: acc.rawCount > 0 ? acc.rawSum / acc.rawCount : null,
      balancedAverage: acc.balancedCount > 0 ? acc.balancedSum / acc.balancedCount : null,
      count: acc.count,
      roundCount: acc.roundCount,
    })
  }
  return out
}
```

- [ ] **Step 4: Update `getProjectRankings` to branch on roundId vs programId**

Modify `src/server/routers/analytics.ts`. Replace the imports near line 9:

```typescript
import {
  computeBalanceContext,
  computeBalancedProjectScores,
  computePerRoundBalanced,
  type ScorePoint,
  type RoundScopedScorePoint,
} from '../services/juror-balance'
```

Then replace the `getProjectRankings` body (lines 199-258) with:

```typescript
  getProjectRankings: observerProcedure
    .input(editionOrRoundInput.and(z.object({ limit: z.number().optional() })))
    .query(async ({ ctx, input }) => {
      const [projects, evaluations] = await Promise.all([
        ctx.prisma.project.findMany({
          where: projectWhere(input),
          select: { id: true, title: true, teamName: true, status: true },
        }),
        ctx.prisma.evaluation.findMany({
          where: evalWhere(input, { status: 'SUBMITTED' }),
          select: {
            criterionScoresJson: true,
            assignment: { select: { userId: true, projectId: true, roundId: true } },
          },
        }),
      ])

      const rawPoints: RoundScopedScorePoint[] = []
      for (const e of evaluations) {
        const scores = e.criterionScoresJson as Record<string, unknown> | null
        if (!scores) continue
        const vals = Object.values(scores).filter((s): s is number => typeof s === 'number')
        if (vals.length === 0) continue
        const rawScore = vals.reduce((a, b) => a + b, 0) / vals.length
        rawPoints.push({
          projectId: e.assignment.projectId,
          userId: e.assignment.userId,
          roundId: e.assignment.roundId,
          rawScore,
        })
      }

      // roundId mode: single-round z-context (existing behavior)
      // programId mode: per-round z-contexts aggregated as the mean of per-round balanced averages
      const balancedByProject: Map<string, { rawAverage: number | null; balancedAverage: number | null; count: number }> = (() => {
        if (input.roundId) {
          const flat: ScorePoint[] = rawPoints.map(({ projectId, userId, rawScore }) => ({ projectId, userId, rawScore }))
          const ctx = computeBalanceContext(flat)
          const out = computeBalancedProjectScores(flat, ctx)
          return out
        }
        return computePerRoundBalanced(rawPoints)
      })()

      const rankings = projects
        .map((project) => {
          const result = balancedByProject.get(project.id)
          return {
            id: project.id,
            title: project.title,
            teamName: project.teamName,
            status: project.status,
            averageScore: result?.rawAverage ?? null,
            balancedScore: result?.balancedAverage ?? null,
            evaluationCount: result?.count ?? 0,
          }
        })
        .sort((a, b) => {
          const aScore = a.balancedScore ?? a.averageScore
          const bScore = b.balancedScore ?? b.averageScore
          if (aScore !== null && bScore !== null) return bScore - aScore
          if (aScore !== null) return -1
          if (bScore !== null) return 1
          return 0
        })

      return input.limit ? rankings.slice(0, input.limit) : rankings
    }),
```

- [ ] **Step 5: Run the test suite**

Run: `npx vitest run tests/unit/juror-balance-round-scoping.test.ts`
Expected: all 3 tests PASS.

- [ ] **Step 6: Run typecheck**

Run: `npm run typecheck`
Expected: no errors.

- [ ] **Step 7: Commit**

```bash
git add src/server/routers/analytics.ts src/server/services/juror-balance.ts tests/unit/juror-balance-round-scoping.test.ts
git commit -m "fix: compute z-context per-round in edition-mode rankings rollup"
```

---

## Task 3: Pass `roundId` from admin ranking dashboard side sheet

**Files:**
- Modify: `src/components/admin/round/ranking-dashboard.tsx` (around the existing `getProjectDetail.useQuery` call)

- [ ] **Step 1: Locate the existing useQuery call**

Open `src/components/admin/round/ranking-dashboard.tsx` and find where `selectedProjectId` drives `analytics.getProjectDetail`. The component already has `roundId` in scope (it's the dashboard's own prop / state).

Run: `grep -n "getProjectDetail\.useQuery" src/components/admin/round/ranking-dashboard.tsx`

- [ ] **Step 2: Add roundId to the input**

Edit the call to include `roundId`:

```typescript
  const { data: projectDetail, isLoading: detailLoading } =
    trpc.analytics.getProjectDetail.useQuery(
      { id: selectedProjectId!, roundId },
      { enabled: !!selectedProjectId },
    )
```

(Confirm the existing `enabled` guard and any other existing options stay intact.)

- [ ] **Step 3: Manual smoke test**

Start the dev server: `npm run dev`
Navigate to the admin ranking dashboard for a round where a project has had evaluations in earlier rounds. Click a project. Confirm:
- The "Evaluators" stat in the side sheet matches the count in the per-juror list below.
- "Avg Score" reflects only the current round's scores (one decimal).

- [ ] **Step 4: Commit**

```bash
git add src/components/admin/round/ranking-dashboard.tsx
git commit -m "fix: scope admin ranking side-sheet stats to current round"
```

---

## Task 4: Pass `roundId` to observer reports preview dialog

**Files:**
- Modify: `src/components/observer/reports/project-preview-dialog.tsx`
- Modify: caller(s) of `<ProjectPreviewDialog />` (find via grep)

- [ ] **Step 1: Find callers**

Run: `grep -rn "ProjectPreviewDialog" src --include="*.tsx"`

The observer reports page already tracks the active round (the round selector lives on the reports page itself per recent commit `2e080a5`). Capture the active roundId from the caller and thread it through.

- [ ] **Step 2: Add `roundId?: string` to the props and useQuery**

Edit `src/components/observer/reports/project-preview-dialog.tsx`:

```typescript
interface ProjectPreviewDialogProps {
  projectId: string | null
  roundId?: string
  open: boolean
  onOpenChange: (open: boolean) => void
}

export function ProjectPreviewDialog({ projectId, roundId, open, onOpenChange }: ProjectPreviewDialogProps) {
  const { data, isLoading } = trpc.analytics.getProjectDetail.useQuery(
    { id: projectId!, roundId },
    { enabled: !!projectId && open },
  )
  // …existing render…
}
```

- [ ] **Step 3: Update each caller to pass `roundId`**

For each call site identified in Step 1, pass the roundId from the page's existing state. The observer reports page (per recent commit `2e080a5`) already lifts a round selector to the top of the page — find that state and thread it through. If a caller has no round in scope and is not the observer reports page, leave the prop omitted (the procedure's optional roundId will fall back to aggregate stats and the dialog will still render correctly).

- [ ] **Step 4: Run typecheck**

Run: `npm run typecheck`
Expected: no errors.

- [ ] **Step 5: Manual smoke test**

`npm run dev`. From the observer reports page, open a project preview. Confirm Avg Score / Evaluators match the round selector at the top of the page.

- [ ] **Step 6: Commit**

```bash
git add src/components/observer/reports/project-preview-dialog.tsx <updated caller files>
git commit -m "fix: scope observer reports preview dialog to selected round"
```

---

## Task 5: Resolve default round on observer full project page

**Files:**
- Modify: `src/app/(observer)/observer/projects/[projectId]/page.tsx`
- Modify: `src/components/observer/observer-project-detail.tsx`

- [ ] **Step 1: Read the existing page wrapper**

Run: `cat src/app/\(observer\)/observer/projects/\[projectId\]/page.tsx`

It currently calls `<ObserverProjectDetail projectId={projectId} />` without round context.

- [ ] **Step 2: Read `?round=` from search params and resolve default**

Replace the page body:

```tsx
import { ObserverProjectDetail } from '@/components/observer/observer-project-detail'

export default async function ObserverProjectDetailPage({
  params,
  searchParams,
}: {
  params: Promise<{ projectId: string }>
  searchParams: Promise<{ round?: string }>
}) {
  const { projectId } = await params
  const sp = await searchParams
  return <ObserverProjectDetail projectId={projectId} initialRoundId={sp.round} />
}
```

- [ ] **Step 3: Modify `ObserverProjectDetail` to resolve the default**

In `src/components/observer/observer-project-detail.tsx`, update the props and resolve the default round:

```typescript
export function ObserverProjectDetail({ projectId, initialRoundId }: { projectId: string; initialRoundId?: string }) {
  const [activeRoundId, setActiveRoundId] = useState<string | undefined>(initialRoundId)

  // Round resolution: ROUND_ACTIVE first, else most-recent ROUND_CLOSED
  const { data: roundCandidates } = trpc.analytics.getProjectRoundsForObserver.useQuery(
    { projectId },
    { enabled: !activeRoundId },
  )

  useEffect(() => {
    if (activeRoundId || !roundCandidates) return
    const active = roundCandidates.find((r) => r.status === 'ROUND_ACTIVE')
    if (active) {
      setActiveRoundId(active.id)
      return
    }
    const closed = [...roundCandidates]
      .filter((r) => r.status === 'ROUND_CLOSED')
      .sort((a, b) => b.sortOrder - a.sortOrder)[0]
    if (closed) setActiveRoundId(closed.id)
  }, [roundCandidates, activeRoundId])

  const { data, isLoading } = trpc.analytics.getProjectDetail.useQuery(
    { id: projectId, roundId: activeRoundId },
    { refetchInterval: 30_000, enabled: !!projectId },
  )
  // …rest of component, with a small <select> chip near the stats card to switch rounds when len > 1…
```

- [ ] **Step 4: Add the `getProjectRoundsForObserver` procedure**

In `src/server/routers/analytics.ts`, add a new procedure (place it next to other observer procedures):

```typescript
  getProjectRoundsForObserver: observerProcedure
    .input(z.object({ projectId: z.string() }))
    .query(async ({ ctx, input }) => {
      const states = await ctx.prisma.projectRoundState.findMany({
        where: { projectId: input.projectId },
        select: {
          round: { select: { id: true, name: true, status: true, sortOrder: true } },
        },
      })
      return states
        .map((s) => s.round)
        .filter((r) => r.status === 'ROUND_ACTIVE' || r.status === 'ROUND_CLOSED')
        .sort((a, b) => a.sortOrder - b.sortOrder)
    }),
```

- [ ] **Step 5: Render a round selector chip when multiple rounds are available**

Inside the stats card area of `<ObserverProjectDetail />`, when `roundCandidates.length > 1`, render a small select:

```tsx
{roundCandidates && roundCandidates.length > 1 && (
  <div className="text-xs">
    <select
      className="rounded border px-2 py-1 bg-background"
      value={activeRoundId ?? ''}
      onChange={(e) => setActiveRoundId(e.target.value)}
    >
      {roundCandidates.map((r) => (
        <option key={r.id} value={r.id}>{r.name}</option>
      ))}
    </select>
  </div>
)}
```

- [ ] **Step 6: Run typecheck**

Run: `npm run typecheck`
Expected: no errors.

- [ ] **Step 7: Manual smoke test**

`npm run dev`. Open an observer project page for a project that has been in multiple rounds. Confirm:
- With no `?round=`, the page defaults to the active round if any, otherwise the most recently closed.
- The selector chip is visible when there are ≥2 candidate rounds and switches the stats correctly.
- "Evaluators" matches the per-juror count.

- [ ] **Step 8: Commit**

```bash
git add src/app/\(observer\)/observer/projects/\[projectId\]/page.tsx src/components/observer/observer-project-detail.tsx src/server/routers/analytics.ts
git commit -m "feat: resolve observer project page round default + selector"
```

---

## Task 6: Add `useBalancedRanking` to `EvaluationConfigSchema`

**Files:**
- Modify: `src/types/competition-configs.ts:90-156`
- Create: `tests/unit/round-config-balance-toggle.test.ts`

- [ ] **Step 1: Write the failing test**

Create `tests/unit/round-config-balance-toggle.test.ts`:

```typescript
import { afterAll, beforeAll, describe, expect, it } from 'vitest'
import { prisma, createCaller } from '../setup'
import {
  createTestUser, createTestProgram, createTestCompetition, createTestRound,
  cleanupTestData, uid,
} from '../helpers'
import { roundRouter } from '../../src/server/routers/round'

describe('Round.configJson.useBalancedRanking', () => {
  let programId: string
  let admin: { id: string; email: string; role: 'SUPER_ADMIN' }
  const userIds: string[] = []

  beforeAll(async () => {
    const program = await createTestProgram({ name: `bal-toggle-${uid()}` })
    programId = program.id
    const adminUser = await createTestUser('SUPER_ADMIN')
    userIds.push(adminUser.id)
    admin = { id: adminUser.id, email: adminUser.email, role: 'SUPER_ADMIN' }
  })

  afterAll(async () => {
    await cleanupTestData(programId, userIds)
  })

  it('persists useBalancedRanking via round.update', async () => {
    const competition = await createTestCompetition(programId)
    const round = await createTestRound(competition.id)
    const caller = createCaller(roundRouter, admin)
    await caller.update({
      id: round.id,
      configJson: { useBalancedRanking: false },
    })
    const reloaded = await prisma.round.findUniqueOrThrow({ where: { id: round.id } })
    expect((reloaded.configJson as Record<string, unknown>).useBalancedRanking).toBe(false)
  })
})
```

- [ ] **Step 2: Run test to verify it fails or passes**

Run: `npx vitest run tests/unit/round-config-balance-toggle.test.ts`

If `round.update` accepts arbitrary configJson today (passthrough), the test may already pass — that's fine; the test still pins the behavior. If `round.update` schema-validates configJson against `EvaluationConfigSchema` and rejects unknown keys, the test FAILS until the field is added.

- [ ] **Step 3: Add `useBalancedRanking` to the schema**

Modify `src/types/competition-configs.ts`. Inside `EvaluationConfigSchema` (right above `// Ranking (Phase 3)` near line 153), add:

```typescript
  // Whether the ranking dashboard ranks projects by juror-balanced (z-normalized) average.
  // Defaulting to true preserves existing behavior. Toggled per-round via the dashboard side panel.
  useBalancedRanking: z.boolean().default(true),
```

- [ ] **Step 4: Run test to verify it passes**

Run: `npx vitest run tests/unit/round-config-balance-toggle.test.ts`
Expected: PASS.

- [ ] **Step 5: Run typecheck**

Run: `npm run typecheck`
Expected: no errors.

- [ ] **Step 6: Commit**

```bash
git add src/types/competition-configs.ts tests/unit/round-config-balance-toggle.test.ts
git commit -m "feat: add useBalancedRanking flag to round config schema"
```

---

## Task 7: Wire toggle UI into ranking dashboard side sheet

**Files:**
- Modify: `src/components/admin/round/ranking-dashboard.tsx`

- [ ] **Step 1: Read the dashboard's existing config-save plumbing**

Run: `grep -n "saveRankingConfig\|updateRoundMutation\|roundData?.configJson" src/components/admin/round/ranking-dashboard.tsx`

The component already loads `roundData.configJson` (~line 475-484) and saves via `updateRoundMutation` (`trpc.round.update`). Reuse the same plumbing.

- [ ] **Step 2: Add local state + initialization**

Near the other `useState` calls for local weights, add:

```typescript
  const [useBalanced, setUseBalanced] = useState(true)
```

In the existing `useEffect` that initializes from `roundData.configJson`, add:

```typescript
      setUseBalanced((cfg.useBalancedRanking as boolean | undefined) ?? true)
```

(Place this line next to the other `setLocal*` calls — and since the toggle should hydrate every time `roundData` refetches without resetting other in-flight edits, leave the `weightsInitialized.current` guard in place but read `useBalancedRanking` outside it. Concretely:)

```typescript
  useEffect(() => {
    if (!roundData?.configJson) return
    const cfg = roundData.configJson as Record<string, unknown>
    setUseBalanced((cfg.useBalancedRanking as boolean | undefined) ?? true)
    if (weightsInitialized.current) return
    const saved = (cfg.criteriaWeights ?? {}) as Record<string, number>
    setLocalWeights(saved)
    setLocalCriteriaText((cfg.rankingCriteria as string) ?? '')
    setLocalScoreWeight((cfg.scoreWeight as number) ?? 5)
    setLocalPassRateWeight((cfg.passRateWeight as number) ?? 5)
    weightsInitialized.current = true
  }, [roundData])
```

- [ ] **Step 3: Add a toggle handler that persists immediately**

Add next to `saveRankingConfig`:

```typescript
  const persistUseBalanced = (next: boolean) => {
    setUseBalanced(next)
    if (!roundData?.configJson) return
    const cfg = roundData.configJson as Record<string, unknown>
    updateRoundMutation.mutate({
      id: roundId,
      configJson: { ...cfg, useBalancedRanking: next },
    })
  }
```

- [ ] **Step 4: Render a toggle row at the top of the side sheet**

Inside the `<SheetContent>` block, just below the header (right above the stats grid at line ~1004), insert:

```tsx
              <div className="mt-4 flex items-center justify-between rounded-lg border p-3">
                <div className="flex flex-col">
                  <span className="text-sm font-medium">Use balanced scoring for ranking</span>
                  <span className="text-xs text-muted-foreground">
                    Corrects for per-juror grading style. Off uses raw averages.
                  </span>
                </div>
                <Switch checked={useBalanced} onCheckedChange={persistUseBalanced} />
              </div>
```

Add to the imports at the top of the file (only if not already imported):

```typescript
import { Switch } from '@/components/ui/switch'
```

- [ ] **Step 5: Run typecheck**

Run: `npm run typecheck`
Expected: no errors.

- [ ] **Step 6: Manual smoke test**

`npm run dev`. Open the admin ranking dashboard for a round, click a project, flip the toggle. Reload — it should persist. Open another browser/session — the same value should appear.

- [ ] **Step 7: Commit**

```bash
git add src/components/admin/round/ranking-dashboard.tsx
git commit -m "feat: per-round 'use balanced scoring' toggle in side sheet"
```

---

## Task 8: List sort respects the toggle

**Files:**
- Modify: `src/components/admin/round/ranking-dashboard.tsx`

- [ ] **Step 1: Locate the sort sites**

Run: `grep -n "balancedAverage\|avgGlobalScore" src/components/admin/round/ranking-dashboard.tsx`

Two sort sites read balanced first: the initial sort (~line 417) and the composite computation (~line 879). Both use `evalScores.balanced[id]?.balancedAverage ?? raw ?? 0`.

- [ ] **Step 2: Replace the score selector with a helper**

Add near the top of the component:

```typescript
  const pickRankingScore = (projectId: string, rawFallback: number | null | undefined): number => {
    const balanced = evalScores?.balanced[projectId]?.balancedAverage
    if (useBalanced && balanced != null) return balanced
    return rawFallback ?? 0
  }
```

Replace the two existing expressions:

- Around line 417, change:
  ```typescript
  evalScores.balanced[projectId]?.balancedAverage ?? raw ?? 0
  ```
  to:
  ```typescript
  pickRankingScore(projectId, raw)
  ```

- Around line 879, change:
  ```typescript
  return evalScores?.balanced[id]?.balancedAverage ?? e?.avgGlobalScore ?? 0
  ```
  to:
  ```typescript
  return pickRankingScore(id, e?.avgGlobalScore)
  ```

- [ ] **Step 3: Trigger re-sort when toggle flips**

The list memoizes order from `evalScores`. Add `useBalanced` to the dependency array of the `useMemo` / `useEffect` that computes ranking order.

Run: `grep -n "useMemo\|useEffect" src/components/admin/round/ranking-dashboard.tsx | head` and locate the order-computation block (look for the `useEffect` near line 393 noted in the existing comment "Wait for evalScores too — the initial sort uses balanced (juror-corrected)"). Add `useBalanced` to the dependency array there.

- [ ] **Step 4: Run typecheck**

Run: `npm run typecheck`
Expected: no errors.

- [ ] **Step 5: Manual smoke test**

Flip the toggle in the side sheet. The list under each category should re-sort live.

- [ ] **Step 6: Commit**

```bash
git add src/components/admin/round/ranking-dashboard.tsx
git commit -m "feat: list sort respects useBalancedRanking toggle"
```

---

## Task 9: Rebuild side-panel stats card (Raw + Balanced side-by-side)

**Files:**
- Modify: `src/components/admin/round/ranking-dashboard.tsx` (the side-sheet stats grid around lines 1005-1027)

- [ ] **Step 1: Remove the `⇢ X.X` annotation from list rows**

Run: `grep -n "⇢" src/components/admin/round/ranking-dashboard.tsx`

Locate the inner block around lines 204-220 (the "Raw + balanced averages shown side by side" comment). Replace the entire `{balancedScore != null && Math.abs(...) >= 0.05 && (…)}` JSX block with nothing. Keep the raw `{entry.avgGlobalScore.toFixed(1)}` rendering.

Also remove the `balancedScore` prop from the `<RankingRow />` component declaration (and its prop interface) since the row no longer uses it. Keep callers from passing it — drop the JSX prop too.

- [ ] **Step 2: Compute balanced + raw averages for the open project**

Inside the side-sheet block, just before the stats grid, compute:

```typescript
              {(() => {
                const raw = evalScores?.balanced[selectedProjectId ?? '']?.rawAverage ?? null
                const balanced = evalScores?.balanced[selectedProjectId ?? '']?.balancedAverage ?? null
                const showBoth = raw != null || balanced != null
                if (!showBoth) return null
                return (
                  <div className="rounded-lg border p-3">
                    <p className="text-xs text-muted-foreground mb-2">Avg Score</p>
                    <div className="flex items-baseline gap-4">
                      <div className={`flex items-baseline gap-1 ${useBalanced ? 'text-muted-foreground' : 'font-semibold'}`}>
                        <span className="text-xs">Raw</span>
                        <span className="text-lg tabular-nums">{raw != null ? raw.toFixed(1) : '—'}</span>
                        {!useBalanced && <span className="ml-1 text-[10px] text-muted-foreground">← used for ranking</span>}
                      </div>
                      <div className={`flex items-baseline gap-1 ${useBalanced ? 'font-semibold' : 'text-muted-foreground'}`}>
                        <span className="text-xs">Balanced</span>
                        <span className="text-lg tabular-nums">{balanced != null ? balanced.toFixed(1) : '—'}</span>
                        {useBalanced && <span className="ml-1 text-[10px] text-muted-foreground">← used for ranking</span>}
                      </div>
                    </div>
                  </div>
                )
              })()}
```

The `← used for ranking` chip should sit next to whichever number is active. To keep markup simple, render it once at the end of the row and rely on the bolded label to point to the active number; if you want the chip to literally sit next to the active label, conditional-render it inline within each label block instead.

- [ ] **Step 3: Replace the legacy 3-card grid**

The existing 3-card grid (Avg / Pass Rate / Evaluators) at lines 1006-1027 keeps Pass Rate + Evaluators but loses the Avg card (replaced by Step 2's combined card). Restructure into a vertical stack:

```tsx
              {projectDetail.stats && (
                <div className="space-y-3">
                  {/* Avg card (Step 2 above) */}
                  <div className="grid grid-cols-2 gap-3">
                    <div className="rounded-lg border p-3 text-center">
                      <p className="text-xs text-muted-foreground">Pass Rate</p>
                      <p className="mt-1 text-lg font-semibold">
                        {projectDetail.stats.totalEvaluations > 0
                          ? `${Math.round((projectDetail.stats.yesVotes / projectDetail.stats.totalEvaluations) * 100)}%`
                          : '—'}
                      </p>
                    </div>
                    <div className="rounded-lg border p-3 text-center">
                      <p className="text-xs text-muted-foreground">Evaluators</p>
                      <p className="mt-1 text-lg font-semibold">
                        {projectDetail.stats.totalEvaluations}
                      </p>
                    </div>
                  </div>
                </div>
              )}
```

- [ ] **Step 4: Run typecheck + manual smoke test**

Run: `npm run typecheck`
`npm run dev` and confirm the side panel renders both Raw + Balanced, with the active one bolded.

- [ ] **Step 5: Commit**

```bash
git add src/components/admin/round/ranking-dashboard.tsx
git commit -m "feat: side panel shows both raw and balanced averages, list view drops delta annotation"
```

---

## Task 10: Per-juror "typical / contributes as" chip

**Files:**
- Modify: `src/server/routers/ranking.ts` (extend the data the dashboard already fetches to include per-juror balance stats)
- Modify: `src/components/admin/round/ranking-dashboard.tsx`

- [ ] **Step 1: Extend the ranking router's evalScores response with per-juror stats**

Open `src/server/routers/ranking.ts` and locate the procedure that returns `byProject` + `balanced` (around lines 488-540). After computing `balanceCtx`, extend the response with a `jurorStats` map:

```typescript
      const jurorStats: Record<string, { mean: number; stddev: number; count: number }> = {}
      for (const [userId, s] of balanceCtx.jurorStats.entries()) {
        jurorStats[userId] = { mean: s.mean, stddev: s.stddev, count: s.count }
      }

      return { byProject, balanced, jurorStats, overallMean: balanceCtx.overallMean, overallStddev: balanceCtx.overallStddev }
```

- [ ] **Step 2: Render the chip in the per-juror list**

In the side sheet, find the per-juror row (around lines 1046-1090). After the existing `<Badge variant="outline">Score: {a.evaluation?.globalScore?.toFixed(1) ?? '—'}</Badge>`, render a chip when balanced is on AND we have stats for this juror:

```tsx
{useBalanced && (() => {
  const stats = evalScores?.jurorStats?.[a.userId]
  const score = a.evaluation?.globalScore
  if (!stats || score == null) return null
  const overallMean = evalScores!.overallMean
  const overallStddev = evalScores!.overallStddev
  if (overallStddev === 0) return null
  const z = stats.stddev > 0 ? (score - stats.mean) / stats.stddev : (score - overallMean) / overallStddev
  const contributesAs = overallMean + z * overallStddev
  return (
    <span className="ml-2 text-xs text-muted-foreground" title="Juror's personal scoring baseline → rescaled contribution">
      typical {stats.mean.toFixed(1)} → contributes {contributesAs.toFixed(1)}
    </span>
  )
})()}
```

(`a.userId` may be absent in the existing select. If so, add `userId: true` to the assignment select inside `getProjectDetail` and re-thread, or use `a.user?.id` if already selected. Run `grep -n "user: { select" src/server/routers/analytics.ts` to confirm.)

- [ ] **Step 3: Run typecheck + manual smoke test**

Run: `npm run typecheck`
`npm run dev`. Open the side panel. With balanced on, each juror row should show "typical X.X → contributes Y.Y". With it off, the chip disappears.

- [ ] **Step 4: Commit**

```bash
git add src/server/routers/ranking.ts src/components/admin/round/ranking-dashboard.tsx
git commit -m "feat: side panel shows per-juror baseline and balanced contribution"
```

---

## Task 11: Build shared `<ScoreExplainerDialog />`

**Files:**
- Create: `src/components/shared/score-explainer-dialog.tsx`

- [ ] **Step 1: Create the dialog component**

```tsx
'use client'

import {
  Dialog,
  DialogContent,
  DialogHeader,
  DialogTitle,
  DialogTrigger,
} from '@/components/ui/dialog'
import { Button } from '@/components/ui/button'
import { Info } from 'lucide-react'
import type { ReactNode } from 'react'

export function ScoreExplainerDialog({ trigger }: { trigger?: ReactNode }) {
  return (
    <Dialog>
      <DialogTrigger asChild>
        {trigger ?? (
          <Button variant="ghost" size="sm" className="h-7 gap-1 px-2 text-xs">
            <Info className="h-3.5 w-3.5" />
            How scores are calculated
          </Button>
        )}
      </DialogTrigger>
      <DialogContent className="max-w-xl max-h-[85vh] overflow-y-auto">
        <DialogHeader>
          <DialogTitle>How scores are calculated</DialogTitle>
        </DialogHeader>

        <div className="space-y-4 text-sm">
          <p>
            Different jurors have different grading styles. Some grade harshly, some leniently.
            Balanced scoring corrects for that so a project isn&apos;t punished for drawing harsh
            jurors or rewarded for drawing lenient ones.
          </p>

          <div>
            <h3 className="font-semibold mb-1">How it works</h3>
            <ol className="list-decimal pl-5 space-y-1">
              <li>For each juror, calculate their personal average and spread across all the projects they scored in this round.</li>
              <li>Convert each individual score into &quot;how many standard deviations above or below this juror&apos;s typical&quot; — a 6 from a juror who averages 5 reads the same as a 9 from a juror who averages 8.</li>
              <li>Average those normalized values across the project&apos;s jurors.</li>
              <li>Rescale back onto the same 1–10 scale using the round&apos;s overall average and spread.</li>
              <li>The result is directly comparable to the raw average — same scale, but corrected for grading style.</li>
            </ol>
          </div>

          <div>
            <h3 className="font-semibold mb-1">Worked example</h3>
            <table className="w-full text-xs border-collapse">
              <thead>
                <tr className="border-b">
                  <th className="py-1 text-left">Juror</th>
                  <th className="py-1 text-left">Their typical avg</th>
                  <th className="py-1 text-left">Score for &quot;Project X&quot;</th>
                  <th className="py-1 text-left">What that means</th>
                </tr>
              </thead>
              <tbody>
                <tr className="border-b">
                  <td className="py-1">Juror A (lenient)</td>
                  <td>8.2</td>
                  <td>9.0</td>
                  <td>Just above their typical (+0.4σ)</td>
                </tr>
                <tr className="border-b">
                  <td className="py-1">Juror B (harsh)</td>
                  <td>5.8</td>
                  <td>7.5</td>
                  <td>Well above their typical (+1.5σ)</td>
                </tr>
                <tr>
                  <td className="py-1">Juror C (typical)</td>
                  <td>7.0</td>
                  <td>8.0</td>
                  <td>Slightly above their typical (+0.7σ)</td>
                </tr>
              </tbody>
            </table>
            <p className="mt-2 text-xs text-muted-foreground">
              Raw average: (9.0 + 7.5 + 8.0) / 3 = <strong>8.2</strong>.
              Balanced average rescales each juror&apos;s enthusiasm to the round&apos;s overall scale and lands at
              roughly <strong>8.4</strong> — Juror B&apos;s strong endorsement (well above their harsh baseline)
              carries more weight than the raw 7.5 suggests.
            </p>
          </div>

          <div>
            <h3 className="font-semibold mb-1">When it kicks in</h3>
            <ul className="list-disc pl-5 space-y-1">
              <li>Needs at least 2 evaluations from the round to compute a juror&apos;s spread; otherwise that juror falls back to the round-wide average.</li>
              <li>Needs at least one juror with non-zero spread; if every juror gave identical scores, balanced equals raw.</li>
              <li>Computed within a single round only — a juror&apos;s grading style in an intake screening doesn&apos;t affect their balance in a deep evaluation.</li>
            </ul>
          </div>

          <div>
            <h3 className="font-semibold mb-1">Why we still show &quot;Raw&quot;</h3>
            <p>
              Both numbers are always shown so you can sanity-check the correction. The toggle at the top of the
              side panel decides which one is used for ranking.
            </p>
          </div>
        </div>
      </DialogContent>
    </Dialog>
  )
}
```

- [ ] **Step 2: Run typecheck**

Run: `npm run typecheck`
Expected: no errors.

- [ ] **Step 3: Commit**

```bash
git add src/components/shared/score-explainer-dialog.tsx
git commit -m "feat: shared 'How scores are calculated' explainer dialog"
```

---

## Task 12: Wire the explainer into admin + observer surfaces

**Files:**
- Modify: `src/components/admin/round/ranking-dashboard.tsx`
- Modify: `src/components/observer/observer-project-detail.tsx`
- Modify: `src/components/observer/reports/project-preview-dialog.tsx`

- [ ] **Step 1: Mount in admin side sheet**

Inside the side sheet, just below the Avg Score combined card from Task 9 Step 2, add:

```tsx
import { ScoreExplainerDialog } from '@/components/shared/score-explainer-dialog'
// …
<div className="flex justify-end">
  <ScoreExplainerDialog />
</div>
```

- [ ] **Step 2: Mount in observer full project detail**

Find the stats area in `src/components/observer/observer-project-detail.tsx` (search for "averageGlobalScore" or "Avg Score") and add the same `<ScoreExplainerDialog />` button next to it.

- [ ] **Step 3: Mount in observer reports preview dialog**

Inside `src/components/observer/reports/project-preview-dialog.tsx`, in the Evaluation summary block (around the existing Avg Score card), add the explainer button.

- [ ] **Step 4: Run typecheck + manual smoke test**

Run: `npm run typecheck`
`npm run dev`. Click the "How scores are calculated" button in each of the three locations and confirm the dialog renders.

- [ ] **Step 5: Commit**

```bash
git add src/components/admin/round/ranking-dashboard.tsx src/components/observer/observer-project-detail.tsx src/components/observer/reports/project-preview-dialog.tsx
git commit -m "feat: mount score explainer dialog in admin and observer surfaces"
```

---

## Task 13: Decimal display audit

**Files:**
- Modify: `src/app/(admin)/admin/reports/page.tsx:368`

- [ ] **Step 1: Replace toFixed(2) with toFixed(1)**

Find the line:

```typescript
{p.balancedScore == null ? '-' : p.balancedScore.toFixed(2)}
```

Change to:

```typescript
{p.balancedScore == null ? '-' : p.balancedScore.toFixed(1)}
```

- [ ] **Step 2: Grep for any other 2-decimal score displays**

Run: `grep -rn "toFixed(2)" src/components src/app --include="*.tsx" | grep -iE "balanced|avg|score"`

For any results that show balanced/raw scores, change to `toFixed(1)`. Skip any rate/percentage displays that should stay at 2 decimals.

- [ ] **Step 3: Commit**

```bash
git add <files>
git commit -m "fix: standardize score displays on one decimal"
```

---

## Task 14: Verify list-view delta annotation removal

**Files:**
- (No new modification; verifies Task 9 Step 1 landed.)

- [ ] **Step 1: Grep for any remaining `⇢` characters**

Run: `grep -rn "⇢" src/components --include="*.tsx"`
Expected: no matches.

- [ ] **Step 2: Grep for the now-unused `balancedScore` prop on the row component**

Run: `grep -n "balancedScore" src/components/admin/round/ranking-dashboard.tsx`
Expected: occurrences only inside the side-sheet block, not on the row component's props or render.

If anything remains, remove it.

- [ ] **Step 3: Commit (if changes were made)**

```bash
git add src/components/admin/round/ranking-dashboard.tsx
git commit -m "chore: remove leftover balancedScore plumbing on list row"
```

---

## Task 15: Final verification

- [ ] **Step 1: Run the full test suite**

Run: `npx vitest run`
Expected: PASS.

- [ ] **Step 2: Run typecheck**

Run: `npm run typecheck`
Expected: no errors.

- [ ] **Step 3: Run a production build**

Run: `npm run build`
Expected: build completes successfully.

- [ ] **Step 4: Manual end-to-end smoke**

`npm run dev`. Walk through the spec's acceptance criteria:

1. With 3 round-scoped evaluations of 9, 8, 8, the side panel shows Avg 8.3 and Evaluators 3.
2. Flipping the toggle re-sorts the list view; persists across reload and across users.
3. List view shows no per-row delta annotation.
4. Side panel shows both Raw and Balanced; active one is highlighted.
5. Edition-mode rankings differ vs. before (compute by hand for one project — should match per-round rollup).
6. Observer project detail page defaults to active or most recently closed round.
7. All score displays show one decimal.
8. "How scores are calculated" opens from admin side panel, observer detail page, and observer preview dialog.

- [ ] **Step 5: No new commit unless something needed fixing**

If any acceptance criterion fails, create a fix commit. Otherwise nothing to commit here.