# Juror-Balance Toggle + Round-Scoping Fixes — Implementation Plan > **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. **Goal:** Fix cross-round contamination in two analytics procedures (and the three UI surfaces that consume them), add a per-round "use balanced scoring" toggle, replace the list-view delta annotation with a richer side-panel display, and ship a shared "How scores are calculated" explainer dialog. **Architecture:** Server-side: extend `analytics.getProjectDetail` to accept a roundId and scope its evaluation query; rework `analytics.getProjectRankings` edition mode to compute one z-balance context per round before aggregating; add `useBalancedRanking` to `EvaluationConfigSchema` so it persists in `Round.configJson`. Client-side: pass roundId from each caller; rebuild the admin ranking dashboard side sheet to show both raw and balanced averages, the per-round toggle, per-juror balance contributions, and an affordance opening a shared explainer dialog component (``) reused on observer surfaces. **Tech Stack:** Next.js 15 App Router, tRPC 11 with Zod, Prisma 6, Vitest 4 (file-parallelism off, forks pool), shadcn/ui (Dialog + Switch primitives already available), TypeScript strict. **Spec:** `docs/superpowers/specs/2026-04-27-juror-balance-toggle-and-round-scoping-design.md` --- ## File Structure ### Server | File | Responsibility | |---|---| | `src/server/routers/analytics.ts` | Modify `getProjectDetail` (Task 1) + `getProjectRankings` (Task 2). | | `src/server/services/juror-balance.ts` | Add a small helper `computePerRoundBalanced(pointsByRound)` consumed by `getProjectRankings` edition mode (Task 2). Keep existing functions untouched. | | `src/types/competition-configs.ts` | Add `useBalancedRanking: z.boolean().default(true)` to `EvaluationConfigSchema` (Task 6). | ### Client | File | Responsibility | |---|---| | `src/components/shared/score-explainer-dialog.tsx` | NEW. Reusable explainer dialog (Task 11). | | `src/components/admin/round/ranking-dashboard.tsx` | Wire roundId through to `getProjectDetail`; rebuild side-sheet stats area; add toggle row; per-juror chips; remove list-row delta annotation; mount `` (Tasks 3, 7, 8, 9, 10, 12, 14). | | `src/components/observer/observer-project-detail.tsx` | Resolve default round and pass roundId; mount explainer dialog affordance (Tasks 5, 12). | | `src/components/observer/reports/project-preview-dialog.tsx` | Accept and pass `roundId` prop; mount explainer affordance (Tasks 4, 12). | | `src/app/(observer)/observer/projects/[projectId]/page.tsx` | Read `?round=` query param and pass to `` (Task 5). | | `src/app/(admin)/admin/reports/page.tsx` | Decimal audit fix `toFixed(2)` → `toFixed(1)` (Task 13). | ### Tests | File | Responsibility | |---|---| | `tests/unit/juror-balance-round-scoping.test.ts` | NEW. Vitest cases for getProjectDetail roundId filtering and getProjectRankings per-round z-context (Tasks 1, 2). | | `tests/unit/round-config-balance-toggle.test.ts` | NEW. Vitest case for persisting `useBalancedRanking` via `round.update` (Task 6). | --- ## Test setup notes (for the implementer) Vitest 4 is the framework; tests run sequentially (`fileParallelism: false`, `pool: 'forks'`). Use the helpers in `tests/helpers.ts` (`createTestUser`, `createTestProgram`, `createTestCompetition`, `createTestRound`, `createTestProject`, `createTestProjectRoundState`, `createTestAssignment`, `createTestEvaluation`, `createTestEvaluationForm`, `cleanupTestData`) and `createCaller(routerModule, user)` from `tests/setup.ts`. Always `cleanupTestData(programId, userIds)` in `afterAll`. Run a single test file with: `npx vitest run tests/unit/.test.ts`. Run a single test by name with: `npx vitest run -t ''`. --- ## Task 1: Round-scope `analytics.getProjectDetail` **Files:** - Modify: `src/server/routers/analytics.ts:1370-1464` - Create: `tests/unit/juror-balance-round-scoping.test.ts` - [ ] **Step 1: Write the failing test (round filtering)** Create `tests/unit/juror-balance-round-scoping.test.ts`: ```typescript import { afterAll, beforeAll, describe, expect, it } from 'vitest' import { prisma, createCaller } from '../setup' import { createTestUser, createTestProgram, createTestCompetition, createTestRound, createTestProject, createTestProjectRoundState, createTestAssignment, createTestEvaluation, createTestEvaluationForm, cleanupTestData, uid, } from '../helpers' import { analyticsRouter } from '../../src/server/routers/analytics' describe('analytics.getProjectDetail round scoping', () => { let programId: string let admin: { id: string; email: string; role: 'SUPER_ADMIN' } let projectId: string let roundAId: string let roundBId: string const userIds: string[] = [] beforeAll(async () => { const program = await createTestProgram({ name: `bal-scope-${uid()}` }) programId = program.id const competition = await createTestCompetition(programId) const roundA = await createTestRound(competition.id, { name: 'Round A', sortOrder: 0, status: 'ROUND_CLOSED' }) const roundB = await createTestRound(competition.id, { name: 'Round B', sortOrder: 1, status: 'ROUND_ACTIVE' }) roundAId = roundA.id roundBId = roundB.id const formA = await createTestEvaluationForm(roundA.id) const formB = await createTestEvaluationForm(roundB.id) const project = await createTestProject(programId) projectId = project.id await createTestProjectRoundState(projectId, roundA.id, { state: 'PASSED' }) await createTestProjectRoundState(projectId, roundB.id, { state: 'IN_PROGRESS' }) // 2 evaluations on Round A: 7.0, 8.0 (mean 7.5) for (const score of [7, 8]) { const juror = await createTestUser('JURY_MEMBER') userIds.push(juror.id) const a = await createTestAssignment(juror.id, projectId, roundA.id) await createTestEvaluation(a.id, formA.id, { status: 'SUBMITTED', globalScore: score, submittedAt: new Date() }) } // 3 evaluations on Round B: 9.0, 8.0, 8.0 (mean 8.333…) for (const score of [9, 8, 8]) { const juror = await createTestUser('JURY_MEMBER') userIds.push(juror.id) const a = await createTestAssignment(juror.id, projectId, roundB.id) await createTestEvaluation(a.id, formB.id, { status: 'SUBMITTED', globalScore: score, submittedAt: new Date() }) } const adminUser = await createTestUser('SUPER_ADMIN') userIds.push(adminUser.id) admin = { id: adminUser.id, email: adminUser.email, role: 'SUPER_ADMIN' } }) afterAll(async () => { await cleanupTestData(programId, userIds) }) it('returns only round-B stats when roundId=roundB is passed', async () => { const caller = createCaller(analyticsRouter, admin) const result = await caller.getProjectDetail({ id: projectId, roundId: roundBId }) expect(result.stats).not.toBeNull() expect(result.stats!.totalEvaluations).toBe(3) expect(result.stats!.averageGlobalScore).toBeCloseTo(8.333, 2) }) }) ``` - [ ] **Step 2: Run test to verify it fails** Run: `npx vitest run tests/unit/juror-balance-round-scoping.test.ts -t 'returns only round-B stats'` Expected: FAIL — current procedure ignores `roundId` and returns 5 evaluations averaging 8.0. The Zod input schema also rejects `roundId` since it's not declared. - [ ] **Step 3: Add `roundId` to the input schema and scope the query** Modify `src/server/routers/analytics.ts` around line 1371. Replace the input definition and the `submittedEvaluations` query: ```typescript getProjectDetail: observerProcedure .input(z.object({ id: z.string(), roundId: z.string().optional() })) .query(async ({ ctx, input }) => { const [projectRaw, projectTags, assignments, submittedEvaluations] = await Promise.all([ ctx.prisma.project.findUniqueOrThrow({ where: { id: input.id }, include: { files: { select: { id: true, fileName: true, fileType: true, mimeType: true, size: true, bucket: true, objectKey: true, pageCount: true, textPreview: true, detectedLang: true, langConfidence: true, analyzedAt: true, roundId: true, requirementId: true, requirement: { select: { id: true, name: true, description: true, isRequired: true } }, }, orderBy: [{ createdAt: 'asc' }], }, teamMembers: { include: { user: { select: { id: true, name: true, email: true, profileImageKey: true, profileImageProvider: true }, }, }, orderBy: { joinedAt: 'asc' }, }, }, }), ctx.prisma.projectTag.findMany({ where: { projectId: input.id }, include: { tag: { select: { id: true, name: true, category: true, color: true } } }, orderBy: { confidence: 'desc' }, }).catch(() => [] as { id: string; projectId: string; tagId: string; confidence: number; tag: { id: string; name: string; category: string | null; color: string | null } }[]), ctx.prisma.assignment.findMany({ where: { projectId: input.id }, include: { user: { select: { id: true, name: true, email: true, profileImageKey: true, profileImageProvider: true } }, round: { select: { id: true, name: true } }, evaluation: { select: { id: true, status: true, submittedAt: true, globalScore: true, binaryDecision: true, criterionScoresJson: true, feedbackText: true, }, }, }, orderBy: { createdAt: 'desc' }, }), ctx.prisma.evaluation.findMany({ where: { status: 'SUBMITTED', assignment: { projectId: input.id, ...(input.roundId ? { roundId: input.roundId } : {}), }, }, }), ]) ``` Leave the rest of the procedure body untouched. The `stats = null` fallback (when no submitted evaluations match) already does the right thing. - [ ] **Step 4: Run test to verify it passes** Run: `npx vitest run tests/unit/juror-balance-round-scoping.test.ts -t 'returns only round-B stats'` Expected: PASS. - [ ] **Step 5: Add a second test asserting unfiltered behavior is preserved** Append to the same `describe` block: ```typescript it('returns aggregated stats across all rounds when roundId is omitted', async () => { const caller = createCaller(analyticsRouter, admin) const result = await caller.getProjectDetail({ id: projectId }) expect(result.stats!.totalEvaluations).toBe(5) }) ``` Run: `npx vitest run tests/unit/juror-balance-round-scoping.test.ts -t 'aggregated stats across all rounds'` Expected: PASS (no further code change needed; the `?:` fallback handles this). - [ ] **Step 6: Commit** ```bash git add src/server/routers/analytics.ts tests/unit/juror-balance-round-scoping.test.ts git commit -m "fix: scope analytics.getProjectDetail by optional roundId" ``` --- ## Task 2: Per-round z-context in `analytics.getProjectRankings` edition mode **Files:** - Modify: `src/server/routers/analytics.ts:199-258` - Modify: `src/server/services/juror-balance.ts` (add `computePerRoundBalanced` helper) - Modify: `tests/unit/juror-balance-round-scoping.test.ts` (append cases) - [ ] **Step 1: Write the failing test (edition-mode per-round grouping)** Append to `tests/unit/juror-balance-round-scoping.test.ts`: ```typescript describe('analytics.getProjectRankings per-round z-context (edition mode)', () => { let programId: string let admin: { id: string; email: string; role: 'SUPER_ADMIN' } let projectXId: string let projectYId: string const userIds: string[] = [] beforeAll(async () => { const program = await createTestProgram({ name: `rank-edition-${uid()}` }) programId = program.id const competition = await createTestCompetition(programId) const roundA = await createTestRound(competition.id, { name: 'A', sortOrder: 0 }) const roundB = await createTestRound(competition.id, { name: 'B', sortOrder: 1 }) const formA = await createTestEvaluationForm(roundA.id, [ { id: 'c1', label: 'X', scale: '1-10', weight: 1 }, ]) const formB = await createTestEvaluationForm(roundB.id, [ { id: 'c1', label: 'X', scale: '1-10', weight: 1 }, ]) const projX = await createTestProject(programId, { title: 'X' }) const projY = await createTestProject(programId, { title: 'Y' }) projectXId = projX.id projectYId = projY.id await createTestProjectRoundState(projX.id, roundA.id) await createTestProjectRoundState(projY.id, roundA.id) await createTestProjectRoundState(projX.id, roundB.id) await createTestProjectRoundState(projY.id, roundB.id) // Round A: a "lenient" juror grades 9 on X, 9 on Y. A "harsh" juror grades 6 on X, 4 on Y. // Mixing A+B produces a misleading single z-context. Per-round contexts: // - In Round A: lenient mean=9 stddev=0 (fallback), harsh mean=5 stddev=1 // - In Round B: identical ratings, separate context const lenient = await createTestUser('JURY_MEMBER') const harsh = await createTestUser('JURY_MEMBER') userIds.push(lenient.id, harsh.id) const writeEval = async (jurorId: string, projId: string, roundId: string, formId: string, c1: number) => { const a = await createTestAssignment(jurorId, projId, roundId) await prisma.evaluation.create({ data: { assignmentId: a.id, formId, status: 'SUBMITTED', submittedAt: new Date(), criterionScoresJson: { c1 }, }, }) } // Round A await writeEval(lenient.id, projX.id, roundA.id, formA.id, 9) await writeEval(lenient.id, projY.id, roundA.id, formA.id, 9) await writeEval(harsh.id, projX.id, roundA.id, formA.id, 6) await writeEval(harsh.id, projY.id, roundA.id, formA.id, 4) // Round B (different scoring profile so cross-round pooling skews things) await writeEval(lenient.id, projX.id, roundB.id, formB.id, 8) await writeEval(lenient.id, projY.id, roundB.id, formB.id, 8) await writeEval(harsh.id, projX.id, roundB.id, formB.id, 7) await writeEval(harsh.id, projY.id, roundB.id, formB.id, 5) const adminUser = await createTestUser('SUPER_ADMIN') userIds.push(adminUser.id) admin = { id: adminUser.id, email: adminUser.email, role: 'SUPER_ADMIN' } }) afterAll(async () => { await cleanupTestData(programId, userIds) }) it('aggregates per-project balanced score as the mean of per-round balanced averages', async () => { const caller = createCaller(analyticsRouter, admin) const result = await caller.getProjectRankings({ programId }) const x = result.find((p) => p.id === projectXId)! const y = result.find((p) => p.id === projectYId)! // Hand-computed expected per-round balanced averages: // Round A: lenient stddev=0 (fallback to overall), harsh mean=5 stddev=1. // X: lenient z=fallback (9-7)/sqrt(3.5)=2/1.8708=+1.069, harsh z=(6-5)/1=+1.0 → avg z=1.0345 // Round A overall mean=7, stddev=sqrt(3.5)=1.8708 → X balanced = 7 + 1.0345*1.8708 ≈ 8.94 // Y: lenient z=fallback (9-7)/1.8708=+1.069, harsh z=(4-5)/1=-1.0 → avg z=0.0345 // Y balanced = 7 + 0.0345*1.8708 ≈ 7.06 // Round B: lenient mean=8 stddev=0 (fallback), harsh mean=6 stddev=1. // B overall mean=7, stddev=sqrt(1.5)=1.2247 // X: lenient z=(8-7)/1.2247=+0.8165, harsh z=(7-6)/1=+1.0 → avg z=0.9082 // X balanced = 7 + 0.9082*1.2247 ≈ 8.11 // Y: lenient z=(8-7)/1.2247=+0.8165, harsh z=(5-6)/1=-1.0 → avg z=-0.0917 // Y balanced = 7 - 0.0917*1.2247 ≈ 6.89 // Project-level edition rollup = mean of per-round balanced averages: // X ≈ (8.94 + 8.11)/2 ≈ 8.52 // Y ≈ (7.06 + 6.89)/2 ≈ 6.97 expect(x.balancedScore!).toBeCloseTo(8.52, 1) expect(y.balancedScore!).toBeCloseTo(6.97, 1) }) }) ``` - [ ] **Step 2: Run test to verify it fails** Run: `npx vitest run tests/unit/juror-balance-round-scoping.test.ts -t 'mean of per-round balanced averages'` Expected: FAIL — current code pools all 8 evaluations into one z-context. - [ ] **Step 3: Add helper `computePerRoundBalanced` to juror-balance service** Modify `src/server/services/juror-balance.ts`. Append at the end: ```typescript /** * Per-round balanced rollup: groups points by roundId, computes a balance * context per round, then averages the per-round balanced averages for each * project. Use when surfacing edition-level rankings — never pool z-contexts * across rounds, because a juror's grading profile differs by round type. */ export type RoundScopedScorePoint = ScorePoint & { roundId: string } export type EditionRollupResult = { projectId: string rawAverage: number | null balancedAverage: number | null count: number roundCount: number } export function computePerRoundBalanced( points: RoundScopedScorePoint[], ): Map { const byRound = new Map() for (const p of points) { const arr = byRound.get(p.roundId) ?? [] arr.push({ projectId: p.projectId, userId: p.userId, rawScore: p.rawScore }) byRound.set(p.roundId, arr) } const perRoundResults: Array> = [] for (const roundPoints of byRound.values()) { const ctx = computeBalanceContext(roundPoints) perRoundResults.push(computeBalancedProjectScores(roundPoints, ctx)) } const accumulator = new Map< string, { rawSum: number; rawCount: number; balancedSum: number; balancedCount: number; count: number; roundCount: number } >() for (const roundMap of perRoundResults) { for (const [projectId, result] of roundMap.entries()) { const acc = accumulator.get(projectId) ?? { rawSum: 0, rawCount: 0, balancedSum: 0, balancedCount: 0, count: 0, roundCount: 0, } if (result.rawAverage != null) { acc.rawSum += result.rawAverage acc.rawCount += 1 } if (result.balancedAverage != null) { acc.balancedSum += result.balancedAverage acc.balancedCount += 1 } acc.count += result.count acc.roundCount += 1 accumulator.set(projectId, acc) } } const out = new Map() for (const [projectId, acc] of accumulator.entries()) { out.set(projectId, { projectId, rawAverage: acc.rawCount > 0 ? acc.rawSum / acc.rawCount : null, balancedAverage: acc.balancedCount > 0 ? acc.balancedSum / acc.balancedCount : null, count: acc.count, roundCount: acc.roundCount, }) } return out } ``` - [ ] **Step 4: Update `getProjectRankings` to branch on roundId vs programId** Modify `src/server/routers/analytics.ts`. Replace the imports near line 9: ```typescript import { computeBalanceContext, computeBalancedProjectScores, computePerRoundBalanced, type ScorePoint, type RoundScopedScorePoint, } from '../services/juror-balance' ``` Then replace the `getProjectRankings` body (lines 199-258) with: ```typescript getProjectRankings: observerProcedure .input(editionOrRoundInput.and(z.object({ limit: z.number().optional() }))) .query(async ({ ctx, input }) => { const [projects, evaluations] = await Promise.all([ ctx.prisma.project.findMany({ where: projectWhere(input), select: { id: true, title: true, teamName: true, status: true }, }), ctx.prisma.evaluation.findMany({ where: evalWhere(input, { status: 'SUBMITTED' }), select: { criterionScoresJson: true, assignment: { select: { userId: true, projectId: true, roundId: true } }, }, }), ]) const rawPoints: RoundScopedScorePoint[] = [] for (const e of evaluations) { const scores = e.criterionScoresJson as Record | null if (!scores) continue const vals = Object.values(scores).filter((s): s is number => typeof s === 'number') if (vals.length === 0) continue const rawScore = vals.reduce((a, b) => a + b, 0) / vals.length rawPoints.push({ projectId: e.assignment.projectId, userId: e.assignment.userId, roundId: e.assignment.roundId, rawScore, }) } // roundId mode: single-round z-context (existing behavior) // programId mode: per-round z-contexts aggregated as the mean of per-round balanced averages const balancedByProject: Map = (() => { if (input.roundId) { const flat: ScorePoint[] = rawPoints.map(({ projectId, userId, rawScore }) => ({ projectId, userId, rawScore })) const ctx = computeBalanceContext(flat) const out = computeBalancedProjectScores(flat, ctx) return out } return computePerRoundBalanced(rawPoints) })() const rankings = projects .map((project) => { const result = balancedByProject.get(project.id) return { id: project.id, title: project.title, teamName: project.teamName, status: project.status, averageScore: result?.rawAverage ?? null, balancedScore: result?.balancedAverage ?? null, evaluationCount: result?.count ?? 0, } }) .sort((a, b) => { const aScore = a.balancedScore ?? a.averageScore const bScore = b.balancedScore ?? b.averageScore if (aScore !== null && bScore !== null) return bScore - aScore if (aScore !== null) return -1 if (bScore !== null) return 1 return 0 }) return input.limit ? rankings.slice(0, input.limit) : rankings }), ``` - [ ] **Step 5: Run the test suite** Run: `npx vitest run tests/unit/juror-balance-round-scoping.test.ts` Expected: all 3 tests PASS. - [ ] **Step 6: Run typecheck** Run: `npm run typecheck` Expected: no errors. - [ ] **Step 7: Commit** ```bash git add src/server/routers/analytics.ts src/server/services/juror-balance.ts tests/unit/juror-balance-round-scoping.test.ts git commit -m "fix: compute z-context per-round in edition-mode rankings rollup" ``` --- ## Task 3: Pass `roundId` from admin ranking dashboard side sheet **Files:** - Modify: `src/components/admin/round/ranking-dashboard.tsx` (around the existing `getProjectDetail.useQuery` call) - [ ] **Step 1: Locate the existing useQuery call** Open `src/components/admin/round/ranking-dashboard.tsx` and find where `selectedProjectId` drives `analytics.getProjectDetail`. The component already has `roundId` in scope (it's the dashboard's own prop / state). Run: `grep -n "getProjectDetail\.useQuery" src/components/admin/round/ranking-dashboard.tsx` - [ ] **Step 2: Add roundId to the input** Edit the call to include `roundId`: ```typescript const { data: projectDetail, isLoading: detailLoading } = trpc.analytics.getProjectDetail.useQuery( { id: selectedProjectId!, roundId }, { enabled: !!selectedProjectId }, ) ``` (Confirm the existing `enabled` guard and any other existing options stay intact.) - [ ] **Step 3: Manual smoke test** Start the dev server: `npm run dev` Navigate to the admin ranking dashboard for a round where a project has had evaluations in earlier rounds. Click a project. Confirm: - The "Evaluators" stat in the side sheet matches the count in the per-juror list below. - "Avg Score" reflects only the current round's scores (one decimal). - [ ] **Step 4: Commit** ```bash git add src/components/admin/round/ranking-dashboard.tsx git commit -m "fix: scope admin ranking side-sheet stats to current round" ``` --- ## Task 4: Pass `roundId` to observer reports preview dialog **Files:** - Modify: `src/components/observer/reports/project-preview-dialog.tsx` - Modify: caller(s) of `` (find via grep) - [ ] **Step 1: Find callers** Run: `grep -rn "ProjectPreviewDialog" src --include="*.tsx"` The observer reports page already tracks the active round (the round selector lives on the reports page itself per recent commit `2e080a5`). Capture the active roundId from the caller and thread it through. - [ ] **Step 2: Add `roundId?: string` to the props and useQuery** Edit `src/components/observer/reports/project-preview-dialog.tsx`: ```typescript interface ProjectPreviewDialogProps { projectId: string | null roundId?: string open: boolean onOpenChange: (open: boolean) => void } export function ProjectPreviewDialog({ projectId, roundId, open, onOpenChange }: ProjectPreviewDialogProps) { const { data, isLoading } = trpc.analytics.getProjectDetail.useQuery( { id: projectId!, roundId }, { enabled: !!projectId && open }, ) // …existing render… } ``` - [ ] **Step 3: Update each caller to pass `roundId`** For each call site identified in Step 1, pass the roundId from the page's existing state. The observer reports page (per recent commit `2e080a5`) already lifts a round selector to the top of the page — find that state and thread it through. If a caller has no round in scope and is not the observer reports page, leave the prop omitted (the procedure's optional roundId will fall back to aggregate stats and the dialog will still render correctly). - [ ] **Step 4: Run typecheck** Run: `npm run typecheck` Expected: no errors. - [ ] **Step 5: Manual smoke test** `npm run dev`. From the observer reports page, open a project preview. Confirm Avg Score / Evaluators match the round selector at the top of the page. - [ ] **Step 6: Commit** ```bash git add src/components/observer/reports/project-preview-dialog.tsx git commit -m "fix: scope observer reports preview dialog to selected round" ``` --- ## Task 5: Resolve default round on observer full project page **Files:** - Modify: `src/app/(observer)/observer/projects/[projectId]/page.tsx` - Modify: `src/components/observer/observer-project-detail.tsx` - [ ] **Step 1: Read the existing page wrapper** Run: `cat src/app/\(observer\)/observer/projects/\[projectId\]/page.tsx` It currently calls `` without round context. - [ ] **Step 2: Read `?round=` from search params and resolve default** Replace the page body: ```tsx import { ObserverProjectDetail } from '@/components/observer/observer-project-detail' export default async function ObserverProjectDetailPage({ params, searchParams, }: { params: Promise<{ projectId: string }> searchParams: Promise<{ round?: string }> }) { const { projectId } = await params const sp = await searchParams return } ``` - [ ] **Step 3: Modify `ObserverProjectDetail` to resolve the default** In `src/components/observer/observer-project-detail.tsx`, update the props and resolve the default round: ```typescript export function ObserverProjectDetail({ projectId, initialRoundId }: { projectId: string; initialRoundId?: string }) { const [activeRoundId, setActiveRoundId] = useState(initialRoundId) // Round resolution: ROUND_ACTIVE first, else most-recent ROUND_CLOSED const { data: roundCandidates } = trpc.analytics.getProjectRoundsForObserver.useQuery( { projectId }, { enabled: !activeRoundId }, ) useEffect(() => { if (activeRoundId || !roundCandidates) return const active = roundCandidates.find((r) => r.status === 'ROUND_ACTIVE') if (active) { setActiveRoundId(active.id) return } const closed = [...roundCandidates] .filter((r) => r.status === 'ROUND_CLOSED') .sort((a, b) => b.sortOrder - a.sortOrder)[0] if (closed) setActiveRoundId(closed.id) }, [roundCandidates, activeRoundId]) const { data, isLoading } = trpc.analytics.getProjectDetail.useQuery( { id: projectId, roundId: activeRoundId }, { refetchInterval: 30_000, enabled: !!projectId }, ) // …rest of component, with a small setActiveRoundId(e.target.value)} > {roundCandidates.map((r) => ( ))} )} ``` - [ ] **Step 6: Run typecheck** Run: `npm run typecheck` Expected: no errors. - [ ] **Step 7: Manual smoke test** `npm run dev`. Open an observer project page for a project that has been in multiple rounds. Confirm: - With no `?round=`, the page defaults to the active round if any, otherwise the most recently closed. - The selector chip is visible when there are ≥2 candidate rounds and switches the stats correctly. - "Evaluators" matches the per-juror count. - [ ] **Step 8: Commit** ```bash git add src/app/\(observer\)/observer/projects/\[projectId\]/page.tsx src/components/observer/observer-project-detail.tsx src/server/routers/analytics.ts git commit -m "feat: resolve observer project page round default + selector" ``` --- ## Task 6: Add `useBalancedRanking` to `EvaluationConfigSchema` **Files:** - Modify: `src/types/competition-configs.ts:90-156` - Create: `tests/unit/round-config-balance-toggle.test.ts` - [ ] **Step 1: Write the failing test** Create `tests/unit/round-config-balance-toggle.test.ts`: ```typescript import { afterAll, beforeAll, describe, expect, it } from 'vitest' import { prisma, createCaller } from '../setup' import { createTestUser, createTestProgram, createTestCompetition, createTestRound, cleanupTestData, uid, } from '../helpers' import { roundRouter } from '../../src/server/routers/round' describe('Round.configJson.useBalancedRanking', () => { let programId: string let admin: { id: string; email: string; role: 'SUPER_ADMIN' } const userIds: string[] = [] beforeAll(async () => { const program = await createTestProgram({ name: `bal-toggle-${uid()}` }) programId = program.id const adminUser = await createTestUser('SUPER_ADMIN') userIds.push(adminUser.id) admin = { id: adminUser.id, email: adminUser.email, role: 'SUPER_ADMIN' } }) afterAll(async () => { await cleanupTestData(programId, userIds) }) it('persists useBalancedRanking via round.update', async () => { const competition = await createTestCompetition(programId) const round = await createTestRound(competition.id) const caller = createCaller(roundRouter, admin) await caller.update({ id: round.id, configJson: { useBalancedRanking: false }, }) const reloaded = await prisma.round.findUniqueOrThrow({ where: { id: round.id } }) expect((reloaded.configJson as Record).useBalancedRanking).toBe(false) }) }) ``` - [ ] **Step 2: Run test to verify it fails or passes** Run: `npx vitest run tests/unit/round-config-balance-toggle.test.ts` If `round.update` accepts arbitrary configJson today (passthrough), the test may already pass — that's fine; the test still pins the behavior. If `round.update` schema-validates configJson against `EvaluationConfigSchema` and rejects unknown keys, the test FAILS until the field is added. - [ ] **Step 3: Add `useBalancedRanking` to the schema** Modify `src/types/competition-configs.ts`. Inside `EvaluationConfigSchema` (right above `// Ranking (Phase 3)` near line 153), add: ```typescript // Whether the ranking dashboard ranks projects by juror-balanced (z-normalized) average. // Defaulting to true preserves existing behavior. Toggled per-round via the dashboard side panel. useBalancedRanking: z.boolean().default(true), ``` - [ ] **Step 4: Run test to verify it passes** Run: `npx vitest run tests/unit/round-config-balance-toggle.test.ts` Expected: PASS. - [ ] **Step 5: Run typecheck** Run: `npm run typecheck` Expected: no errors. - [ ] **Step 6: Commit** ```bash git add src/types/competition-configs.ts tests/unit/round-config-balance-toggle.test.ts git commit -m "feat: add useBalancedRanking flag to round config schema" ``` --- ## Task 7: Wire toggle UI into ranking dashboard side sheet **Files:** - Modify: `src/components/admin/round/ranking-dashboard.tsx` - [ ] **Step 1: Read the dashboard's existing config-save plumbing** Run: `grep -n "saveRankingConfig\|updateRoundMutation\|roundData?.configJson" src/components/admin/round/ranking-dashboard.tsx` The component already loads `roundData.configJson` (~line 475-484) and saves via `updateRoundMutation` (`trpc.round.update`). Reuse the same plumbing. - [ ] **Step 2: Add local state + initialization** Near the other `useState` calls for local weights, add: ```typescript const [useBalanced, setUseBalanced] = useState(true) ``` In the existing `useEffect` that initializes from `roundData.configJson`, add: ```typescript setUseBalanced((cfg.useBalancedRanking as boolean | undefined) ?? true) ``` (Place this line next to the other `setLocal*` calls — and since the toggle should hydrate every time `roundData` refetches without resetting other in-flight edits, leave the `weightsInitialized.current` guard in place but read `useBalancedRanking` outside it. Concretely:) ```typescript useEffect(() => { if (!roundData?.configJson) return const cfg = roundData.configJson as Record setUseBalanced((cfg.useBalancedRanking as boolean | undefined) ?? true) if (weightsInitialized.current) return const saved = (cfg.criteriaWeights ?? {}) as Record setLocalWeights(saved) setLocalCriteriaText((cfg.rankingCriteria as string) ?? '') setLocalScoreWeight((cfg.scoreWeight as number) ?? 5) setLocalPassRateWeight((cfg.passRateWeight as number) ?? 5) weightsInitialized.current = true }, [roundData]) ``` - [ ] **Step 3: Add a toggle handler that persists immediately** Add next to `saveRankingConfig`: ```typescript const persistUseBalanced = (next: boolean) => { setUseBalanced(next) if (!roundData?.configJson) return const cfg = roundData.configJson as Record updateRoundMutation.mutate({ id: roundId, configJson: { ...cfg, useBalancedRanking: next }, }) } ``` - [ ] **Step 4: Render a toggle row at the top of the side sheet** Inside the `` block, just below the header (right above the stats grid at line ~1004), insert: ```tsx
Use balanced scoring for ranking Corrects for per-juror grading style. Off uses raw averages.
``` Add to the imports at the top of the file (only if not already imported): ```typescript import { Switch } from '@/components/ui/switch' ``` - [ ] **Step 5: Run typecheck** Run: `npm run typecheck` Expected: no errors. - [ ] **Step 6: Manual smoke test** `npm run dev`. Open the admin ranking dashboard for a round, click a project, flip the toggle. Reload — it should persist. Open another browser/session — the same value should appear. - [ ] **Step 7: Commit** ```bash git add src/components/admin/round/ranking-dashboard.tsx git commit -m "feat: per-round 'use balanced scoring' toggle in side sheet" ``` --- ## Task 8: List sort respects the toggle **Files:** - Modify: `src/components/admin/round/ranking-dashboard.tsx` - [ ] **Step 1: Locate the sort sites** Run: `grep -n "balancedAverage\|avgGlobalScore" src/components/admin/round/ranking-dashboard.tsx` Two sort sites read balanced first: the initial sort (~line 417) and the composite computation (~line 879). Both use `evalScores.balanced[id]?.balancedAverage ?? raw ?? 0`. - [ ] **Step 2: Replace the score selector with a helper** Add near the top of the component: ```typescript const pickRankingScore = (projectId: string, rawFallback: number | null | undefined): number => { const balanced = evalScores?.balanced[projectId]?.balancedAverage if (useBalanced && balanced != null) return balanced return rawFallback ?? 0 } ``` Replace the two existing expressions: - Around line 417, change: ```typescript evalScores.balanced[projectId]?.balancedAverage ?? raw ?? 0 ``` to: ```typescript pickRankingScore(projectId, raw) ``` - Around line 879, change: ```typescript return evalScores?.balanced[id]?.balancedAverage ?? e?.avgGlobalScore ?? 0 ``` to: ```typescript return pickRankingScore(id, e?.avgGlobalScore) ``` - [ ] **Step 3: Trigger re-sort when toggle flips** The list memoizes order from `evalScores`. Add `useBalanced` to the dependency array of the `useMemo` / `useEffect` that computes ranking order. Run: `grep -n "useMemo\|useEffect" src/components/admin/round/ranking-dashboard.tsx | head` and locate the order-computation block (look for the `useEffect` near line 393 noted in the existing comment "Wait for evalScores too — the initial sort uses balanced (juror-corrected)"). Add `useBalanced` to the dependency array there. - [ ] **Step 4: Run typecheck** Run: `npm run typecheck` Expected: no errors. - [ ] **Step 5: Manual smoke test** Flip the toggle in the side sheet. The list under each category should re-sort live. - [ ] **Step 6: Commit** ```bash git add src/components/admin/round/ranking-dashboard.tsx git commit -m "feat: list sort respects useBalancedRanking toggle" ``` --- ## Task 9: Rebuild side-panel stats card (Raw + Balanced side-by-side) **Files:** - Modify: `src/components/admin/round/ranking-dashboard.tsx` (the side-sheet stats grid around lines 1005-1027) - [ ] **Step 1: Remove the `⇢ X.X` annotation from list rows** Run: `grep -n "⇢" src/components/admin/round/ranking-dashboard.tsx` Locate the inner block around lines 204-220 (the "Raw + balanced averages shown side by side" comment). Replace the entire `{balancedScore != null && Math.abs(...) >= 0.05 && (…)}` JSX block with nothing. Keep the raw `{entry.avgGlobalScore.toFixed(1)}` rendering. Also remove the `balancedScore` prop from the `` component declaration (and its prop interface) since the row no longer uses it. Keep callers from passing it — drop the JSX prop too. - [ ] **Step 2: Compute balanced + raw averages for the open project** Inside the side-sheet block, just before the stats grid, compute: ```typescript {(() => { const raw = evalScores?.balanced[selectedProjectId ?? '']?.rawAverage ?? null const balanced = evalScores?.balanced[selectedProjectId ?? '']?.balancedAverage ?? null const showBoth = raw != null || balanced != null if (!showBoth) return null return (

Avg Score

Raw {raw != null ? raw.toFixed(1) : '—'} {!useBalanced && ← used for ranking}
Balanced {balanced != null ? balanced.toFixed(1) : '—'} {useBalanced && ← used for ranking}
) })()} ``` The `← used for ranking` chip should sit next to whichever number is active. To keep markup simple, render it once at the end of the row and rely on the bolded label to point to the active number; if you want the chip to literally sit next to the active label, conditional-render it inline within each label block instead. - [ ] **Step 3: Replace the legacy 3-card grid** The existing 3-card grid (Avg / Pass Rate / Evaluators) at lines 1006-1027 keeps Pass Rate + Evaluators but loses the Avg card (replaced by Step 2's combined card). Restructure into a vertical stack: ```tsx {projectDetail.stats && (
{/* Avg card (Step 2 above) */}

Pass Rate

{projectDetail.stats.totalEvaluations > 0 ? `${Math.round((projectDetail.stats.yesVotes / projectDetail.stats.totalEvaluations) * 100)}%` : '—'}

Evaluators

{projectDetail.stats.totalEvaluations}

)} ``` - [ ] **Step 4: Run typecheck + manual smoke test** Run: `npm run typecheck` `npm run dev` and confirm the side panel renders both Raw + Balanced, with the active one bolded. - [ ] **Step 5: Commit** ```bash git add src/components/admin/round/ranking-dashboard.tsx git commit -m "feat: side panel shows both raw and balanced averages, list view drops delta annotation" ``` --- ## Task 10: Per-juror "typical / contributes as" chip **Files:** - Modify: `src/server/routers/ranking.ts` (extend the data the dashboard already fetches to include per-juror balance stats) - Modify: `src/components/admin/round/ranking-dashboard.tsx` - [ ] **Step 1: Extend the ranking router's evalScores response with per-juror stats** Open `src/server/routers/ranking.ts` and locate the procedure that returns `byProject` + `balanced` (around lines 488-540). After computing `balanceCtx`, extend the response with a `jurorStats` map: ```typescript const jurorStats: Record = {} for (const [userId, s] of balanceCtx.jurorStats.entries()) { jurorStats[userId] = { mean: s.mean, stddev: s.stddev, count: s.count } } return { byProject, balanced, jurorStats, overallMean: balanceCtx.overallMean, overallStddev: balanceCtx.overallStddev } ``` - [ ] **Step 2: Render the chip in the per-juror list** In the side sheet, find the per-juror row (around lines 1046-1090). After the existing `Score: {a.evaluation?.globalScore?.toFixed(1) ?? '—'}`, render a chip when balanced is on AND we have stats for this juror: ```tsx {useBalanced && (() => { const stats = evalScores?.jurorStats?.[a.userId] const score = a.evaluation?.globalScore if (!stats || score == null) return null const overallMean = evalScores!.overallMean const overallStddev = evalScores!.overallStddev if (overallStddev === 0) return null const z = stats.stddev > 0 ? (score - stats.mean) / stats.stddev : (score - overallMean) / overallStddev const contributesAs = overallMean + z * overallStddev return ( typical {stats.mean.toFixed(1)} → contributes {contributesAs.toFixed(1)} ) })()} ``` (`a.userId` may be absent in the existing select. If so, add `userId: true` to the assignment select inside `getProjectDetail` and re-thread, or use `a.user?.id` if already selected. Run `grep -n "user: { select" src/server/routers/analytics.ts` to confirm.) - [ ] **Step 3: Run typecheck + manual smoke test** Run: `npm run typecheck` `npm run dev`. Open the side panel. With balanced on, each juror row should show "typical X.X → contributes Y.Y". With it off, the chip disappears. - [ ] **Step 4: Commit** ```bash git add src/server/routers/ranking.ts src/components/admin/round/ranking-dashboard.tsx git commit -m "feat: side panel shows per-juror baseline and balanced contribution" ``` --- ## Task 11: Build shared `` **Files:** - Create: `src/components/shared/score-explainer-dialog.tsx` - [ ] **Step 1: Create the dialog component** ```tsx 'use client' import { Dialog, DialogContent, DialogHeader, DialogTitle, DialogTrigger, } from '@/components/ui/dialog' import { Button } from '@/components/ui/button' import { Info } from 'lucide-react' import type { ReactNode } from 'react' export function ScoreExplainerDialog({ trigger }: { trigger?: ReactNode }) { return ( {trigger ?? ( )} How scores are calculated

Different jurors have different grading styles. Some grade harshly, some leniently. Balanced scoring corrects for that so a project isn't punished for drawing harsh jurors or rewarded for drawing lenient ones.

How it works

  1. For each juror, calculate their personal average and spread across all the projects they scored in this round.
  2. Convert each individual score into "how many standard deviations above or below this juror's typical" — a 6 from a juror who averages 5 reads the same as a 9 from a juror who averages 8.
  3. Average those normalized values across the project's jurors.
  4. Rescale back onto the same 1–10 scale using the round's overall average and spread.
  5. The result is directly comparable to the raw average — same scale, but corrected for grading style.

Worked example

Juror Their typical avg Score for "Project X" What that means
Juror A (lenient) 8.2 9.0 Just above their typical (+0.4σ)
Juror B (harsh) 5.8 7.5 Well above their typical (+1.5σ)
Juror C (typical) 7.0 8.0 Slightly above their typical (+0.7σ)

Raw average: (9.0 + 7.5 + 8.0) / 3 = 8.2. Balanced average rescales each juror's enthusiasm to the round's overall scale and lands at roughly 8.4 — Juror B's strong endorsement (well above their harsh baseline) carries more weight than the raw 7.5 suggests.

When it kicks in

  • Needs at least 2 evaluations from the round to compute a juror's spread; otherwise that juror falls back to the round-wide average.
  • Needs at least one juror with non-zero spread; if every juror gave identical scores, balanced equals raw.
  • Computed within a single round only — a juror's grading style in an intake screening doesn't affect their balance in a deep evaluation.

Why we still show "Raw"

Both numbers are always shown so you can sanity-check the correction. The toggle at the top of the side panel decides which one is used for ranking.

) } ``` - [ ] **Step 2: Run typecheck** Run: `npm run typecheck` Expected: no errors. - [ ] **Step 3: Commit** ```bash git add src/components/shared/score-explainer-dialog.tsx git commit -m "feat: shared 'How scores are calculated' explainer dialog" ``` --- ## Task 12: Wire the explainer into admin + observer surfaces **Files:** - Modify: `src/components/admin/round/ranking-dashboard.tsx` - Modify: `src/components/observer/observer-project-detail.tsx` - Modify: `src/components/observer/reports/project-preview-dialog.tsx` - [ ] **Step 1: Mount in admin side sheet** Inside the side sheet, just below the Avg Score combined card from Task 9 Step 2, add: ```tsx import { ScoreExplainerDialog } from '@/components/shared/score-explainer-dialog' // …
``` - [ ] **Step 2: Mount in observer full project detail** Find the stats area in `src/components/observer/observer-project-detail.tsx` (search for "averageGlobalScore" or "Avg Score") and add the same `` button next to it. - [ ] **Step 3: Mount in observer reports preview dialog** Inside `src/components/observer/reports/project-preview-dialog.tsx`, in the Evaluation summary block (around the existing Avg Score card), add the explainer button. - [ ] **Step 4: Run typecheck + manual smoke test** Run: `npm run typecheck` `npm run dev`. Click the "How scores are calculated" button in each of the three locations and confirm the dialog renders. - [ ] **Step 5: Commit** ```bash git add src/components/admin/round/ranking-dashboard.tsx src/components/observer/observer-project-detail.tsx src/components/observer/reports/project-preview-dialog.tsx git commit -m "feat: mount score explainer dialog in admin and observer surfaces" ``` --- ## Task 13: Decimal display audit **Files:** - Modify: `src/app/(admin)/admin/reports/page.tsx:368` - [ ] **Step 1: Replace toFixed(2) with toFixed(1)** Find the line: ```typescript {p.balancedScore == null ? '-' : p.balancedScore.toFixed(2)} ``` Change to: ```typescript {p.balancedScore == null ? '-' : p.balancedScore.toFixed(1)} ``` - [ ] **Step 2: Grep for any other 2-decimal score displays** Run: `grep -rn "toFixed(2)" src/components src/app --include="*.tsx" | grep -iE "balanced|avg|score"` For any results that show balanced/raw scores, change to `toFixed(1)`. Skip any rate/percentage displays that should stay at 2 decimals. - [ ] **Step 3: Commit** ```bash git add git commit -m "fix: standardize score displays on one decimal" ``` --- ## Task 14: Verify list-view delta annotation removal **Files:** - (No new modification; verifies Task 9 Step 1 landed.) - [ ] **Step 1: Grep for any remaining `⇢` characters** Run: `grep -rn "⇢" src/components --include="*.tsx"` Expected: no matches. - [ ] **Step 2: Grep for the now-unused `balancedScore` prop on the row component** Run: `grep -n "balancedScore" src/components/admin/round/ranking-dashboard.tsx` Expected: occurrences only inside the side-sheet block, not on the row component's props or render. If anything remains, remove it. - [ ] **Step 3: Commit (if changes were made)** ```bash git add src/components/admin/round/ranking-dashboard.tsx git commit -m "chore: remove leftover balancedScore plumbing on list row" ``` --- ## Task 15: Final verification - [ ] **Step 1: Run the full test suite** Run: `npx vitest run` Expected: PASS. - [ ] **Step 2: Run typecheck** Run: `npm run typecheck` Expected: no errors. - [ ] **Step 3: Run a production build** Run: `npm run build` Expected: build completes successfully. - [ ] **Step 4: Manual end-to-end smoke** `npm run dev`. Walk through the spec's acceptance criteria: 1. With 3 round-scoped evaluations of 9, 8, 8, the side panel shows Avg 8.3 and Evaluators 3. 2. Flipping the toggle re-sorts the list view; persists across reload and across users. 3. List view shows no per-row delta annotation. 4. Side panel shows both Raw and Balanced; active one is highlighted. 5. Edition-mode rankings differ vs. before (compute by hand for one project — should match per-round rollup). 6. Observer project detail page defaults to active or most recently closed round. 7. All score displays show one decimal. 8. "How scores are calculated" opens from admin side panel, observer detail page, and observer preview dialog. - [ ] **Step 5: No new commit unless something needed fixing** If any acceptance criterion fails, create a fix commit. Otherwise nothing to commit here.