15 TDD-style tasks covering the round-scoping bug fixes for getProjectDetail and getProjectRankings, the per-round toggle, the side-panel deeper display, the shared score explainer dialog, and the decimal display audit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
55 KiB
Juror-Balance Toggle + Round-Scoping Fixes — Implementation Plan
For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (
- [ ]) syntax for tracking.
Goal: Fix cross-round contamination in two analytics procedures (and the three UI surfaces that consume them), add a per-round "use balanced scoring" toggle, replace the list-view delta annotation with a richer side-panel display, and ship a shared "How scores are calculated" explainer dialog.
Architecture: Server-side: extend analytics.getProjectDetail to accept a roundId and scope its evaluation query; rework analytics.getProjectRankings edition mode to compute one z-balance context per round before aggregating; add useBalancedRanking to EvaluationConfigSchema so it persists in Round.configJson. Client-side: pass roundId from each caller; rebuild the admin ranking dashboard side sheet to show both raw and balanced averages, the per-round toggle, per-juror balance contributions, and an affordance opening a shared explainer dialog component (<ScoreExplainerDialog />) reused on observer surfaces.
Tech Stack: Next.js 15 App Router, tRPC 11 with Zod, Prisma 6, Vitest 4 (file-parallelism off, forks pool), shadcn/ui (Dialog + Switch primitives already available), TypeScript strict.
Spec: docs/superpowers/specs/2026-04-27-juror-balance-toggle-and-round-scoping-design.md
File Structure
Server
| File | Responsibility |
|---|---|
src/server/routers/analytics.ts |
Modify getProjectDetail (Task 1) + getProjectRankings (Task 2). |
src/server/services/juror-balance.ts |
Add a small helper computePerRoundBalanced(pointsByRound) consumed by getProjectRankings edition mode (Task 2). Keep existing functions untouched. |
src/types/competition-configs.ts |
Add useBalancedRanking: z.boolean().default(true) to EvaluationConfigSchema (Task 6). |
Client
| File | Responsibility |
|---|---|
src/components/shared/score-explainer-dialog.tsx |
NEW. Reusable explainer dialog (Task 11). |
src/components/admin/round/ranking-dashboard.tsx |
Wire roundId through to getProjectDetail; rebuild side-sheet stats area; add toggle row; per-juror chips; remove list-row delta annotation; mount <ScoreExplainerDialog /> (Tasks 3, 7, 8, 9, 10, 12, 14). |
src/components/observer/observer-project-detail.tsx |
Resolve default round and pass roundId; mount explainer dialog affordance (Tasks 5, 12). |
src/components/observer/reports/project-preview-dialog.tsx |
Accept and pass roundId prop; mount explainer affordance (Tasks 4, 12). |
src/app/(observer)/observer/projects/[projectId]/page.tsx |
Read ?round= query param and pass to <ObserverProjectDetail /> (Task 5). |
src/app/(admin)/admin/reports/page.tsx |
Decimal audit fix toFixed(2) → toFixed(1) (Task 13). |
Tests
| File | Responsibility |
|---|---|
tests/unit/juror-balance-round-scoping.test.ts |
NEW. Vitest cases for getProjectDetail roundId filtering and getProjectRankings per-round z-context (Tasks 1, 2). |
tests/unit/round-config-balance-toggle.test.ts |
NEW. Vitest case for persisting useBalancedRanking via round.update (Task 6). |
Test setup notes (for the implementer)
Vitest 4 is the framework; tests run sequentially (fileParallelism: false, pool: 'forks'). Use the helpers in tests/helpers.ts (createTestUser, createTestProgram, createTestCompetition, createTestRound, createTestProject, createTestProjectRoundState, createTestAssignment, createTestEvaluation, createTestEvaluationForm, cleanupTestData) and createCaller(routerModule, user) from tests/setup.ts. Always cleanupTestData(programId, userIds) in afterAll.
Run a single test file with: npx vitest run tests/unit/<file>.test.ts. Run a single test by name with: npx vitest run -t '<test name>'.
Task 1: Round-scope analytics.getProjectDetail
Files:
-
Modify:
src/server/routers/analytics.ts:1370-1464 -
Create:
tests/unit/juror-balance-round-scoping.test.ts -
Step 1: Write the failing test (round filtering)
Create tests/unit/juror-balance-round-scoping.test.ts:
import { afterAll, beforeAll, describe, expect, it } from 'vitest'
import { prisma, createCaller } from '../setup'
import {
createTestUser, createTestProgram, createTestCompetition, createTestRound,
createTestProject, createTestProjectRoundState, createTestAssignment,
createTestEvaluation, createTestEvaluationForm, cleanupTestData, uid,
} from '../helpers'
import { analyticsRouter } from '../../src/server/routers/analytics'
describe('analytics.getProjectDetail round scoping', () => {
let programId: string
let admin: { id: string; email: string; role: 'SUPER_ADMIN' }
let projectId: string
let roundAId: string
let roundBId: string
const userIds: string[] = []
beforeAll(async () => {
const program = await createTestProgram({ name: `bal-scope-${uid()}` })
programId = program.id
const competition = await createTestCompetition(programId)
const roundA = await createTestRound(competition.id, { name: 'Round A', sortOrder: 0, status: 'ROUND_CLOSED' })
const roundB = await createTestRound(competition.id, { name: 'Round B', sortOrder: 1, status: 'ROUND_ACTIVE' })
roundAId = roundA.id
roundBId = roundB.id
const formA = await createTestEvaluationForm(roundA.id)
const formB = await createTestEvaluationForm(roundB.id)
const project = await createTestProject(programId)
projectId = project.id
await createTestProjectRoundState(projectId, roundA.id, { state: 'PASSED' })
await createTestProjectRoundState(projectId, roundB.id, { state: 'IN_PROGRESS' })
// 2 evaluations on Round A: 7.0, 8.0 (mean 7.5)
for (const score of [7, 8]) {
const juror = await createTestUser('JURY_MEMBER')
userIds.push(juror.id)
const a = await createTestAssignment(juror.id, projectId, roundA.id)
await createTestEvaluation(a.id, formA.id, { status: 'SUBMITTED', globalScore: score, submittedAt: new Date() })
}
// 3 evaluations on Round B: 9.0, 8.0, 8.0 (mean 8.333…)
for (const score of [9, 8, 8]) {
const juror = await createTestUser('JURY_MEMBER')
userIds.push(juror.id)
const a = await createTestAssignment(juror.id, projectId, roundB.id)
await createTestEvaluation(a.id, formB.id, { status: 'SUBMITTED', globalScore: score, submittedAt: new Date() })
}
const adminUser = await createTestUser('SUPER_ADMIN')
userIds.push(adminUser.id)
admin = { id: adminUser.id, email: adminUser.email, role: 'SUPER_ADMIN' }
})
afterAll(async () => {
await cleanupTestData(programId, userIds)
})
it('returns only round-B stats when roundId=roundB is passed', async () => {
const caller = createCaller(analyticsRouter, admin)
const result = await caller.getProjectDetail({ id: projectId, roundId: roundBId })
expect(result.stats).not.toBeNull()
expect(result.stats!.totalEvaluations).toBe(3)
expect(result.stats!.averageGlobalScore).toBeCloseTo(8.333, 2)
})
})
- Step 2: Run test to verify it fails
Run: npx vitest run tests/unit/juror-balance-round-scoping.test.ts -t 'returns only round-B stats'
Expected: FAIL — current procedure ignores roundId and returns 5 evaluations averaging 8.0. The Zod input schema also rejects roundId since it's not declared.
- Step 3: Add
roundIdto the input schema and scope the query
Modify src/server/routers/analytics.ts around line 1371. Replace the input definition and the submittedEvaluations query:
getProjectDetail: observerProcedure
.input(z.object({ id: z.string(), roundId: z.string().optional() }))
.query(async ({ ctx, input }) => {
const [projectRaw, projectTags, assignments, submittedEvaluations] = await Promise.all([
ctx.prisma.project.findUniqueOrThrow({
where: { id: input.id },
include: {
files: {
select: {
id: true, fileName: true, fileType: true, mimeType: true, size: true,
bucket: true, objectKey: true, pageCount: true, textPreview: true,
detectedLang: true, langConfidence: true, analyzedAt: true,
roundId: true,
requirementId: true,
requirement: { select: { id: true, name: true, description: true, isRequired: true } },
},
orderBy: [{ createdAt: 'asc' }],
},
teamMembers: {
include: {
user: {
select: { id: true, name: true, email: true, profileImageKey: true, profileImageProvider: true },
},
},
orderBy: { joinedAt: 'asc' },
},
},
}),
ctx.prisma.projectTag.findMany({
where: { projectId: input.id },
include: { tag: { select: { id: true, name: true, category: true, color: true } } },
orderBy: { confidence: 'desc' },
}).catch(() => [] as { id: string; projectId: string; tagId: string; confidence: number; tag: { id: string; name: string; category: string | null; color: string | null } }[]),
ctx.prisma.assignment.findMany({
where: { projectId: input.id },
include: {
user: { select: { id: true, name: true, email: true, profileImageKey: true, profileImageProvider: true } },
round: { select: { id: true, name: true } },
evaluation: {
select: {
id: true, status: true, submittedAt: true, globalScore: true,
binaryDecision: true, criterionScoresJson: true, feedbackText: true,
},
},
},
orderBy: { createdAt: 'desc' },
}),
ctx.prisma.evaluation.findMany({
where: {
status: 'SUBMITTED',
assignment: {
projectId: input.id,
...(input.roundId ? { roundId: input.roundId } : {}),
},
},
}),
])
Leave the rest of the procedure body untouched. The stats = null fallback (when no submitted evaluations match) already does the right thing.
- Step 4: Run test to verify it passes
Run: npx vitest run tests/unit/juror-balance-round-scoping.test.ts -t 'returns only round-B stats'
Expected: PASS.
- Step 5: Add a second test asserting unfiltered behavior is preserved
Append to the same describe block:
it('returns aggregated stats across all rounds when roundId is omitted', async () => {
const caller = createCaller(analyticsRouter, admin)
const result = await caller.getProjectDetail({ id: projectId })
expect(result.stats!.totalEvaluations).toBe(5)
})
Run: npx vitest run tests/unit/juror-balance-round-scoping.test.ts -t 'aggregated stats across all rounds'
Expected: PASS (no further code change needed; the ?: fallback handles this).
- Step 6: Commit
git add src/server/routers/analytics.ts tests/unit/juror-balance-round-scoping.test.ts
git commit -m "fix: scope analytics.getProjectDetail by optional roundId"
Task 2: Per-round z-context in analytics.getProjectRankings edition mode
Files:
-
Modify:
src/server/routers/analytics.ts:199-258 -
Modify:
src/server/services/juror-balance.ts(addcomputePerRoundBalancedhelper) -
Modify:
tests/unit/juror-balance-round-scoping.test.ts(append cases) -
Step 1: Write the failing test (edition-mode per-round grouping)
Append to tests/unit/juror-balance-round-scoping.test.ts:
describe('analytics.getProjectRankings per-round z-context (edition mode)', () => {
let programId: string
let admin: { id: string; email: string; role: 'SUPER_ADMIN' }
let projectXId: string
let projectYId: string
const userIds: string[] = []
beforeAll(async () => {
const program = await createTestProgram({ name: `rank-edition-${uid()}` })
programId = program.id
const competition = await createTestCompetition(programId)
const roundA = await createTestRound(competition.id, { name: 'A', sortOrder: 0 })
const roundB = await createTestRound(competition.id, { name: 'B', sortOrder: 1 })
const formA = await createTestEvaluationForm(roundA.id, [
{ id: 'c1', label: 'X', scale: '1-10', weight: 1 },
])
const formB = await createTestEvaluationForm(roundB.id, [
{ id: 'c1', label: 'X', scale: '1-10', weight: 1 },
])
const projX = await createTestProject(programId, { title: 'X' })
const projY = await createTestProject(programId, { title: 'Y' })
projectXId = projX.id
projectYId = projY.id
await createTestProjectRoundState(projX.id, roundA.id)
await createTestProjectRoundState(projY.id, roundA.id)
await createTestProjectRoundState(projX.id, roundB.id)
await createTestProjectRoundState(projY.id, roundB.id)
// Round A: a "lenient" juror grades 9 on X, 9 on Y. A "harsh" juror grades 6 on X, 4 on Y.
// Mixing A+B produces a misleading single z-context. Per-round contexts:
// - In Round A: lenient mean=9 stddev=0 (fallback), harsh mean=5 stddev=1
// - In Round B: identical ratings, separate context
const lenient = await createTestUser('JURY_MEMBER')
const harsh = await createTestUser('JURY_MEMBER')
userIds.push(lenient.id, harsh.id)
const writeEval = async (jurorId: string, projId: string, roundId: string, formId: string, c1: number) => {
const a = await createTestAssignment(jurorId, projId, roundId)
await prisma.evaluation.create({
data: {
assignmentId: a.id,
formId,
status: 'SUBMITTED',
submittedAt: new Date(),
criterionScoresJson: { c1 },
},
})
}
// Round A
await writeEval(lenient.id, projX.id, roundA.id, formA.id, 9)
await writeEval(lenient.id, projY.id, roundA.id, formA.id, 9)
await writeEval(harsh.id, projX.id, roundA.id, formA.id, 6)
await writeEval(harsh.id, projY.id, roundA.id, formA.id, 4)
// Round B (different scoring profile so cross-round pooling skews things)
await writeEval(lenient.id, projX.id, roundB.id, formB.id, 8)
await writeEval(lenient.id, projY.id, roundB.id, formB.id, 8)
await writeEval(harsh.id, projX.id, roundB.id, formB.id, 7)
await writeEval(harsh.id, projY.id, roundB.id, formB.id, 5)
const adminUser = await createTestUser('SUPER_ADMIN')
userIds.push(adminUser.id)
admin = { id: adminUser.id, email: adminUser.email, role: 'SUPER_ADMIN' }
})
afterAll(async () => {
await cleanupTestData(programId, userIds)
})
it('aggregates per-project balanced score as the mean of per-round balanced averages', async () => {
const caller = createCaller(analyticsRouter, admin)
const result = await caller.getProjectRankings({ programId })
const x = result.find((p) => p.id === projectXId)!
const y = result.find((p) => p.id === projectYId)!
// Hand-computed expected per-round balanced averages:
// Round A: lenient stddev=0 (fallback to overall), harsh mean=5 stddev=1.
// X: lenient z=fallback (9-7)/sqrt(3.5)=2/1.8708=+1.069, harsh z=(6-5)/1=+1.0 → avg z=1.0345
// Round A overall mean=7, stddev=sqrt(3.5)=1.8708 → X balanced = 7 + 1.0345*1.8708 ≈ 8.94
// Y: lenient z=fallback (9-7)/1.8708=+1.069, harsh z=(4-5)/1=-1.0 → avg z=0.0345
// Y balanced = 7 + 0.0345*1.8708 ≈ 7.06
// Round B: lenient mean=8 stddev=0 (fallback), harsh mean=6 stddev=1.
// B overall mean=7, stddev=sqrt(1.5)=1.2247
// X: lenient z=(8-7)/1.2247=+0.8165, harsh z=(7-6)/1=+1.0 → avg z=0.9082
// X balanced = 7 + 0.9082*1.2247 ≈ 8.11
// Y: lenient z=(8-7)/1.2247=+0.8165, harsh z=(5-6)/1=-1.0 → avg z=-0.0917
// Y balanced = 7 - 0.0917*1.2247 ≈ 6.89
// Project-level edition rollup = mean of per-round balanced averages:
// X ≈ (8.94 + 8.11)/2 ≈ 8.52
// Y ≈ (7.06 + 6.89)/2 ≈ 6.97
expect(x.balancedScore!).toBeCloseTo(8.52, 1)
expect(y.balancedScore!).toBeCloseTo(6.97, 1)
})
})
- Step 2: Run test to verify it fails
Run: npx vitest run tests/unit/juror-balance-round-scoping.test.ts -t 'mean of per-round balanced averages'
Expected: FAIL — current code pools all 8 evaluations into one z-context.
- Step 3: Add helper
computePerRoundBalancedto juror-balance service
Modify src/server/services/juror-balance.ts. Append at the end:
/**
* Per-round balanced rollup: groups points by roundId, computes a balance
* context per round, then averages the per-round balanced averages for each
* project. Use when surfacing edition-level rankings — never pool z-contexts
* across rounds, because a juror's grading profile differs by round type.
*/
export type RoundScopedScorePoint = ScorePoint & { roundId: string }
export type EditionRollupResult = {
projectId: string
rawAverage: number | null
balancedAverage: number | null
count: number
roundCount: number
}
export function computePerRoundBalanced(
points: RoundScopedScorePoint[],
): Map<string, EditionRollupResult> {
const byRound = new Map<string, ScorePoint[]>()
for (const p of points) {
const arr = byRound.get(p.roundId) ?? []
arr.push({ projectId: p.projectId, userId: p.userId, rawScore: p.rawScore })
byRound.set(p.roundId, arr)
}
const perRoundResults: Array<Map<string, BalancedProjectResult>> = []
for (const roundPoints of byRound.values()) {
const ctx = computeBalanceContext(roundPoints)
perRoundResults.push(computeBalancedProjectScores(roundPoints, ctx))
}
const accumulator = new Map<
string,
{ rawSum: number; rawCount: number; balancedSum: number; balancedCount: number; count: number; roundCount: number }
>()
for (const roundMap of perRoundResults) {
for (const [projectId, result] of roundMap.entries()) {
const acc = accumulator.get(projectId) ?? {
rawSum: 0, rawCount: 0, balancedSum: 0, balancedCount: 0, count: 0, roundCount: 0,
}
if (result.rawAverage != null) {
acc.rawSum += result.rawAverage
acc.rawCount += 1
}
if (result.balancedAverage != null) {
acc.balancedSum += result.balancedAverage
acc.balancedCount += 1
}
acc.count += result.count
acc.roundCount += 1
accumulator.set(projectId, acc)
}
}
const out = new Map<string, EditionRollupResult>()
for (const [projectId, acc] of accumulator.entries()) {
out.set(projectId, {
projectId,
rawAverage: acc.rawCount > 0 ? acc.rawSum / acc.rawCount : null,
balancedAverage: acc.balancedCount > 0 ? acc.balancedSum / acc.balancedCount : null,
count: acc.count,
roundCount: acc.roundCount,
})
}
return out
}
- Step 4: Update
getProjectRankingsto branch on roundId vs programId
Modify src/server/routers/analytics.ts. Replace the imports near line 9:
import {
computeBalanceContext,
computeBalancedProjectScores,
computePerRoundBalanced,
type ScorePoint,
type RoundScopedScorePoint,
} from '../services/juror-balance'
Then replace the getProjectRankings body (lines 199-258) with:
getProjectRankings: observerProcedure
.input(editionOrRoundInput.and(z.object({ limit: z.number().optional() })))
.query(async ({ ctx, input }) => {
const [projects, evaluations] = await Promise.all([
ctx.prisma.project.findMany({
where: projectWhere(input),
select: { id: true, title: true, teamName: true, status: true },
}),
ctx.prisma.evaluation.findMany({
where: evalWhere(input, { status: 'SUBMITTED' }),
select: {
criterionScoresJson: true,
assignment: { select: { userId: true, projectId: true, roundId: true } },
},
}),
])
const rawPoints: RoundScopedScorePoint[] = []
for (const e of evaluations) {
const scores = e.criterionScoresJson as Record<string, unknown> | null
if (!scores) continue
const vals = Object.values(scores).filter((s): s is number => typeof s === 'number')
if (vals.length === 0) continue
const rawScore = vals.reduce((a, b) => a + b, 0) / vals.length
rawPoints.push({
projectId: e.assignment.projectId,
userId: e.assignment.userId,
roundId: e.assignment.roundId,
rawScore,
})
}
// roundId mode: single-round z-context (existing behavior)
// programId mode: per-round z-contexts aggregated as the mean of per-round balanced averages
const balancedByProject: Map<string, { rawAverage: number | null; balancedAverage: number | null; count: number }> = (() => {
if (input.roundId) {
const flat: ScorePoint[] = rawPoints.map(({ projectId, userId, rawScore }) => ({ projectId, userId, rawScore }))
const ctx = computeBalanceContext(flat)
const out = computeBalancedProjectScores(flat, ctx)
return out
}
return computePerRoundBalanced(rawPoints)
})()
const rankings = projects
.map((project) => {
const result = balancedByProject.get(project.id)
return {
id: project.id,
title: project.title,
teamName: project.teamName,
status: project.status,
averageScore: result?.rawAverage ?? null,
balancedScore: result?.balancedAverage ?? null,
evaluationCount: result?.count ?? 0,
}
})
.sort((a, b) => {
const aScore = a.balancedScore ?? a.averageScore
const bScore = b.balancedScore ?? b.averageScore
if (aScore !== null && bScore !== null) return bScore - aScore
if (aScore !== null) return -1
if (bScore !== null) return 1
return 0
})
return input.limit ? rankings.slice(0, input.limit) : rankings
}),
- Step 5: Run the test suite
Run: npx vitest run tests/unit/juror-balance-round-scoping.test.ts
Expected: all 3 tests PASS.
- Step 6: Run typecheck
Run: npm run typecheck
Expected: no errors.
- Step 7: Commit
git add src/server/routers/analytics.ts src/server/services/juror-balance.ts tests/unit/juror-balance-round-scoping.test.ts
git commit -m "fix: compute z-context per-round in edition-mode rankings rollup"
Task 3: Pass roundId from admin ranking dashboard side sheet
Files:
-
Modify:
src/components/admin/round/ranking-dashboard.tsx(around the existinggetProjectDetail.useQuerycall) -
Step 1: Locate the existing useQuery call
Open src/components/admin/round/ranking-dashboard.tsx and find where selectedProjectId drives analytics.getProjectDetail. The component already has roundId in scope (it's the dashboard's own prop / state).
Run: grep -n "getProjectDetail\.useQuery" src/components/admin/round/ranking-dashboard.tsx
- Step 2: Add roundId to the input
Edit the call to include roundId:
const { data: projectDetail, isLoading: detailLoading } =
trpc.analytics.getProjectDetail.useQuery(
{ id: selectedProjectId!, roundId },
{ enabled: !!selectedProjectId },
)
(Confirm the existing enabled guard and any other existing options stay intact.)
- Step 3: Manual smoke test
Start the dev server: npm run dev
Navigate to the admin ranking dashboard for a round where a project has had evaluations in earlier rounds. Click a project. Confirm:
-
The "Evaluators" stat in the side sheet matches the count in the per-juror list below.
-
"Avg Score" reflects only the current round's scores (one decimal).
-
Step 4: Commit
git add src/components/admin/round/ranking-dashboard.tsx
git commit -m "fix: scope admin ranking side-sheet stats to current round"
Task 4: Pass roundId to observer reports preview dialog
Files:
-
Modify:
src/components/observer/reports/project-preview-dialog.tsx -
Modify: caller(s) of
<ProjectPreviewDialog />(find via grep) -
Step 1: Find callers
Run: grep -rn "ProjectPreviewDialog" src --include="*.tsx"
The observer reports page already tracks the active round (the round selector lives on the reports page itself per recent commit 2e080a5). Capture the active roundId from the caller and thread it through.
- Step 2: Add
roundId?: stringto the props and useQuery
Edit src/components/observer/reports/project-preview-dialog.tsx:
interface ProjectPreviewDialogProps {
projectId: string | null
roundId?: string
open: boolean
onOpenChange: (open: boolean) => void
}
export function ProjectPreviewDialog({ projectId, roundId, open, onOpenChange }: ProjectPreviewDialogProps) {
const { data, isLoading } = trpc.analytics.getProjectDetail.useQuery(
{ id: projectId!, roundId },
{ enabled: !!projectId && open },
)
// …existing render…
}
- Step 3: Update each caller to pass
roundId
For each call site identified in Step 1, pass the roundId from the page's existing state. The observer reports page (per recent commit 2e080a5) already lifts a round selector to the top of the page — find that state and thread it through. If a caller has no round in scope and is not the observer reports page, leave the prop omitted (the procedure's optional roundId will fall back to aggregate stats and the dialog will still render correctly).
- Step 4: Run typecheck
Run: npm run typecheck
Expected: no errors.
- Step 5: Manual smoke test
npm run dev. From the observer reports page, open a project preview. Confirm Avg Score / Evaluators match the round selector at the top of the page.
- Step 6: Commit
git add src/components/observer/reports/project-preview-dialog.tsx <updated caller files>
git commit -m "fix: scope observer reports preview dialog to selected round"
Task 5: Resolve default round on observer full project page
Files:
-
Modify:
src/app/(observer)/observer/projects/[projectId]/page.tsx -
Modify:
src/components/observer/observer-project-detail.tsx -
Step 1: Read the existing page wrapper
Run: cat src/app/\(observer\)/observer/projects/\[projectId\]/page.tsx
It currently calls <ObserverProjectDetail projectId={projectId} /> without round context.
- Step 2: Read
?round=from search params and resolve default
Replace the page body:
import { ObserverProjectDetail } from '@/components/observer/observer-project-detail'
export default async function ObserverProjectDetailPage({
params,
searchParams,
}: {
params: Promise<{ projectId: string }>
searchParams: Promise<{ round?: string }>
}) {
const { projectId } = await params
const sp = await searchParams
return <ObserverProjectDetail projectId={projectId} initialRoundId={sp.round} />
}
- Step 3: Modify
ObserverProjectDetailto resolve the default
In src/components/observer/observer-project-detail.tsx, update the props and resolve the default round:
export function ObserverProjectDetail({ projectId, initialRoundId }: { projectId: string; initialRoundId?: string }) {
const [activeRoundId, setActiveRoundId] = useState<string | undefined>(initialRoundId)
// Round resolution: ROUND_ACTIVE first, else most-recent ROUND_CLOSED
const { data: roundCandidates } = trpc.analytics.getProjectRoundsForObserver.useQuery(
{ projectId },
{ enabled: !activeRoundId },
)
useEffect(() => {
if (activeRoundId || !roundCandidates) return
const active = roundCandidates.find((r) => r.status === 'ROUND_ACTIVE')
if (active) {
setActiveRoundId(active.id)
return
}
const closed = [...roundCandidates]
.filter((r) => r.status === 'ROUND_CLOSED')
.sort((a, b) => b.sortOrder - a.sortOrder)[0]
if (closed) setActiveRoundId(closed.id)
}, [roundCandidates, activeRoundId])
const { data, isLoading } = trpc.analytics.getProjectDetail.useQuery(
{ id: projectId, roundId: activeRoundId },
{ refetchInterval: 30_000, enabled: !!projectId },
)
// …rest of component, with a small <select> chip near the stats card to switch rounds when len > 1…
- Step 4: Add the
getProjectRoundsForObserverprocedure
In src/server/routers/analytics.ts, add a new procedure (place it next to other observer procedures):
getProjectRoundsForObserver: observerProcedure
.input(z.object({ projectId: z.string() }))
.query(async ({ ctx, input }) => {
const states = await ctx.prisma.projectRoundState.findMany({
where: { projectId: input.projectId },
select: {
round: { select: { id: true, name: true, status: true, sortOrder: true } },
},
})
return states
.map((s) => s.round)
.filter((r) => r.status === 'ROUND_ACTIVE' || r.status === 'ROUND_CLOSED')
.sort((a, b) => a.sortOrder - b.sortOrder)
}),
- Step 5: Render a round selector chip when multiple rounds are available
Inside the stats card area of <ObserverProjectDetail />, when roundCandidates.length > 1, render a small select:
{roundCandidates && roundCandidates.length > 1 && (
<div className="text-xs">
<select
className="rounded border px-2 py-1 bg-background"
value={activeRoundId ?? ''}
onChange={(e) => setActiveRoundId(e.target.value)}
>
{roundCandidates.map((r) => (
<option key={r.id} value={r.id}>{r.name}</option>
))}
</select>
</div>
)}
- Step 6: Run typecheck
Run: npm run typecheck
Expected: no errors.
- Step 7: Manual smoke test
npm run dev. Open an observer project page for a project that has been in multiple rounds. Confirm:
-
With no
?round=, the page defaults to the active round if any, otherwise the most recently closed. -
The selector chip is visible when there are ≥2 candidate rounds and switches the stats correctly.
-
"Evaluators" matches the per-juror count.
-
Step 8: Commit
git add src/app/\(observer\)/observer/projects/\[projectId\]/page.tsx src/components/observer/observer-project-detail.tsx src/server/routers/analytics.ts
git commit -m "feat: resolve observer project page round default + selector"
Task 6: Add useBalancedRanking to EvaluationConfigSchema
Files:
-
Modify:
src/types/competition-configs.ts:90-156 -
Create:
tests/unit/round-config-balance-toggle.test.ts -
Step 1: Write the failing test
Create tests/unit/round-config-balance-toggle.test.ts:
import { afterAll, beforeAll, describe, expect, it } from 'vitest'
import { prisma, createCaller } from '../setup'
import {
createTestUser, createTestProgram, createTestCompetition, createTestRound,
cleanupTestData, uid,
} from '../helpers'
import { roundRouter } from '../../src/server/routers/round'
describe('Round.configJson.useBalancedRanking', () => {
let programId: string
let admin: { id: string; email: string; role: 'SUPER_ADMIN' }
const userIds: string[] = []
beforeAll(async () => {
const program = await createTestProgram({ name: `bal-toggle-${uid()}` })
programId = program.id
const adminUser = await createTestUser('SUPER_ADMIN')
userIds.push(adminUser.id)
admin = { id: adminUser.id, email: adminUser.email, role: 'SUPER_ADMIN' }
})
afterAll(async () => {
await cleanupTestData(programId, userIds)
})
it('persists useBalancedRanking via round.update', async () => {
const competition = await createTestCompetition(programId)
const round = await createTestRound(competition.id)
const caller = createCaller(roundRouter, admin)
await caller.update({
id: round.id,
configJson: { useBalancedRanking: false },
})
const reloaded = await prisma.round.findUniqueOrThrow({ where: { id: round.id } })
expect((reloaded.configJson as Record<string, unknown>).useBalancedRanking).toBe(false)
})
})
- Step 2: Run test to verify it fails or passes
Run: npx vitest run tests/unit/round-config-balance-toggle.test.ts
If round.update accepts arbitrary configJson today (passthrough), the test may already pass — that's fine; the test still pins the behavior. If round.update schema-validates configJson against EvaluationConfigSchema and rejects unknown keys, the test FAILS until the field is added.
- Step 3: Add
useBalancedRankingto the schema
Modify src/types/competition-configs.ts. Inside EvaluationConfigSchema (right above // Ranking (Phase 3) near line 153), add:
// Whether the ranking dashboard ranks projects by juror-balanced (z-normalized) average.
// Defaulting to true preserves existing behavior. Toggled per-round via the dashboard side panel.
useBalancedRanking: z.boolean().default(true),
- Step 4: Run test to verify it passes
Run: npx vitest run tests/unit/round-config-balance-toggle.test.ts
Expected: PASS.
- Step 5: Run typecheck
Run: npm run typecheck
Expected: no errors.
- Step 6: Commit
git add src/types/competition-configs.ts tests/unit/round-config-balance-toggle.test.ts
git commit -m "feat: add useBalancedRanking flag to round config schema"
Task 7: Wire toggle UI into ranking dashboard side sheet
Files:
-
Modify:
src/components/admin/round/ranking-dashboard.tsx -
Step 1: Read the dashboard's existing config-save plumbing
Run: grep -n "saveRankingConfig\|updateRoundMutation\|roundData?.configJson" src/components/admin/round/ranking-dashboard.tsx
The component already loads roundData.configJson (~line 475-484) and saves via updateRoundMutation (trpc.round.update). Reuse the same plumbing.
- Step 2: Add local state + initialization
Near the other useState calls for local weights, add:
const [useBalanced, setUseBalanced] = useState(true)
In the existing useEffect that initializes from roundData.configJson, add:
setUseBalanced((cfg.useBalancedRanking as boolean | undefined) ?? true)
(Place this line next to the other setLocal* calls — and since the toggle should hydrate every time roundData refetches without resetting other in-flight edits, leave the weightsInitialized.current guard in place but read useBalancedRanking outside it. Concretely:)
useEffect(() => {
if (!roundData?.configJson) return
const cfg = roundData.configJson as Record<string, unknown>
setUseBalanced((cfg.useBalancedRanking as boolean | undefined) ?? true)
if (weightsInitialized.current) return
const saved = (cfg.criteriaWeights ?? {}) as Record<string, number>
setLocalWeights(saved)
setLocalCriteriaText((cfg.rankingCriteria as string) ?? '')
setLocalScoreWeight((cfg.scoreWeight as number) ?? 5)
setLocalPassRateWeight((cfg.passRateWeight as number) ?? 5)
weightsInitialized.current = true
}, [roundData])
- Step 3: Add a toggle handler that persists immediately
Add next to saveRankingConfig:
const persistUseBalanced = (next: boolean) => {
setUseBalanced(next)
if (!roundData?.configJson) return
const cfg = roundData.configJson as Record<string, unknown>
updateRoundMutation.mutate({
id: roundId,
configJson: { ...cfg, useBalancedRanking: next },
})
}
- Step 4: Render a toggle row at the top of the side sheet
Inside the <SheetContent> block, just below the header (right above the stats grid at line ~1004), insert:
<div className="mt-4 flex items-center justify-between rounded-lg border p-3">
<div className="flex flex-col">
<span className="text-sm font-medium">Use balanced scoring for ranking</span>
<span className="text-xs text-muted-foreground">
Corrects for per-juror grading style. Off uses raw averages.
</span>
</div>
<Switch checked={useBalanced} onCheckedChange={persistUseBalanced} />
</div>
Add to the imports at the top of the file (only if not already imported):
import { Switch } from '@/components/ui/switch'
- Step 5: Run typecheck
Run: npm run typecheck
Expected: no errors.
- Step 6: Manual smoke test
npm run dev. Open the admin ranking dashboard for a round, click a project, flip the toggle. Reload — it should persist. Open another browser/session — the same value should appear.
- Step 7: Commit
git add src/components/admin/round/ranking-dashboard.tsx
git commit -m "feat: per-round 'use balanced scoring' toggle in side sheet"
Task 8: List sort respects the toggle
Files:
-
Modify:
src/components/admin/round/ranking-dashboard.tsx -
Step 1: Locate the sort sites
Run: grep -n "balancedAverage\|avgGlobalScore" src/components/admin/round/ranking-dashboard.tsx
Two sort sites read balanced first: the initial sort (~line 417) and the composite computation (~line 879). Both use evalScores.balanced[id]?.balancedAverage ?? raw ?? 0.
- Step 2: Replace the score selector with a helper
Add near the top of the component:
const pickRankingScore = (projectId: string, rawFallback: number | null | undefined): number => {
const balanced = evalScores?.balanced[projectId]?.balancedAverage
if (useBalanced && balanced != null) return balanced
return rawFallback ?? 0
}
Replace the two existing expressions:
-
Around line 417, change:
evalScores.balanced[projectId]?.balancedAverage ?? raw ?? 0to:
pickRankingScore(projectId, raw) -
Around line 879, change:
return evalScores?.balanced[id]?.balancedAverage ?? e?.avgGlobalScore ?? 0to:
return pickRankingScore(id, e?.avgGlobalScore) -
Step 3: Trigger re-sort when toggle flips
The list memoizes order from evalScores. Add useBalanced to the dependency array of the useMemo / useEffect that computes ranking order.
Run: grep -n "useMemo\|useEffect" src/components/admin/round/ranking-dashboard.tsx | head and locate the order-computation block (look for the useEffect near line 393 noted in the existing comment "Wait for evalScores too — the initial sort uses balanced (juror-corrected)"). Add useBalanced to the dependency array there.
- Step 4: Run typecheck
Run: npm run typecheck
Expected: no errors.
- Step 5: Manual smoke test
Flip the toggle in the side sheet. The list under each category should re-sort live.
- Step 6: Commit
git add src/components/admin/round/ranking-dashboard.tsx
git commit -m "feat: list sort respects useBalancedRanking toggle"
Task 9: Rebuild side-panel stats card (Raw + Balanced side-by-side)
Files:
-
Modify:
src/components/admin/round/ranking-dashboard.tsx(the side-sheet stats grid around lines 1005-1027) -
Step 1: Remove the
⇢ X.Xannotation from list rows
Run: grep -n "⇢" src/components/admin/round/ranking-dashboard.tsx
Locate the inner block around lines 204-220 (the "Raw + balanced averages shown side by side" comment). Replace the entire {balancedScore != null && Math.abs(...) >= 0.05 && (…)} JSX block with nothing. Keep the raw {entry.avgGlobalScore.toFixed(1)} rendering.
Also remove the balancedScore prop from the <RankingRow /> component declaration (and its prop interface) since the row no longer uses it. Keep callers from passing it — drop the JSX prop too.
- Step 2: Compute balanced + raw averages for the open project
Inside the side-sheet block, just before the stats grid, compute:
{(() => {
const raw = evalScores?.balanced[selectedProjectId ?? '']?.rawAverage ?? null
const balanced = evalScores?.balanced[selectedProjectId ?? '']?.balancedAverage ?? null
const showBoth = raw != null || balanced != null
if (!showBoth) return null
return (
<div className="rounded-lg border p-3">
<p className="text-xs text-muted-foreground mb-2">Avg Score</p>
<div className="flex items-baseline gap-4">
<div className={`flex items-baseline gap-1 ${useBalanced ? 'text-muted-foreground' : 'font-semibold'}`}>
<span className="text-xs">Raw</span>
<span className="text-lg tabular-nums">{raw != null ? raw.toFixed(1) : '—'}</span>
{!useBalanced && <span className="ml-1 text-[10px] text-muted-foreground">← used for ranking</span>}
</div>
<div className={`flex items-baseline gap-1 ${useBalanced ? 'font-semibold' : 'text-muted-foreground'}`}>
<span className="text-xs">Balanced</span>
<span className="text-lg tabular-nums">{balanced != null ? balanced.toFixed(1) : '—'}</span>
{useBalanced && <span className="ml-1 text-[10px] text-muted-foreground">← used for ranking</span>}
</div>
</div>
</div>
)
})()}
The ← used for ranking chip should sit next to whichever number is active. To keep markup simple, render it once at the end of the row and rely on the bolded label to point to the active number; if you want the chip to literally sit next to the active label, conditional-render it inline within each label block instead.
- Step 3: Replace the legacy 3-card grid
The existing 3-card grid (Avg / Pass Rate / Evaluators) at lines 1006-1027 keeps Pass Rate + Evaluators but loses the Avg card (replaced by Step 2's combined card). Restructure into a vertical stack:
{projectDetail.stats && (
<div className="space-y-3">
{/* Avg card (Step 2 above) */}
<div className="grid grid-cols-2 gap-3">
<div className="rounded-lg border p-3 text-center">
<p className="text-xs text-muted-foreground">Pass Rate</p>
<p className="mt-1 text-lg font-semibold">
{projectDetail.stats.totalEvaluations > 0
? `${Math.round((projectDetail.stats.yesVotes / projectDetail.stats.totalEvaluations) * 100)}%`
: '—'}
</p>
</div>
<div className="rounded-lg border p-3 text-center">
<p className="text-xs text-muted-foreground">Evaluators</p>
<p className="mt-1 text-lg font-semibold">
{projectDetail.stats.totalEvaluations}
</p>
</div>
</div>
</div>
)}
- Step 4: Run typecheck + manual smoke test
Run: npm run typecheck
npm run dev and confirm the side panel renders both Raw + Balanced, with the active one bolded.
- Step 5: Commit
git add src/components/admin/round/ranking-dashboard.tsx
git commit -m "feat: side panel shows both raw and balanced averages, list view drops delta annotation"
Task 10: Per-juror "typical / contributes as" chip
Files:
-
Modify:
src/server/routers/ranking.ts(extend the data the dashboard already fetches to include per-juror balance stats) -
Modify:
src/components/admin/round/ranking-dashboard.tsx -
Step 1: Extend the ranking router's evalScores response with per-juror stats
Open src/server/routers/ranking.ts and locate the procedure that returns byProject + balanced (around lines 488-540). After computing balanceCtx, extend the response with a jurorStats map:
const jurorStats: Record<string, { mean: number; stddev: number; count: number }> = {}
for (const [userId, s] of balanceCtx.jurorStats.entries()) {
jurorStats[userId] = { mean: s.mean, stddev: s.stddev, count: s.count }
}
return { byProject, balanced, jurorStats, overallMean: balanceCtx.overallMean, overallStddev: balanceCtx.overallStddev }
- Step 2: Render the chip in the per-juror list
In the side sheet, find the per-juror row (around lines 1046-1090). After the existing <Badge variant="outline">Score: {a.evaluation?.globalScore?.toFixed(1) ?? '—'}</Badge>, render a chip when balanced is on AND we have stats for this juror:
{useBalanced && (() => {
const stats = evalScores?.jurorStats?.[a.userId]
const score = a.evaluation?.globalScore
if (!stats || score == null) return null
const overallMean = evalScores!.overallMean
const overallStddev = evalScores!.overallStddev
if (overallStddev === 0) return null
const z = stats.stddev > 0 ? (score - stats.mean) / stats.stddev : (score - overallMean) / overallStddev
const contributesAs = overallMean + z * overallStddev
return (
<span className="ml-2 text-xs text-muted-foreground" title="Juror's personal scoring baseline → rescaled contribution">
typical {stats.mean.toFixed(1)} → contributes {contributesAs.toFixed(1)}
</span>
)
})()}
(a.userId may be absent in the existing select. If so, add userId: true to the assignment select inside getProjectDetail and re-thread, or use a.user?.id if already selected. Run grep -n "user: { select" src/server/routers/analytics.ts to confirm.)
- Step 3: Run typecheck + manual smoke test
Run: npm run typecheck
npm run dev. Open the side panel. With balanced on, each juror row should show "typical X.X → contributes Y.Y". With it off, the chip disappears.
- Step 4: Commit
git add src/server/routers/ranking.ts src/components/admin/round/ranking-dashboard.tsx
git commit -m "feat: side panel shows per-juror baseline and balanced contribution"
Task 11: Build shared <ScoreExplainerDialog />
Files:
-
Create:
src/components/shared/score-explainer-dialog.tsx -
Step 1: Create the dialog component
'use client'
import {
Dialog,
DialogContent,
DialogHeader,
DialogTitle,
DialogTrigger,
} from '@/components/ui/dialog'
import { Button } from '@/components/ui/button'
import { Info } from 'lucide-react'
import type { ReactNode } from 'react'
export function ScoreExplainerDialog({ trigger }: { trigger?: ReactNode }) {
return (
<Dialog>
<DialogTrigger asChild>
{trigger ?? (
<Button variant="ghost" size="sm" className="h-7 gap-1 px-2 text-xs">
<Info className="h-3.5 w-3.5" />
How scores are calculated
</Button>
)}
</DialogTrigger>
<DialogContent className="max-w-xl max-h-[85vh] overflow-y-auto">
<DialogHeader>
<DialogTitle>How scores are calculated</DialogTitle>
</DialogHeader>
<div className="space-y-4 text-sm">
<p>
Different jurors have different grading styles. Some grade harshly, some leniently.
Balanced scoring corrects for that so a project isn't punished for drawing harsh
jurors or rewarded for drawing lenient ones.
</p>
<div>
<h3 className="font-semibold mb-1">How it works</h3>
<ol className="list-decimal pl-5 space-y-1">
<li>For each juror, calculate their personal average and spread across all the projects they scored in this round.</li>
<li>Convert each individual score into "how many standard deviations above or below this juror's typical" — a 6 from a juror who averages 5 reads the same as a 9 from a juror who averages 8.</li>
<li>Average those normalized values across the project's jurors.</li>
<li>Rescale back onto the same 1–10 scale using the round's overall average and spread.</li>
<li>The result is directly comparable to the raw average — same scale, but corrected for grading style.</li>
</ol>
</div>
<div>
<h3 className="font-semibold mb-1">Worked example</h3>
<table className="w-full text-xs border-collapse">
<thead>
<tr className="border-b">
<th className="py-1 text-left">Juror</th>
<th className="py-1 text-left">Their typical avg</th>
<th className="py-1 text-left">Score for "Project X"</th>
<th className="py-1 text-left">What that means</th>
</tr>
</thead>
<tbody>
<tr className="border-b">
<td className="py-1">Juror A (lenient)</td>
<td>8.2</td>
<td>9.0</td>
<td>Just above their typical (+0.4σ)</td>
</tr>
<tr className="border-b">
<td className="py-1">Juror B (harsh)</td>
<td>5.8</td>
<td>7.5</td>
<td>Well above their typical (+1.5σ)</td>
</tr>
<tr>
<td className="py-1">Juror C (typical)</td>
<td>7.0</td>
<td>8.0</td>
<td>Slightly above their typical (+0.7σ)</td>
</tr>
</tbody>
</table>
<p className="mt-2 text-xs text-muted-foreground">
Raw average: (9.0 + 7.5 + 8.0) / 3 = <strong>8.2</strong>.
Balanced average rescales each juror's enthusiasm to the round's overall scale and lands at
roughly <strong>8.4</strong> — Juror B's strong endorsement (well above their harsh baseline)
carries more weight than the raw 7.5 suggests.
</p>
</div>
<div>
<h3 className="font-semibold mb-1">When it kicks in</h3>
<ul className="list-disc pl-5 space-y-1">
<li>Needs at least 2 evaluations from the round to compute a juror's spread; otherwise that juror falls back to the round-wide average.</li>
<li>Needs at least one juror with non-zero spread; if every juror gave identical scores, balanced equals raw.</li>
<li>Computed within a single round only — a juror's grading style in an intake screening doesn't affect their balance in a deep evaluation.</li>
</ul>
</div>
<div>
<h3 className="font-semibold mb-1">Why we still show "Raw"</h3>
<p>
Both numbers are always shown so you can sanity-check the correction. The toggle at the top of the
side panel decides which one is used for ranking.
</p>
</div>
</div>
</DialogContent>
</Dialog>
)
}
- Step 2: Run typecheck
Run: npm run typecheck
Expected: no errors.
- Step 3: Commit
git add src/components/shared/score-explainer-dialog.tsx
git commit -m "feat: shared 'How scores are calculated' explainer dialog"
Task 12: Wire the explainer into admin + observer surfaces
Files:
-
Modify:
src/components/admin/round/ranking-dashboard.tsx -
Modify:
src/components/observer/observer-project-detail.tsx -
Modify:
src/components/observer/reports/project-preview-dialog.tsx -
Step 1: Mount in admin side sheet
Inside the side sheet, just below the Avg Score combined card from Task 9 Step 2, add:
import { ScoreExplainerDialog } from '@/components/shared/score-explainer-dialog'
// …
<div className="flex justify-end">
<ScoreExplainerDialog />
</div>
- Step 2: Mount in observer full project detail
Find the stats area in src/components/observer/observer-project-detail.tsx (search for "averageGlobalScore" or "Avg Score") and add the same <ScoreExplainerDialog /> button next to it.
- Step 3: Mount in observer reports preview dialog
Inside src/components/observer/reports/project-preview-dialog.tsx, in the Evaluation summary block (around the existing Avg Score card), add the explainer button.
- Step 4: Run typecheck + manual smoke test
Run: npm run typecheck
npm run dev. Click the "How scores are calculated" button in each of the three locations and confirm the dialog renders.
- Step 5: Commit
git add src/components/admin/round/ranking-dashboard.tsx src/components/observer/observer-project-detail.tsx src/components/observer/reports/project-preview-dialog.tsx
git commit -m "feat: mount score explainer dialog in admin and observer surfaces"
Task 13: Decimal display audit
Files:
-
Modify:
src/app/(admin)/admin/reports/page.tsx:368 -
Step 1: Replace toFixed(2) with toFixed(1)
Find the line:
{p.balancedScore == null ? '-' : p.balancedScore.toFixed(2)}
Change to:
{p.balancedScore == null ? '-' : p.balancedScore.toFixed(1)}
- Step 2: Grep for any other 2-decimal score displays
Run: grep -rn "toFixed(2)" src/components src/app --include="*.tsx" | grep -iE "balanced|avg|score"
For any results that show balanced/raw scores, change to toFixed(1). Skip any rate/percentage displays that should stay at 2 decimals.
- Step 3: Commit
git add <files>
git commit -m "fix: standardize score displays on one decimal"
Task 14: Verify list-view delta annotation removal
Files:
-
(No new modification; verifies Task 9 Step 1 landed.)
-
Step 1: Grep for any remaining
⇢characters
Run: grep -rn "⇢" src/components --include="*.tsx"
Expected: no matches.
- Step 2: Grep for the now-unused
balancedScoreprop on the row component
Run: grep -n "balancedScore" src/components/admin/round/ranking-dashboard.tsx
Expected: occurrences only inside the side-sheet block, not on the row component's props or render.
If anything remains, remove it.
- Step 3: Commit (if changes were made)
git add src/components/admin/round/ranking-dashboard.tsx
git commit -m "chore: remove leftover balancedScore plumbing on list row"
Task 15: Final verification
- Step 1: Run the full test suite
Run: npx vitest run
Expected: PASS.
- Step 2: Run typecheck
Run: npm run typecheck
Expected: no errors.
- Step 3: Run a production build
Run: npm run build
Expected: build completes successfully.
- Step 4: Manual end-to-end smoke
npm run dev. Walk through the spec's acceptance criteria:
- With 3 round-scoped evaluations of 9, 8, 8, the side panel shows Avg 8.3 and Evaluators 3.
- Flipping the toggle re-sorts the list view; persists across reload and across users.
- List view shows no per-row delta annotation.
- Side panel shows both Raw and Balanced; active one is highlighted.
- Edition-mode rankings differ vs. before (compute by hand for one project — should match per-round rollup).
- Observer project detail page defaults to active or most recently closed round.
- All score displays show one decimal.
- "How scores are calculated" opens from admin side panel, observer detail page, and observer preview dialog.
- Step 5: No new commit unless something needed fixing
If any acceptance criterion fails, create a fix commit. Otherwise nothing to commit here.