18 KiB
Codebase Concerns
Analysis Date: 2026-02-26
Tech Debt
Award Router is a Stub:
- Issue: Entire award router is commented out and non-functional. All award procedures were deleted when the Pipeline/Track models were removed and have not been reimplemented.
- Files:
src/server/routers/award.ts - Impact: Any UI that references award management procedures will fail at runtime. The
SpecialAwardmodel still exists in the schema but has no tRPC exposure via this router. - Fix approach: Reimplement the router procedures against the current
SpecialAward→CompetitionFK relationship. See the TODO comment at line 9 for the list of procedures to reimplement.
Deliberation Page Has Incomplete Implementation:
- Issue: The jury-facing deliberation page has two hardcoded stub values that break actual vote submission.
juryMemberId: ''— submitted vote will have an empty juror ID.const hasVoted = false— the "already voted" guard never fires, allowing duplicate vote submissions.
- Files:
src/app/(jury)/jury/competitions/deliberation/[sessionId]/page.tsxlines 34 and 66 - Impact: Jury members can submit blank/duplicate votes in deliberation sessions. The submitted vote will be associated with an empty string
juryMemberId, which will likely fail at the Prisma level or silently create bad data. - Fix approach: Derive
juryMemberIdfromsession.participantsby matchingctx.user.id. DerivehasVotedby checking if aDeliberationVotewith the current user's jury member ID already exists insession.votes.
Audit Middleware is a No-Op:
- Issue: The
withAuditLogmiddleware insrc/server/trpc.ts(lines 99–114) identifies mutation calls by path pattern but does nothing with them — the body contains only a comment:// We'll implement this in the audit service. - Files:
src/server/trpc.ts - Impact: Automatic centralised audit logging for all admin mutations does not occur through this middleware. Manual
logAudit()/DecisionAuditLog.create()calls are present in many routers but coverage is inconsistent. - Fix approach: Implement the middleware body to call
logAudit()withctx.user.id,path, and serialized input/output. This provides a safety net for any procedure that doesn't calllogAuditmanually.
In-Memory Rate Limiter Not Suitable for Multi-Instance Deployment:
- Issue:
src/lib/rate-limit.tsuses a module-levelMapfor rate limit state. This works in a single process but does not share state across multiple Node.js instances or after a process restart. - Files:
src/lib/rate-limit.ts - Impact: Rate limits can be trivially bypassed by hitting different server instances. Auth brute-force protection (5-attempt lockout) also uses this same in-memory store (
src/lib/auth.tsline 12). - Fix approach: Replace with Redis-based rate limiting (e.g.,
@upstash/ratelimitorioredis). The comment at line 5 already acknowledges this: "For production with multiple instances, replace with Redis-based solution."
configJson Widely Cast Without Validation:
- Issue:
configJson(a PrismaJsonfield) is cast directly toRecord<string, unknown>in 65 locations across server routers and services without running it through the Zod validators. The validators (safeValidateRoundConfig,EvaluationConfigSchema.safeParse) are only called in 4 locations. - Files:
src/server/routers/assignment.ts,src/server/routers/evaluation.ts,src/server/routers/filtering.ts,src/server/services/round-engine.ts, and many others. - Impact: Stale or malformed config JSON stored in the database can cause silent runtime failures deep in business logic (e.g., missing criteria, wrong field names) without a clear validation error.
- Fix approach: Extract a typed
parseRoundConfig(roundType, configJson)utility that returns a typed config or throws aTRPCError. Replace bareas Record<string, unknown>casts with this utility at query boundaries.
Known Bugs
Tag Rename Performs N+1 Database Writes:
- Symptoms: Renaming a tag iterates over every user and every project that has the tag, issuing one
UPDATEper record. - Files:
src/server/routers/tag.tslines 361–389 and 421–438 - Trigger: Admin renames any tag that is widely used.
- Workaround: None. Will time out for large datasets.
- Fix approach: Use a raw SQL
UPDATE ... SET tags = array_replace(tags, $old, $new)or a Prisma$executeRawto perform the rename in a single query.
Jury Deliberation Vote: juryMemberId Is Hardcoded Empty String:
- Symptoms: Votes submitted via the jury deliberation page will have
juryMemberId: ''. - Files:
src/app/(jury)/jury/competitions/deliberation/[sessionId]/page.tsxline 34 - Trigger: Any jury member visits a deliberation session and submits a vote.
- Workaround: None — the vote will silently pass or fail depending on Prisma validation.
Security Considerations
IP Header Spoofing on Rate Limiter:
- Risk: All rate limiters extract the client IP from
x-forwarded-forwithout validating that the header originates from a trusted proxy. A client can set this header to any value, bypassing per-IP rate limits. - Files:
src/app/api/trpc/[trpc]/route.tslines 13–18,src/app/api/auth/[...nextauth]/route.tslines 7–10,src/app/api/email/change-password/route.tsline 36,src/app/api/email/verify-credentials/route.tsline 20,src/server/context.tsline 13. - Current mitigation: Nginx passes
X-Forwarded-Forfrom upstream; in single-proxy deployment this reduces (but does not eliminate) risk. - Recommendations: Pin IP extraction to
req.headers.get('x-real-ip')set by Nginx only, or validate the forwarded-for chain against a trusted proxy list.
Cron Secret Compared with !== (Non–Timing-Safe):
- Risk: String equality check
cronSecret !== process.env.CRON_SECRETis vulnerable to timing side-channel attacks on the secret value. - Files:
src/app/api/cron/audit-cleanup/route.ts,src/app/api/cron/digest/route.ts,src/app/api/cron/draft-cleanup/route.ts,src/app/api/cron/reminders/route.ts— all at line 8. - Current mitigation: Cron endpoints are not user-facing and rate limited at Nginx level.
- Recommendations: Replace with
timingSafeEqual(Buffer.from(cronSecret), Buffer.from(process.env.CRON_SECRET))— the same approach already used insrc/lib/storage/local-provider.tsline 75.
No Content-Security-Policy Header:
- Risk: No CSP is set in
next.config.tsor via middleware headers. If an XSS vector exists, there is no second line of defence to limit script execution. - Files:
next.config.ts(missingheaders()function),docker/nginx/mopc-platform.conf(missing CSP directive). - Current mitigation: Nginx sets
X-Frame-Options,X-Content-Type-Options, andX-XSS-Protection, but these are legacy headers. No HSTS header is configured in Nginx either (only set post-certbot). - Recommendations: Add
Content-Security-PolicyandStrict-Transport-Securityvia the Next.jsheaders()config function.
MinIO Fallback Credentials Hardcoded:
- Risk: When
MINIO_ACCESS_KEY/MINIO_SECRET_KEYare not set in non-production environments, the client defaults tominioadmin/minioadmin. - Files:
src/lib/minio.tslines 28–29 - Current mitigation: Production throws an error if credentials are missing (line 20–22). The fallback only applies in development.
- Recommendations: Remove the hardcoded fallback entirely; require credentials in all environments to prevent accidental exposure of a non-production MinIO instance.
NEXT_PUBLIC_MINIO_ENDPOINT Undefined in Production:
- Risk: Two admin learning pages read
process.env.NEXT_PUBLIC_MINIO_ENDPOINTat runtime. This variable is not defined indocker-compose.ymland has noNEXT_PUBLIC_entry. Next.js requires public env vars to be present at build time; at runtime this will always resolve to the fallbackhttp://localhost:9000, making file previews broken in production. - Files:
src/app/(admin)/admin/learning/new/page.tsxline 112,src/app/(admin)/admin/learning/[id]/page.tsxline 165. - Fix approach: Add
NEXT_PUBLIC_MINIO_ENDPOINTtodocker-compose.ymlenv section, or use the server-sideMINIO_PUBLIC_ENDPOINTvia a tRPC query rather than client-side env.
Performance Bottlenecks
Unbounded findMany Queries in Analytics Router:
- Problem:
src/server/routers/analytics.tscontains approximately 25findManycalls with notakelimit. With large competitions (hundreds of projects, thousands of evaluations) these will perform full table scans filtered only byroundIdorcompetitionId. - Files:
src/server/routers/analytics.ts— queries at lines 38, 80, 265, 421, 539, 582, 649, 749, 794, 1207, 1227, 1346, 1406, 1481, 1498, 1654, 1677, 1700. - Cause: Analytics queries are built for correctness, not scale. They load entire result sets into Node.js memory before aggregation.
- Improvement path: Move aggregation into the database using Prisma
groupByand_count/_avgaggregations, or write$queryRawSQL for complex analytics. Add pagination or date-range limits to the procedure inputs.
Tag Rename N+1 Pattern:
- Problem: Renaming a tag issues one DB write per entity (user or project) that carries the tag rather than a single bulk update.
- Files:
src/server/routers/tag.tslines 355–390 - Cause: Prisma does not support
array_replacenatively; the current implementation works around this with a loop. - Improvement path: Use
prisma.$executeRawwith PostgreSQL'sarray_replacefunction.
assignment.ts Router is 3,337 Lines:
- Problem: The assignment router is the single largest file and handles jury assignments, AI assignment, manual overrides, transfer, COI, and coverage checks in one module.
- Files:
src/server/routers/assignment.ts - Cause: Organic growth without module splitting.
- Improvement path: Extract into separate files:
src/server/routers/assignment/manual.ts,assignment/ai.ts,assignment/coverage.ts. This will also improve IDE performance and test isolation.
Fragile Areas
Round Engine: COMPLETED State Has No Guards on Re-Entry:
- Files:
src/server/services/round-engine.tslines 57–64 - Why fragile:
COMPLETEDis defined as a terminal state (empty transitions array). However, there is no server-side guard preventing a direct Prisma update to aCOMPLETEDproject state outside oftransitionProjectState(). If a bug or data migration bypasses the state machine, projects can end up in unexpected states. - Safe modification: Always transition through
transitionProjectState(). Any admin data repair scripts should call this function rather than usingprisma.projectRoundState.updatedirectly. - Test coverage: No unit tests for project state transitions. Only
tests/unit/assignment-policy.test.tsexists, covering a different subsystem.
Prisma.$transaction Typed as any:
- Files:
src/server/services/round-engine.tsline 129,src/server/services/result-lock.tslines 87, 169,src/server/services/mentor-workspace.tslines 39, 254, and 50+ other locations. - Why fragile:
tx: anydisables all type-checking inside transaction callbacks. A mistakenly called method ontx(e.g.,tx.round.deleteinstead oftx.round.update) will compile successfully but may cause silent data corruption. - Safe modification: Type the callback as
(tx: Parameters<Parameters<typeof prisma.$transaction>[0]>[0]) => ...or define aTransactionalPrismatype alias. ThePrismaClient | anyunion also defeats the purpose of typing.
email.ts is 2,175 Lines:
- Files:
src/lib/email.ts - Why fragile: All email templates, SMTP transport logic, and dynamic config loading are in one file. Adding a new email type requires navigating 2,000+ lines, and any change to transport setup affects all templates.
- Safe modification: Extract individual email functions into
src/lib/email/subdirectory with one file per template type. Keep shared transport logic insrc/lib/email/transport.ts. - Test coverage: No tests for email sending. Email errors are caught and logged but not surfaced to callers consistently.
admin/rounds/[roundId]/page.tsx is 2,398 Lines:
- Files:
src/app/(admin)/admin/rounds/[roundId]/page.tsx - Why fragile: The entire round management UI (config, assignments, filtering, deliberation controls) lives in a single client component. State from one section can accidentally affect another, and the component re-renders on any state change.
- Safe modification: Extract tab sections into separate
'use client'components with scoped state. Consider converting to a tab-based layout with lazy loading.
SSE Live Voting Stream Relies on Polling:
- Files:
src/app/api/live-voting/stream/route.tslines 184–194 - Why fragile: The SSE endpoint polls the database every 2 seconds per connected client. Under live ceremony conditions with many simultaneous audience connections, this can produce significant database load.
- Safe modification: Introduce a Redis pub/sub channel that the vote submission path writes to, and have the SSE stream subscribe to the channel rather than polling. Alternatively, implement a debounce on the poll and share the result across all open SSE connections via a singleton broadcaster.
Scaling Limits
In-Memory State (Rate Limiter, Login Attempts):
- Current capacity: Works correctly for a single Node.js process.
- Limit: Breaks under horizontal scaling or after a process restart (all rate limit windows reset).
- Scaling path: Replace
src/lib/rate-limit.tswith a Redis-backed solution. Replace thefailedAttemptsMap insrc/lib/auth.tswith Redis counters or database fields on theUsermodel.
SSE Connection Count vs. Database Poll Rate:
- Current capacity: Each SSE client issues 1 database query per 2 seconds.
- Limit: At 100 concurrent audience connections, this is 50 queries/second to
liveVotingSessionand related tables during a ceremony. - Scaling path: Shared broadcaster pattern (one database poll, fan-out to all SSE streams) or Redis pub/sub as described above.
Dependencies at Risk
next-auth v5 (Auth.js) — Beta API:
- Risk: Auth.js v5 was in release candidate status at time of integration. The API surface (
authConfig+handlers+auth) differs significantly from v4. Upgrading to a stable v5 release may require changes tosrc/lib/auth.tsandsrc/lib/auth.config.ts. - Impact: Session type definitions, adapter interfaces, and middleware patterns may change.
- Migration plan: Monitor the Auth.js v5 stable release. Changes are likely limited to
src/lib/auth.tsandsrc/types/next-auth.d.ts.
Missing Critical Features
No Database Backup Configuration:
- Problem:
docker-compose.ymlhas no scheduled backup service or volume snapshot configuration for the PostgreSQL container. - Blocks: Point-in-time recovery after data loss or accidental deletion.
- Recommendation: Add a sidecar backup service (e.g.,
prodrigestivill/postgres-backup-local) or configure WAL archiving to MinIO.
No Error Monitoring / Observability:
- Problem: There is no Sentry, Datadog, or equivalent error monitoring integration. Application errors are only logged to stdout via
console.error. In production, these are only visible if the Docker logs are actively monitored. - Files: No integration found in
src/instrumentation.tsor anywhere else. - Blocks: Proactive detection of runtime errors, AI service failures, and payment/submission edge cases.
- Recommendation: Add Sentry (
@sentry/nextjs) insrc/instrumentation.ts— Next.js has native support for this. Filter out expected errors (e.g.,TRPCErrorwithNOT_FOUND) to reduce noise.
No Automated Tests for Core Business Logic (Round Engine, Evaluation, Filtering):
- Problem: Only one test file exists:
tests/unit/assignment-policy.test.ts. The round engine state machine (src/server/services/round-engine.ts), evaluation submission flow (src/server/routers/evaluation.ts), and AI filtering pipeline (src/server/services/ai-filtering.ts) have no automated tests. - Blocks: Confident refactoring of the state machine, advance-criterion logic, and scoring.
- Recommendation: Add unit tests for
activateRound,closeRound,transitionProjectState(happy path + guard failures), andsubmitEvaluation(COI check, advance criterion logic, score validation).
Test Coverage Gaps
Round Engine State Machine:
- What's not tested: All
activateRound,closeRound,archiveRound,transitionProjectStatetransitions, including guard conditions (e.g., activating an ARCHIVED round, transitioning a COMPLETED project). - Files:
src/server/services/round-engine.ts - Risk: A regression in state transition guards could allow data corruption (e.g., re-activating a closed round, double-passing a project).
- Priority: High
Evaluation Submission (advance criterion, COI, scoring):
- What's not tested:
submitEvaluationmutation — specifically the advance criterion auto-transition logic (lines 1637–1646 ofsrc/server/routers/evaluation.ts), COI auto-reassignment ondeclareCOI, andupsertFormcriterion validation. - Files:
src/server/routers/evaluation.ts - Risk: Regression in advance criterion will silently skip project advancement. COI declaration failures are caught and logged but untested.
- Priority: High
AI Anonymization:
- What's not tested:
sanitizeText,anonymizeProject,validateNoPersonalInfoinsrc/server/services/anonymization.ts. These are GDPR-critical functions. - Files:
src/server/services/anonymization.ts - Risk: A PII leak in AI calls would violate GDPR without detection.
- Priority: High
Assignment Policy Execution:
- What's not tested: End-to-end
executeAssignmentflow insrc/server/services/round-assignment.ts— specifically the COI filtering, geo-diversity penalty, familiarity bonus, and under-coverage gap-fill. - Files:
src/server/services/round-assignment.ts,src/server/services/smart-assignment.ts - Risk: Silent over- or under-assignment when constraints interact.
- Priority: Medium
Concerns audit: 2026-02-26