Optimize AI system with batching, token tracking, and GDPR compliance

- Add AIUsageLog model for persistent token/cost tracking - Implement batched processing for all AI services: - Assignment: 15 projects/batch - Filtering: 20 projects/batch - Award eligibility: 20 projects/batch - Mentor matching: 15 projects/batch - Create unified error classification (ai-errors.ts) - Enhance anonymization with comprehensive project data - Add AI usage dashboard to Settings page - Add usage stats endpoints to settings router - Create AI system documentation (5 files) - Create GDPR compliance documentation (2 files) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 11:58:12 +01:00
parent a72e815d3a
commit 928b1c65dc
19 changed files with 4103 additions and 601 deletions
--- a/docs/architecture/ai-services.md
+++ b/docs/architecture/ai-services.md
@@ -0,0 +1,249 @@
+# AI Services Reference
+
+## 1. AI Filtering Service
+
+**File:** `src/server/services/ai-filtering.ts`
+
+**Purpose:** Evaluate projects against admin-defined criteria text
+
+### Input
+- List of projects (anonymized)
+- Criteria text (e.g., "Projects must be based in Mediterranean region")
+- Rule configuration (PASS/REJECT/FLAG actions)
+
+### Output
+Per-project result:
+- `meets_criteria` - Boolean
+- `confidence` - 0-1 score
+- `reasoning` - Explanation
+- `quality_score` - 1-10 rating
+- `spam_risk` - Boolean flag
+
+### Configuration
+- **Batch Size:** 20 projects per API call
+- **Description Limit:** 500 characters
+- **Token Usage:** ~1500-2500 tokens per batch
+
+### Example Criteria
+- "Filter out any project without a description"
+- "Only include projects founded after 2020"
+- "Reject projects with fewer than 2 team members"
+- "Projects must be based in Mediterranean region"
+
+### Usage
+```typescript
+import { aiFilterProjects } from '@/server/services/ai-filtering'
+
+const results = await aiFilterProjects(
+  projects,
+  'Only include projects with ocean conservation focus',
+  userId,
+  roundId
+)
+```
+
+---
+
+## 2. AI Assignment Service
+
+**File:** `src/server/services/ai-assignment.ts`
+
+**Purpose:** Match jurors to projects based on expertise alignment
+
+### Input
+- List of jurors with expertise tags
+- List of projects with tags/category
+- Constraints:
+  - Required reviews per project
+  - Max assignments per juror
+  - Existing assignments (to avoid duplicates)
+
+### Output
+Suggested assignments:
+- `jurorId` - Juror to assign
+- `projectId` - Project to assign
+- `confidenceScore` - 0-1 match confidence
+- `expertiseMatchScore` - 0-1 expertise overlap
+- `reasoning` - Explanation
+
+### Configuration
+- **Batch Size:** 15 projects per batch (all jurors included)
+- **Description Limit:** 300 characters
+- **Token Usage:** ~2000-3500 tokens per batch
+
+### Fallback Algorithm
+When AI is unavailable, uses:
+1. Tag overlap scoring (60% weight)
+2. Load balancing (40% weight)
+3. Constraint satisfaction
+
+### Usage
+```typescript
+import { generateAIAssignments } from '@/server/services/ai-assignment'
+
+const result = await generateAIAssignments(
+  jurors,
+  projects,
+  {
+    requiredReviewsPerProject: 3,
+    maxAssignmentsPerJuror: 10,
+    existingAssignments: [],
+  },
+  userId,
+  roundId
+)
+```
+
+---
+
+## 3. Award Eligibility Service
+
+**File:** `src/server/services/ai-award-eligibility.ts`
+
+**Purpose:** Determine which projects qualify for special awards
+
+### Input
+- Award criteria text (plain language)
+- List of projects (anonymized)
+- Optional: Auto-tag rules (field-based matching)
+
+### Output
+Per-project:
+- `eligible` - Boolean
+- `confidence` - 0-1 score
+- `reasoning` - Explanation
+- `method` - 'AI' or 'AUTO'
+
+### Configuration
+- **Batch Size:** 20 projects per API call
+- **Description Limit:** 400 characters
+- **Token Usage:** ~1500-2500 tokens per batch
+
+### Auto-Tag Rules
+Deterministic rules can be combined with AI:
+```typescript
+const rules: AutoTagRule[] = [
+  { field: 'country', operator: 'equals', value: 'Italy' },
+  { field: 'competitionCategory', operator: 'equals', value: 'STARTUP' },
+]
+```
+
+### Usage
+```typescript
+import { aiInterpretCriteria, applyAutoTagRules } from '@/server/services/ai-award-eligibility'
+
+// Deterministic matching
+const autoResults = applyAutoTagRules(rules, projects)
+
+// AI-based criteria interpretation
+const aiResults = await aiInterpretCriteria(
+  'Projects focusing on marine biodiversity',
+  projects,
+  userId,
+  awardId
+)
+```
+
+---
+
+## 4. Mentor Matching Service
+
+**File:** `src/server/services/mentor-matching.ts`
+
+**Purpose:** Recommend mentors for projects based on expertise
+
+### Input
+- Project details (single or batch)
+- Available mentors with expertise tags and availability
+
+### Output
+Ranked list of mentor matches:
+- `mentorId` - Mentor ID
+- `confidenceScore` - 0-1 overall match
+- `expertiseMatchScore` - 0-1 expertise overlap
+- `reasoning` - Explanation
+
+### Configuration
+- **Batch Size:** 15 projects per batch
+- **Description Limit:** 350 characters
+- **Token Usage:** ~1500-2500 tokens per batch
+
+### Fallback Algorithm
+Keyword-based matching when AI unavailable:
+1. Extract keywords from project tags/description
+2. Match against mentor expertise tags
+3. Factor in availability (assignments vs max)
+
+### Usage
+```typescript
+import {
+  getAIMentorSuggestions,
+  getAIMentorSuggestionsBatch
+} from '@/server/services/mentor-matching'
+
+// Single project
+const matches = await getAIMentorSuggestions(prisma, projectId, 5, userId)
+
+// Batch processing
+const batchResults = await getAIMentorSuggestionsBatch(
+  prisma,
+  projectIds,
+  5,
+  userId
+)
+```
+
+---
+
+## Common Patterns
+
+### Token Logging
+All services log usage to `AIUsageLog`:
+```typescript
+await logAIUsage({
+  userId,
+  action: 'FILTERING',
+  entityType: 'Round',
+  entityId: roundId,
+  model,
+  promptTokens: usage.promptTokens,
+  completionTokens: usage.completionTokens,
+  totalTokens: usage.totalTokens,
+  batchSize: projects.length,
+  itemsProcessed: projects.length,
+  status: 'SUCCESS',
+})
+```
+
+### Error Handling
+All services use unified error classification:
+```typescript
+try {
+  // AI call
+} catch (error) {
+  const classified = classifyAIError(error)
+  logAIError('ServiceName', 'functionName', classified)
+
+  if (classified.retryable) {
+    // Retry logic
+  } else {
+    // Fall back to algorithm
+  }
+}
+```
+
+### Anonymization
+All services anonymize before sending to AI:
+```typescript
+const { anonymized, mappings } = anonymizeProjectsForAI(projects, 'FILTERING')
+
+if (!validateAnonymizedProjects(anonymized)) {
+  throw new Error('GDPR compliance check failed')
+}
+```
+
+## See Also
+
+- [AI System Architecture](./ai-system.md)
+- [AI Configuration Guide](./ai-configuration.md)
+- [AI Error Handling](./ai-errors.md)