Optimize AI system with batching, token tracking, and GDPR compliance

- Add AIUsageLog model for persistent token/cost tracking - Implement batched processing for all AI services: - Assignment: 15 projects/batch - Filtering: 20 projects/batch - Award eligibility: 20 projects/batch - Mentor matching: 15 projects/batch - Create unified error classification (ai-errors.ts) - Enhance anonymization with comprehensive project data - Add AI usage dashboard to Settings page - Add usage stats endpoints to settings router - Create AI system documentation (5 files) - Create GDPR compliance documentation (2 files) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 11:58:12 +01:00
parent a72e815d3a
commit 928b1c65dc
19 changed files with 4103 additions and 601 deletions
--- a/docs/architecture/ai-prompts.md
+++ b/docs/architecture/ai-prompts.md
@@ -0,0 +1,222 @@
+# AI Prompts Reference
+
+This document describes the prompts used by each AI service. All prompts are optimized for token efficiency while maintaining accuracy.
+
+## Design Principles
+
+1. **Concise system prompts** - Under 100 tokens where possible
+2. **Structured output** - JSON format for reliable parsing
+3. **Clear field names** - Consistent naming across services
+4. **Score ranges** - 0-1 for confidence, 1-10 for quality
+
+## Filtering Prompt
+
+**Purpose:** Evaluate projects against admin-defined criteria
+
+### System Prompt
+```
+Project screening assistant. Evaluate each project against the criteria.
+Return JSON: {"projects": [{project_id, meets_criteria: bool, confidence: 0-1, reasoning: str, quality_score: 1-10, spam_risk: bool}]}
+Assess description quality and relevance objectively.
+```
+
+### User Prompt Template
+```
+CRITERIA: {criteria_text}
+PROJECTS: {anonymized_project_array}
+Evaluate each project against the criteria. Return JSON.
+```
+
+### Example Response
+```json
+{
+  "projects": [
+    {
+      "project_id": "P1",
+      "meets_criteria": true,
+      "confidence": 0.9,
+      "reasoning": "Project focuses on coral reef restoration, matching ocean conservation criteria",
+      "quality_score": 8,
+      "spam_risk": false
+    }
+  ]
+}
+```
+
+---
+
+## Assignment Prompt
+
+**Purpose:** Match jurors to projects by expertise
+
+### System Prompt
+```
+Match jurors to projects by expertise. Return JSON assignments.
+Each: {juror_id, project_id, confidence_score: 0-1, expertise_match_score: 0-1, reasoning: str (1-2 sentences)}
+Distribute workload fairly. Avoid assigning jurors at capacity.
+```
+
+### User Prompt Template
+```
+JURORS: {anonymized_juror_array}
+PROJECTS: {anonymized_project_array}
+CONSTRAINTS: {N} reviews/project, max {M}/juror
+EXISTING: {existing_assignments}
+Return JSON: {"assignments": [...]}
+```
+
+### Example Response
+```json
+{
+  "assignments": [
+    {
+      "juror_id": "juror_001",
+      "project_id": "project_005",
+      "confidence_score": 0.85,
+      "expertise_match_score": 0.9,
+      "reasoning": "Juror expertise in marine biology aligns with coral restoration project"
+    }
+  ]
+}
+```
+
+---
+
+## Award Eligibility Prompt
+
+**Purpose:** Determine project eligibility for special awards
+
+### System Prompt
+```
+Award eligibility evaluator. Evaluate projects against criteria, return JSON.
+Format: {"evaluations": [{project_id, eligible: bool, confidence: 0-1, reasoning: str}]}
+Be objective. Base evaluation only on provided data. No personal identifiers in reasoning.
+```
+
+### User Prompt Template
+```
+CRITERIA: {criteria_text}
+PROJECTS: {anonymized_project_array}
+Evaluate eligibility for each project.
+```
+
+### Example Response
+```json
+{
+  "evaluations": [
+    {
+      "project_id": "P3",
+      "eligible": true,
+      "confidence": 0.95,
+      "reasoning": "Project is based in Italy and focuses on Mediterranean biodiversity"
+    }
+  ]
+}
+```
+
+---
+
+## Mentor Matching Prompt
+
+**Purpose:** Recommend mentors for projects
+
+### System Prompt
+```
+Match mentors to projects by expertise. Return JSON.
+Format for each project: {"matches": [{project_id, mentor_matches: [{mentor_index, confidence_score: 0-1, expertise_match_score: 0-1, reasoning: str}]}]}
+Rank by suitability. Consider expertise alignment and availability.
+```
+
+### User Prompt Template
+```
+PROJECTS:
+P1: Category=STARTUP, Issue=HABITAT_RESTORATION, Tags=[coral, reef], Desc=Project description...
+P2: ...
+
+MENTORS:
+0: Expertise=[marine biology, coral], Availability=2/5
+1: Expertise=[business development], Availability=0/3
+...
+
+For each project, rank top {N} mentors.
+```
+
+### Example Response
+```json
+{
+  "matches": [
+    {
+      "project_id": "P1",
+      "mentor_matches": [
+        {
+          "mentor_index": 0,
+          "confidence_score": 0.92,
+          "expertise_match_score": 0.95,
+          "reasoning": "Marine biology expertise directly matches coral restoration focus"
+        }
+      ]
+    }
+  ]
+}
+```
+
+---
+
+## Anonymized Data Structure
+
+All projects sent to AI use this structure:
+
+```typescript
+interface AnonymizedProjectForAI {
+  project_id: string        // P1, P2, etc.
+  title: string             // Sanitized (PII removed)
+  description: string       // Truncated + PII stripped
+  category: string | null   // STARTUP | BUSINESS_CONCEPT
+  ocean_issue: string | null
+  country: string | null
+  region: string | null
+  institution: string | null
+  tags: string[]
+  founded_year: number | null
+  team_size: number
+  has_description: boolean
+  file_count: number
+  file_types: string[]
+  wants_mentorship: boolean
+  submission_source: string
+  submitted_date: string | null // YYYY-MM-DD
+}
+```
+
+### What Gets Stripped
+- Team/company names
+- Email addresses
+- Phone numbers
+- External URLs
+- Real project/user IDs
+- Internal comments
+
+---
+
+## Token Optimization Tips
+
+1. **Batch projects** - Process 15-20 per request
+2. **Truncate descriptions** - 300-500 chars based on task
+3. **Use abbreviated fields** - `desc` vs `description`
+4. **Compress constraints** - Inline in prompt
+5. **Request specific fields** - Only what you need
+
+## Prompt Versioning
+
+| Service | Version | Last Updated |
+|---------|---------|--------------|
+| Filtering | 2.0 | 2025-01 |
+| Assignment | 2.0 | 2025-01 |
+| Award Eligibility | 2.0 | 2025-01 |
+| Mentor Matching | 2.0 | 2025-01 |
+
+## See Also
+
+- [AI System Architecture](./ai-system.md)
+- [AI Services Reference](./ai-services.md)
+- [AI Configuration Guide](./ai-configuration.md)