Optimize AI system with batching, token tracking, and GDPR compliance
- Add AIUsageLog model for persistent token/cost tracking - Implement batched processing for all AI services: - Assignment: 15 projects/batch - Filtering: 20 projects/batch - Award eligibility: 20 projects/batch - Mentor matching: 15 projects/batch - Create unified error classification (ai-errors.ts) - Enhance anonymization with comprehensive project data - Add AI usage dashboard to Settings page - Add usage stats endpoints to settings router - Create AI system documentation (5 files) - Create GDPR compliance documentation (2 files) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
217
docs/gdpr/ai-data-processing.md
Normal file
217
docs/gdpr/ai-data-processing.md
Normal file
@@ -0,0 +1,217 @@
|
||||
# AI Data Processing - GDPR Compliance Documentation
|
||||
|
||||
## Overview
|
||||
|
||||
This document describes how project data is processed by AI services in the MOPC Platform, ensuring compliance with GDPR Articles 5, 6, 13-14, 25, and 32.
|
||||
|
||||
## Legal Basis
|
||||
|
||||
| Processing Activity | Legal Basis | GDPR Article |
|
||||
|---------------------|-------------|--------------|
|
||||
| AI-powered project filtering | Legitimate interest | Art. 6(1)(f) |
|
||||
| AI-powered jury assignment | Legitimate interest | Art. 6(1)(f) |
|
||||
| AI-powered award eligibility | Legitimate interest | Art. 6(1)(f) |
|
||||
| AI-powered mentor matching | Legitimate interest | Art. 6(1)(f) |
|
||||
|
||||
**Legitimate Interest Justification:** AI processing is used to efficiently evaluate ocean conservation projects and match appropriate reviewers, directly serving the platform's purpose of managing the Monaco Ocean Protection Challenge.
|
||||
|
||||
## Data Minimization (Article 5(1)(c))
|
||||
|
||||
The AI system applies strict data minimization:
|
||||
|
||||
- **Only necessary fields** sent to AI (no names, emails, phone numbers)
|
||||
- **Descriptions truncated** to 300-500 characters maximum
|
||||
- **Team size** sent as count only (no member details)
|
||||
- **Dates** sent as year-only or ISO date (no timestamps)
|
||||
- **IDs replaced** with sequential anonymous identifiers (P1, P2, etc.)
|
||||
|
||||
## Anonymization Measures
|
||||
|
||||
### Data NEVER Sent to AI
|
||||
|
||||
| Data Type | Reason |
|
||||
|-----------|--------|
|
||||
| Personal names | PII - identifying |
|
||||
| Email addresses | PII - identifying |
|
||||
| Phone numbers | PII - identifying |
|
||||
| Physical addresses | PII - identifying |
|
||||
| External URLs | Could identify individuals |
|
||||
| Internal project/user IDs | Could be cross-referenced |
|
||||
| Team member details | PII - identifying |
|
||||
| Internal comments | May contain PII |
|
||||
| File content | May contain PII |
|
||||
|
||||
### Data Sent to AI (Anonymized)
|
||||
|
||||
| Field | Type | Purpose | Anonymization |
|
||||
|-------|------|---------|---------------|
|
||||
| project_id | String | Reference | Replaced with P1, P2, etc. |
|
||||
| title | String | Spam detection | PII patterns removed |
|
||||
| description | String | Criteria matching | Truncated, PII stripped |
|
||||
| category | Enum | Filtering | As-is (no PII) |
|
||||
| ocean_issue | Enum | Topic filtering | As-is (no PII) |
|
||||
| country | String | Geographic eligibility | As-is (country name only) |
|
||||
| region | String | Regional eligibility | As-is (zone name only) |
|
||||
| institution | String | Student identification | As-is (institution name only) |
|
||||
| tags | Array | Keyword matching | As-is (no PII expected) |
|
||||
| founded_year | Number | Age filtering | Year only, not full date |
|
||||
| team_size | Number | Team requirements | Count only |
|
||||
| file_count | Number | Document checks | Count only |
|
||||
| file_types | Array | File requirements | Type names only |
|
||||
| wants_mentorship | Boolean | Mentorship filtering | As-is |
|
||||
| submission_source | Enum | Source filtering | As-is |
|
||||
| submitted_date | String | Deadline checks | Date only, no time |
|
||||
|
||||
## Technical Safeguards
|
||||
|
||||
### PII Detection and Stripping
|
||||
|
||||
```typescript
|
||||
// Patterns detected and removed before AI processing
|
||||
const PII_PATTERNS = {
|
||||
email: /[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/g,
|
||||
phone: /(\+?\d{1,3}[-.\s]?)?\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}/g,
|
||||
url: /https?:\/\/[^\s]+/g,
|
||||
ssn: /\d{3}-\d{2}-\d{4}/g,
|
||||
ipv4: /\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b/g,
|
||||
}
|
||||
```
|
||||
|
||||
### Validation Before Every AI Call
|
||||
|
||||
```typescript
|
||||
// GDPR compliance enforced before EVERY API call
|
||||
export function enforceGDPRCompliance(data: unknown[]): void {
|
||||
for (const item of data) {
|
||||
const { valid, violations } = validateNoPersonalData(item)
|
||||
if (!valid) {
|
||||
throw new Error(`GDPR compliance check failed: ${violations.join(', ')}`)
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### ID Anonymization
|
||||
|
||||
Real IDs are never sent to AI. Instead:
|
||||
- Projects: `cm1abc123...` → `P1`, `P2`, `P3`
|
||||
- Jurors: `cm2def456...` → `juror_001`, `juror_002`
|
||||
- Results mapped back using secure mapping tables
|
||||
|
||||
## Data Retention
|
||||
|
||||
| Data Type | Retention | Deletion Method |
|
||||
|-----------|-----------|-----------------|
|
||||
| AI usage logs | 12 months | Automatic deletion |
|
||||
| Anonymized prompts | Not stored | Sent directly to API |
|
||||
| AI responses | Not stored | Parsed and discarded |
|
||||
|
||||
**Note:** OpenAI does not retain API data for training (per their API Terms). API data is retained for up to 30 days for abuse monitoring, configurable to 0 days.
|
||||
|
||||
## Subprocessor: OpenAI
|
||||
|
||||
| Aspect | Details |
|
||||
|--------|---------|
|
||||
| Subprocessor | OpenAI, Inc. |
|
||||
| Location | United States |
|
||||
| DPA Status | Data Processing Agreement in place |
|
||||
| Safeguards | Standard Contractual Clauses (SCCs) |
|
||||
| Compliance | SOC 2 Type II, GDPR-compliant |
|
||||
| Data Use | API data NOT used for model training |
|
||||
|
||||
**OpenAI DPA:** https://openai.com/policies/data-processing-agreement
|
||||
|
||||
## Audit Trail
|
||||
|
||||
All AI processing is logged:
|
||||
|
||||
```typescript
|
||||
await prisma.aIUsageLog.create({
|
||||
data: {
|
||||
userId: ctx.user.id, // Who initiated
|
||||
action: 'FILTERING', // What type
|
||||
entityType: 'Round', // What entity
|
||||
entityId: roundId, // Which entity
|
||||
model: 'gpt-4o', // What model
|
||||
totalTokens: 1500, // Resource usage
|
||||
status: 'SUCCESS', // Outcome
|
||||
},
|
||||
})
|
||||
```
|
||||
|
||||
## Data Subject Rights
|
||||
|
||||
### Right of Access (Article 15)
|
||||
|
||||
Users can request:
|
||||
- What data was processed by AI
|
||||
- When AI processing occurred
|
||||
- What decisions were made
|
||||
|
||||
**Implementation:** Export AI usage logs for user's projects.
|
||||
|
||||
### Right to Erasure (Article 17)
|
||||
|
||||
When a user requests deletion:
|
||||
- AI usage logs for their projects can be deleted
|
||||
- No data remains at OpenAI (API data not retained for training)
|
||||
|
||||
**Note:** Since only anonymized data is sent to AI, there is no personal data at OpenAI to delete.
|
||||
|
||||
### Right to Object (Article 21)
|
||||
|
||||
Users can request to opt out of AI processing:
|
||||
- Admin can disable AI features per round
|
||||
- Manual review fallback available for all AI features
|
||||
|
||||
## Risk Assessment
|
||||
|
||||
### Risk: PII Leakage to AI Provider
|
||||
|
||||
| Factor | Assessment |
|
||||
|--------|------------|
|
||||
| Likelihood | Very Low |
|
||||
| Impact | Medium |
|
||||
| Mitigation | Automated PII detection, validation before every call |
|
||||
| Residual Risk | Very Low |
|
||||
|
||||
### Risk: AI Decision Bias
|
||||
|
||||
| Factor | Assessment |
|
||||
|--------|------------|
|
||||
| Likelihood | Low |
|
||||
| Impact | Low |
|
||||
| Mitigation | Human review of all AI suggestions, algorithmic fallback |
|
||||
| Residual Risk | Very Low |
|
||||
|
||||
### Risk: Data Breach at Subprocessor
|
||||
|
||||
| Factor | Assessment |
|
||||
|--------|------------|
|
||||
| Likelihood | Very Low |
|
||||
| Impact | Low (only anonymized data) |
|
||||
| Mitigation | OpenAI SOC 2 compliance, no PII sent |
|
||||
| Residual Risk | Very Low |
|
||||
|
||||
## Compliance Checklist
|
||||
|
||||
- [x] Data minimization applied (only necessary fields)
|
||||
- [x] PII stripped before AI processing
|
||||
- [x] Anonymization validated before every API call
|
||||
- [x] DPA in place with OpenAI
|
||||
- [x] Audit logging of all AI operations
|
||||
- [x] Fallback available when AI declined
|
||||
- [x] Usage logs retained for 12 months only
|
||||
- [x] No personal data stored at subprocessor
|
||||
|
||||
## Contact
|
||||
|
||||
For questions about AI data processing:
|
||||
- Data Protection Officer: [DPO email]
|
||||
- Technical Contact: [Tech contact email]
|
||||
|
||||
## See Also
|
||||
|
||||
- [Platform GDPR Compliance](./platform-gdpr-compliance.md)
|
||||
- [AI System Architecture](../architecture/ai-system.md)
|
||||
- [AI Services Reference](../architecture/ai-services.md)
|
||||
324
docs/gdpr/platform-gdpr-compliance.md
Normal file
324
docs/gdpr/platform-gdpr-compliance.md
Normal file
@@ -0,0 +1,324 @@
|
||||
# MOPC Platform - GDPR Compliance Documentation
|
||||
|
||||
## 1. Data Controller Information
|
||||
|
||||
| Field | Value |
|
||||
|-------|-------|
|
||||
| **Data Controller** | Monaco Ocean Protection Challenge |
|
||||
| **Contact** | [Data Protection Officer email] |
|
||||
| **Platform** | monaco-opc.com |
|
||||
| **Jurisdiction** | Monaco |
|
||||
|
||||
---
|
||||
|
||||
## 2. Personal Data Collected
|
||||
|
||||
### 2.1 User Account Data
|
||||
|
||||
| Data Type | Purpose | Legal Basis | Retention |
|
||||
|-----------|---------|-------------|-----------|
|
||||
| Email address | Account identification, notifications | Contract performance | Account lifetime + 2 years |
|
||||
| Name | Display in platform, certificates | Contract performance | Account lifetime + 2 years |
|
||||
| Phone number (optional) | WhatsApp notifications | Consent | Until consent withdrawn |
|
||||
| Profile photo (optional) | Platform personalization | Consent | Until deleted by user |
|
||||
| Role | Access control | Contract performance | Account lifetime |
|
||||
| IP address | Security, audit logging | Legitimate interest | 12 months |
|
||||
| User agent | Security, debugging | Legitimate interest | 12 months |
|
||||
|
||||
### 2.2 Project/Application Data
|
||||
|
||||
| Data Type | Purpose | Legal Basis | Retention |
|
||||
|-----------|---------|-------------|-----------|
|
||||
| Project title | Competition entry | Contract performance | Program lifetime + 5 years |
|
||||
| Project description | Evaluation | Contract performance | Program lifetime + 5 years |
|
||||
| Team information | Contact, evaluation | Contract performance | Program lifetime + 5 years |
|
||||
| Uploaded files | Evaluation | Contract performance | Program lifetime + 5 years |
|
||||
| Country/Region | Geographic eligibility | Contract performance | Program lifetime + 5 years |
|
||||
|
||||
### 2.3 Evaluation Data
|
||||
|
||||
| Data Type | Purpose | Legal Basis | Retention |
|
||||
|-----------|---------|-------------|-----------|
|
||||
| Jury evaluations | Competition judging | Contract performance | Program lifetime + 5 years |
|
||||
| Scores and comments | Competition judging | Contract performance | Program lifetime + 5 years |
|
||||
| Evaluation timestamps | Audit trail | Legitimate interest | Program lifetime + 5 years |
|
||||
|
||||
### 2.4 Technical Data
|
||||
|
||||
| Data Type | Purpose | Legal Basis | Retention |
|
||||
|-----------|---------|-------------|-----------|
|
||||
| Session tokens | Authentication | Contract performance | Session duration |
|
||||
| Magic link tokens | Passwordless login | Contract performance | 15 minutes |
|
||||
| Audit logs | Security, compliance | Legitimate interest | 12 months |
|
||||
| AI usage logs | Cost tracking, debugging | Legitimate interest | 12 months |
|
||||
|
||||
---
|
||||
|
||||
## 3. Data Processing Purposes
|
||||
|
||||
### 3.1 Primary Purposes
|
||||
|
||||
1. **Competition Management** - Managing project submissions, evaluations, and results
|
||||
2. **User Authentication** - Secure access to the platform
|
||||
3. **Communication** - Sending notifications about evaluations, deadlines, results
|
||||
|
||||
### 3.2 Secondary Purposes
|
||||
|
||||
1. **Analytics** - Understanding platform usage (aggregated, anonymized)
|
||||
2. **Security** - Detecting and preventing unauthorized access
|
||||
3. **AI Processing** - Automated filtering and matching (anonymized data only)
|
||||
|
||||
---
|
||||
|
||||
## 4. Third-Party Data Sharing
|
||||
|
||||
### 4.1 Subprocessors
|
||||
|
||||
| Subprocessor | Purpose | Data Shared | Location | DPA |
|
||||
|--------------|---------|-------------|----------|-----|
|
||||
| OpenAI | AI processing | Anonymized project data only | USA | Yes |
|
||||
| MinIO/S3 | File storage | Uploaded files | [Location] | Yes |
|
||||
| Poste.io | Email delivery | Email addresses, notification content | [Location] | Yes |
|
||||
|
||||
### 4.2 Data Shared with OpenAI
|
||||
|
||||
**Sent to OpenAI:**
|
||||
- Anonymized project titles (PII sanitized)
|
||||
- Truncated descriptions (500 chars max)
|
||||
- Project category, tags, country
|
||||
- Team size (count only)
|
||||
- Founded year (year only)
|
||||
|
||||
**NEVER sent to OpenAI:**
|
||||
- Names of any individuals
|
||||
- Email addresses
|
||||
- Phone numbers
|
||||
- Physical addresses
|
||||
- External URLs
|
||||
- Internal database IDs
|
||||
- File contents
|
||||
|
||||
For full details, see [AI Data Processing](./ai-data-processing.md).
|
||||
|
||||
---
|
||||
|
||||
## 5. Data Subject Rights
|
||||
|
||||
### 5.1 Right of Access (Article 15)
|
||||
|
||||
Users can request a copy of their personal data via:
|
||||
- Profile → Settings → Download My Data
|
||||
- Email to [DPO email]
|
||||
|
||||
**Response Time:** Within 30 days
|
||||
|
||||
### 5.2 Right to Rectification (Article 16)
|
||||
|
||||
Users can update their data via:
|
||||
- Profile → Settings → Edit Profile
|
||||
- Contact support for assistance
|
||||
|
||||
**Response Time:** Immediately for self-service, 72 hours for support
|
||||
|
||||
### 5.3 Right to Erasure (Article 17)
|
||||
|
||||
Users can request deletion via:
|
||||
- Profile → Settings → Delete Account
|
||||
- Email to [DPO email]
|
||||
|
||||
**Exceptions:** Data required for legal obligations or ongoing competitions
|
||||
|
||||
**Response Time:** Within 30 days
|
||||
|
||||
### 5.4 Right to Restrict Processing (Article 18)
|
||||
|
||||
Users can request processing restrictions by contacting [DPO email]
|
||||
|
||||
**Response Time:** Within 72 hours
|
||||
|
||||
### 5.5 Right to Data Portability (Article 20)
|
||||
|
||||
Users can export their data in machine-readable format (JSON) via:
|
||||
- Profile → Settings → Export Data
|
||||
|
||||
**Format:** JSON file containing all user data
|
||||
|
||||
### 5.6 Right to Object (Article 21)
|
||||
|
||||
Users can object to processing based on legitimate interests by contacting [DPO email]
|
||||
|
||||
**Response Time:** Within 72 hours
|
||||
|
||||
---
|
||||
|
||||
## 6. Security Measures (Article 32)
|
||||
|
||||
### 6.1 Technical Measures
|
||||
|
||||
| Measure | Implementation |
|
||||
|---------|----------------|
|
||||
| Encryption in transit | TLS 1.3 for all connections |
|
||||
| Encryption at rest | AES-256 for sensitive data |
|
||||
| Authentication | Magic link (passwordless) or OAuth |
|
||||
| Rate limiting | 100 requests/minute per IP |
|
||||
| Session management | Secure cookies, automatic expiry |
|
||||
| Input validation | Zod schema validation on all inputs |
|
||||
|
||||
### 6.2 Access Controls
|
||||
|
||||
| Control | Implementation |
|
||||
|---------|----------------|
|
||||
| RBAC | Role-based permissions (SUPER_ADMIN, PROGRAM_ADMIN, JURY_MEMBER, etc.) |
|
||||
| Least privilege | Users only see assigned projects/programs |
|
||||
| Session expiry | Configurable timeout (default 24 hours) |
|
||||
| Audit logging | All sensitive actions logged |
|
||||
|
||||
### 6.3 Infrastructure Security
|
||||
|
||||
| Measure | Implementation |
|
||||
|---------|----------------|
|
||||
| Firewall | iptables rules on VPS |
|
||||
| DDoS protection | Cloudflare (if configured) |
|
||||
| Updates | Regular security patches |
|
||||
| Backups | Daily encrypted backups, 90-day retention |
|
||||
| Monitoring | Error logging, performance monitoring |
|
||||
|
||||
---
|
||||
|
||||
## 7. Data Retention Policy
|
||||
|
||||
| Data Category | Retention Period | Deletion Method |
|
||||
|---------------|------------------|-----------------|
|
||||
| Active user accounts | Account lifetime | Soft delete → hard delete after 30 days |
|
||||
| Inactive accounts | 2 years after last login | Automatic anonymization |
|
||||
| Project data | Program lifetime + 5 years | Archived, then anonymized |
|
||||
| Audit logs | 12 months | Automatic deletion |
|
||||
| AI usage logs | 12 months | Automatic deletion |
|
||||
| Session data | Session duration | Automatic expiration |
|
||||
| Backup data | 90 days | Automatic rotation |
|
||||
|
||||
---
|
||||
|
||||
## 8. International Data Transfers
|
||||
|
||||
### 8.1 OpenAI (USA)
|
||||
|
||||
| Aspect | Details |
|
||||
|--------|---------|
|
||||
| Transfer Mechanism | Standard Contractual Clauses (SCCs) |
|
||||
| DPA | OpenAI Data Processing Agreement |
|
||||
| Data Minimization | Only anonymized data transferred |
|
||||
| Risk Assessment | Low (no PII transferred) |
|
||||
|
||||
### 8.2 Data Localization
|
||||
|
||||
| Service | Location |
|
||||
|---------|----------|
|
||||
| Primary database | [EU location] |
|
||||
| File storage | [Location] |
|
||||
| Email service | [Location] |
|
||||
|
||||
---
|
||||
|
||||
## 9. Cookies and Tracking
|
||||
|
||||
### 9.1 Essential Cookies
|
||||
|
||||
| Cookie | Purpose | Duration |
|
||||
|--------|---------|----------|
|
||||
| `session_token` | User authentication | Session |
|
||||
| `csrf_token` | CSRF protection | Session |
|
||||
|
||||
### 9.2 Optional Cookies
|
||||
|
||||
The platform does **not** use:
|
||||
- Marketing cookies
|
||||
- Analytics cookies that track individuals
|
||||
- Third-party tracking
|
||||
|
||||
---
|
||||
|
||||
## 10. Data Protection Impact Assessment (DPIA)
|
||||
|
||||
### 10.1 AI Processing DPIA
|
||||
|
||||
| Factor | Assessment |
|
||||
|--------|------------|
|
||||
| **Risk** | Personal data sent to third-party AI |
|
||||
| **Mitigation** | Strict anonymization before processing |
|
||||
| **Residual Risk** | Low (no PII transferred) |
|
||||
|
||||
### 10.2 File Upload DPIA
|
||||
|
||||
| Factor | Assessment |
|
||||
|--------|------------|
|
||||
| **Risk** | Sensitive documents uploaded |
|
||||
| **Mitigation** | Pre-signed URLs, access controls, virus scanning |
|
||||
| **Residual Risk** | Medium (users control uploads) |
|
||||
|
||||
### 10.3 Evaluation Data DPIA
|
||||
|
||||
| Factor | Assessment |
|
||||
|--------|------------|
|
||||
| **Risk** | Subjective opinions about projects/teams |
|
||||
| **Mitigation** | Access controls, audit logging |
|
||||
| **Residual Risk** | Low |
|
||||
|
||||
---
|
||||
|
||||
## 11. Breach Notification Procedure
|
||||
|
||||
### 11.1 Detection (Within 24 hours)
|
||||
|
||||
1. Automated monitoring alerts
|
||||
2. User reports
|
||||
3. Security audit findings
|
||||
|
||||
### 11.2 Assessment (Within 48 hours)
|
||||
|
||||
1. Identify affected data and individuals
|
||||
2. Assess severity and risk
|
||||
3. Document incident details
|
||||
|
||||
### 11.3 Notification (Within 72 hours)
|
||||
|
||||
**Supervisory Authority:**
|
||||
- Notify if risk to individuals
|
||||
- Include: nature of breach, categories of data, number affected, consequences, measures taken
|
||||
|
||||
**Affected Individuals:**
|
||||
- Notify without undue delay if high risk
|
||||
- Include: nature of breach, likely consequences, measures taken, contact for information
|
||||
|
||||
### 11.4 Documentation
|
||||
|
||||
All breaches documented regardless of notification requirement.
|
||||
|
||||
---
|
||||
|
||||
## 12. Contact Information
|
||||
|
||||
| Role | Contact |
|
||||
|------|---------|
|
||||
| **Data Protection Officer** | [DPO name] |
|
||||
| **Email** | [DPO email] |
|
||||
| **Address** | [Physical address] |
|
||||
|
||||
**Supervisory Authority:**
|
||||
Commission de Contrôle des Informations Nominatives (CCIN)
|
||||
[Address in Monaco]
|
||||
|
||||
---
|
||||
|
||||
## 13. Document History
|
||||
|
||||
| Version | Date | Changes |
|
||||
|---------|------|---------|
|
||||
| 1.0 | 2025-01 | Initial version |
|
||||
|
||||
---
|
||||
|
||||
## See Also
|
||||
|
||||
- [AI Data Processing](./ai-data-processing.md)
|
||||
- [AI System Architecture](../architecture/ai-system.md)
|
||||
Reference in New Issue
Block a user