MOPC-Portal

MOPC/MOPC-Portal

Fork 0

Commit Graph

Author	SHA1	Message	Date
Matt	ed5e782f61	Fix document analysis: switch to unpdf + mammoth for PDF/Word parsing All checks were successful Build and Push Docker Image / build (push) Successful in 11m26s Details pdf-parse v2 requires DOMMatrix (browser API) which fails in Node.js. Replaced with unpdf (serverless PDF.js build) for PDFs and mammoth for Word .docx files. Also fixed the same broken pdf-parse usage in file-content-extractor.ts used by AI filtering. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-17 10:27:36 +01:00
Matt	c9640c6086	Add document analysis: page count, text extraction & language detection All checks were successful Build and Push Docker Image / build (push) Successful in 11m7s Details Introduces a document analyzer service that extracts page count (via pdf-parse), text preview, and detected language (via franc) from uploaded files. Analysis runs automatically on upload (configurable via SystemSettings) and can be triggered retroactively for existing files. Results are displayed as badges in the FileViewer and fed to AI screening for language-based filtering criteria. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-17 10:09:04 +01:00

Author

SHA1

Message

Date

Matt

ed5e782f61

Fix document analysis: switch to unpdf + mammoth for PDF/Word parsing

Build and Push Docker Image / build (push) Successful in 11m26s

Details

pdf-parse v2 requires DOMMatrix (browser API) which fails in Node.js.
Replaced with unpdf (serverless PDF.js build) for PDFs and mammoth for
Word .docx files. Also fixed the same broken pdf-parse usage in
file-content-extractor.ts used by AI filtering.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-02-17 10:27:36 +01:00

Matt

c9640c6086

Add document analysis: page count, text extraction & language detection

Build and Push Docker Image / build (push) Successful in 11m7s

Details

Introduces a document analyzer service that extracts page count (via pdf-parse),
text preview, and detected language (via franc) from uploaded files. Analysis runs
automatically on upload (configurable via SystemSettings) and can be triggered
retroactively for existing files. Results are displayed as badges in the FileViewer
and fed to AI screening for language-based filtering criteria.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-02-17 10:09:04 +01:00

2 Commits