Agentic Document Analyser

Articles 11 + 18Infrastructure

Converts unstructured compliance documents — risk assessments, model cards, contracts, audit logs — into structured JSON using Vision-Language Models. Acts as the evidence processing layer for the AiExponent compliance toolchain. Feeds Article 11 technical documentation and Article 18 log preservation workflows.

Quick Start

bashdocker compose up

Features

  • Vision-Language Model (Qwen2-VL) for unified layout analysis and OCR in a single pass
  • Detects and classifies document elements: text blocks, headings, tables, figures, form fields, signatures
  • Returns precise bounding boxes for every detected element
  • Parallel page processing for multi-page PDFs
  • Structured JSON output consumable by downstream compliance tools
  • Docker Compose deployment — four microservices, one command

EU AI Act Context

Articles 11 + 18Evidence Processing

Structures unstructured compliance documents into machine-readable JSON for Article 11 technical documentation packages and Article 18 log preservation workflows. Does not implement Article 9 risk management — use RiskForge for that.

Known Limitations

  • Requires Docker Compose; no standalone pip package available.
  • Depends on Fireworks AI API key — no offline/local inference by default.
  • No persistent storage; results are not retained between container restarts.
  • No authentication on the /analyze endpoint — not suitable for public deployment without a reverse proxy.
  • Alpha quality: no production hardening, rate limiting, or database backend yet.

For the most current status, see GitHub issues.

Contributing

Contributions are welcome — Apache 2.0 licensed. See the contributing guide and open issues.

License

Licensed under the Apache License 2.0.

The Compound Moat

One tool is a start. The chain is the moat.

Each AiExponent tool produces structured evidence the next tool consumes. Browse the full toolchain — from Article 5 screening through Article 72 post-market monitoring.

See all tools →