License Compliance Checker

Article 53Flagshipv1.1.0

Scans AI models, software packages, and agentic pipelines for license compliance across 8 ecosystems. Detects HuggingFace model references in code, GGUF/ONNX files, and generates EU AI Act Article 53 audit evidence with an honest dataset risk registry.

On this page

Quick Start

bashpip install license-compliance-checker
python# Scan a project for license violations
lcc scan . --policy eu_ai_act --format json

# Scan with transitive dependency analysis (requires lock file)
lcc scan . --include-transitive --policy permissive

# Detect AI model licenses referenced in code
lcc scan . --format sarif --output report.sarif

Features

  • Detects AI model license references in Python/YAML/JSON code (from_pretrained, model=, etc.)
  • Scans GGUF and ONNX files — covers Ollama and llama.cpp model formats
  • Multi-ecosystem: Python, Node.js, Go, Rust, Ruby, Java, .NET, HuggingFace
  • AI license registry: RAIL, OpenRAIL, Llama, Gemma, Mistral, BigScience and more
  • Dataset risk registry: flags OpenAI API outputs, ShareGPT, Books3 as high/critical risk
  • EU AI Act Article 53 assessor with honest scope framing
  • SBOM export: CycloneDX and SPDX formats
  • FastAPI server + CLI + JSON/SARIF/CSV report formats

Regulatory Foundation

Article 53Obligations for providers of general-purpose AI modelsApplication date 2025-08-02· Enforced

Read the full pillar: EU AI Act Article 53 explainer →

What the regulation requires

1. Providers of general-purpose AI models shall: (a) draw up and keep up-to-date the technical documentation of the model, including its training and testing process and the results of its evaluation, which shall contain, at a minimum, the information set out in Annex XI for the purpose of providing it, upon request, to the AI Office and the national competent authorities; (c) put in place a policy to comply with Union law on copyright and related rights, and in particular to identify and comply with, including through state-of-the-art technologies, a reservation of rights expressed pursuant to Article 4(3) of Directive (EU) 2019/790; (d) draw up and make publicly available a sufficiently detailed summary about the content used for training of the general-purpose AI model, according to a template provided by the AI Office.
53(1)(a)53(1)(c)53(1)(d)

What you face if you don't comply

Article 53 has been enforceable since 2 August 2025 — GPAI providers placing models on the EU market today owe technical documentation, a copyright-compliance policy aligned with Directive (EU) 2019/790 Article 4(3), and a public training-data summary on the AI Office template. Non-compliance is sanctionable by the Commission under Article 101 at up to €15M or 3% of global annual turnover. The operational reality is that the copyright-policy and training-data-summary obligations require auditable evidence at the dataset level, not assertions in a model card.

Up to €15M or 3% of global annual turnover
Article 101(1) · Penalties

How License Compliance Checker addresses this

  • 53(1)(a)Generates per-model SBOM-style training-data manifests fit for inclusion in the Annex XI technical documentation pack
  • 53(1)(c)Detects rights-reservation signals (TDM opt-outs, robots.txt, ai.txt) across training corpora to evidence the Art. 4(3) Directive 2019/790 policy
  • 53(1)(c)Flags incompatible-licence content (NC, ND, viral copyleft) before it enters a training run, with provenance trail
  • 53(1)(d)Produces the public training-content summary in the AI Office template format, regenerated on each dataset revision

Source: eur-lex.europa.eu/…/CELEX:32024R1689 · Retrieved

Frequently asked questions

Direct answers to common questions about License Compliance Checker and Article 53. Regulatory citations reference EUR-Lex CELEX:32024R1689.

What does EU AI Act Article 53 require?
Providers of general-purpose AI models must keep up-to-date Annex XI technical documentation, put a copyright-compliance policy in place aligned with Directive (EU) 2019/790 Article 4(3), and publish a sufficiently detailed training-data summary using the AI Office template. Source: Regulation (EU) 2024/1689 Article 53(1).
Is LCC a substitute for legal review?
No. LCC produces audit evidence — SBOMs, license-conflict reports, training-data risk registries — that legal counsel reviews. The tool does not provide legal opinions or substitute for qualified counsel.
What ecosystems and file formats does LCC scan?
Eight package ecosystems (Python, Node.js, Go, Rust, Ruby, Java, .NET, HuggingFace) plus AI model files in GGUF and ONNX formats — covering Ollama and llama.cpp deployments. The full feature list is in the documentation.
Does LCC detect AI model licenses, not just code dependencies?
Yes. LCC includes an AI license registry covering RAIL, OpenRAIL, Llama, Gemma, Mistral, BigScience and other AI-specific licenses. It also detects HuggingFace model references in Python, YAML, and JSON code (e.g. `from_pretrained`, `model=`).
Can LCC generate the AI Office training-data summary template?
LCC produces inputs for that summary — a per-model training-data manifest with provenance and licensing. The final AI Office template completion is a documentation task; LCC supplies the structured data needed to fill it. Treat the output as evidence, not as the certified summary itself.
What is the penalty for non-compliance with Article 53?
Up to €15M or 3% of global annual turnover, whichever is higher, imposed by the European Commission under Article 101(1). Note that GPAI fines are Commission-imposed under Art. 101 — distinct from the Article 99 fines that member-state market-surveillance authorities impose for high-risk-system violations.
Is LCC really free? What is the catch?
LCC is Apache 2.0 licensed, free for any use including commercial. There is no telemetry, no remote calls, no enterprise tier locked behind paywalls. The "catch" is that you run it on your own infrastructure and review the output yourself.
Does LCC scan transitive dependencies?
Yes, when a lock file is present (poetry.lock, package-lock.json, etc.). Without a lock file, LCC scans declared direct dependencies only and warns about the transitive gap.

Known Limitations

  • HuggingFace Hub API scanning requires referenced models (not local downloads only).
  • SPDX AND/OR compound expressions flagged for manual review — not auto-resolved.
  • Transitive dependency resolution requires a lock file (poetry.lock, package-lock.json).
  • Article 53 assessment covers documentation completeness only — not a legal compliance determination.
  • Training data risk registry covers top-50 known datasets; unknown datasets flagged for review.

For the most current status, see GitHub issues.

Contributing

Contributions are welcome — Apache 2.0 licensed. See the contributing guide and open issues.

License

Licensed under the Apache License 2.0. Not legal advice. Not a notified body.

The Compound Moat

One tool is a start. The chain is the moat.

Each AiExponent tool produces structured evidence the next tool consumes. Browse the full toolchain — from Article 5 screening through Article 72 post-market monitoring.

See all tools →