← Back to methodology index

Dimension 2 — Model card completeness (20 points)

Assesses whether the model is properly documented. A complete model card is the primary transparency mechanism for model authors and is directly referenced in EU AI Act Article 13. The dimension reads two sources in parallel: the structured cardData block in the model's YAML frontmatter, and the prose sections in the README markdown body.

SignalPointsSource
Model card exists (README body present and non-empty)4README presence
Intended Use section present with body3README heading detector
Limitations / Known Limitations section present with body3README heading detector
Bias, Risks, and Limitations section present with body2README heading detector
Out of Scope Use section present with body1README heading detector
Get Started code paired with declared library_name1README + cardData (conditional combo)
Training Data section present with body1README heading detector
Datasets — Hugging Face Hub-resolved2cardData.datasets resolved against Hub
Datasets — declared but unresolved1cardData.datasets declared, not on Hub
Evaluation results present (model-index OR README "Evaluation" section)3cardData.model-index OR README

Each signal is assessed using a combination of structural checks (does the section exist?) and content checks (does it contain substantive material?). A heading with no body text below it does not earn the points; the parser walks each section's body window and only credits the section when at least one non-blank, non-heading line of prose follows the heading. Heading detection is case-insensitive and tolerates the common variants (e.g. Limitations and Known Limitations; Bias, Risks, and Limitations and Risks and Limitations).

Out of Scope Use is a separate signal (v0.4 onwards). Authors who actively document where their model should NOT be used are doing the harder, more useful kind of safety work than authors who only describe what the model is for. The dimension was rebalanced in v0.4 to reward this with its own one-point sub-rule, with the Training Data sub-rule trimmed from two to one to absorb the change (the dataset signal below now covers training-data documentation more precisely than a free-prose section heading).

Get Started + library_name is a conditional combo. Many authors paste a pip install snippet into a README without declaring which library the model is built against. That is partial documentation — a downstream user reads the snippet and still has to guess whether the model wants transformers, vllm, diffusers, or something else. Dokima awards the one-point Get Started bonus only when both signals are present together: the README has a "How to Get Started" section AND the model has either a library_name tag or a populated pipeline_tag. The combo rewards coherent documentation rather than isolated documentation.

Datasets resolution is a three-tier ladder. Authors declare datasets in the cardData.datasets field of the YAML frontmatter. Dokima takes each declared id and asks Hugging Face whether that id resolves to a real Hub-hosted dataset. A model that declares glue and the id resolves on the Hub earns the full two points; a model that declares internal-corpus-v3 (a real dataset, but not on the Hub) earns one point because the author at least documented the source; a model that declares no datasets earns zero. This is content-addressable provenance: the higher tier requires the dataset itself to be inspectable, not just named.

The integer point allocation here flatten an internal half-point tier from the design discussion (free-prose dataset mentions in the README that don't appear in cardData.datasets). Detecting those mentions reliably needs a regex sweep over the prose against a curated dataset-name list — that piece ships in a subsequent recalibration. Until then, free-prose dataset mentions floor to zero and the integer-tier scoring is what is published. Calling this out so authors who read the methodology before reading their score know exactly why their cardData.datasets-empty model loses the point.

No-card behaviour: if no README exists at all and cardData is empty, every sub-check fails and the dimension scores 0/20. If the README exists but exceeds the parser size cap (1 MB), the existence sub-rule fires but the section sub-rules score zero — the model is documented but not in a form Dokima can read at scale.