Known-malicious registry

The known-malicious registry is a curated list of Hugging Face model identifiers (and the weights they ship) that have been independently flagged for malicious payloads, supply-chain compromise patterns, or namespace-hijacking signatures. When a model identifier matches the registry, or when its weight files fingerprint-match a previously-flagged entry, Dokima surfaces an attestation on the score report citing the registry as a source.

This page documents what the registry contains, how new entries land in it, how each entry is verifiable, and how authors who dispute an entry can have it reviewed.

What's in the registry

Every entry carries the following structured fields:

Field	What it stores
`model_id`	Hugging Face identifier the entry was first observed against (`author/model`).
`weights_fingerprint`	SHA-256 hashes of the weight files at the moment the entry was first observed. Used by the mirror and fork detection path to catch byte-identical republications under a new namespace.
`severity`	Classification of the finding: critical (active malware), high (vulnerable dependency or hijack-pattern match), medium (suspicious metadata, awaiting confirmation), info (contextual notes; never affects grade).
`source`	The authority that surfaced the finding — Hugging Face safety scanner, OSV.dev, Protect AI Guardian disclosure (where still available), security-research disclosure programs, or Dokima operator manual review.
`source_url`	An authoritative click-target verifying the entry against its upstream source. Every entry must carry at least one source URL; the dispute process can challenge any entry whose source URL fails to support the finding.
`added_at` / `last_seen_at`	UTC timestamps. The pair brackets when the entry was created + when its last positive scan confirmation occurred.
`notes`	A short analyst-readable explanation of the finding, written for a sharp 17-year-old (per the Dokima documentation register).

Seed methodology

The registry started with a small curated seed list (under 100 entries) covering:

Publicly disclosed supply-chain incidents. Repositories cited in security advisories, conference talks, or vendor disclosures as confirmed malicious or compromised at a specific point in time. Each seed entry carries the disclosure URL as source_url.
Hugging Face safety-scanner positives that surfaced via the platform's own detection on canonical malicious-format files (e.g. pickle files with __reduce__-based code execution paths).
Security-research disclosure artifacts that publish intentionally malicious models for the purpose of vulnerability-disclosure programs (huntr, CVE submissions, responsible-disclosure write-ups). These models are flagged at F grade but carry an additional self-declared-purpose note distinguishing them from supply-chain attacks. See the calibration policy for the wider context on how the rubric handles intentional security-research artifacts.

The seed list errs on the side of small. Verified disclosures only. No rumours, no speculation, no entries based on unverified user reports.

Auto-append discipline

The registry grows in three structurally-disciplined ways:

Detector promotion. When Dokima's scanning path detects an active malware pattern via Hugging Face safety scanner positive, an OSV.dev advisory match against the model's declared dependencies, or a Dokima-side static-analysis hit, the finding is recorded as an attestation on the verdict + the model identifier is appended to the registry with the relevant severity and source fields. The source URL on the appended entry is the same URL surfaced to the score-report viewer, so the audit trail is symmetric end to end.
Fingerprint matches. When a newly-scanned model's weights byte-identical match an existing registry entry, the new identifier is appended as a sibling entry. The original entry's source URL is copied; the sibling carries a "mirror of" note linking to the original. This catches the rebrand-and-republish pattern where an adversary deletes a flagged model and re-uploads weights byte-identical under a new namespace.
Operator manual review. Where a finding originates from a source Dokima does not currently automate against (e.g. a CVE disclosure with no machine-readable advisory, or a vendor blog post citing a specific repository), the Dokima operator may manually append the entry. Manual entries follow the same shape as automated entries: every manual entry must carry at least one source URL pointing to the disclosure that motivated it.

How entries are verifiable

Every entry on a public score report carries the source URL the entry was added under. Viewers and external researchers can verify the claim by following the link: read the disclosure, examine the upstream advisory, inspect the flagged file on Hugging Face. This makes the registry an audit surface in its own right — not a black box.

Where multiple sources confirm the same finding, the registry stores one entry with the strongest source URL surfaced on the score report; the other sources are accessible via the appeals process documented below.

Disputes and removal

Model authors who believe their model has been wrongly registered may dispute the entry via the appeals process. Disputes are reviewed within 5 business days. Confirmed errors (entries whose source URL fails to support the finding, entries that confused the model with a similarly-named repository, entries where the underlying vulnerability has been remediated and the finding no longer applies) are corrected, logged in the methodology changelog, and the affected model is automatically rescanned with the corrected registry state.

Entries are not removed simply because the author has since cleaned up the repository — the historical attestation remains as evidence of the state at scan time, and the calibration corpus depends on that history. Where a remediation is genuine and the underlying weights no longer match the malicious fingerprint, the registry stores a "remediated on" timestamp on the entry rather than deleting it, so the trail stays auditable.

Out of scope

The registry does not store anti-gaming heuristics or detection signatures — those live in the private detection layer per the open-core split. The registry stores what has been flagged, with audit links to where the flag came from. It does not store how Dokima decided to flag it; that's the rubric and the per-dimension detectors, all of which are documented elsewhere in this methodology.