Dimension 1 — Serialisation safety (25 points)
Assesses the risk profile of the model's file format. Pickle-based formats can execute arbitrary code on load. SafeTensors was designed specifically to prevent this.
| Format | Points | Rationale |
|---|---|---|
| SafeTensors only | 25 | Safest available format; no code execution |
| ONNX only | 22 | No arbitrary code execution at load (note: some runtimes can execute custom ops) |
OpenVINO .openvino | 12 | OpenVINO runtime tensor format; same safety profile as ONNX (deterministic graph deserialisation; no arbitrary code execution by default) |
OpenNMT .ot | 12 | Legacy OpenNMT tensor format; same safety profile as ONNX (deterministic binary tensor serialisation) |
MessagePack .msgpack | 8 | Safer than pickle by default (no callable types); requires loader hardening (the consumer must reject unknown extension types) to stay safe |
| GGUF only | 20 | Generally safe, used by llama.cpp ecosystem; metadata sections can be weaponised but no execution-on-load |
| H5 / Keras (without Lambda layers) | 16 | Safe when Lambda layers absent |
| H5 / Keras (with Lambda layers) | 8 | Lambda layers enable code execution |
PyTorch .pt / .bin (Pickle) | 5 | Arbitrary code execution on load |
Python wheel .whl | 3 | Code execution surface on pip install; broader exploit class than pickle (full package install runs setup.py / build hooks). Hugging Face does not formally restrict .whl files in model repos, so authors can ship them; Dokima publishes the gap rather than ignore it |
Raw Pickle .pkl | 2 | Highest risk format |
| Active malware scan flag (HF safety scanner / Picklescan / Palo Alto AI Model Security public feed) | 0 | Hard fail — see index Hard-fail behaviour |
| Mixed formats (safe + unsafe present) | Lowest applicable score among model-weight files | Conservative scoring; non-weight repository files (tokenizers, configs) are excluded from the format mix |
Weight-file identification uses a combination of extension matching, file-size heuristics, and (for SafeTensors specifically) cross-checking against the Hugging Face API's structured metadata when available. The precise allow-list and thresholds are operational details that evolve as new formats emerge.
SafeTensors three-signal corroboration (v0.2 onwards). A file claiming the SafeTensors extension is awarded full per-format points only when three independent signals agree:
- The file extension matches the SafeTensors allow-list and the file size is above the weight-file threshold.
- Hugging Face's structured
safetensorsmetadata block is present for the model and reports both a dtype breakdown and a non-null total parameter count. - The bare
safetensorstag is present in the model's tag array.
When all three signals agree the file scores 25 of 25. When two of three agree, a degraded-confidence signal is recorded and the file's contribution is multiplied by 0.7 (rounded to the nearest integer). When only one of three agrees (extension only), a disagreement signal is recorded and the contribution is multiplied by 0.3. The two multipliers are published values; they are calibrated against observed real-model false-positive rates at each quarterly review and can change without bumping the methodology version.
This three-signal rule replaces the earlier single-tier demotion behaviour. The intent is the same — defending against gameable models that ship a malicious file under a SafeTensors extension — but with proportional rather than binary scoring so legitimate models with malformed metadata are not pushed into the next-tier-down bucket.
Long-tenured-namespace charity multipliers (v0.4 onwards). A 1000-model recon run in April 2026 found that 18.5 percent of popular models trigger the disagreement multiplier above, including major vendors (Nvidia, Google's bert-base family, Stability AI). On inspection, almost all of those cases were metadata-quality issues at the upload step rather than adversarial gaming — a real SafeTensors file shipped without the matching tag, or HF's metadata block missed indexing. Penalising those models the same way Dokima penalises a freshly-registered single-upload account would inflate the noise floor on otherwise-trustworthy authors. The fix is namespace-age-aware multipliers: when the namespace shows at least one year of Hugging Face presence AND at least five published models, the corroboration multipliers soften to 0.85 (degraded) and 0.5 (disagreement). The drift flag still fires — the metadata-vs-reality signal is still surfaced for analysts — but the score penalty is gentler. New or sparse namespaces continue to use the default 0.7 / 0.3 multipliers because the same metadata-quality discount cannot be earned without tenure. The two thresholds (one year and five models) are published values calibrated at the next quarterly review.
Picklescan is best-effort by Hugging Face's own admission. Dokima penalises pickle-format models because the format itself is fundamentally unsafe regardless of any specific scan outcome. Hugging Face's own integration of Picklescan is described in their documentation as "best effort" — a clean Picklescan result on a .bin or .pt file does not certify the file is safe, only that the specific patterns Picklescan recognises did not match. Dokima's stance is that the safe move is to use SafeTensors when one is available; if a model ships SafeTensors and Pickle side-by-side, Dokima scores the model on the lowest applicable format because that is what a downstream consumer might actually load.
Acknowledged limitation — format detection without file download. Dokima does not download model files (this is a deliberate non-goal: it sidesteps legal liability and keeps infrastructure costs near zero). Format detection therefore relies on file extensions exposed in the Hugging Face file-listing API. A .safetensors extension on a malicious file is indistinguishable from a real SafeTensors file at the metadata layer. We state this explicitly so users know what the score does and does not verify. The three-signal corroboration above mitigates the most common gameability vector but does not eliminate it.
Acknowledged limitation — malware-tag coverage gap. Dokima's hard-fail catches Hugging Face disabled models plus models self-tagged with malware, malicious, unsafe, picklescan-flagged, or protectai-flagged. A 250-model deeper recon conducted in April 2026 found that this catches 0 of 250 popular models and 0 of 1 publicly-disclosed malicious model (the model star23/baller13, named in JFrog Security Research's February 2024 disclosure as carrying a pickle-execution payload, was still publicly accessible with disabled: false and zero security tags two years post-disclosure). Hugging Face appears to take internal action on disclosed-malicious models without surfacing a public structured signal we can read. We close this gap in a coming release via an internal denylist of disclosed-malicious model identifiers sourced from external security research (JFrog, ProtectAI, ReversingLabs, Datadog, the picklescan and ModelScan project trackers, plus the Hugging Face community trust and safety threads). The denylist is reviewed monthly. We publish this limitation transparently because knowing what a score does and does not verify is the foundation of the score's value.