← Back to methodology index

Every Dokima score is auditable

Dokima is built so that every score it produces is independently reproducible — given the same Hugging Face metadata, the same methodology version, and the same weights configuration, two scans of the same model produce byte-identical Verdict JSON. This is a load-bearing property for regulated-AI-deployer audiences (banks, insurance, defence, healthcare) who need to be able to audit how a third-party scoring tool reached a verdict on a model they intend to deploy.

This page documents the invariants that make the property true and the path a third party can take to verify it.

What "auditable" means

For each score Dokima publishes, the following are independently reproducible without any access to Dokima's internal infrastructure:

  1. The score itself. Re-running the open-source dokima CLI against the same model at the same Hugging Face commit SHA produces the same score_total, the same per-dimension breakdown, the same drift flags, and the same grade.
  2. The methodology applied. Each Verdict is stamped with methodology_version (the v0.X version published in this documentation) so the rubric in effect at scan time is unambiguous.
  3. The weights applied. Each Verdict is stamped with weights_sha256 — the SHA-256 of the weights.toml configuration file consumed by the engine. Two scans with identical SHA were scored under identical weights.
  4. The serialisation order. Every map in the Verdict JSON is keyed in canonical order (the engine uses BTreeMap at every serialisation boundary), so re-scoring on a different machine or tomorrow does not produce a JSON that differs in field order.

The invariants

These are the load-bearing rules from the project robustness discipline. Each is enforced in code:

InvariantWhat it guaranteesWhere it's enforced
Wall-clock timestamps captured once per scan and reusedA multi-step audit pipeline does not see two timestamps for the same scandokima-scoring/src/audit.rs AuditRegistry::run captures Utc::now() once and reuses it
HashMap iteration order must not leak into stored outputTwo runs of the same scan produce byte-identical JSONEvery serialisation path uses BTreeMap, not HashMap
Idempotent score persistenceTwo writes of the same (model_id, commit_sha) are a no-opScoreStore::put contract; SurrealDB upsert
Cache key includes the commit SHACached content is immutable per SHAScoreKey::as_cache_key returns score::{author}/{model}@{commit_sha}
Methodology version stamp on every VerdictThe rubric in effect at scan time is unambiguousVerdict::Scored.methodology_version
Weights SHA-256 stamp on every VerdictThe weights in effect at scan time are unambiguousVerdict::Scored.weights_sha256

An explicit canary test runs the in-process scan 100 times against a fixture model and asserts the JSON is byte-identical across every run (with scanned_at masked because that field is by-design the time of the scan, not a function of the data). The test lives at crates/dokima-cli/tests/scan_reproducibility.rs and runs as part of the standard cargo test workspace gate.

How to verify a score independently

The verification path uses the open-source dokima CLI and the Hugging Face fixture corpus that ships with the repo:

# 1. Clone the public Dokima repo (engine + stubs are AGPL-3.0-or-later). git clone https://github.com/The-Malware-Files/dokima cd dokima # 2. Build the CLI. cargo build --release -p dokima-cli # 3. Score a model offline against the bundled fixture corpus. ./target/release/dokima scan fixtures/safetensors-only-model \ --replay crates/dokima-hf-client/fixtures \ --json > my-verdict.json # 4. Compare against the same fixture's reference verdict. diff my-verdict.json crates/dokima-cli/fixtures/expected-verdicts/safetensors-only.json

For a live model rather than the fixture corpus, set HUGGING_FACE_READ_KEY and run:

./target/release/dokima scan {author}/{model} --json

Two scans of the same {author}/{model}@{commit_sha} will produce the same Verdict (modulo scanned_at).

What is NOT auditable

Honesty about the open-core split (see also coverage-disclosure.md):

  • The hijacking-signature implementations in the private model-security-core crate are NOT publicly auditable. Their categories are documented on the Dim 4 page, but the specific signature definitions stay private to prevent adversarial route-around-testing.
  • The curated benchmark vocabulary for Dim 5's suspicious-metric flag is similarly private.
  • The disclosed-malicious-model denylist (the malware-tag coverage gap mitigation) is monthly-reviewed but not publicly published; the input sources (JFrog, ProtectAI, ReversingLabs, etc.) are.

Public-engine-only OSS builds run with stub implementations of these private surfaces. The Verdict JSON from a public OSS build is structurally identical to a production Verdict; the difference is the production binary may emit additional drift flags from the private detection layer that the OSS build's stubs cannot. Both are auditable per the invariants above; the production build's additional flags trace back to the private crate's own internal review process.

Why this matters

Lighthouse-style scoring tools that produce a single number without a reproducibility trail are useful for product feedback but cannot survive an audit conversation. A regulated AI deployer pilot conversation that opens with "we ran this scoring tool and our model scored X" needs to be able to answer "show us how X was computed" without depending on the scoring vendor's continued availability or goodwill. The invariants above are what make that answer possible.

This is the Dokima moat for regulated-AI-deployer prospects: open methodology + reproducible-by-design + content-addressable cache + offline replay + version stamps. No competitor scoring tool currently ships all five.