Population baselines
The Dokima recon corpus produces population baselines that shape what a per-dimension score actually means in context. A reader interpreting a low score on a fresh recalibration corpus should know which thresholds reflect the population norm versus authentic outliers.
These rates are recomputed at every quarterly recalibration cycle (see Calibration) and the table is republished here when any base rate moves more than a percentage point.
Recon baselines (April 2026, 1500-model corpus)
| Signal | Population rate | What it means |
|---|---|---|
model-index populated (Dim 5 evidence) | ~7% | A 0/15 in Dim 5 is the population norm, not an outlier. Models that publish a model-index are doing something the overwhelming majority of authors do not. |
library_name populated | ~72% | 28% of repos have no library_name. Largely structural — Hugging Face discontinued automatic library detection for repos created after August 2024, so newer repos that don't manually set the field end up empty even when the underlying library is obvious. |
pipeline_tag populated | ~73% | 27% missing. Same dynamic as library_name — newer repos that did not opt into the manual pipeline-tag declaration end up empty. |
Structured safetensors metadata block populated | ~45% | This is the upper bound on Dim 1's three-signal corroboration earning full credit; below that ceiling sits the metadata-quality long tail. |
| Models with zero structured signals across the four-signal set (library_name, pipeline_tag, model-index, safetensors block) | ~18% | These models score the floor on multiple dimensions simultaneously by population baseline rather than author failure. Dokima emits a LowSignalDensity drift flag on these models for analyst review; the flag is observability-only at v0.4. |
| Q-65 SafeTensors corroboration disagreement (Dim 1) | ~18.5% | Includes major vendors (Nvidia, Google's bert-base family, Stability AI). Almost all observed cases are metadata-quality issues at the upload step rather than adversarial gaming. The long-tenured-namespace charity multiplier in Dim 1 was added in v0.4 to soften the penalty for established authors. |
HF isVerified == true on namespaces | ~3% | Verified-status applies to a small minority of namespaces; the bonus mechanic in Dim 4 is sized accordingly so unverified accounts are not penalised. |
cardData.gated non-false | ~6% | Author-controlled access is rare; the +1 bonus in Dim 4 rewards the small fraction of authors who set the friction layer. |
EvalResultsUnknownShape drift flag fires per scan | ~10.8% | Recorded against an early multi-thousand-model recon. The HF .eval_results/*.yaml schema is partial; this rate measures how often a YAML file carries top-level keys we do not recognise. The defensive-parsing path (Dim 5) consumes the recognised fields without crashing the scan. |
How baselines are used
The baselines are a calibration tool, not a scoring input. A 0/15 in Dim 5 with the population baseline at 93 percent zero-rate is not exceptional; a 0/15 in Dim 5 with the baseline at 30 percent zero-rate would be. Knowing the baseline lets a reader of the methodology contextualise a specific score.
Drift alerts (see Calibration §automated-drift-alerts) trigger manual review when a per-dimension distribution moves a documented threshold from these published baselines. The alert path is what catches gameability drift between full sweeps.