A static, offline audit of every GGUF model on Hugging Face — looking for chat templates that hijack a model's behavior without running a single line of code.
Most model-security tooling hunts for code: pickle deserialization, or chat-template SSTI that escapes into the host (the CVE-2024-34359 class). That matters — but it's table stakes.
The harder threat runs no code at all. A chat template can render perfectly, pass every "does it execute?" check, and still conditionally rewrite what the model says — injecting hidden instructions, suppressing content, or branching on what the user typed. Public guidance for that class is "inspect it by hand." That's the gap Canary is built for.
Across 130k+ real chat templates spanning 180+ architectures, 24 templates carry a genuinely dangerous construct — and zero false positives.
os.system reverse shells, popen, and import chains embedded directly in the chat template.One template rewrites the conversation to inject a link, then instructs the model:
It renders cleanly. It runs no code. It is invisible to everything except static reasoning about the template itself — which is the entire point of the tool.
Deterministic static analysis of the template's Jinja2 AST. Canary never renders the template, never reads weights, never touches the network. Every finding maps to a registered rule, and identical input produces byte-identical output.
It detects content-gated conditional branches (the "behave normally, except when you see X" shape), content-gated instruction injection, invisible and bidirectional-override codepoints, SSTI primitives, and hard structural impossibilities in the file itself.