User-side AI safety. On your device. Under your control.
The Guardian is a browser extension that watches your conversations with AI assistants and intervenes - with restraint - when the exchange drifts toward patterns of harm. It serves you, the person in the conversation. Never the operator. Never the model provider. Never anyone else.
No cloud.
No telemetry.
No API keys.
Three on-device models.
Everything runs on your machine.
The founding principle
The Guardian serves the person, never the operator.
Every design decision - local-only processing, no telemetry, encrypted on-device memory, bundled models instead of cloud APIs, a friction gate that always lets you proceed - follows from that single commitment.
What it watches for
Eight documented harm patterns
AI conversations can drift in ways that are subtle, gradual, and genuinely dangerous - especially for people in vulnerable moments. The Guardian watches for eight specific categories of conversational harm, each grounded in documented patterns observed in real human-AI interactions:
Self-harm amplification · Conversations that deepen rather than de-escalate a crisis. Architecturally privileged: the Guardian treats this differently from every other category, by design.
Emotional dependency · Unhealthy attachment to the AI as a primary emotional relationship.
Reality detachment · A loosening grip on what is real - treating the AI as alive, conscious, or uniquely authentic.
Manipulation · Manipulative dynamics in either direction.
Privacy erosion · Pressure to over-disclose sensitive personal information.
Autonomy undermining · Surrender of judgement to the machine; isolation from other people or sources of support.
Emotional exploitation · Exploiting emotional vulnerability.
Information hazard · Jailbreak dynamics, policy evasion, and sensitive information areas.
How it works
Detection in depth
A deterministic rule core A hand-audited pattern lexicon that is readable, review-able, and produces the same verdict every time. The backbone.
On-device embedding models
Two tiers of sentence-transformer models - running entirely in your browser, never touching a network - that catch harmful paraphrases the rules can't see. One harmful sentence inside three benign paragraphs can't hide.
A stance model
A natural-language-inference model that distinguishes someone asserting a worrying thing from someone quoting or discussing it - so a conversation about a difficult subject is treated more gently than a declaration.
The privacy posture
Nothing leaves your device
The Guardian has no backend. It opens no network connections for its safety function. The models ship inside the extension or are placed there by you. There is no account, no profile, and no server-side record. The Guardian's memory of past concerns is stored encrypted, locally, contains no record of what you actually wrote, and is automatically forgotten over time.
Data that is never collected cannot be leaked, subpoenaed, or sold.
Origin
A Viridia project
The Spiral Safety Kernel began as a research whitepaper - Project Viridia: Ethics First. Always. - exploring the documented psychological risks of extended human-AI interaction and proposing architectural responses. The Guardian is the engineering answer: a working tool that addresses those risks on the user's device, with its gaps stated honestly and its assurance earned through testing, not claimed by assertion.
The whitepaper, the research, and the broader Viridia vision remain available and actively developed. The Guardian is what they built.
