I built this because I recently caught myself almost pasting a block of logs containing AWS keys into Claude.
The Problem: I need the reasoning capabilities of cloud models (GPT/Claude/Gemini), but I can't trust myself not to accidentally leak PII or secrets.
The Solution: A Chrome extension that acts as a local middleware. It intercepts the prompt and runs a local BERT model (via a Python FastAPI backend) to scrub names, emails, and keys before the request leaves the browser.
A few notes up front (to set expectations clearly):
Everything runs 100% locally. Regex detection happens in the extension itself. Advanced detection (NER) uses a small transformer model running on localhost via FastAPI.
No data is ever sent to a server. You can verify this in the code + DevTools network panel.
This is an early prototype. There will be rough edges. I’m looking for feedback on UX, detection quality, and whether the local-agent approach makes sense.
Tech Stack: Manifest V3 Chrome Extension Python FastAPI (Localhost) HuggingFace dslim/bert-base-NER Roadmap / Request for Feedback: Right now, the Python backend adds some friction. I received feedback on Reddit yesterday suggesting I port the inference to transformer.js to run entirely in-browser via WASM.
I decided to ship v1 with the Python backend for stability, but I'm actively looking into the ONNX/WASM route for v2 to remove the local server dependency. If anyone has experience running NER models via transformer.js in a Service Worker, I’d love to hear about the performance vs native Python.
Repo is MIT licensed.
Very open to ideas suggestions or alternative approaches.
Sure go ahead and roast me but please include full proof method you use to make sure that never happens that still allows you to use credentials for developing applications in the normal way.
I do something similar locally by manually specifying all the things I want scrubbed/replaced and having keyboard maestro run a script on my system keyboard whenever doing a paste operation that's mapped to `hyperkey + v`. The plus side of this is that the paste is instant. The latency introduced by even the littlest of inference is enough friction to make you want to ditch the process entirely.
Another plus of the non-extension solution is that it's application agnostic.
If we move the detection and modification process from paste to copy operation, that will reduce in-use latency
you might find this useful: https://github.com/classvsoftware/under-new-management
my port (and now fork): https://github.com/maxtheaxe/under-new-management-firefox
they currently (PRs are welcome!) only check listing info. mine doesn't route requests through an external (non addon store) server.
a couple PRs are overdue on mine due to linting making the diffs impossible. I'll get to it. (see the wxt-migration branch)
Something like this would greatly increase end user confidence. PII in the input could be highlighted so the user knows what is being hidden from the LLM.
As AI gets better and cheaper there will absolutely be influence campaigns conducted at the individual level for every possible thing anyone with money might want, and those campaigns will be so precisely targeted and calibrated by autonomous influencer AI that know so much about you that they will convince you to do the thing they want, whether by emotional manipulation, subtle blackmail, whatever.
It will also be extraordinarily easy to emit subliminal or unconscious signals that will encode a great deal more of our internal state than we want them to.
It will be necessary to have a 'memetic firewall' that reduces our unintentional outgoing informational cross section, while also preventing contamination by the torrent of ideas trying to worm their way into our heads. This firewall would also need to be autonomous, but by exploiting the inherent information asymmetry (your firewall would know you very well) it need not be as powerful as the AI that are trying to exploit you.
There's also:
It’s one thing for the ENVs to be user pasted but typically you’re also giving the bots access to your file system to interrogate and understand them right? Does this also block that access for ENVs by detecting them and doing granular permissions?
Also, how does this deal with inquiries when piece of PII is important to the task itself? I assume you just have to turn it off?
There are a lot of websites that scans the clipboard to improve user experience, but also pose a great risk to users privacy.
A user using an LLM is probably talking directly to the service inside a TLS connection (TCP/443) so there's not a lot of room to inspect the prompt at the same layer a Pihole might (unless you MITM yourself).
I think OP has the right idea to approach this from the application layer in the browser where the contents of the page are available. But to me it feels like a stopgap, something that fixes a specific scenario (copy/pasted private data into a web browser form), and not a proper service-level solution some have proposed (swap PII at the endpoint, or have a client that pre-filters).
Encrypting sensitive data can be more useful than blocking entire requests, as LLMs can reason about that data even without seeing it in plain text.
The ipcrypt-pfx and uricrypt prefix-preserving schemes have been designed for that purpose.