It kind of sounds like the LLM built a large system that doesn't necessarily achieve any actual value.
1. There are a lot of Agentic Data Plane startups for knowledge workers(not really for coders[1] but for CFOs, Analysts etc) going up. e.g https://www.redpanda.com/ For people to ask "Hey give me a breakdown of last year's sales target by region, type and compare 2026 to 2025 for Q1".
Now this can be done entirely on intranet and only on certain permissioned data servers — by agents or humans — but as someone pointed out the intranet can also be a dangerous place. So I guess this is about protecting DB tables and Jiras and documentation you are not allowed to see.??
2. People who have skills — like the one OP has with wasm (I guess?) — are building random infra projects for enabling this.
3. All the coding people are getting weirded out by its security model because it is ofc not built for them.
[1] As I have commented elsewhere on this thread the moment a coder does webfetch + codeexec its game over from security perspective. Prove me wrong on that please.
SEKS — Secure Environment for Key Services
We built a broker for the keys/secrets. We have a fork of nushell called seksh, which takes stand-ins for the actual auth, but which only reifies them inside the AST of the shell. This makes the keys inaccessible for the agent. In the end, the agent won't even have their Anthropic/OpenAI keys!
The broker also acts as a proxy, and injects secrets or even does asymmetric key signing on behalf of the proxied agent.
My agents are already running on our fork of OpenClaw, doing the work. They deprecated their Doppler ENV vars, and all their work is through the broker!
All that said, we might just take a few ideas from IronClaw as well.
I put up a Show HN, but no one noticed: https://news.ycombinator.com/item?id=47005607
Website is here: https://seksbot.com/
I mean honestly if you pronounce the name it is going to sound like that outside eastern europe too, so I am not sure about that name choice at all. Intentional?
Looking at the website it looks like a vibecoded joke, but what do I know.
We need to know if the email being sent by an agent is supposed to be sent and if an agent is actually supposed to be making that transaction on my behalf. etc
A VM is too coarse grained and doesn't know how to deal with sensitive data in a structured and secure way. Everything's just in the same big box.
You don't want to give a a single agent access to your email, calendar, bank, and the internet, but you may want to give an agent access to your calendar and not the general internet; another access to your credit card but nothing else; and then be able to glue them together securely to buy plane tickets.
The problem is that it seems (at least for now) a very hard problem, even for very constrained workflows. It seems even harder for "open-ended" / dynamic workflows. This gets more complicated the more you think about it, and there's a very small (maybe 0 in some cases) intersection of "things it can do safely" and "things I need it to do".
1. An LLM given untrusted input produces untrusted output and should only be able to generate something for human review or that's verifiably safe.
2. Even an LLM without malicious input will occasionally do something insane and needs guardrails.
There's a gnarly orchestration problem I don't see anyone working on yet.
Sandboxes could provide that level of observability, HOWEVER, it is a hard lift. Yet, I don't have better ideas either. Do you?
I think it is in general pretty futile to implement permission systems / guardrails which basically insert a human in the loop (humans need to review the work to fully understand why it needs to send that email, and at that point, why do you need a LLM to send the email again?).
Solutions no, for now continued cat/mouse with things like "good agents" in the mix (i.e. ai as a judge - of course just as exploitable through prompt injection), and deterministic policy where you can (e.g. OPA/rego).
We should continue to enable better integrations with runtime - why i created the original feature request for hooks in claude code. Things like IFC or agent-as-a-judge can form some early useful solutions.
If agents are reading - add at least minimal encryption detection, I'm sure someone will use Caesar encryption, easy to decode.
Of the three Bedrock is probably the best for trust, but still not private by any means.
At this rate, it's going to be simply impossible to catchup in just a few months.
I'm currently using this for social media research via browser automation, running as a daily cron job.
Given I have VNC access and the browser is not in headless mode I can solve captchas myself as the agent runs into them.
Apart from a known issue with the openclaw browser which the agent itself was made aware of so it could work around it, this has been working well so far.
I'm thinking of open sourcing this container at some point...
However, this design is still under development as it creates quite a bit of challenges.
Every time a project is shared that uses WASM.
Can't wait for the bubble to pop.
OCI supports far more and has a much bigger ecosystem