At least for Codex, the agent runs commands inside an OS-provided sandbox (Seatbelt on macOS, and other stuff on other platforms). It does not end up "making the agent mostly useless".
The YOLO mode is also good, but having a small ‘baby setting mode’ that’s not full-blown system access would make sense for basic security. Just a sensible layer of "pls don't blow my machine" without killing the freedom :)
I only wish the author changed his stance on vendor extensions: https://github.com/badlogic/pi-mono/discussions/254
This makes it even more baffling why anthropic went with bun, a runtime without any sandboxing or security architecture and will rely in apple seatbelt alone?
I built on ADK (Agent Development Kit), which comes with many of the features discussed in the post.
Building a full, custom agent setup is surprisingly easy and a great learning experience for this transformational technology. Getting into instruction and tool crafting was where I found the most ROI.
I hadn't realized that Pi is the agent harness used by OpenClaw.
edit: referring to Anthropic and the like
Also data, see https://news.ycombinator.com/item?id=46637328
(this is also why all the labs, including some chinese ones, are subsidising / metoo-ing coding agents)
One thing I do find is that subagents are helpful for performance -- offloading tasks to smaller models (gpt-oss specifically for me) gets data to the bigger model quicker.
You can sandbox off the data.
Small and observable is excellent.
Letting your agent read traces of other sessions is an interesting method of context trimming.
Especially, "always Yolo" and "no background tasks". The LLM can manage Unix processes just fine with bash (e.g. ps, lsof, kill), and if you want you can remind it to use systemd, and it will. (It even does it without rolling it's eyes, which I normally do when forced to deal with systemd.)
Something he didn't mention is git: talk to your agent a commit at a time. Recently I had a colleague check in his minimal, broken PoC on a new branch with the commit message "work in progress". We pointed the agent at the branch and said, "finish the feature we started" and it nailed it in one shot. No context whatsoever other than "draw the rest of the f'ing owl" and it just.... did it. Fascinating.