What I would love to see here is:
- some kind of sorting: eg. by excecution time or order. So I can see the slowest queries.
- search/filter feature.
- faster scrolling with pgup/pgdown keys.
- maybe how often the same query was executed. I could check the code and maybe optimize the queries.
I just tried and it works smoothly. For those who doesn't want to plug in the agents to their database directly, I built a similar tool https://dbfor.dev for the exact purpose, it just embeds PGLite and implements PG wire protocol to spin up quick PG databases with a traffic viewer included.
There's two main problems with said proxies:
* Latency. Yes, yes, yes, they add "microseconds" vs "milliseconds for queries", and that's true, but just part of the story. There's an extra hop. There's two extra sets of TCP layers being traversed. If the hop is local (say a sidecar, as we do in StackGres) it adds complexity in its deployment and management (something we solved by automation, but was an extra problem to solve) and consumes resources. If it's a network hop, then adds milliseconds, and not microseconds.
* Performance. It's not that hard to write a functioning PG wire proxy (it's not trivial either). But it is extremely hard to make it perform well under high load scenarios. Most of the proxies I have seen crack down under moderate to high performance.
What's the solution then? The Postgres extension model to capture the metrics (we also experimented with eBPF, but it causes too many kernel-user space context switches when you can do the same in an extension without them), and a small sidecar to push the metrics out via a standardized protocol like OTEL.
[1]: https://www.envoyproxy.io/docs/envoy/latest/configuration/li...
[2]: https://www.cncf.io/blog/2020/08/13/envoy-1-15-introduces-a-...
[3]: https://stackgres.io
Edit: formatting
I've just not sure it is much of a slight on such proxies.
You don't need to run this always inline in production to get amazingly useful results. Yes, there are lots of production insight solutions out there, but lots of modern stacks can be complex enough that just getting a quick handle on how the page you're debugging talks to your DBs can be incredibly useful, which is where I love the idea of a solution like this.
Sure, it is mytop / pgtop, but trying to offering it at a different layer & with a modern interface. Seems useful to me.
The extension model is great, but it doesn't work with existing postgres providers (RDS, Aurora, etc.). Unless one such extension becomes standard enough that all providers will support it. That would be ideal, IMO.
To be clear, I don't mean pg_stat_statements, that is standard enough, but an extension that pushes the actual queries in real-time.
> If it's a network hop, then adds milliseconds, and not microseconds.
Are you talking about connection establishing time or for query delay? I think it should normally be under a millisecond for the later.
What you can also do is add frontend and backend user to the proxy and then agents won't ever get the actual db user and password. You can make it throwaway too as well as just in time if you want.
Traditionally it was database activity monitoring which kind of fell out of fashion, but i think it is going to be back with advent of agents.
I've been building a Postgres wire protocol proxy in Go and the latency concern is the thing people always bring up first, but it's the wrong thing to worry about. A proxy adds microseconds, your queries take milliseconds. Nobody will ever notice. The actual hard part — the thing that will eat weeks of your life — is implementing the wire protocol correctly. Everyone starts with simple query messages and thinks they're 80% done. Then you hit the extended query protocol (Parse/Bind/Execute), prepared statements, COPY, notifications, and you realize the simple path was maybe 20% of what Postgres actually does. Once you get through that though, monitoring becomes almost a side effect. You're already parsing every query, so you can filter them, enforce policies, do tenant-level isolation, rotate credentials — things that are fundamentally impossible with any passive approach.
why would i inspect this data, because maybe trying to find a cause to a problem.. are there any other reasons
[0]: https://pgtap.org/
If so, would you mind sharing which model(s) you used and what tooling?