I think there are solutions for that scale of data already, and simplicity is the best feature of DuckDB (at lest for me).
What you need is a multi-tenancy shared infrastructure that is elastic.
i think this is where spark shuffling comes in? but how does it work here.
https://duckdb.org/docs/stable/guides/performance/how_to_tun...