We started out building an AI agent dev tool, but somewhere along the way it turned into Sims for AI agents.
Demo video: https://www.youtube.com/watch?v=sRPnX_f2V_c.
The original idea was simple: make it easy to create AI agents. We started with Jupyter Notebooks, where each cell could be callable by MCP—so agents could turn them into tools for themselves. It worked well enough that the system became self-improving, churning out content, and acting like a co-pilot that helped you build new agents.
But when we stepped back, what we had was these endless walls of text. And even though it worked, honestly, it was just boring. We were also convinced that it would be swallowed up by the next model’s capabilities. We wanted to build something else—something that made AI less of a black box and more engaging. Why type into a chat box all day if you could look your agents in the face, see their confusion, and watch when and how they interact?
Both of us grew up on simulation games—RollerCoaster Tycoon 3, Age of Empires, SimCity—so we started experimenting with running LLM agents inside a 3D world. At first it was pure curiosity, but right away, watching agents interact in real time was much more interesting than anything we’d done before.
The very first version was small: a single Unity room, an MCP server, and a chat box. Even getting two agents to take turns took weeks. Every run surfaced quirks—agents refusing to talk at all, or only “speaking” by dancing or pulling facial expressions to show emotion. That unpredictability kept us building.
Now it’s a desktop app (Tauri + Unity via WebGL) where humans and agents share 3D tile-based rooms. Agents receive structured observations every tick and can take actions that change the world. You can edit the rules between runs—prompts, decision logic, even how they see chat history—without rebuilding.
On the technical side, we built a Unity bridge with MCP and multi-provider routing via LiteLLM, with local model support via Mistral.rs coming next. All system prompts are editable, so you can directly experiment with coordination strategies—tuning how “chatty” agents are versus how much they move or manipulate the environment.
We then added a tilemap editor so you can design custom rooms, set tile-based events with conditions and actions, and turn them into puzzles or hazards. There’s community sharing built in, so you can post rooms you make.
Watching agents collude or negotiate through falling tiles, teleports, landmines, fire, “win” and “lose” tiles, and tool calls for things like lethal fires or disco floors is a much more fun way to spend our days.
Under the hood, Unity’s ECS drives a whole state machine and event system. And because humans and AI share the same space in real time, every negotiation, success, or failure also becomes useful multi-agent, multimodal data for post-training or world models.
Our early users are already using it for prompt-injection testing, social engineering scenarios, cooperative games, and model comparisons. The bigger vision is to build an open-ended, AI-native sim-game where you can build and interact with anything or anyone. You can design puzzles, levels, and environments, have agents compete or collaborate, set up games, or even replay your favorite TV shows.
The fun part is that no two interactions are ever the same. Everything is emergent, not hard-coded, so the same level played six times will play out differently each time.
The plan is to keep expanding—bigger rooms, more in-world tools for agents, and then multiplayer hosting. It’s live now, no waitlist. Free to play. You can bring your own API keys, or start with $10 in credits and run agents right away: www.TheInterface.com.
We’d love feedback on scenarios worth testing and what to build next. Tell us the weird stuff you’d throw at this—we’ll be in the comments.
Keep it up! Looking forward to what you figure out.
The Sims was my first experience with getting lost in a game having a negative impact on my life. Had to do most of a two week 5th or 6th grade geography project in the span of two days after playing the Sims instead of working on it.
LLMs with some decent harnesses could build up unpredictable - but internally consistent - strategies per each new game you play.
This is close to a proof of concept for those improvements.
Otherwise you run into the risk of "TOTAL NUCLEAR FINANCIAL LEGAL DESTRUCTION" ;)
Another thought that follows is that any kind of generative behavior, not just LLM, runs this risk of an endless pointless blandness. I.e. like with any artform we want there to be a point.
If those games are to feature LLM AI it would have to stand on it's own, with someone like these guys having thought it through.
No amount of dialogue is going to save that.
The actual story dialogue is usually interesting enough already
They win by the sheer quantity and by giving you a lot of subsystems to play with.
So LLM generated quest text probably feels it belongs here. It wouldn't, for example, in something with the Witcher 3 story quality.
By the way there are LLM dialog mods for Skyrim and everyone thinks they’re a joke because they suck.
Typically there are some easy micro and macro tricks that make the AI do something very stupid. That's why kiting is so ubiquitous in games - the AI just keeps following you while you whittle it down. Doesn't really work against a real player if they're microing the units.
The AI on higher difficulty starts a few centuries more technologically advanced than you, and gets multipliers on the starting resources like cities.
It’s not particularly fun to compete against.
OpenRA's bots are a bit more clever, and also don't need to magically see into fog-of-war.
Skirmish was a blast- I'd turtle until I had the enormous battleships (cruisers?) that could fire onto land. Loads of fun when I was like 12.
The other challenge I think you'll run into in general is that there's a huge knee jerk reaction against any use of LLMs or other popular types of gen AI in games in places like Reddit or Bluesky.
Fine tuned LLMs though with actual experience with the game, maybe?
I.e. what's the goal, how do you know you're doing well (or not), what makes it fun etc?
While fun this game-like interface is too casual and it certainly has lower bit rate which impacts communicate exchange between an AI and the human operator.
It will be a fine abstraction if the goal is to have high-level overview though.
Same thing happened when I tried hitting the URL directly. Do I have to accept the ToS before I'm allowed to read it?
The important thing is that you can plug in new objects without reprogramming the people.
Sims objects (including characters) have a list of "advertisements" of possible interactions (usually shown as items on the user control pie menu, but also including invisible and special orchestration actions).
Each enabled action of every live object broadcasts its character-specific scores to each of the characters, adjusted to each character's personality, current stats, location, relationships, history, optionally tweaked by code.
Then to keep the sims from acting like perfectly optimized robots, they have a "behavioral dithering" that choses randomly between the top scoring few advertisements.
Here's a video of "Will Wright - Maxis - Interfacing to Microworlds - 1996-4-26" where he shows an pre-release version called "Dollhouse" and explains the design:
https://www.youtube.com/watch?v=nsxoZXaYJSk
Jamie Doornbos gave a talk at GDC shortly after we released The Sims 1 in 2000, "Those Darned Sims: What Makes Them Tick?" in which he explains everything:
https://www.gdcvault.com/play/1013969/Those-Darned-Sims-What...
Transcript:
https://dn721906.ca.archive.org/0/items/gdc-2001-those-darne...
Yoann Bourse wrote this paper "Artificial Intelligence in The Sims series":
https://yo252yo.com/old/ens/sims-rapport.pdf
In The Sims 4 it's all been rewritten in Python, and has more fancy features, but it still uses the essential model for objects and advertisements.
The Sims 1 used a visual programming language called "SimAntics" to script the objects and characters, including functions that are run to score advertisements for each character.
But with LLMs you can write scoring functions and behavioral control in natural language!
I thought it was just another YouTube video with no audio.
Which is a lot of words to offer: be careful tossing out Luddite accusations just because it happens to be AI adjacent, that's rarely the whole story