Overview: https://youtu.be/GEOMYDXv6pE
Control demo: https://youtu.be/_Cw5fGa8i3s
Videos of autonomous use-cases: https://docs.innate.bot/welcome/mars-example-use-cases
Quickstart: https://docs.innate.bot/welcome/mars-quick-start.
Our last thread: https://news.ycombinator.com/item?id=42451707
When we started we felt there is currently no good affordable general-purpose that anyone can build on. There’s no lack of demand: hugging face’s SO-100 and LeKiwi are pretty clear successes already; but the hardware is unreliable, the software experience is barebone and keeps changing, and you often need to buy hidden extras to make them work (starting with a computer with a good gpu). The Turtlebots were good, but are getting outdated.
The open-source hobbyist movement lacks really good platforms to build on, and we wanted something robust and accessible. MARS is our attempt at making a first intuitive AI robot for everyone.
What it is:
- It comes assembled and calibrated
- Has onboard compute with a jetson orin nano 8gb
- a 5DoF arm with a wrist camera
- Sensors: RGBD wide-angle cam, 2D LiDAR, speakers
- Control via a dedicated app and a leader arm that plugs in iPhone and Android
- 2 additional USB ports + GPIO pins for extra sensors or effectors.
- And our novel SDK called BASIC that allows to run it like an AI agent with VLAs.
It boots in a minute, can be controlled via phone, programmable in depth with a PC, and the onboard agent lets it see, talk, plan, and act in real-time.
Our SDK BASIC allows to create “behaviors” (our name for programs) ranging from a simple hello world to a very complex long-horizon task involving reasoning, planning, navigation and manipulation. You can create skills that behaviors can run autonomously by training the arm or writing code tools, like for an AI agent.
You can also call the ROS2 topics to control the robot at a low-level. And anything created on top of this SDK can be easily shared with anyone else by just sharing the files.
This is intended for hobbyist builders and education, and we would love to have your feedback!
p.s. If you want to try it, there’s a temporary code HACKERNEWS-INNATE-MARS that lowers the price to $1,799.
p.p.s The hardware and software will be open-sourced too, if some of you want to contribute or help us prepare it properly feel free to join our discord at https://discord.gg/YvqQbGKH
This isn't so clear though: https://docs.innate.bot/main/software/basic/connecting-to-ba...
> BASIC is accessible for free to all users of Innate robots for 300 cumulative hours - and probably more if you ask us.
Is BASIC used just to create the behaviours or to run them too? It sounds like this is an API you host that turns a behaviour like "pick up socks" into ROS2 motor commands for the robot. Are you open sourcing this too, so anyone can run the (presumably GPU heavy) backend?
Does the robot needs an internet connection to work?
Also, more importantly, what does it look like with googly eyes stuck on?
It is required to run them, not to create them. And it's not about running "pick_up_socks", this one can already run on your robot. BASIC is required to chain it with other tasks such as navigating to another part of your house and then running another skill to drop the sock somewhere for example
Thank you for the remark, we will make it clearer in the docs
As a consequence: The robot does not necessarily require Internet to run, but if you want it to chain tasks while talking and using memory, yes it does.
As for the googly eyes, give me a minute...
Can it do complex tasks like "pick up socks from room A, drive to room B, and put in basket"? Is the intention to allow hobbyists to do actual work with it or is this version purely novelty rather than a functional "personal robot"?
Additionally, what is the limitation on speed of movement? It seems very slow in movement, is that intentional for safety or is that purely because of running the AI model locally?
https://docs.innate.bot/main/welcome/mars-example-use-cases
I take from your comment that the full capabilities of the robot are not properly represented and I take a note to film longer ones. And it can definitely do what you just asked. And multiple times in a row. I will note that it depends on the training quality of course.
On speed of movement, I now realize we didn't mention it anywhere so I added it in the overview but it's pretty fast. 0.7m.s-1 for the base, and the arm can be tormented quite a bit. Just took this video for you:
Does the MARS hardware really remove the hidden extras (computer with a gpu) mentioned as the downside of HF SO-101 or LeKiwi? While a jetson is good for inference, I feel like to train VLAs you would need access to a powerful GPU regardless. For Lerobot based hardware training ACT was relatively low profile if you use low resolution for the camera feeds, but with increased resolution or with more than one camera I already saw needing more than 8GB of VRAM. If VLA is on the table, finetuning something like the open sourced version of pi0 should already necessitate access to more than one 4090 or above I think.
Also, do you have plans for community-level datasets? I think Lerobot sort of does this with their data recording pipeline and HF integration.
The training does require external GPUs (but we provide that infra for free, straight from the app!), but the onboard jetson can run models trained though, as you can see in the examples. Everything you see in the vids is running onboard when it comes to manipulation, because we use a special version of ACT made specifically by us for this robot, that also includes a reward model (like DYNA does).
We have developed this system to also be able to run the other components smoothly so it also does SLAM, and has room for more processing even when running our ACT.
Now indeed this cannot run Pi-0 but from our experience - and the whole community in general - VLAs are not particularly better than ACT in the low data regime, and need a lot more compute.
As for community-level datasets, yes this is the plan. Anything you train can already be shared with others - just share the files. We didn't develop a centralized place for sharing datasets and behaviors but it is on the plan.
You could simply host the raw grunt in a base station somewhere else in the premises, keeping the device lighter and lower power.
This one is really, really convenient and intuitive. Turn it on anywhere, even outside, it just works. Even when I want to dev on it, it's super convenient.
On some level I truly believe robotics has to become more "complete", we can't always just piece things together, it makes it very hard to have a beautiful product.
I realize this is more of a philosophical answer, but I also think it is the right one to take this field to the next level
How does that fit into your ‘complete’ ethos?
For this one, it's just the only feasible way we found to bring the kind of experience we created to folks.
Also I am curious about a couple of the parts, if you don't mind sharing - are those wheels the direct drive wheels from waveshare? And what is the RGBD camera? (Fwiw, even if it's hefty the MARS price tag seems fair to me).
There is also a possibility for it to tip the base if the arm is fully extended. And the SO-101 has quite poor repeatability.
The base is also slow to move, and depending on which surface you are the omniwheels can get dirt in quickly.
Finally, external compute means you need in particular to teleoperate from your computer, so you have to be far from the robot and not necessarily in the same orientation than it which is very, very uncomfortable. This app system we made is one of the things people love the most about MARS.
Ah, and RGBD really does matter for navigation AND for learning (augmenting ACT with depth yields better results).
The wheels are indeed these ones, and the camera on the video is a luxonis oak-d wide, pretty expensive but comfortable to work with. However, the version we're shipping includes a much cheaper stereo-depth camera that we calibrate ourselves - I can't get you the reference right right now cause it's late at night but feel free to reach out on discord
Just wondering what the main function of the open onboard agentic OS built on top of ROS2? Does it has a dedicated name or just plugin extension for ROS2.
But the interest of it is how it is packaged and that it can be triggered by this VLM-based agent called BASIC that we created.
This agent has a special kind of architecture which gives it spatial memory, capability to react in time, and ability to navigate. This means that skills can be triggered in the right place at the right time, like a true entity. BASIC can interrupt skills in execution depending on how you configure it. So that if it's chilling navigating around and it sees you, it would interrupt navigation and get to you, shake hands and ask if you need help instead of finishing the task without reacting.
Hope this helps!
On the motors, these are dynamixels from robotis, and we provide all three of position, velocity, and effort on the low-level SDK (in ROS2 too)
and much higher interfaces for interaction and ai manipulation. Like directly recording episodes of training data so that the arm can use a VLA instead of simple IK.
You've got a $250 computer, some lidar+camera sensor for maybe $1-200, 6 servos, and cheap plastic. Plus you want to charge a $50/mo software subscription fee for some software product, whatever I guess that's beside the point.
No shade on the idea because low-cost robotics is an unsolved need for the future. But this current iteration is just not competing well with other alternatives. Perhaps this is more of a comment on what we can accomplish in the West vs what's possible in Asia.
Why would I not go for this guy for $1600, and attach an arm? https://www.unitree.com/go2
It's not an apples-to-apples product comparison, but you get the point. There's just so much more raw value offered per dollar elsewhere.
As for the unitree robot, this one is not unlocked for development, does not have onboard GPU, and does not have an arm. If you want it, check the price they give, it's very prohibitive.
You could attach a cheap arm to it but it would also not be stable enough for AI algorithms to run it. We're researchers ourselves, we would have made it cheaper if we could, but then you just can't do anything with it.
Our platform will deliver the experience of a real AI robot, anything cheaper than that is kind of a lie - or forces you to assemble and calibrate, which we do for you here. It is just the nature of trying to deliver a really complete product that works, and we want to stand for that.
EDIT: You can take a look at our autonomous demos there, you need something reliable for these: https://docs.innate.bot/welcome/mars-example-use-cases
Sure, the package is really interesting and definitely got me interested. But not one of the demos seems like a good use of the hardware. If you want to position yourselves mainly as an educational tool I don't think that is a problem. But if you want to target the 'maker' community I think you should put some thought into that.
For example, you could change the 'security guard' demo into a 'housekeeper' demo. You make it roam your house during the day and keep an up-to-date list of things you need to buy. I think this should work reasonably well for laundry and cleaning products. And after you have some historical data you could even do some forecasts about when you need to buy things again.
Another example would be to have it integrated with weather data and when it starts to rain it goes around the house to check if all windows are properly closed. On this same note it could keep track of the window state during the day and send you a reminder to open/close some windows if the temperature/humidity is above/below some threshold.
I think that by having some more 'useful' examples you should be able to get more attention from the 'maker' community. My guess is that a lot of folks that are heavily into home automation would love to have a device like that help with random things around the house.
Best of luck with your product, and I hope you succeed because this idea looks really exciting.
As for humidity, we don't have a sensor for it yet BUT the platform was made to be extensible specifically for this so it's easy to add one (see https://docs.innate.bot/main/robots/mars/extending-mars)
Thank you!
As someone who's dabbled in this before, I guess I'd rather just sit down and plan a BOM and do it myself if that's your markup anyways. Not that it's totally unreasonable for people who just want something super simple out of the box that works.
My general commentary is just that it's sad how much basic servos and what not cost in North America. We've completely ceded this industry to Asia.
Also, fair to say that if indeed you're the kind of person who likes to assemble all of this yourself, you're not directly in our target :)
This is more for AI / software folks who don't want to have to assemble and calibrate everything and risk having an arm that is not repeatable and thus can't actually properly learn. We have seen many folks spend a weekend or more trying to put these together and end up with a barely working platform and then be disgusted of AI robotics
I would say the problem is that most manufacturers, including chinese, sell you platforms that are not reliable enough for AI manipulation, and there's a race to the bottom for it, to which we try not to participate to
Pretty lofty claims though, really think you're so above everyone on quality at this price point? I know what dynamixels are capable of, and I see the jitter in the demo videos.
Why aren't the manipulator specs easily accessible on the website? Have you run a real repeatability test? Payload even?
It's a neat high-fidelity garage build platform, but I don't see any reason to assume this price premium is due to hardware quality.
You can see however in these demos: https://docs.innate.bot/main/welcome/mars-example-use-cases
that it is indeed pretty smooth.
Also, sorry the arm specs were not there! You can now have them at: https://docs.innate.bot/robots/mars/arm
Final comment I'll say, it's a weird and tough price point. Actual research labs would rather spend $20,000 on a very high quality and likely larger high-fidelity platform. A random hacker or grad student will need some real convincing to shell out $2,000, sub $1K might better serve them. So what's the target customer profile exactly?
I encountered similar issues developing a $3K plug and play robot research arm in the past. The economics are awkward. You can actually just spend $5K and get a really good second-hand industrial robot (maybe even first-hand now from China). Or you could spend $500 and get a 6 DOF platform at least as good as your current platform's arm and then buy the sensor separately and bolt it to your workspace - bam, done. And no, the software isn't that important, servos are easy to work with...
Therefore my 'in between platform' was stuck in a hard place. I made some one-off sales, but never really scaled the business, which is what would be needed for any fancy "we're the platform where people do AI" vision to manifest to investors. Hardware is tough - they'll see your numbers and easily pass. They'll realize you need sales in quantity to get anywhere meaningful otherwise.
So I wanted to share criticisms and my experience so you can look ahead to likely challenges and hopefully get further. Best of luck.
And yeah, I agree this mid-market is indeed tough, but this is the upper price I was looking for when I started with my AI background and bought a similar-price turtlebot then struggled to put a cheap arm on it. Anything under this is really bad for algorithms, although you can reduce it with just the arm and clamping it to your workspace as you suggested but then you don't have mobility.
I will keep your comment in mind, and thank you for the thoughtfulness. You might be interested to know that we intend to show something bigger not long from now. But this is, as you said, more for investors.
For now I'm content if there's enough people that want this one
Did you mean 250gr? Otherwise, what is this? A robot arm for ants?
Quasi-Lego-style robo dog for RPi is $100-150 on AMZN
This kind of architecture is very similar to what Physical Intelligence used for Pi-0.5 (a VLM triggering a VLA in different areas), albeit to a smaller size for now.
You can also see some example (autonomous!) use-cases here: https://docs.innate.bot/welcome/mars-example-use-cases
I especially like that you’re using ACT and BC to bootstrap be authoring process. Hopefully behaviors are modular and transportable - which I assume they will be given then arch.
That is the correct approach in my opinion.
Happy to have you onboard!
Sigh.
I have been running performance checks during the morning and tried with other browsers, used the debugging tools. I can see why it could be slow but I definitely need more trials to understand where it comes from. Bear with us while we're on it :)
and as i said earlier, this is really helpful for us to know so thank you
The commenter was actually very considerate and raised a warning where it might be seen. And they were kind enough to attempt this with two different browsers. After that you can buy my troubleshooting time at its usual hourly rate.
(Because it's a frequent enough issue: I wouldn't see that warning as being about a one-off obscure bug that will affect few people and doesn't matter. It's a warning that the web site probably did not enough take compatibility in consideration - and was approved without such consideration.)
"Runs fine for me" is an absurd bar for reliability / compatibility, no?
Not to say I dismiss the comment, definitely looking at it cause it might come from somewhere else. Just that I yet don't see what is the bottleneck.
do you have anything plug-and-play for jetson nano?
All the webrtc-features really only become relevant when you want to control the robot over the Internet, i.e., not just locally where you can assume reliable network.
So now employees will not only surveil employees but also physically punish them remotely if their gaze veers off screen?
What a time to be alive!
It's very difficult. Hard to transfer norms, rituals, and intuitive social cues passed organically drives human actions and evolution by enabling adaptive cooperation, empathy, and innovation in diverse societies.
For example, which books to read and whom to trust. You often make decisions on gut feeling which is hard to transfer.
The product looks promising. Hoping for the best.