I’ve been a Siri power user for a long time, mainly because I’ve always liked using voice as an interface. But, legacy voice assistants like Siri, Google Assistant, and Alexa were never well integrated enough or reliable enough to actually save time. Maybe 1 in 5 commands end up executing as smoothly as you expected, but the most useful thing they do is play a song or set an alarm. The advent of LLMs seemed like a great opportunity to push the state of the art forward a notch or two!
Our goal is to do 2 things better:
1) Deeper integrations with productivity-related apps you use every day, like calendar, email, messages, whatsApp, and soon Google Docs, Slack, and phone calls.
2) Better memory of each user based on their past conversations and integrations, so Martin can start to anticipate parameters in the user’s commands (e.g. text the guy from yesterday about the plans we made this morning)
A great way that our early users use Martin is having morning syncs and evening debriefs with the software. At the start/end of each day, they’ll have a 5-10 minute sync about their TODOs for the next day, and Martin will brief them on upcoming tasks and news they’re typically interested in.
Something else Martin does which is unlike other voice assistants is it can have full text conversations with your contacts on your behalf from its own phone number. For example, you can tell it to plan a lunch with a friend, and it can text back and forth with that friend to figure out a time and place. After the text conversation between your friend and Martin is over, Martin reports back to you via a notification and a text. You can also monitor all of its messages with your contacts in the app.
We started building Martin exactly 1 year ago, during our YC batch. It’s definitely a hard product to “complete" because of the many unsolved technical challenges, but we’re making progress step by step. First was the voice interface, which Siri still hasn’t gotten right after more than a decade. We have 2 modes: push-to-talk and handsfree. Handsfree is great for obvious reasons. We’ve gotten our latency down to only a couple seconds max for most commands, and we’ve tuned our own voice activity detection model to minimize the chance of Martin cutting you off (a common problem with voiceGPTs). But, even then, Martin may still cut you off if you pause for 3-5 seconds in the middle of a thought, so we made a push-to-talk mode. For those cases where you want to describe something in detail or just brain-dump to Martin, you might need 20-30 seconds to finish speaking. So just hold down, speak, and release when you’re done—like a walkie talkie.
We’ve also had to tackle a very long tail of integrations, and we want to do each one well. For example, when we launched Google calendar, we wanted to make sure you could add a Google Meet link, invite your contacts to the events, and access secondary calendars. And, you should be able to say things like “set reminders leading up to the event” or “text Eric the details of this event.” So, we pretty much release one new major integration every month.
Finally, there’s the problem of personalization / LLM memory, which is still very unsolved. From each conversation that a user has with their Martin, we try to infer what the user is busy with, worried about, or looking forward to, so in their next “morning sync” or “evening debrief”, Martin can proactively suggest to-dos or goals/topics to discuss with the user. Right now, we use a few different LLMs and many chain-of-thought steps to extract clues from each conversation and have Martin “reflect” periodically to build its memory. But, with all that said we still have a lot of work to do here, and this is just a start!
You can try Martin by going to our website (https://www.trymartin.com) and starting a 7 day free trial. Once you start your trial, you’ll get an access code emailed to you along with the download link for our iOS app. After you enter your access code into the app, you can integrate your calendar, contacts, etc. If you find Martin useful after the trial, we charge our early users (who are generally productivity gurus and prosumers with multiple AI subscriptions) a $30/month subscription.
We can’t wait to hear your thoughts. Any cool experiences with Siri, things you wish a voice assistant could do, or ideas about LLM memory, tool calling, etc. - I’d love to discuss any of these topics with you!
We recently got our CASA Tier-2 compliance done (Cloud Application Security Assessment). We've also gone through Google's OAuth compliance process for every new integration we add that's related to Google. These assessments scan our app and make sure that our software meets pretty stringent standards when it comes to data security and encryption, and that we're not using the data for anything other than the specific features we promise (i.e. not sharing or selling to advertisers, etc.). You can read more about CASA here (https://appdefensealliance.dev/casa). We haven't gone through SOC2 yet, but planning on soon once we have a few more integrations.
Does personal information get sent to OpenAI or Claude as part of the functionality? Can users request deletion of their data, and if so, what is the process? Are there specific protocols in place to ensure security? (i.e. Do you use encryption at rest?).
Unless you intend to personally audit their code, I'd argue it couldn't possibly matter. Even businesses like Apple publish all kinds of documentation that belies the reality of their infrastructure. The iMessage Security Overview doesn't mention the NSA's retention period for encrypted communique; the push notification documentation doesn't tell you about the government middleman processing each alert.
You either trust people blindly, or you validate them personally. Getting a pinkie-promise about privacy from the CEO is worth absolutely nothing in real-world security terms.
So... in Apple's own words, they are allowed to cherry-pick who's allowed to read their code and audit their privacy, in the same way they strategically deny researchers the ability to audit certain iOS features.
You're still taking them on their word, here.
[0] https://arstechnica.com/tech-policy/2023/12/apple-admits-to-...
The measurement logs will be publicly available.
As you develop your messaging, I wanted to share the questions I had as I think a lot of users will ask the same:
1. What powers Martin? Is it a custom LLM or powered by OpenAI, Anthropic? 2. Is any of my data ever used in training? 3. Will I always be notified before new texts / calls / actions are taken on my behalf? Does the AI present as me or are my contacts aware that it's an AI assistant that may provide incorrect information? 4. Can I easily and quickly remove all my data and context?
I’ve been putting it through its paces and it’s handling some complicated requests correctly the first time. For example:
“There is an art festival in my city this weekend. They have a jazz stage my wife and I would like to check out. Find the schedule for each day and create one event every day it’s happening. In the event description put the schedule for each day, and invite my wife.”
It got it right the first time. Pretty amazing.
I see some folks saying it’s just a “wrapper for an LLM” like that’s easy to do. LLMs are not faerie powder that just work for every use case. The personal assistant use case is extremely difficult, which is why the big players haven’t done it yet.
So bravo for the bravado and actually making it work. Privacy is a concern, but honestly I’m not that worried that you can find out which art festival I’ll be at this weekend. But an oncology appointment? I might.
You should create a system where you cannot access user data, and it can never be shared with third parties. Make that system open source to prove it. Give up the potential upside of using this data for revenue so that Martin becomes what it can be. Otherwise, I’ll never feel confident telling Martin anything I don’t want advertisers to know.
We can certainly publish more privacy guarantees in the future - thanks for the suggestions. Our business model is subscriptions, so we won't be going anywhere near ads or data sharing.
https://ios.gadgethacks.com/how-to/60-ios-features-apple-sto...
Apple’s strength and weakness is that there is only one Apple way to do things for ease of use and for it to “just work”. However, not everyone fits into their cookie cutter design regardless of how good it is. Customization and options beyond what Apple does tends to be the way to go.
they pivoted mid yc it sounds like. yc chooses founders not ideas.
I'm sure it won't be long before we see apps that listen, record any "Hey Siri" they hear, and then synthesize that voice to give your phone commands to "tell me my passwords", or more insidious and difficult-to-detect commands.
It seems Apple's new version will be facing this problem too.
- How did you solve the long-term memory problem? What kind of issues are you facing with scaling the number of tools?
- I like the idea but there's one crucial thing missing for me. I will happily pay for your app if it lets me bring my own API keys/ endpoints for models that I can host, so that I know my data is private and secure.
- Right now, we use a combination of RAG and chain of thought for storing memories. At different time intervals, we'll create memories at different levels of granularity. For example, at the end of every conversation, we'll embed some vectors based on specific commands from the user. At the end of each day, we'll have the LLM reflect on key questions related to a user's routine. And every few days, it'll reflect on the user's short/long term goals. This has worked to some degree, but we're still in the very early stages of figuring out how to do long-term memory for an assistant.
- Scaling the number of tools is definitely a struggle since we want to make our integrations as thorough as we can. It takes time, so we just try to keep growing the list consistently. We have an internal goal of adding at least one new major integration every month.
- Love the idea of bringing your own API keys/endpoints. We've gotten this feedback before, so we'll seriously consider it in our next few sprints!
As someone who is very interested in using this, may I make two suggestions:
1. Have a list of integrations somewhere on the homepage. It might be there, but if so I missed it. I immediately wanted to know if it can integrate with Obsidian, for example, or Omnifocus. I'm sure others will want to know if "email" means Google only, or Outlook, etc.
2. Make the trial longer. When I see 7 days, what I immediately think is "not enough time to really test this". I'm a busy person, I'm not going to change habits overnight, and unless this thing will immediately integrate into my daily routine (it won't), I'll probably only use it casually the first few times. It would be much better to give me more time to test it. (This is not business advice - maybe I'm wrong and 7 days is better to actually convert users! I'm just giving my immediate reaction.)
My main problem with startups around this is that it’s just a big ask to get access to all my data and store it in their cloud.
...after almost 10 years of internal stagnation and killing Siri's other upgrade projects: https://www.macrumors.com/2023/04/27/report-details-turmoil-...
It's just baffling. Same goes for Microsoft and how utterly unusable Cortana is, these features should be more than afterthought integrations that limp along because they're too cheap to get rid of. The slow "evolution" of voice assistants is pretty much ensuring that nobody wants to use them, at least among the people I know that own smartphones. Something tells me that AI won't be the selling point Apple thinks it is, especially when anyone with a web browser can use ChatGPT for free.
I’ve read through Apple’s Intelligence architecture and I liked what I saw (context graph on device; how the integrations work; if doing remote inference only sending the relevant parts; the architecture of private compute).
I haven’t upgraded my phone in a while, and this will be the final push for me to get their next phone, personally, and I’ve been using ChatGPT and now Claude from the moment they got to a sensible level.
That all of them are taking their sweet time with this makes me think it was just hard to get something production-ready out that will work for 99% of the population well.
So let me get this straight. You want AI features, but not the ChatGPT 4o functionality you can go use for free right now. You intend to upgrade to one of the Pro phones (the only ones guaranteed to get local AI functionality) for subpar AI functionality and less freedom to select an LLM that works for you?
I don't get it. Since WWDC I've heard so much contradiction around what people want from Apple AI. Maybe I'm misinformed, but I think it's absolute nonsense that someone would trust OpenAI only when Apple is the middleman. I certainly know the majority of smartphone customers couldn't care less.
That's very obvious from your comment.
Apple Intelligence is an on-device LLM backed by a private Apple hosted LLM. So it has access to your private data and is designed to provide capabilities that can leverage it e.g. it will have knowledge of all your personal interactions across multiple apps. It is fundamentally a personal experience.
It is completely different from ChatGPT which is a single LLM that is used to provide a public experience. And nothing is stopping you from using Claude, Mistral or any other LLMs through their existing apps.
That's what I'm talking about, really. Why would you buy a new iPhone to get functionality you can receive from the App Store?
I can see two reasons. You're either one of the evergreen sycophants that buys into Apple's security theater, or you're an obsessive Apple-maniac that will use any of their products no matter how bad they are. Those are two miniscule audiences, in the market of current and future iPhone owners. You would have to show me market research to convince me that even 5% of iPhone owners fall into either category.
So I'll ask again: does anyone really think an inferior local AI with access to your contacts list is going to drive upgrades? Mind you, in the EU Apple isn't even shipping these features because they have to mysteriously and arbitrarily prevent developers from offering competing AI integrations.
Apple's LLM is not a competitor for ChatGPT, Claude, Mistral etc.
It is an on-device model that has access to all of your private data e.g. health, contacts, mail, photos, messages etc. Something that no other LLM will ever get access to. Which means you can ask highly contextualised questions like "show me the photos I took on my last holiday". Again something no other LLM can do.
And it may not drive upgrades on its own but people asked for a better Siri and Apple unquestionably delivered on it.
Why? There's no technical reason it's impossible. LLMs don't need an internet connection or persistent local storage; Apple could safely run any LLM with personal data in a sandbox denied internet access. Running Llama or Phi on my data is no less secure than Apple's own local models.
Even if Apple does thwart antitrust efforts and becomes the new exclusive offline model provider for iPhone... how does that drive sales? People predisposed to avoid Siri aren't going to be enticed by an AI-powered upgrade, unless they wanted Siri to be wrong more often. These people have access to better LLMs for free, on any device they choose. The only thing Apple can leverage against them is handling the data they store on the device they own... something Apple is very obviously afraid to attempt in markets where they're already scrutinized.
Again, besides the perennial Apple investors I do not know a single iPhone owner that cares about anything you just said.
Importantly, I have a fair bit of trust towards OpenAI, but I’m not willing to share the details of half my digital life with them.
After reading Apple’s Intelligence architecture, I feel fairly comfortable with having it use the data from all my various apps for context.
I would take those whitepapers with a grain of salt if I were you. Apple's documentation has been known for redacting certain... details, per the request of the US government. Plus, it's not like you or I can go audit the tech (or even the code) and confirm it's the same as in the documentation.
Then again, I trust OpenAI and Apple both as far as I can throw them. Maybe it is me being overly discriminate about what I consider secure and not.
“Alexa, how much are you costing Amazon per user and how much do they want to charge me to revamp it?”
“Thank you for asking. Here is a top rated result”: https://www.ftc.gov/business-guidance/blog/2023/06/hey-alexa...
Sorry, she meant to show you this: https://www.reuters.com/technology/amazon-mulls-5-10-monthly...
Not sure how it will affect this startup.
This is not directly in contradiction to what you said, but I feel like there’s a lot of confusion around this.
It has been a few years and my Android assistant is still dumb. I would have expected the iOS/Android assistants to be much better by now.
I am guessing its against their philosophy to release product that only works sometimes from the get go. Thats why they have been so demure about whole AI stuff.
But have you used Siri?
“Only works sometimes”, yes if all you ever ask is “what time is it”, what’s the weather, and maybe 5-6 other canned prompts.
Half the time it doesn’t even understand or catch what’s being said. (Mainly talking about the HomePod, it generally works fine on an iPhone)
Siri is a great example of Apple being perfectly fine letting a product languish so what might’ve been great 10 years ago is a punchline now.
Anyway, a few thoughts...
1. I find the event planning stuff to be kind of stale. Like maybe it will be cool this time, but so far it's part of every demo and concept around AI assistants and it's never ACTUALLY been cool. I wish this was trying to be cool in a new way.
2. The turn-taking for voice input looks kind of awkward. I get why it has to be that way, and there's not really a better solution, but... well, maybe it would be possible to use visual output and voice input, or generally make them complement each other. Many details are better to show visually and can be tedious to listen to.
3. I like the patient and attentive secretary model more than the turn-taking chat. The confirmation turn-taking is a trust exercise (did the AI _really_ hear and understand what I said?) but I think there's other ways to handle that trust. Like being more trustworthy (modern non-streaming speech recognition works really well!), making things easy to undo, or detecting unlikely commands and require verification.
4. For example, when reviewing a to-do list, I'd rather it show the list and I can just say "yeah, I finished item 1 and 2, and I was able to pick up milk but there's still some other groceries I need to get for tonight" and have it complete and revise entries based on that.
5. Generally to-do and task management is 10x more interesting to me than calendaring. But you should have a theory, not just be a layer over something else. I should be able to break down tasks, complete subtasks, identify partial completion and have it identify the remaining portions, get suggestions on breakdown, get advice on which tasks to complete when, etc.
6. Another interesting thing would be a kind of personal database. I would love to be able to unload a lot of information from my head and know that it will be put someplace where it can be meaningfully retrieved, combined with other data, etc. Like if I have certain bill payments or house maintenance I want to remember or something, I don't want to turn that into calendar items. Lots of them aren't even fully articulated, or the structure will emerge as more information becomes available. But I want to get started before I have carefully defined the task, and an AI assistant could do that.
We'll be experimenting a lot and releasing updates to Martin's home screen layout and feed soon! There's a good chance this will come with changes in how we do task management altogether.
Also totally hear you on the personal database idea. We've been toying with similar ideas for a while. Many users already do this with Martin, basically brain dump in a long voice session, and it'll suggest reminders/calendar events for you. We're still figuring out how to display this personal DB info to the user in a UI though, so would love to hear your suggestions.
Maybe the founders they are funding are not diverse enough. Is there too much tracking on which universities they went to? So the same set is applying and getting funded?
The exception for me would be situations where I can't use my hands, like driving. I don't want to have to look at a screen. If a voice agent could replicate the functionality of CarPlay, that would be really useful.
Also I feel you, about running into all of the challenges your facing with LLMs. We've run into quiet a few road blocks, but your comments summarises it the best. Just keep working on it step by step.
- do some research on a given company/individual/website and give me a summary.
- preferably also identify a contact email.
- handle selecting a good time for meetings according to my availability and preferences.
- handle the communication with the other party.
- let me know when it is arranged, or if it's given up.
I signed up and gave it a UK phone number, and got a UK number back for texting Martin. I'm not sure why it has to be SMS when it could be an in-app chat. I was expecting to get a confirmation SMS or similar, but it just accepted it straight away. When I texted the number I was given (several times), it was delivered but there was no reply.
Martin sent me an email welcoming me. I replied asking it to set up a meeting for early next week with another email address. Martin replied saying it is unable to email people on my behalf, and suggested I set it up myself.
> Unfortunately, I am currently unable to send emails to other people on your behalf. However, you can easily send an email to ** to schedule the meeting for early next week in the afternoon.
I reminded Martin that there is an example on the website homepage of doing just that, and it replied saying it can indeed schedule meetings, and asked for the details again. I replied with the same details, and it confirmed the meeting was set up.
I checked my other email, and there was no message setting anything up. I told Martin that the other party needs to know about the email, and it replied with:
> Understood. I'll make sure to inform ** about the meeting details.
Still nothing received. Furthermore, I checked the app and I haven't even connected my calendar, so I'm surprised it didn't warn me or prompt me to do this when I asked for a meeting.
I gave up with that and decided to try something else. I forwarded Martin an email thread from a lead, which included a lot of back story on their organization, offering, and some areas that they think we could potentially collaborate on. I asked Martin to find out more about the company, and evaluate the options for collaboration.
This lead is in the AI space, with their primary product being a document digitisation solution to help surface and discover business documents.
Martin replied describing it as a "nearbound revenue platform to streamline revenue operations", with a key feature being "Automated lead scoring and distribution to prioritize high potential leads". As far as evaluating the collaboration opportunities, it instead gave me a list of collaboration features within the platform, none of which exist.
At the end, it linked to a blog post to their recent funding round. Except, the blog post was from a completely unrelated company with a similar name. Bear in mind that the originally forwarded email was from their business email account, and the body contained multiple links and references to their website.
I decided to try one more test, and asked it to do some research on my own business website and let me know what it finds out. It's been 20 minutes, and I haven't had a reply. I checked the app to see if there was any indication it's working on something for me, but nothing their either.
I love the idea of Martin, but I'll be canceling my trial - it just doesn't seem anywhere near ready yet - especially given I have to trust it to communicate on my behalf.
- I totally resonate with your criteria for an AI PA - this is very much what we're working towards with our email integration. We had been focused on voice for a while, but recently started tackling all the email use cases. Really want to get these right for you!
- Sorry for the poor onboarding job - we should make it more clear that you have to sync your calendar before we send you an email inviting you to send and forward scheduling items to Martin.
- For sending emails to contacts, this is one of our upcoming integrations that we've been building for a while - but just not ready yet! We want to make it able to send/reply to emails and fully act on threads that you attach it to. This means issuing you a unique email address for "your Martin" and managing it's behavior on threads and memory of other contacts. It's a harder problem than we first anticipated, so we're working through it steadily! It should be ready in the next month or so. For now, the communications feature is just limited to texting contacts on your behalf.
- For "deep searches", it definitely isn't the greatest at digging into a topic or generating a thorough briefing for you right now. We're not sure how deep we'll go into this use case in the future, but we do plan on integrating with more specialized functions, like LinkedIn, Twitter, Maps, etc. which should make this a lot better.
Sorry again for the poor onboarding experience. I think we also got an email from you, so will reply there as well to ask for more feedback!
I think this is the greatest drawback to using a tool like this. It does good with the low hanging fruit tasks like responding to a txt. But it does poorly at a loooong range of long tail tasks like conpany research, or summarizing a topic or gathering the latest headlines for a topic, or finding leads or alerting me when someone mentions my brand on Twitter or when someone links to my blog post.
And when soneone tries to do any of those things and gets a bad user experience, theyll give up on your tool
Want big $$$? Support exchange 365 via MS Graph API or whatever today's preferred api is.
Exchange is on our list! Started the Microsoft compliance process a couple months ago - expecting to get support for at least outlook or word rolled out this year.
Please update the video to bleep out “Siri”
If the contact thought they were talking to me, it would be fucking dystopian. I would immediately break contact with someone who gaslighted me with AI.
Please make some sort of pledge or guarantee that your company will never impersonate a human and that it will always identify itself as a machine when communicating with others.
Nice job!
They even announced it during the WWDC this year.
Again, I’m a founder myself. Not trying to poopoo on this. I’m just curious how this idea has legs with all the might of Apple looking poised to crush this.