Current LLMs are stateless—they forget everything between sessions. This limitation leads to repetitive interactions, a lack of personalization, and increased computational costs because developers must repeatedly include extensive context in every prompt.
When we were building Embedchain (an open-source RAG framework with over 2M downloads), users constantly shared their frustration with LLMs’ inability to remember anything between sessions. They had to repeatedly input the same context, which was costly and inefficient. We realized that for AI to deliver more useful and intelligent responses, it needed memory. That’s when we started building Mem0.
Mem0 employs a hybrid datastore architecture that combines graph, vector, and key-value stores to store and manage memories effectively. Here is how it works:
Adding memories: When you use mem0 with your AI App, it can take in any messages or interactions and automatically detects the important parts to remember.
Organizing information: Mem0 sorts this information into different categories: - Facts and structured data go into a key-value store for quick access. - Connections between things (like people, places, or objects) are saved in a graph store that understands relationships between different entities. - The overall meaning and context of conversations are stored in a vector store that allows for finding similar memories later.
Retrieving memories: When given an input query, Mem0 searches for and retrieves related stored information by leveraging a combination of graph traversal techniques, vector similarity and key-value lookups. It prioritizes the most important, relevant, and recent information, making sure the AI always has the right context, no matter how much memory is stored.
Unlike traditional AI applications that operate without memory, Mem0 introduces a continuously learning memory layer. This reduces the need to repeatedly include long blocks of context in every prompt, which lowers computational costs and speeds up response times. As Mem0 learns and retains information over time, AI applications become more adaptive and provide more relevant responses without relying on large context windows in each interaction.
We’ve open-sourced the core technology that powers Mem0—specifically the memory management functionality in the vector and graph databases, as well as the stateful memory layer—under the Apache 2.0 license. This includes the ability to add, organize, and retrieve memories within your AI applications.
However, certain features that are optimized for production use, such as low latency inference, and the scalable graph and vector datastore for real-time memory updates, are part of our paid platform. These advanced capabilities are not part of the open-source package but are available for those who need to scale memory management in production environments.
We’ve made both our open-source version and platform available for HN users. You can check out our GitHub repo (https://github.com/mem0ai/mem0) or explore the platform directly at https://app.mem0.ai/playground.
We’d love to hear what you think! Please feel free to dive into the playground, check out the code, and share any thoughts or suggestions with us. Your feedback will help shape where we take Mem0 from here!
One question that I've heard a few times now: will you support the open source version as a first class citizen for the long term? A lot of open source projects with a paid version follow a similar strategy. They use the open source repo to get traction, but then the open source version gets neglected and users are eventually pushed to the paid version. How committed are you to supporting the open source version long term?
- Inclusion prompt: User's travel preferences and food choices - Exclusion prompt: Credit card details, passport number, SSN etc.
Although we definitely think that there is scope to make it better and we are actively working on it. Please let us know if you have feedback/suggestions. Thanks!
I messed around with the playground onboarding...here's the output:
With Memory Mem0.ai I know that you like to collect records from New Orleans artists, and you enjoy running.
Relevancy: 9/10
Without Memory I don’t have any personal information about you. I don’t have the ability to know or remember individual users. My main function is to provide information and answer questions to the best of my knowledge and training. How can I assist you today?
Relevancy: 4/10
--
It's interesting that "With Memory" is 9/10 Relevancy even though it is 100% duplication of what I had said. It feels like that would be 10/10.
It's also interesting that "Without Memory" is 4/10 — it seems to be closer to 0/10?
Curious how you thinking about calculating relevancy.
1. Automatically deprioritizing older memories when new, contradictory information is added. 2. Adjusting memory relevance based on changing contexts.
We're working on improving this system to give developers more control. Future plans include:
1. Time-based decay of unused memories 2. Customizable relevance scoring 3. Manual removal options for obsolete information
These improvements aim to create a more flexible "forgetting" mechanism, allowing AI applications to maintain up-to-date and relevant knowledge bases over time.
We're open to user feedback on how to best implement these features in practical applications.
(I hope it's ok to share something I've built along a similar vein here.)
I wanted to get long-term memory with Claude, and as different tools excel at different use cases, I wanted to share this memory across the different tools.
So I created MemoryPlugin (https://www.memoryplugin.com). It's a very simple tool that provides your AI tools with a list of memories, and instructs them on how to add new memories. It's available as a Chrome extension that works with ChatGPT, Claude, Gemini, and LibreChat, a Custom GPT for ChatGPT on mobile, and a plugin for TypingMind. Think of it as the ChatGPT memory feature, but for all your AI tools, and your memories aren't locked into any one tool but shared across all of them.
This is meant for end-users instead of developers looking to add long-term memory to their own apps.
So after using Mem0 a bit for a hackathon project, I have sort of two thoughts: 1. Memory is extremely useful and almost a requirement when it comes to building next level agents and Mem0 is probably the best designed/easiest way to get there. 2. I think the interface between structured and unstructured memory still needs some thinking.
What I mean by that is when I look at the memory feature of OpenAI it's obviously completely unstructured, free form text, and that makes sense when it's a general use product.
At the same time, when I'm thinking about more vertical specific use cases up until now, there are very specific things generally that we want to remember about our customers (for example, for advertising, age range, location, etc.) However, as the use of LLMs in chatbots increases, we may want to also remember less structured details.
So the killer app here would be something that can remember and synthesize both structured and unstructured information about the user in a way that's natural for a developer.
I think the graph integration is a step in this direction but still more on the unstructured side for now. Look forward to seeing how it develops.
- Control over what to remember/forget - Ability to set how detailed memories should be (some want more detailed vs less detailed) - Different structure of the memories based on the use case
I believe the answer is "no, you can only run the memory management code in Python, the javascript code is only a client SDK for interacting with the managed solution". In which case, no worries, still looks awesome!
Disclaimer: I built it.
Context: We are using mem0 in another open-source project of ours (Typescript) and had the same questions. So we went ahead and built a small api server for ourselves.
I am inclined to like SPARQL databases because of their multiscale nature. You can have a tiny SPARQL database in RAM that you use like a hashtable and also have a big one with a few billion triples. It is a common situation that you want to gather all the facts to make a decision about a case (such as handling a customer at a call center) and it reasonable to fetch all of that and get it in RAM.
Two major problems w/ SPARQL databases are:
(1) even though RDF has two official ways to represent ordered collections and there is an unofficial one that works very well, SPARQL does not have facilities to work with ordered collections like you would have in N1QL or AQL or similar document-oriented query languages. This could be added but it hasn't been done.
(2) If you are writing transactional or agentic systems in SQL you have a lot of help in that a "row" is a unit to do inserts, deletes, and updates in. It is not so easy to get it right if you are updating a triple at a time, there are algorithms to define a part of a graph that form a "record" (e.g. go to the right from a starting node, passing through blank nodes, not passing through URIs) but this is all stuff you have to implement yourself.
---
Salesforce.com has a patent which has recently expired that covers a triple store that automatically profiles itself and builds indexes for very efficient query execution, if this was built into graph database products it could be game changing but so far it isn't.
---
There is "graph as a universal data structure" as in "the graph of pointers in a C program" and then there are the "graph algorithms" that Mark Newman writes about. The later are much less interesting than the former (go bomb the #1 centrality node in a terrorist network -- did you win the war?)
If you are doing the latter or any kind of really intensive job you may be better doing it as a batch job, in fact back in the day I developed Hadoop-based pipelines to do things like reconstruct the relationships inside the Freebase data dump.
----
For quite a few projects I've used Arangodb which is a great product but the license really sucks. I have something I have been working on for a while that uses it and if I am going to either open source or commercialize it I'm going to have to switch to something else.
Vector databases are typically used for storing embeddings and are great for tasks like similarity search. However, they are generally read-only and don't natively support the concept of time or state transitions. Let's take an example of tracking state of a tasks from your todo list in a vector database:
You might store the task's states like:
Task 1 in backlog Task 1 in progress Task 1 in canceled
But there's no concept of "latest state" or memory of how the task evolved over time. You'd have to store multiple versions and manually track changes.
With a memory-enabled system like Mem0, you could track: Task 1 (current state: in progress) with a memory of previous states (backlog, canceled, etc). This gives your AI app more stateful understanding of the world, allowing it to update and reflect the current context automatically.
Traditional databases, on the other hand, are designed for structured, relational data with fixed schemas, like customer information in a table. These are great for handling transactional data but aren't optimal for cases where the data is unstructured.
As mentioned in the post, we use a hybrid datastore approach that handles these cases effectively and that's where the graph aspect comes into picture.
What??
==== Bot: wassup?
Me: I have some more thoughts on Project X. They will be rambly so please also create an edited version as well as the usual synopsis. I will say 'I'm finished' when I've finished.
Bot: ok hit me
Me: bla bla bla bla etc etc. I'm finished.
Bot: this looks like part of the introduction text of Project X, is that correct?
Me: yes. What meta tags do you suggest? Etc ====
I'm assuming that a custom GPT or equivalent is necessary to set out the 'terms of engagement' and agent objectives. Can you offer any advice about building such a system, and how mem0 could help?
It's a very similar reason on why using Graph for RAG can help you get much more accurate responses than with Vector RAG. See a blog I wrote about it https://www.falkordb.com/blog/knowledge-graph-vs-vector-data...
The only AI memory solution I work with every day is ChatGPT memory feature. How does mem0 compares to it?
1. LLM Compatibility: Mem0 works with various AI providers (OpenAI, Anthropic, Groq, etc.), while ChatGPT memory is tied to OpenAI's models only.
2. Target Audience: Mem0 is built for developers creating AI applications, whereas ChatGPT memory is for ChatGPT users.
3. Quality and Performance: Our evaluations show Mem0 outperforms ChatGPT memory in several areas:
- Consistency: Mem0 updates memories more reliably across multiple instances.
- Reliability: ChatGPT memory can be inconsistent with the same prompts, while Mem0 aims for more consistent performance.
- Speed: Mem0 typically creates memories in about 2 seconds, compared to ChatGPT's 30-40 seconds to reflect new memories.
4. Flexibility: Mem0 offers more customization options for developers, allowing better integration into various AI applications.These differences make Mem0 a better choice for developers building AI apps that need efficient memory capabilities.
Exciting work overall!
makes me nostalgic for ChatScript's fact triples
This is my main concern with most AI providers. They are based in the US, with unclear GDPR compliancy, making most of them a non-starter for me.
Claude Prompt Caching and Mem0's memory system have several key differences:
1. Purpose and duration: Claude's cache is designed for short-term memory, clearing every 5 minutes. In contrast, Mem0 is built for long-term information storage, retaining data indefinitely unless instructed otherwise. 2. Flexibility and control: Mem0 offers more flexibility, allowing developers to update, delete, or modify stored information as needed. Claude's cache is more static - new information creates additional entries rather than updating existing ones. 3. Content management: Claude has minimum length requirements for caching (1024 characters for Sonnet, 2048 for Haiku). Mem0 can handle information of any length, from short facts to longer contexts. 4. Customization: Developers have greater control over Mem0's memory management, including options for prioritizing or deprioritizing information based on relevance or time. Claude's caching system offers less direct control. 5. Information retrieval: Mem0 is designed for more precise and targeted information retrieval, while Claude's cache works with broader contextual blocks.
These differences reflect the distinct purposes of each system. Claude's cache aims to maintain recent context in ongoing conversations, while Mem0 is built to serve as a flexible, long-term knowledge base for AI applications.
See the "Comments" section of the Launch HN instructions for YC startups (https://news.ycombinator.com/yli.html) for an example of how strongly we emphasize this.
There's only one level of removal beyond that ("[deleted]") but we never do that as moderators. "[deleted]" always means either that the author deleted the post themselves or asked us to do it for them.