We opt for the central server as a super-peer and use the Yjs differential update system to avoid loading docs in memory for too long. While there are many things about local-first that are a huge pain in the ass, the UX benefits are pretty huge. The DX can be nice too! Getting to focus on product and not on data transit (once you've got a robust sync system) is pretty sweet. The first 4 weeks of launching our Yjs-based system was rough though; lots of bugs that virally replicated between peers. It requires a really paranoid eye for defensive coding; after several years, we have multiple layers of self-healing and validation on the client.
I built a simple SaaS [1] to get a sense of what's missing and while React Router + a syncing local first database [2] + $5/month Cloudflare gets you pretty far, I still found myself needing to think through a lot of pieces
[1] https://usequickcheck.com/ [2] https://fireproof.storage/
I haven’t gotten my hands on Zero yet, but the gist I get from those who have is that it gives you the same kind of experience with less work, where the client just operates from the local cache, and I’d assume there will be some way of lazy loading the entire data set into the cache which would give you offline capabilities if desired, but still functions as a traditional web app if the data set is too large.
What features do you wish Yjs had that would make your life easier?
I get that there is networking and integration that a modern application will typically need to do (depending on its core purpose), and syncing state to and from servers is a special concern (especially if conflict management is necessary) that native desktop applications rarely had to do in years past.
But at the end of the day, it sure does feel like we've come full circle. For a long time, every single application was "local first" by default. And now we're writing research papers and doing side POCs (I'm speaking generally, nothing to do with the author or their article) trying to figure out how to implement these things.
If I were to frame it succinctly I would say applications = local-only which is distinct from local-first.
I mean not to get really low level or abstract, but there's a reason that operating systems have the concept of a virtual file system. Where and how your data is persisted is something that can be abstracted from the rest of the system. Add CRDT or another conflict resolution solution to that layer and, I don't want to pretend that it's simple, far from it ... but it's not new tech is all I'm really saying.
Distributed systems were a very hot topic in the 90s. We even went through one very awkward and short-lived fad where we toyed with the idea of having a single application distributed across a network of computers ... not to persist data across multiple systems per se but for computation performance. This entire concept got abandoned when people realized how unnecessarily complicated it was (all cost / very little reward). But it was a thing for a while.
Even RAID distributes data persistence across physical devices and needs to care about data integrity as a result.
It just seems like the longer I'm in the industry, the more I realize that there is very little that is actually new ... despite the fact that we have a large number of enthusiastic young software developers who are looking at these concepts with starry eyes and youthful ignorance because they weren't around decades ago when this stuff was being researched, developed and experimented with.
I had thought that the advantage of CRDTs was you do not need a centralized server and that if you do have a central server Operational Transforms are easier. Am I missing why CRDTs are used here?
- First and (maybe most importantly), WebRTC in browsers requires a central server for signaling. So unless web browsers loosen that constraint, a "true" P2P web app without a central server is unfortunately infeasible.
- My understanding is that with Operational Transforms, the server is "special" — it's responsible for re-ordering the clients' operations to prevent conflicts. I mention a little later in the article that Y-Sweet is just running plain Yjs under the hood. So it is a central server, but it's easily replaceable with any other instance of Y-Sweet; you could fork the code and run your own and it would work just as well.
- Peers will only sync their changes if they're online at the same time. That means that the longer peers go without being online simultaneously, the more their local documents will diverge. From a user experience point of view, that means people will tend to do a lot of work in a silo and then receive a big batch of others' changes all at once — not ideal! Having a "cloud peer" that's always online mitigates that (this is true for any algorithm).
FWIW, though, the author of ShareJS had said some pretty strong things pro-CRDT in the past and even kind of lamenting his work on OT, so...
OT would work fine to make this collaboratively editable. It’s just not local first. (If that matters to you.)
With an OT based system like sharejs or google docs, the server is the hub, and clients are spokes connecting to that hub. Or to put it another way, the server acts as the central source of truth. If the server goes down, you’ve not only lost the ability to collaboratively edit. You've also usually lost your data too. (You can store a local copy, but sharejs not designed to be able to restore the server’s data from whatever is cached on the clients).
With Yjs (and similar libraries), the entire data set is usually stored on each peer the server is just one node you happen to connect to & use to relay messages. Because they’re using Yjs, the author of this travel app could easily set up a parallel webrtc channel with his wife’s computer (in the same house). Any edits made would be broadcast through all available pipes. Then even when they’re on the road and the internet goes down, their devices could still stay in sync. And if the server was somehow wiped, you could spin up another one. The first client that connects would automatically populate the server with all of the data.
But whether these local first characteristics matter to you is another question. They might be a hindrance - for commercial data, centralisation is often desirable. I can think of plenty of use cases where replicating your entire database to your customers’ computers (and replicating any changes they make back!) would be a disaster. It depends on your use case.
You’re right, that is one of the advantages of CRDTs, but it turns out to be hard to realize on the web — aside from RTC (which has its own dragons), you still need a server in the mix.
The other thing an authoritative server solves is persisting the data. Because one server is the authority for a document at a time, you can use S3 or R2 for persistence without worrying about different servers with different versions of the document colliding and erasing each other’s changes.
Plus, if you want the data to persist so that two people can collaborate even if they are never online at the same time, you need a server anyway.
CRDTs as data structures support peer-to-peer, it’s just that in many use cases that aspect of CRDTs is not needed.
Bonus points: you could potentially rip out the bus and replace it with something that involves peer to peer connectivity without changing client data structures.
I got confused by this comment though:
> To determine when to re-render, “reactive” frameworks like Svelte and Solid track property access using Proxies, whereas “immutable” frameworks like React rely on object identity.
I thought React was just as reactive as all the other JS frameworks, and that the state/setState code would look similar.Great article, congrats on releasing!
https://dev.to/this-is-learning/how-react-isn-t-reactive-and...
We also use a hub and spoke model, but we still rely on a central server (pocketbase) for management user flows like authorization and billing.
Obsidian is such a fantastic editor, and it fits so naturally with local-first collaboration.
Change my view!
1. That "in existing approaches" qualifier is important — local-first is still very much a nascent paradigm, and there are still a lot of common features that we don't really know how to implement yet (such as permissioning). You might be correct for the moment, but watch this space!
2. I think most apps that would benefit from a local-first architecture do not have the monotonically growing dataset you're describing here. Think word processors, image editors, etc.
3. That said, there are some apps that do have that problem, and local-first probably just isn't the right architecture for them! There are plenty of apps for which client-server is a fundamentally better architecture, and that's okay.
4. People love sorting things into binaries, but it doesn't have to be zero-sum. You can build local-first features (or, if you prefer, offline-first features) into a client-server app, and vice versa. For example, the core experience of Twitter is fundamentally client-server, but the bookmarking feature would benefit from being local-first so people can use it even when they're offline.
The other issue of relying on a just the server to build these highly collaborative apps is you can't wait for a roundtrip to the server for each interaction if you want it to feel fast. Sure you can for those rare cases where your user is on a stable Wifi network, on a fast connection, and near their data; however, a lot computing is now on mobile where pings are much much higher than 10ms and on top of that when you have two people collaborating from different regions, someone will be too far away to rely on round trips.
Ultimately, you're going to need a client caching component (at least optimistic mutations) so you can either lean into that or try to create a complicated mess where all of your logic needs to be duplicated in the frontend and backend (potentially in different programming languages!).
The best approach IMO (again biased to what Triplit does) is to have the same query engine on both client and server so the exact query you use to get data from your central server can also be used to resolve from the clients cache.
Probably it'd require ECC RAM to prevent in memory bitrot, multiple copies of blocks (or even multiple physical block devices) with strong checksums.
Perhaps this data should somehow "automagically" sync between all locally available devices, again protected with strong checksums at every step.
(This idea requires some refining.)
Then locally available devices can compare changelogs and sync only the delta.
No need for a checksum, since you can use monetonically increasing version numbers and CRDTs!
How does that help against random bit flips?
You could still build a server-side search index over those documents, which never needs to be sent to the client.
An example that matches that document structure is Figma; each document is individually small enough to be synced with the client, document metadata is indexed on the server, and queries over documents take place on the server.
- this app didn’t need fancy graph querying, so didn’t have to implement it.
- if it did, there’s a natural way to extend this approach to support it.
I mean, I don't, personally. I'm writing a couple small apps to scratch my own itches and I might sell them to anyone else who wants an individual copy for personal use.
Remember when you could just buy a copy of a program and use it on your own computer? And it would never get updated to remove functionality or break because some servers were shut down? That's the experience I'm seeking from local-first software.
I think designing for casual, personal-sized data is extremely easy if you give up the idea that every program needs to be some bloated Enterprise-Ready junkware.
You are Google's dream user. :)
For this to work, my home computer needs to upload the changes somewhere my phone can access them. For example, a home server or a dumb box in the cloud.
It’s very difficult to make this work without a server kicking around somewhere. So long as the server is fungible (it can easily be replaced for basically any other server), I don’t really see the problem with keeping a server around to relay messages.
Jake makes creating a local-first multiplayer app seem so simple.
A PoC of using Dropbox for a local-first app
Currently android only, don't have ios dev subscription https://play.google.com/store/apps/details?id=com.yedev.habi...
I'm currently looking into TinyBase to make working with high latency decentralised services more bearable.
Would be cool if there were a better comparison of the different solutions for storage and sync, with features, good use-cases, etc.
I work on PowerSync and we did a Show HN last year: https://news.ycombinator.com/item?id=38473743
Also see InstantDB Show HN: https://news.ycombinator.com/item?id=41322281
EDIT: actually... it looks like mongo may have just announced the EOL for server-side component a couple weeks ago... bad timing!
I love that it's document stored in S3, and it's probably going to be way cheaper than if hosted elsewhere in a database. Can't wait to try it out soon