I wrote this because I wanted a queue with all the bells and whistles - searching, scheduling into the future, observability, and rate limiting - all the things that many modern task queue systems have.
But I didn't want to rewrite my app, which was already using SQS. And I was frustrated that many of the best solutions out there (BullMQ, Oban, Sidekiq) were language-specific.
So I made an SQS-compatible replacement. All you have to do is replace the endpoint using AWS' native library in your language of choice.
For example, the queue works with Celery - you just change the connection string. From there, you can see all of your messages and their status, which is hard today in the SQS console (and flower doesn't support SQS.)
It is written to be pluggable. The queue implementation uses SQLite, but I've been experimenting with RocksDB as a backend and you could even write one that uses Postgres. Similarly, you could implement multiple protocols (AMQP, PubSub, etc) on top of the underlying queue. I started with SQS because it is simple and I use it a lot.
It is written to be as easy to deploy as possible - a single go binary. I'm working on adding distributed and autoscale functionality as the next layer.
Today I have search, observability (via prometheus), unlimited message sizes, and the ability to schedule messages arbitrarily in the future.
In terms of monetization, the goal is to just have a hosted queue system. I believe this can be cheaper than SQS without sacrificing performance. Just as Backblaze and Minio have had success competing in the S3 space, I wanted to take a crack at queues.
I'd love your feedback!
+1 for k8s, kubernetes, cloud native, self-hosted, edge-enabled at low cost, no cost.
I ran rq and minio for years on k8s, but been watching sqlite as a drop-in-replacement since most of my work has been early stage at or near the edge.
Private cloud matters. This is an enabler. We've done too much already in public cloud where many things don't belong.
BTLE sensors are perfectly happy talking to my Apple Watch directly with enough debugging.
I'd argue the trip through cloud was not a win and should be corrected in the next generation of tools like this, where mobile is already well-primed for SQLite.
Asking from a business perspective - I of course intend to keep developing this, but am also really trying to think through the business case as well.
The answer depends on funding, i.e. in my own never-leaves-my-house case it is always self-host, much like SOC work.
In the case of startup or research lab work (day-job, for lack of a better descriptor). It's frequently a slice of AWS, GCP, or Azure, i.e. 6 figure/mo cloud bills.
I think those two broad cases are worth considering.
If you look at the history from J2EE to the k8s prototype in Java to what we have now, it's a great idea to encapsulate all of these things into a single container, particularly at Google scale, but many unintended consequences arise from complexity accruing to features and functions which weren't actually requirements for your particular project being supported, i.e. the notion that YAGNI because few orgs have Google scale problems. If so, great! Carry on... If not, consider k3s or aptible or more emergent platforms I haven't actually used.
The mere presence of unneeded items in source and documentation presents a "why am I here?" choice paradox. That's before we even get into keeping track of deprecations in the never-at-rest source/release evolution.
Federation is a good example. I've worked at places that needed it and places that didn't.
Continous Glucose Monitors (CGM). That used to be a diabetic-only problem until Abbott, Dexcom,and other vendors expanded their markets beyond diagnosed diabetics into pre-diabetes markets and exercise, health, and well-being applications like:
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10635370/
and:
This has exploded beyond the hundred years of ketogenic research and as research on glycemic variability in mental health has grown.
My current kit includes an Apple Watch series 9 and an iPhone SE. Prior to that it was Google Pixel and a Fitbit Sense, though direct BTLE->Watch was not an option in that generation.
I have two complications: Abbott Freestyle Libre 3 and Dexcom G7. Continuous Glucose Monitors (CGM). The Dexcom is entirely proprietary. The Abbott is accomplished via a series of 3rd party hacks:
https://www.youtube.com/watch?v=YqUZjXo5VXY
I'd say open source, but I'm not certain that every link in the chain is open source. I've run many generations of various open source tools on Android and iPhone since no vendor ships a complete end-to-end solution that is perfect.
Nightscout and watchdrip are two open source examples:
When G7 originally shipped last year, sensor data left the sensor and used the iPhone as a proxy-to-cloud storage, as has become the default mode across many IoT devices because it was easy, obvious, despite the unintended consequences of the design choices here.
At that point, because the data has already taken a long and perilous journey into cloud when BTLE->Watch was dramatically shorter, cheaper (in terms of hops and requirement for service), and arguably better. Hence, any data request pays that full routing price into and out of cloud, even the most trivial display, such as watch.
After a latent Dexcom BTLE->Apple Watch update a year later, I don't even need to carry my iPhone anymore, despite the fact that I don't have service beyond WiFi on my Apple Watch, since the data exchange is entirely BTLE.
The sad fact is that straightforward questions around a person's glycemia were not answerable directly without an entire belt-worn cloud ecosystem being paid for and fully functional 24x7x365.
BTLE->Apple Watch is by no means perfect, but it's dramatically better than my previous 5 years of 24x7x365 routing through cloud.
HTH!
It looks like this may solve at least another part of my quest to replace various back-ends with sqlite.
Makes me think quite positively of go lang and of the dev for designing in such a way. Can understand why teams like it because of easier maintenance
Quite elegant from my uneducated pov.
A number of people don't like it for limiting their expression and abilities, which I understand that feeling too. But as a middle aged programmer I realized that readability trumps conciseness and cleverness in the long run.
Two experienced Rust devs tackle the same task, and their solutions will be worlds apart.
(I write both, but I do love Go for its simpleness)
- The dev for making it both fast & simple to understand
- Golang for making the codebase easy to follow
are you monetizing this as a separate business from: https://www.ycombinator.com/companies/scratch-data
But I started working this as something I wish existed as opposed to having some big VC strategy and pitch deck behind it.
(Also, I appreciate all of your feedback on this a month ago! It was really helpful to encourage me to keep looking into this and also figuring out the "first" things to launch with!)
The monetization paragraph reads really weird, as if you believe HN is a community of VC-adjacent people looking for new ways to make money (it isn't), and talking about how you plan to exploit your new project is mandatory (it isn't either).
What isn't irrelevant, I think, is the accusation that monetizing something you've worked on is somehow exploiting it. If this brings value to people, and they can make money using it, or make more money using, the OP deserves to charge them for it if they want to. There's nothing wrong or exploitative about that. You are free to donate all your time and not charge anyone for anything but it's silly to frame someone charging for their work as an exploitation of either that thing or the people paying for it.
it kinda is. There are all sorts of people here, but HN is owned by YC, a well known VC fund. That doesn't mean that everyone here is one, but it certainly influences the community here.
> and talking about how you plan to exploit your new project is mandatory
it's not mandatory but it's a frequently asked question. OP might as well answer it while they have the mic.
Where was this claim made? I missed it.
> Projects that were created to "scratch one's itch" tend to fare better than those built to make money. Devs put more love and less stress into them than into things they want to build a business out of.
This project is based on SQLite, which is several people's livelihood. As a result, it's rock-solid, reliable, and available to all without fee or restriction.
So it's an odd choice of project to choose to express this sort of eminently debatable sentiment.
The main difference I expect with a hosted solution are things like multiple tenants or billing integrations. These aren’t core to the product and only necessary when you need to host someone else’s data.
My enthusiasm has instantly waned.
Why is AGPL needed? Just be MIT and make it easy for people, espefcially if you're not planning on monetising it.
I won't use AGPL code just on principle.
Not trying to be snarky, I'm genuinely curious why you'd be so vehemently opposed to it.
Permissive licenses allow for proprietary forks, which may become more successful than the upstream project.
AGPL would be able to benefit from any improvements from any fork, and all those will remain OSS for everyone.
Nothing written here is related in any way to monetization.
Which fulfills the developer will of attracting higher quality users that plan to collaborate with him. You got it right :)
You got something wrong tho, If you're anywhere close to a free software/open source space, you should know that using his AGPL SQS replacement, the only thing that would need to be public is whatever you change in it, not the things you use it.
Every man and his dog has made a message queue with Postgres. Message queues are everywhere on github and often posted on HN.
So far I had not problems with ElasticMQ.
I'm much intrigued by the small LOC count of SmoothMQ. When I compare it to ElasticMQ it's much smaller (probably by using sqlite's features).
I assume this would work without much issue with Litestream, though I'm curious if you've already tried it. This would make a great ephemeral queue system without having to worry about coordinating backend storage.
The nice thing about queues is that backend storage doesn't really need to be coordinated. Like, you could have two servers, with two sets of messages, and the client can just pull from them round robin. They (mostly) don't need to coordinate at all for this to work.
However, this is different for replication where we have multiple nodes as backups.
I also love writing AWS API-compatible services. That's why I did Dyna53 [1] ;P
(I know, unrelated, but hopefully funny)
Move all the structs from models/ into the root directory.
This allow users of this package to have nice and short names like: q.Message and q.Queue, and avoids import naming conflicts if the user has its own „models“ package.
First, the project does not yet have a distributed implementation, you’re correct. Stay tuned!
Second, SQLite is incidental. Data is still stored on disk, just as SQS must be, but I’ve chosen SQLite as the file format for now.
Having said that, I think it is safe to assume that SQS probably stores information to non-volatile storage.
Nice work on the project!
If it faithfully reproduces the SQS Api what could possibly stop me from using this product now (if he ever does hosted) and then switching to SQS if the scale is ever justified?
I'm all for a full suite of solutions targeting the same API. Can I run this thing on a single server with my dashboard app for the 3 person team I'm developing for and all that it cost them is per hour for tech support and server hardware and deloy my multi million user app on SQS but have effectively the same code.
I swear they reimplement stuff we have just so there are more places to bill us.
If you are greenfield and scared of hitching yourself to Amazon, why not go something like RabbitMQ? There is also RabbitMQ cloud providers as well.
HTTP based solution is easy to understand and implemented (if basic functionality is enough for your use case). Real MQs obviously have more complex protocols and not all client libraries are perfect.
10m requests for $4.
You will need hundreds of millions of requests per month for it to be noticable.
And can an implementation like this even help you at that point?
Add to that, it’s enormously scalable in terms of throughput and retained messages.
And it’s globally available.
A single file app is no comparison, really. The value of SQS is in the engineering.
Some enterprise production questions:
* Is it HA? What HA topology is it designed for? * How many messages-per-second can it ingest? * What about with multiple clients? * Can you front-end it via a load balancer? * Can it guarantee only-one delivery? * How does it handle errors? ie: when a message is retrieved by a client and the client errors after the server dies, what happens to the message - does it redrive or expire correctly when the server comes back up? * How does it guarantee consistency in a multi-node environment? * How dies it behave when a node dies/runs out of memory? * How do you monitor it? * How do you recover a corrupted instance? * what happens if/when the SQLite partition runs out of space?
etc etc.
I use SQS as a main pipeline, and I would want to know a bunch of these things before I'd even consider replacing it. But for someone with one box on a VPS somewhere, you probably only care if it runs for a year without crashing
single exectuable binary
golang with sqlite so fast
minimal config so no ridiculous hours or days of config and operations
This blitzes other ideas for use cases that do not require distributed queues - and that is likely many use cases.
The "TODO: check for errors" comment, combined with what seems like disabling foreign key constraint checks, makes me a bit hesitant to try this out.
In practice, I did not find they were necessary - I was only using foreign keys to automatically CASCADE deletes when messages were removed. But instead of relying on sqlite to do that, I do it myself and wrap the delete statements in transactions.
There are many TODOs and error checkings that I will, over time, clean up and remove. I'm glad you've pointed them out - that's the great thing about open source, you at least know what you're getting into and can help shine a light on things to improve!
The celery Backends and Brokers docs compare SQS and RabbitMQ AMQP: https://docs.celeryq.dev/en/stable/getting-started/backends-...
Celery's flower utility doesn't work with SQS or GCP's {Cloud Tasks, Cloud Pub/Sub, Firebase Cloud Messaging FWIU} but does work with AMQP, which is a reliable messaging protocol.
RabbitMQ is backed by mnesia, an Erlang/OTP library for distributed Durable data storage. Mnesia: https://en.wikipedia.org/wiki/Mnesia
SQLite is written in C and has lots of tests because aerospace IIUC.
There are many extensions of SQLite; rqlite, cr-sqlite, postlite, electricsql, sqledge, and also WASM: sqlite-wasm, sqlite-wasm-http
celery/kombu > Transport brokers support / comparison table: https://github.com/celery/kombu?tab=readme-ov-file#transport...
Kombu has supported Apache Kafka since 2022, but celery doesn't yet support Kafka: https://github.com/celery/celery/issues/7674#issuecomment-12...
RabbitMQ and other MOMs like Kafka are very versatile. What is the use case for not using SQS right now, but maybe later?
And if there is a use case (e.g. production-grade on-premise deployment), why not a client-side facade for a production-grade MOM (e.g. in Celery instead of sqs: amqp:)? Most MOMs should be more feature-rich than SQS. At-least-once delivery without pub/sub is usually the baseline and easy to configure.
I mean if this project reaches its goal to provide an SQS compatible replacement, that is nice, but I wonder if such a maturity comes with the complexity this project originally wants to avoid.
SQS and heavier ESBs are overkill for some applications, and underkill for others where an HA configuration for the MQ / task queue is necessary.
Why would you want to use the SQS protocol in production without targeting the SQS "broker" as well? The timing and the AWS imposed quotas are bound to be different.
There are plenty brokers that fit different needs. I don't see the SQS protocol especially with security and so on in mind as a good fit in this case.
SQS is not a reliable exactly-once messaging protocol like AMQP, and it doesn't do task-level accounting or result storage (which SQLite also solves for).
Apache Kafka > See also: https://en.wikipedia.org/wiki/Apache_Kafka
I don't know where you are getting at. I work on a project with SQS and I worked with RabbitMQ, ArtemisMQ and other MOM technologies as well. At least once delivery is something you can achieve easily. This would be the common ground that SQS can also provide. The same is true for Kafka.
https://github.com/crowdwave/sasquatch
sasquatch is also a message queue, also written in Golang and also based on sqlite.
sasquatch implements behaviour very similar to SQS but does not attempt to be a dropin replacement.
sqsquatch is not a complete project though, nor even really a prototype, just early code. Likely it does not compile.
HOWEVER - sasquatch is MIT license (versus this project which is AGPL) so you are free to do with it as you choose.
sasquatch is a single file of 700 lines so easy to get your head around: https://raw.githubusercontent.com/crowdwave/sasquatch/main/s...
Just remember as I say it's early code, won't even compile yet but functionally should be complete.
I never cared to figure out what parts of SQS are clients-side and server side, but - does SmoothMQ support long polling, batch delivery, visibility timeouts, error handling, and - triggers? Or are triggers left to whatever is implementing the queue? Both FiFo and simple queues? Do you have throughput numbers?
As an SQS user, a table of SQS features vs SmoothMQ would be handy. If it's just an API-compatible front-end then that would be good to know. But if it does more that would also be good to know.
The reason you'd use this is because there are lots of clients who still want on-prem solutions (go figure). Being able to switch targets this way would be handy.
It implements many of these features so far (ie, visibility timeouts) and there are some that are still in progress (long polling.) a compatibility table is a good idea.
Totally agree on adding more metrics and information to the UI - but how much of that should be on the dashboard vs exposed as a prometheus metric for someone to use their dashboard tool of choice?
I am not very good at visual design and have chosen the simplest possible tech to build it (static rendering + fomanticUI). I sometimes wonder if the lack of react or tailwind or truly beautiful elements will hold the project back.
go get github.com/poundifdef/SmoothMQ/models
go: github.com/poundifdef/SmoothMQ@v0.0.0-20240630162953-46f8b2266d60 requires go >= 1.22.2; switching to go1.22.4
go: github.com/poundifdef/SmoothMQ@v0.0.0-20240630162953-46f8b2266d60 (matching github.com/poundifdef/SmoothMQ/models@upgrade) requires github.com/poundifdef/SmoothMQ@v0.0.0-20240630162953-46f8b2266d60: parsing go.mod:
module declares its path as: q
but was required as: github.com/poundifdef/SmoothMQ
My lab also developed an SQS-esque system based on the filesystem, so no dependencies whatsoever and no need for any operational system other than the OS. It doesn't support all SQS commands (because we haven't needed them), but it also supports commands that SQS doesn't have (like release all messages to visible status).
Each queue node can operate (mostly) independently, and this is good. As a consumer, I don't really care where my next message comes from, so I can minimize the amount of data that needs to have a "leader".
The only data that needs to be synced is the list of queues, which doesn't change often. If one server is full, it should be able to route a request to another server.
When we downscale, we can use S3/Dynamo (GCS/firestore) to store items and redistribute.
There's more nitty gritty here (what about FIFO queues? What about replication?) but the fact that the main actions, "enqueue" and "dequeue", don't require lots of coordination makes this easier to reason about comapred to a full RDBMS.
Enqueue absolutely requires coordination, if not via leader then at least amongst multiple nodes, if you want to guarantee at least once delivery
If you don't guarantee that, cool, but you're not competing with sqs
Or if you truly only need to store simple values in a distributed fashion, you could probably use etcd for that part.
[0]: https://rqlite.io/
disclaimer: I am one of the maintainers
It really depends on the semantics of the MQ you want to provide. There is rqlite if you want a distributed SQLite over Raft.
The question is what sort of guarantees you'd like to provide and how much of latency / performance you are willing to compromise
I'll grant it's small to infinitesimal, but you asked for feedback.
Are there maintenance actions that the admin needs to perform on the database? How are those done?
re: maintenance - I have tried to build this to be hands-off. The only storage this uses is SQLite, and I have the code set to automatically vacuum as space increases.
It also has a /metrics endpoint which has disk size. This is going to be used for two things in the future: first, as a metric for autoscaling (scale when disk is full) and second so that a server can stop serving requests when its disk is full (to prevent catastrophic failure.)
It looks like localstack is a supported testcontainer and they do support SQS (but I haven't tried it myself.)
The specialized testcontainers just pass down or expose more config parameters, imho, because they possess more knowledge about the docker container that it is about to start. For SQS i could imagine, that it is convenient to expose uri or arn or even auth parameters, that one then can refer to when setting up tests with dynamic config parameters and to setup the sqs client.
Localstack i have not used so far. I use containers often for database tests also.
Rate limiting is something I miss on Sidekiq (only available on the premium plan that I can't afford) and the gems that extend it break compatibility often.
I'll probably try poking at it directly through the HTTP API rather than an SDK ... does it need AWS V4 auth signatures or anything?
If you do use your language's AWS SDK, the code handles [1] all of the V4 auth stuff. https://github.com/poundifdef/SmoothMQ/blob/main/protocols/s...
I'd love your feedback! Particularly the difficulties you find in running it, of which I'm sure there are many, so I can fix them or update docs.
1. Is there some threshold where this would make sense financially (n billions of messages.)
2. Are the extra developer features (ie, larger message sizes, observability, DAGs) worth it for people to switch?
Would love your thoughts - what, if anything, would make you even entertain moving to a different queue system?
Maybe you could add a web admin GUI as a paid add on?
My own vision is to take a queue that is relatively dumb and make it smarter. I want it to be able to, for example, allow you to rate limit workers without needing to implement this client side. And so on for all of the other bits that one needs to implement in the course of distributed processing.
I’m still figuring out the market. Very large firms spending thousands of queues? Or developers who want a one stop solution built on familiar tech? Or hosting companies who want to offer their own queue as a service?
I don't think the cost of queues is a problem anywhere (I'm sure it is somewhere, but not a market's worth). The problems created by queues, on the other hand, are myriad and expensive.
Loving all these self-hosted KISS stuff :)
I think the queue itself will end up using a number of technologies: SQLite for some data (ie, organizing messages by date and queue), RocksDB for other things (fast lookup for messages), DuckDB (for message statistics and metadata), and other data structures.
I find that some of the most performant software often uses a mix of different data structures and algorithms by introspecting the nature of the data it has. And if I can make one node really sing, then I can have confidence that distributing that will yield results.
I think SQLite is a great start, but I really do want the software to be able to utilize the full resources the underlying hardware.
People don't bother a google search, after, what, 20 years in town?
I've been playing with benchmarks, yes! I haven't written them up yet because I worry about doing them the "right" way. but since you asked, I did a quick test with a single server and single client on a t2.nano. With 3 sending threads and 5 receiving threads, and mesg size of 2kb, I can send 700 msgs/s and receiver 500 msgs/s.
It is way faster on my laptop, and I have a lot of tricks that I've been playing with to improve this.
I’d also recommend reducing the number of threads as that can increase performance on single-processor machines (context switching will kill you). Try to find the sweet spot (for us, it was 2x-4x the number of cpus to threads, at least, for our workload).
If you are using go, it kinda sucks at single-cpu things, and if you detect that you have a single core, you can lock a go-routine to a thread and that sometimes helps.
Also, try to batch sends (aka, for-loop) to attempt to saturate the send-side so by the time you come back with your next batch they’re still messages sitting in a network buffer somewhere. For example, we have a channel that wakes up our sender, then we honest-to-god wait like 60ms in the hopes there will be more than one message to pick up — and there usually is in production.
1. When new messages are inserted, immediately append to a file on disk. Then, in batch, insert to SQLite.
2. When dequeuing messages, keep n message IDs in memory as a ready queue, and then keep dequeued message IDs in another list. Those can be served immediately (using a SELECT which is fast) and then updating messages to the dequeued status can happen in batch.
Appreciate the tips!
For 2. you could probably do something like this:
BEGIN TRANSACTION;
-- Select the oldest pending message
SELECT id, message FROM queue WHERE status = 'pending' ORDER BY created_at ASC LIMIT 100;
-- Mark messages as 'processing'
UPDATE queue SET status = 'processing' WHERE created_at < ?; -- assuming created_at is monotonic
COMMIT;
Basically, select a batch and then abuse the ordering properties to batch mark them. Then all messages in your select you can dispatch evenly to sender threads. Sender threads can then signal a buffered channel that they've completed/failed, and the database can be updated. At startup, you can just SELECT where status = 'processing' and recover.This is a pretty decent translation of how ours works.
Love the red sweater btw very classy yet simple
The goal is for this to be able to run well on commodity hardware, which I think is possible. If I can run this on $platform with $cheap instances and an $autoscaler then that would be my ideal design goal because I think it matches the setup that most people have access to.
However, I agree - I am a big fan of Hetzner's servers and am excited to try out benchmarks on beefier hardware too.
Any type of licensing conditions like AGPL 3.0 to force you to share your code is a no go for many many companies.
I understand the desire to make money and you can do that without AGPL 3.0
What's unclear or not transparent about that?
I had no idea there would be so much push back against this.
The AGPL, for something like this, is a total non-starter, IMO, and just changes the conversation.
It'd be one thing to choose the AGPL because you believe in the FOSS movement. Or if you're releasing something end-users can benefit from, directly, by self hosting it.
To release something that will always be a component of something bigger, license it as AGPL, then talk about monetization… just cut the chase and release it as “source available,” with a licence people can actually use, even if it has a bunch of not-really-FOSS strings attached.
Because who can realistically use this as is? Who can download the source and actually do something with it? What AGPL-compatible FOSS project is dying to use a drop-in AWS SQS replacement?
MIT/BSD license that imposes no control over the how the code is used is the only FOSS we are allowed to use and we do donate regularly to maintainers working selflessly.
What makes me angry is that somebody then uses that MIT/BSD license code, makes some modifications and releases it as AGPL with clear intent to make a buck from it without paying that original maintainer any money.
The problem is that AGPL is a total no starter at “day job,” and even at “side gig,” I'm obviously not going to release the source code of the entire service because of a message queue dependency.
I mean, if I was trying to switch clouds for whatever reason, I might want this, and could even be persuaded to pay to self host it.
But at this point there's no pay option. Just source I couldn't possibly use, not even for a month, not even as a trial, company policy or no company policy.
It makes no sense.
i much rather ppl just release closed source proprietary software that i can pay for and the value proposition is clear not be lead to surprises down the road where the community version is neglected