Show HN: Off Grid – Run AI text, image gen, vision offline on your phone

Your phone has a GPU more powerful than most 2018 laptops. Right now it sits idle while you pay monthly subscriptions to run AI on someone else's server, sending your conversations, your photos, your voice to companies whose privacy policy you've never read. Off Grid is an open-source app that puts that hardware to work. Text generation, image generation, vision AI, voice transcription — all running on your phone, all offline, nothing ever uploaded.

That means you can use AI on a flight with no wifi. In a country with internet censorship. In a hospital where cloud services are a compliance nightmare. Or just because you'd rather not have your journal entries sitting in someone's training data.

The tech: llama.cpp for text (15-30 tok/s, any GGUF model), Stable Diffusion for images (5-10s on Snapdragon NPU), Whisper for voice, SmolVLM/Qwen3-VL for vision. Hardware-accelerated on both Android (QNN, OpenCL) and iOS (Core ML, ANE, Metal).

MIT licensed. Android APK on GitHub Releases. Build from source for iOS.

sangaya
·
4 hours ago
·
[ - ]

Putting the power and the data of the users in the hands of the users themselves! Well done. Getting it setup was easy. Wish the app recognized the keyboard and realized when it was displayed so the bottom menu and chat box weren't hidden under it.

ali_chherawalla
·
3 hours ago
·
[ - ]

Thank you!

nine_k
·
3 hours ago
·
[ - ]

Is there something similar, but geared towards a Linux desktop / laptop? I suppose this would be relatively easy to adapt.

ali_chherawalla
·
3 hours ago
·
[ - ]

LM Studio solves for it pretty well I think. It doesn't do image gen etc though

resonious
·
5 hours ago
·
[ - ]

On my Samsung phone it doesn't move the screen up to make room for the keyboard so I can't see what I'm typing.

Really awesome idea though. I want this to work.

dotancohen
·
3 hours ago
·
[ - ]

I can confirm this bug on a Samsung S24 Ultra.

ali_chherawalla
·
3 hours ago
·
[ - ]

sorry about that one. I'm taking a look and fixing it right away

ali_chherawalla
·
2 hours ago
·
[ - ]

hey, just pushed a fix for it here: https://github.com/alichherawalla/off-grid-mobile/releases/t...

Thanks for spotting and reporting this.

bkmeneguello
·
6 hours ago
·
[ - ]

Very nice, but I'm gonna wait for the f-droid build.

ali_chherawalla
·
3 hours ago
·
[ - ]

ok, let me figure that one out.

flyingkiwi44
·
5 hours ago
·
[ - ]

The repository is listed as offgrid-mobile everywhere on that page but is off-grid-mobile.

So the lastest releases is at https://github.com/alichherawalla/off-grid-mobile/releases/l...

And the clone would be: git clone https://github.com/alichherawalla/off-grid-mobile.git

ali_chherawalla
·
3 hours ago
·
[ - ]

hey, yes. thanks just pushed that fix out

flyingkiwi44
·
1 hour ago
·
[ - ]

Looks like the build requirements for SDK and NDK in the build instructions don't match the build.gradle.

-Android SDK (API 34)

-Android NDK r26

compileSdkVersion = 36

targetSdkVersion = 36

ndkVersion = "27.1.12297006"

ali_chherawalla
·
42 minutes ago
·
[ - ]

yup just updated. Thanks!

vachina
·
5 hours ago
·
[ - ]

Ok it lists the instruction to build for iOS, but how to sideload?

behole
·
5 hours ago
·
[ - ]

I found more info https://github.com/alichherawalla/off-grid-mobile/blob/main/...

ali_chherawalla
·
3 hours ago
·
[ - ]

yeah thats right. for now for iOS you'll actually have to pod install etc.

Thanks for pointing this out

instagib
·
4 hours ago
·
[ - ]

Wonder if you can use GitHub actions to build iOS.

I found a guide for virtual box macOS which failed on intel then another for hyper-V but haven’t tried that one yet.

derac
·
6 hours ago
·
[ - ]

I haven't run it, but I looked through the repo. It looks very well thought out, the UI is nice. I appreciate the ethos behind the local/offline design. Cheers.

ali_chherawalla
·
3 hours ago
·
[ - ]

thank you!

snicky
·
5 hours ago
·
[ - ]

GitHub Releases link is broken.

The dash in "off-grid" is missing.

ali_chherawalla
·
3 hours ago
·
[ - ]

yup, just took a look at that and fixed it. My bad!

chr15m
·
3 hours ago
·
[ - ]

This rules. Godspeed!

ali_chherawalla
·
3 hours ago
·
[ - ]

wow. thank you

noodletheworld
·
1 hour ago
·
[ - ]

Nice idea, but isnt this kind of daft?

There are basically no useful models that run on phone hardware.

> Results vary by model size and quantization.

I bet they do.

Look, if you cant run models on your desktop, theres no way in hell they run on your phone.

The problem with all of these self hosting solutions is that the actual models you can run on them aren't any good.

Not like, “chat gpt a year ago” not good.

Like, “its a potato pop pop” no good.

Unsloth has a good guide on running qwen3 (1), and the tldr is basically, its not really good unless you run a big version.

The iphone 17 pro has 12GB of ram.

That is, to be fair, enough to run some small stable diffusion models, but it isnt enough to run run a decent quant of qwen3.

You need about 64 GB for that.

So… i dunno. This feels like a bunch of empty promises; yes, technically it can run some models, but how useful is it actually?

Self hosting needs next gen hardware.

This gen of desktop hardware isnt good enough, even remotely, to compare to server api options.

Running on mobile devices is probably still a way away.

(1) - https://unsloth.ai/docs/models/qwen3-how-to-run-and-fine-tun...

resonious
·
1 hour ago
·
[ - ]

The app is basically just a wrapper that makes it super easy to set this up, which I'm very thankful for. I sometimes want to toy with this stuff but the amount of tinkering and gluing things together needed to just get a chat going is always too much for me. The fact that the quality of the AI isn't good is just the models not being quite there yet. If the models get better, this app will be killer.

If there's a similar app for desktop that can set up the stronger models for me, I'd love to hear about it.

ali_chherawalla
·
53 minutes ago
·
[ - ]

LM Studio does it well. Along with being a system integrator for SD, and text models I've tried to create a very good chat experience. So theres some sauce over there with Prompt enhancements, Auto detection of images, English Transcription suppor, etc

·
1 hour ago
·
[ - ]

ali_chherawalla
·
51 minutes ago
·
[ - ]

I actually think you should give it a spin. IMO you don't need claude level performance for a lot of day to day tasks. Qwen3 8B, or even 4B quantized is actually quite good. Take a look at it. You can offload to the GPU as well so it should really help with speed. Theres a setting for it

iddan
·
1 hour ago
·
[ - ]

I think if people will people know how accessible it is to run local LLMs on their device they will consider buying devices with more memory that will be able to run better models. Local LLMs in the long run are game changers

ali_chherawalla
·
50 minutes ago
·
[ - ]

I agree. I mean mobile devices have only been getting more and more powerful.

ImPostingOnHN
·
53 minutes ago
·
[ - ]

It seems like a good solution for those living under a regime that sensors communication, free information flow, and LLM usage. Especially with a model that contains useful information.

imposter
·
52 minutes ago
·
[ - ]

[dead]

wittlesus
·
3 hours ago
·
[ - ]

This is genuinely exciting. The fact that you're getting 15-30 tok/s for text gen on phone hardware is wild — that's basically usable for real conversations.

Curious about a couple things: what GGUF model sizes are practical on a mid-range phone (say 8GB RAM)? And how's the battery impact during sustained inference — does it drain noticeably faster than, say, a video call?

The privacy angle is the real killer feature here IMO. There are so many use cases (journaling, health tracking, sensitive work notes) where people self-censor because they know it's going to a server somewhere. Removing that barrier entirely changes what people are willing to use AI for.

durhamg
·
2 hours ago
·
[ - ]

This sounds exactly like Claude wrote it. I've noticed Claude saying "genuinely" a lot lately, and the "real killer feature" segue just feels like Claude being asked to review something.

ali_chherawalla
·
2 hours ago
·
[ - ]

I've added a section for recommended models. So basically you can chose from there.

I'd recommend going for any quantized 1B parameter model. So you can look at llama 3.2 1B, gemma3 1B, qwen3 VL 2B (if you'd like vision)

Appreciate the kind words!

add-sub-mul-div
·
2 hours ago
·
[ - ]

> that's basically usable for real conversations.

That's using the word "real" very loosely.