Harper – an open-source alternative to Grammarly

Story

oersted
·
6 days ago
·
[ - ]

Fantastic work, I was so fed up with Grammarly and instantly installed this.

I'm just a bit skeptical about this quote:

> Harper takes advantage of decades of natural language research to analyze exactly how your words come together.

But it's just a rather small collection of hard-coded rules:

https://docs.rs/harper-core/latest/harper_core/linting/trait...

Where did the decades of classical NLP go? No gold-standard resources like WordNet? No statistical methods?

There's nothing wrong with this, the solution is a good pragmatic choice. It's just interesting how our collective consciousness of expansive scientific fields can be so thoroughly purged when a new paradigm arises.

LLMs have completely overshadowed ML NLP methods from 10 years ago, and they themselves replaced decades statistical NLP work, which also replaced another few decades of symbolic grammar-based NLP work.

Progress is good, but it's important not to forget all those hard-earned lessons, it can sometimes be a real superpower to be able to leverage that old toolbox in modern contexts. In many ways, we had much more advanced methods in the 60s for solving this problem than what Harper is doing here by naively reinventing the wheel.

chilipepperhott
·
6 days ago
·
[ - ]

I'll admit it's something of a bold label, but there is truth in it.

Before our rule engine has a chance to touch the document, we run several pre-processing steps that imbue semantic meaning to the words it reads.

> LLMs have completely overshadowed ML NLP methods from 10 years ago, and they themselves replaced decades statistical NLP work, which also replaced another few decades of symbolic grammar-based NLP work.

This is a drastic oversimplification. I'll admit that transformer-based approaches are indeed quite prevalent, but I do not believe that "LLMs" in the conventional sense are "replacing" a significant fraction of NLP research.

I appreciate your skepticism and attention to detail.

s1291
·
4 days ago
·
[ - ]

Here's an article you might find interesting: https://www.quantamagazine.org/when-chatgpt-broke-an-entire-...

tough
·
6 days ago
·
[ - ]

to someone who would like to study/learn that evolution, any good recs?

jasonjmcghee
·
5 days ago
·
[ - ]

This skips over Bag of Words / N-Gram / TF-IDF and many other things, but paints a reasonable picture of the progression.

1. https://jalammar.github.io/illustrated-word2vec/

2. https://jalammar.github.io/visualizing-neural-machine-transl...

3. https://jalammar.github.io/illustrated-transformer/

4. https://jalammar.github.io/illustrated-bert/

5. https://jalammar.github.io/illustrated-gpt2/

And from there it's mostly work on improving optimization (both at training and inference time), training techniques (many stages), data (quality and modality), and scale.

---

There's also state space models, but don't believe they've gone mainstream yet.

https://newsletter.maartengrootendorst.com/p/a-visual-guide-...

And diffusion models - but I'm struggling to find a good resource so https://ml-gsai.github.io/LLaDA-demo/

---

All this being said- many tasks are solved very well using a linear model and tfidf. And are actually interpretable.

oersted
·
4 days ago
·
[ - ]

This is indeed the previous generation, but it's not even that old. When I was coming out of undergrad word2vec was the brand-new thing that was eating-up the whole field.

Indeed, before that there was a lot of work on applying classical ML classifiers (Naive Bayes, Decision Trees, SVM, Logistic Regression...) and clustering algorithms (fancily referred to as unsupervised ML) to bag-of-words vectors. This was a big field, with some overlap with Information Retrieval, lending to fancier weightings and normalizations of bag-of-words vectors (TF-IDF, BM25). There was also the whole field of Topic Modeling.

Before that there was a ton of statistical NLP modeling (Markov chains and such), primarily focused around machine translation before neural-networks got good enough (like the early version of Google Translate).

And before that there were a few decades of research on grammars (starting with Chomsky), with a lot of overlap with compilers, theoretical CS (state-machines and such) and symbolic AI (lisps, logic programming, expert systems...).

I myself don't have a very clear picture of all of this. I learned some in undergrad and read a few ancient NLP books (60s - 90s) out of curiosity. I started around the time where NLP, and AI in general, had been rather stagnant for a decade or two, it was rather boring and niche, believe it or not, but was starting to be revitalized by the new wave of ML and then word2vec with DNNs.

tolerance
·
6 days ago
·
[ - ]

I would much rather check my writing against grammatical rules that are hard coded in an open source program—meaning that I can change them—than ones that I imagine would be subject to prompt fiddling or worse; implicitly hard coded in a tangle of training data that the LLM would draw from.

The Neovim configuration for the LSP looks neat: https://writewithharper.com/docs/integrations/neovim

The whole thing seems cool. Automattic should mention this on their homepage. Tools like this are the future of something.

triknomeister
·
6 days ago
·
[ - ]

You would lose out on evolution of language.

phoe-krk
·
6 days ago
·
[ - ]

Natural languages evolve so slowly that writing and editing rules for them is easily achievable even this way. Think years versus minutes.

fakedang
·
6 days ago
·
[ - ]

Aight you win fam, I was trippin fr. You're absolutely bussin, no cap. Harvard should be taking notes.

(^^ alien language that was developed in less than a decade)

notahacker
·
6 days ago
·
[ - ]

The existence of common slang which isn't used in the sort of formal writing that grammar linting tools are typically designed to promote is more of a weakness of learning grammar by a weighted model of the internet vs formal grammatical rules than a strength.

Not an insurmountable problem, ChatGPT will use "aight fam" only in context-sensitive ways and will remove it if you ask to rephrase to sound more like a professor, but RHLFing slang into predictable use is likely a bigger potential challenge than simply ensuring the word list of an open source program is sufficiently up to date to include slang whose etymology dates back to the noughties or nineties, if phrasing things in that particular vernacular is even a target for your grammar linting tool...

chrisweekly
·
6 days ago
·
[ - ]

Huh, this is the first time I've seen "noughties" used to describe the first decade of the 2000s. Slightly amusing that it's surely pronounced like "naughties". I wonder if it'll catch on and spread.

nailer
·
6 days ago
·
[ - ]

‘Noughties’ was popular in Australia from 2010 onwards. Radio stations would “play the best from the eighties nineties noughties and today”.

notahacker
·
6 days ago
·
[ - ]

Common in Britain too, also appears in the opening lines of the Wikipedia description for the decade and the OED.

harvey9
·
6 days ago
·
[ - ]

The fact that you never saw it before suggests it did not catch on and spread during the last 25 years.

dmoy
·
5 days ago
·
[ - ]

Pedantically,

aight, trippin, fr (at least the spoken version), and fam were all very common in the 1990s (which was the last decade I was able to speak like that without getting jeered at by peers).

afeuerstein
·
6 days ago
·
[ - ]

I don't think anyone has the need to check such a message for grammar or spelling mistakes. Even then, I would not rely on a LLM to accurately track this "evolution of language".

fakedang
·
6 days ago
·
[ - ]

What if you're writing emails to GenZers?

dpassens
·
6 days ago
·
[ - ]

As a zoomer, I'd rather not receive emails that sound like they're written by a moron.

bombcar
·
6 days ago
·
[ - ]

Attempting to write like a GenZ when you’re not gets you “hello fellow kids” and “Boomer” right away.

phoe-krk
·
6 days ago
·
[ - ]

Yes, precisely. This "less than a decade" is magnitudes above the hours or days that it would take to manually add those words and idioms to proper dictionaries and/or write new grammar rules to accomodate aspects like skipping "g" in continuous verbs to get "bussin" or "bussin'" instead of "bussing". Thank you for illustrating my point.

Also, it takes at most few developers to write those rules into a grammar checking system, compared to millions and more that need to learn a given piece of "evolved" language as it becomes impossible to avoid learning it. It's not only fast enough to do this manually, it also takes much less work-intensive and more scalable.

fakedang
·
6 days ago
·
[ - ]

Not exactly. It takes time for those words to become mainstream for a generation. While you'd have to manually add those words in dictionaries, LLMs can learn these words on the fly, based on frequency of usage.

phoe-krk
·
6 days ago
·
[ - ]

At this point we're already using different definitions of grammar and vocabulary - are they discrete (as in a rule system, vide Harper) or continuous (as in a probability, vide LLMs). LLMs, like humans, can learn them on the fly, and, like humans, they'll have problems and disagreements judging whether something should be highlighted as an error or not.

Or, in other words: if you "just" want a utility that can learn speech on the fly, you don't need a rigid grammar checker, just a good enough approximator. If you want to check if a document contains errors, you need to define what an error is, and then if you want to define it in a strict manner, at that point you need a rule engine of some sort instead of something probabilistic.

efitz
·
5 days ago
·
[ - ]

I’m glad we have people at HN who could have eliminated decades of effort by tens of thousands of people, had they only been consulted first on the problem.

phoe-krk
·
5 days ago
·
[ - ]

Which effort? Learning a language is something that can't be eliminated. Everyone needs to do it on their own. Writing grammar checking software, though, can be done few times and then copied.

qwery
·
6 days ago
·
[ - ]

Please share your reasoning that led you to this conclusion -- that natural language "evolves slowly". You also seem to be making an assumption that natural languages (English, I'm assuming) can be well defined by a simple set of rigid patterns/rules?

phoe-krk
·
6 days ago
·
[ - ]

> Please share your reasoning that led you to this conclusion -- that natural language "evolves slowly".

Languages are used to successfully communicate. To achieve this, all parties involved in the communication must know the language well enough to send and receive messages. This obviously includes messages that transmit changes in the language, for instance, if you tried to explain to your parents the meaning of the current short-lived meme and fad nouns/adjectives like "skibidi ohio gyatt rizz".

It takes time for a language feature to become widespread and de-facto standardized among a population. This is because people need to asynchronously learn it, start using it themselves, and gain critical mass so that even people who do not like using that feature need to start respecting its presence. This inertia is the main source of slowness that I mention, and also and a requirement for any kind of grammar-checking software. From the point of such software, a language feature that (almost) nobody understands is not a language feature, but an error.

> You also seem to be making an assumption that natural languages (English, I'm assuming) can be well defined by a simple set of rigid patterns/rules?

Yes, that set of patterns is called a language grammar. Even dialects and slangs have grammars of their own, even if they're different, less popular, have less formal materials describing them, and/or aren't taught in schools.

qwery
·
6 days ago
·
[ - ]

Fair enough, thanks for replying. I don't see the task of specifying a grammar as straightforward as you do, perhaps. I guess I just didn't understand the chain of comments.

I find that clear-cut, rigid rules tend to be the least helpful ones in writing. Obviously this class of rule is also easy/easier to represent in software, so it also tends to be the source of false positives and frustration that lead me to disable such features altogether.

phoe-krk
·
6 days ago
·
[ - ]

When you do writing as a form of art, rules are meant to be bent or broken; it's useful to have the ability to explicitly write new ones and make new forms of the language legal, rather than wrestle with hallucinating LLMs.

When writing for utility and communication, though, English grammar is simple and standard enough. Browsing Harper sources, https://github.com/Automattic/harper/blob/0c04291bfec25d0e93... seems to have a lot of the basics already nailed down. Natural language grammar can often be represented as "what is allowed to, should, or should not, appear where, when, and in which context" - IIUC, Harper seems to tackle the problem the same way.

qwery
·
5 days ago
·
[ - ]

I'm certainly not disputing the existence of grammar nor do I think an LLM is a good way to implement/check/enforce one. And now I realise how my first comment landed. Thanks again!

fl0id
·
4 days ago
·
[ - ]

Your first point would be more fitting if a language checker would need a complete, computable grammar that can be parsed and understood. That would be problematic for natural languages.

tolerance
·
4 days ago
·
[ - ]

You get it!!

bombcar
·
6 days ago
·
[ - ]

Just because the rules aren’t set fully in stone, or can be bent or broken, doesn’t mean they don’t “exist” - perhaps not the way mathematical truths exist, but there’s something there.

Even these few posts follow innumerable “rules” which make it easier to (try) to communicate.

Perhaps what you’re angling against is where rules of language get set it stone and fossilized until the “Official” language is so diverged from the “vulgar tongue” that it’s incomprehensibly different.

Like church/legal Latin compared to Italian, perhaps. (Fun fact - the Vulgate translation of the Bible was INTO the vulgar tongue at the time: Latin).

airstrike
·
6 days ago
·
[ - ]

I don't need grammar to evolve in real time. In fact, having a stabilizing function is probably preferable to the alternative.

eadmund
·
6 days ago
·
[ - ]

If a language changes, there are only three possible options: either it becomes more expressive; or it becomes less expressive; or it remains as expressive as before.

Certainly we would never want our language to be less expressive. There’s no point to that.

And what would be the point of changing for the sake of change? Sure, we blop use the word ‘blop’ instead of the word ‘could’ without losing or gaining anything, but we’d incur the cost of changing books and schooling for … no gain.

Ah, but it’d be great to increase expressiveness, right? The thing is, as far as I am aware all human languages are about equal in terms of expressiveness. Changes don’t really move the needle.

So, what would the point of evolution be? If technology impedes it … fine.

canjobear
·
5 days ago
·
[ - ]

The world that we need to be expressive about is changing.

dragonwriter
·
6 days ago
·
[ - ]

> So, what would the point of evolution be?

Being equally as expressive overall but being more focussed where current needs are.

OTOH, I don't think anything is going to stop language from evolving in that way.

Polarity
·
6 days ago
·
[ - ]

why did you use chatgpt for this text then?

acidburnNSA
·
6 days ago
·
[ - ]

I can write em-dashes on my keyboard in one second using the compose key: right alt + ---

Freak_NL
·
6 days ago
·
[ - ]

Same here — the compose key is so convenient you forget most people never heard of it. This em-dashes mean LLM output thing is getting annoying though.

johnisgood
·
6 days ago
·
[ - ]

> This em-dashes mean LLM output thing is getting annoying though.

Agreed. Same with those non-ASCII single and double quotes.

·
6 days ago
·
[ - ]

shortformblog
·
6 days ago
·
[ - ]

LanguageTool (a Grammarly competitor) is also open source and can be managed locally:

https://github.com/languagetool-org/languagetool

I generally run it in a Docker container on my local machine:

https://hub.docker.com/r/erikvl87/languagetool

I haven't messed with Harper closely but I am aware of its existence. It's nice to have options, though.

It would sure be nice if the Harper website made clear that one of the two competitors it compares itself to can also be run locally.

akazantsev
·
6 days ago
·
[ - ]

There are two versions of the LanguageTool: open source and cloud-based. Open source checks the individual words in the dictionary just like the system's spell checker. Maybe there is something more to it, but in my tests, it did not fix even obvious errors. It's not an alternative to Grammarly or this tool.

shortformblog
·
6 days ago
·
[ - ]

There is. It can be heavily customized to your needs and built to leverage a large ngram data set:

https://dev.languagetool.org/finding-errors-using-n-gram-dat...

I would suggest diving into it more because it seems like you missed how customizable it is.

unfitted2545
·
5 days ago
·
[ - ]

This is a really nice app to use LanguageTool, it runs the server in the flatpak: https://flathub.org/apps/re.sonny.Eloquent

pram
·
6 days ago
·
[ - ]

IMO not using LLMs is a big plus in my book. Grammarly has been going downhill since they've been larding it with "AI features," it has become remarkably inconsistent. It will tell me to remove a comma one hour, and then tell me to add it back the next.

tiew9Vii
·
6 days ago
·
[ - ]

Being dyslexic, I was an avid Grammarly user. Once it started adding "AI features" the deterioration was noticeable, I cancelled my subscription and stopped using it a year ago.

I also only ever used the web app, so copy+pasting as installing the app is for all intentness and purposes is installing a key logger.

Grammar works on rules, not sure why that needs an LLM, Grammarly certainly worked better for me when it was more dumb, using rules.

InsideOutSanta
·
6 days ago
·
[ - ]

Grammarly sometimes gets stuck in a loop, where it suggests changing from A to B. It then immediately suggests changing from B to A again, continuing to suggest the opposite change every time I accept the suggestion.

It's not a problem; I make the determination which option I like better, but it is funny.

boplicity
·
6 days ago
·
[ - ]

General purpose LLMs seem to get very confused about punctuation, in my experience. It's one of their big areas of obvious failing. I'm surprised Grammarly would allow this to happen.

jethro_tell
·
6 days ago
·
[ - ]

The internet, especially post phone keyboards, is extremely inconsistent about punctuation. I’m not sure how anyone could think an llm wouldn’t be.

raincole
·
6 days ago
·
[ - ]

So is there a similar tool but based on an LLM?

Not that I think LLM is always better, but it would be interesting to compare these two approaches.

mannycalavera42
·
6 days ago
·
[ - ]

Grammarly is (was) written in Common LISP https://www.grammarly.com/blog/engineering/running-lisp-in-p...

Given LISP was supposed to build "The AI" ... pretty sad than a dumb LLM is taking its place now

7thaccount
·
6 days ago
·
[ - ]

Grammarly came out before the LLMs. I'm not sure what approach it took, but they're likely feeling a squeeze as LLMs can tell you how to rewrite a sentence to remove passive voice and all that. I doubt the LLMs are as consistent (some comments below show some big issues), but they're free (for now).

chneu
·
6 days ago
·
[ - ]

Thank you. In general my grammarly and gboard predictions have become so, so bad over the last year.

harvey9
·
6 days ago
·
[ - ]

'imo' and 'in my book' are redundant in the same sentence. Are there rules-based techniques to catch things like that? Btw I loved the use of 'larding' outside the context of food.

Alex-Programs
·
6 days ago
·
[ - ]

DeepL Write was pretty good in the post-LLM, pre-ChatGPT era.

Dr4kn
·
6 days ago
·
[ - ]

DeepL is different in my opinion. They always focused on machine learning for languages.

They must have acquired fantastic data for their Models. Especially because of the business language and professional translations which they focus on.

They keep your intended message in tact and just refine it. Like a book post editing. Grammarly and other tools force you to sound like they think is best.

DeepL shows, in my opinion, how much more useful a model trained for specific uses is.

monkeywork
·
5 days ago
·
[ - ]

Any suggestions for models ppl can run locally that are close to deepl

attendant3446
·
3 days ago
·
[ - ]

If you are talking about the current status of DeepL, that would be a low bar.

raverbashing
·
6 days ago
·
[ - ]

> It will tell me to remove a comma one hour, and then tell me to add it back the next.

So just like English teachers I see

aDyslecticCrow
·
6 days ago
·
[ - ]

Harper is decent.

I've relied on Grammarly to spellcheck all my writing for a few years (dyslexia prevents me from seeing the errors even when reading it 10 times). However, I find its increasing focus on LLMs and its insistence on rewriting sentences in more verbose ways bothers me a lot. (It removes personality and makes human-written text read like AI text.)

So I've tried out alternatives, and Harper is the closest I've found at the moment... but i still feel like grammarly does a better job at the basic word suggestion.

Really, all I wish for is a spellcheck that can use the context of the sentence to suggest words. Most ordinary dictionary spellchecks can pick the wrong word because it's syntactically closer. They may replace "though" with "thought" because I wrote "thougt" when the sentence clearly indicates "though" is correct; and I see no difference visually between any of the three words.

Breza
·
4 days ago
·
[ - ]

What's wild is that OpenAI's earlier models were trained to guess the next word in a sentence. I wonder if GPT-2 would get "though" correct more often than the latest AI-assisted writing tools like Grammerly.

There are some areas where it seems like LLMs (or even SLMs) should be way more capable. For example, when I touch a word on my Kindle, I'd think Amazon would know how to pick the most relevant definition. Yet it just grabs the most common definition. For example, consider the proper definition of "toilet" in this passage: "He passed ten hours out of the twenty-four in Saville Row, either in sleeping or making his toilet."

aagha
·
1 day ago
·
[ - ]

Have you tried Hemingway Editor?

demarq
·
6 days ago
·
[ - ]

"Me and Jennifer went to have seen the ducks cousin."

No errors detected. So this needs a lot of rule contributions to get to Grammarly level.

alpb
·
6 days ago
·
[ - ]

Similarly 0 grammatical errors flagged: "My name John. What your name? What day today?"

Tsarp
·
6 days ago
·
[ - ]

I was initially impressed. But then I tested a bunch, it wasn't catching some really basic things. Mostly hit or miss.

wellthisisgreat
·
6 days ago
·
[ - ]

What the duck is that test

canyp
·
6 days ago
·
[ - ]

Nominative vs objective

thfuran
·
6 days ago
·
[ - ]

There's a little more going on than that.

rdlw
·
5 days ago
·
[ - ]

In addition to case, it's testing tense (went to have seen) and plural vs. posessive (ducks cousin)

canyp
·
6 days ago
·
[ - ]

Yeah, I stopped parsing after "Me and Jennifer".

marginalia_nu
·
6 days ago
·
[ - ]

Goes the other way around too. For

> In large, this is _how_ anything crawler-adjacent tends to be

It suggests

> In large, this is how _to_ anything crawler-adjacent tends to be

healsdata
·
6 days ago
·
[ - ]

Given this is an Automattic product, I'm hesitant to use it. If it gets remotely successful, Matt will ruin it in the name of profit.

josephcsible
·
6 days ago
·
[ - ]

It's FOSS, so even if the worst happens, anyone could just fork the last good version and continue development there.

jantissler
·
6 days ago
·
[ - ]

Oh, that’s a big no from me then.

icapybara
·
6 days ago
·
[ - ]

Why wouldn't you want an LLM for a language learning tool? Language is one of things I would trust an LLM completely on. Have you ever seen ChatGPT make an English mistake?

healsdata
·
6 days ago
·
[ - ]

Grammarly is all in on AI and recently started recommended splitting "wasn't" and added the contraction to the word it modified. Example: "truly wasn't" becomes "was trulyn't"

https://imgur.com/a/RQZ2wXA

o11c
·
6 days ago
·
[ - ]

Hm ... I wonder, is Grammarly also responsible for the flood of contraction of lexical "have" the last few years? It's standard in British English, but outside of poetry it is proscribed in almost all other dialects (which only permit contraction of auxiliary "have").

Even in British I'm not sure how widely they actually use it - do they say "I've a car" and "I haven't a car"?

filterfish
·
6 days ago
·
[ - ]

"they" say "I haven't got a car".

Contractions are common in Australian English to, though becoming less so due to the influence of US English.

NoboruWataya
·
6 days ago
·
[ - ]

In my experience "I've a car" is much more common than "I haven't a car" (I've never heard the latter construct used, but regularly hear the former in casual speech). "I haven't got a car" or "I've no car" would be relatively common though.

akdev1l
·
6 days ago
·
[ - ]

This is what peak innovation looks like

Destiner
·
6 days ago
·
[ - ]

I don't think an LLM would recommend an edit like that.

Has to be a bug in their rule-based system?

healsdata
·
6 days ago
·
[ - ]

Gemini: "Was trulyn't" is a contraction that follows the rules of forming contractions, but it is not a widely used or accepted form in standard English. It is considered grammatically correct in a technical sense, but it's not common usage and can sound awkward or incorrect to native speakers.

marginalia_nu
·
6 days ago
·
[ - ]

I wonder how much memes like whomst'd might skew the training set.

InsideOutSanta
·
6 days ago
·
[ - ]

Yeah, I agree. An open-source LLM-based grammar checker with a user interface similar to Grammarly is probably what I'm looking for. It doesn't need to be perfect (none of the options are); it just needs to help me become a better writer by pointing out issues in my text. I can ignore the false positives, and as long as it helps improve my text, I don't mind if it doesn't catch every single issue.

Using an LLM would also help make it multilingual. Both Grammarly and Harper only support English and will likely never support more than a few dozen very popular languages. LLMs could help cover a much wider range of languages.

Szpadel
·
5 days ago
·
[ - ]

I tried to use one LLM based tool to rewrite sentence in more official corporate form, and it rewrote something like "we are having issues with xyz" into "please provide more information and I'll do my best to help".

LLMs are trained so hard to be helpful that it's really hard to contain them into other tasks

Groxx
·
6 days ago
·
[ - ]

uh. yes? it's far from uncommon, and sometimes it's ludicrously wrong. Grammarly has been getting quite a lot of meme-content lately showing stuff like that.

it is of course mostly very good at it, but it's very far from "trustworthy", and it tends to mirror mistakes you make.

perching_aix
·
6 days ago
·
[ - ]

Do you have any examples? The only time I noticed an LLM make a language mistake was when using a quantized model (gemma) with my native language (so much smaller training data pool).

Breza
·
4 days ago
·
[ - ]

Not GP, but I've definitely seen cutting edge LLMs make language mistakes. The most head scratching one I've seen in the past few weeks is when Gemini Pro decided to use <em> and </em> tags to emphasize something that was not code.

dartharva
·
6 days ago
·
[ - ]

Because this "language learning tool" will be dominantly used to avoid actually learning the language.

VTimofeenko
·
6 days ago
·
[ - ]

Comes with a great LSP server capable of checking grammar in code comments:

https://writewithharper.com/docs/integrations/language-serve...

sdtransier
·
6 days ago
·
[ - ]

Harper was acquired by Automattic in January 2025

https://automattic.com/2024/11/21/automattic-welcomes-harper...

raybb
·
6 days ago
·
[ - ]

Would be nice if they had a website where you could demo/test it before downloading extensions and stuff. Their firefox extension opens to this page https://writewithharper.com/install-browser-extension but when you paste in anything more than a few paragraphs the highlighting is all messed up.

ErrorNoBrain
·
6 days ago
·
[ - ]

Great to hear

i honestly don't trust grammarly ... i mean, its essentially a keylogger.

i did try it a bit once, and i never seem to have it work that well for me. But i am multilingual so maybe thats part of my hurdle

IceWreck
·
6 days ago
·
[ - ]

Slightly controversial compared to other comments here but I haven't used Grammerly at all since LLMs came out. Even a 4B local LLM is good enough to rephrase all forms of text and fix most grammer mistakes.

gglanzani
·
6 days ago
·
[ - ]

I think a lot of value comes by integrating with a language server and/or browser extensions.

Do you have a setup where this is possible or do you copy paste between text fields? (Genuine question. I’d love to use a local LLM integrating with an LSP).

ibobev
·
6 days ago
·
[ - ]

I'm a long-time Grammarly user. I just tried Harper, and it simply performs very poorly. It is a good initiative, but I don't feel the current state of this software to be worthwhile.

loughnane
·
6 days ago
·
[ - ]

Surprised coming into this that I don't see anyone mentioning vale[0]. I've been using it for ~4 years now and love it.

I use grammarly briefly when it came out and liked the idea. Admittedly it has more polish than vale for people writing in google docs, &c. Still, I stick with Vale. Is there any case for moving to Harper?

[0] https://vale.sh/

aDyslecticCrow
·
6 days ago
·
[ - ]

Looks interesting for linting and cleaning markdown documentation, but it doesn't seem like a very competent "spellcheck". I'll check it out... but it doesn't actually do the same thing as Grammarly or Harper.

WhyNotHugo
·
5 days ago
·
[ - ]

Vale requires a lot of tweaking, and I’ve never been able to get a rule set with which I’m happy.

It’s missing a default rule set with rules that are generally okay without being too opinionated.

thr0waway001
·
6 days ago
·
[ - ]

“Yo who dis?”

Passes.

For reference: https://youtu.be/w-R_Rak8Tys?si=h3zFCq2kyzYNRXBI

behnamoh
·
6 days ago
·
[ - ]

I wish it had keyboard shortcuts. As a Vim user, in Chrome it's tedious to click on every suggestion given by the app. Also, maybe add a "delay" so it doesn't think the currently-being-typed word is a mistake (let me finish typing first!).

Otherwise, it's great work. There should be an option to import/export the correction rules though.

cAtte_
·
6 days ago
·
[ - ]

this solution is just fundamentally insufficient. in the age of LLMs it's pretty insane to imagine programmers manually hard-coding an arbitrary subset of grammatical corrections (sure: it's faster, it's local first, but it's not enough). on top of that, English (like any other natural language) is such a complicated beast that you will never write a classic deterministic parser that's sophisticated enough to allow you to reliably implement even the most basic of grammatical corrections (check the other comments for examples). it's just not gonna happen.

i guess it's a nice and lightweight enhancement on top of the good old spellchecker, though

jacooper
·
6 days ago
·
[ - ]

I think if you can self host language tool, it would still be the better option.

novoreorx
·
6 days ago
·
[ - ]

Seeing Harper as an implementation of natural language's LSP brings me great joy, as it proves an idea I've had for a long time—natural language and programming languages are interconnected. Many concepts and techniques from programming languages can also be applied to natural language, making our lives more convenient. The development of LLMs and vibe coding has further blurred the boundary between natural language and programming languages, offering similar insights.

SZJX
·
4 days ago
·
[ - ]

I was wondering about grammar checking tools in the era of LLMs, especially for grammar checks beyond English, and Sapling https://sapling.ai seemed decent. Nobody seems to have mentioned it here?

DavideNL
·
4 days ago
·
[ - ]

Interesting, curious to try this;

I wonder whether it will impact the performance (Firefox) and things will become noticeably slower...

Recently i noticed highlighting extensions in Firefox were slowing things down significantly, not just loading but also while scrolling up and down web pages.

JPLeRouzic
·
6 days ago
·
[ - ]

It is available in Autommatic's Github repository:

https://github.com/Automattic/harper

sestep
·
6 days ago
·
[ - ]

Automattic*

djfivyvusn
·
1 day ago
·
[ - ]

Automatic only has one t.

sestep
·
15 minutes ago
·
[ - ]

No, they spell it with two: look again at the URL that JPLeRouzic posted.

dartharva
·
6 days ago
·
[ - ]

I never understood the appeal of grammar tools. If you have reached the minimum professional/academic level needed to be designated to write something, shouldn't you at least be capable of verifying its semantic "correctness" just by reading through it once yourself?

Why would you pass a writing job to someone who isn't 100% fluent in the language and then make up for it by buying complex tools?

facundo_olano
·
6 days ago
·
[ - ]

As a non native English speaker/writer there are a bunch of errors I miss, no matter how much attention I pay and how much I proofread, and these tools are useful to catch those.

jordanpg
·
5 days ago
·
[ - ]

I'm a lawyer. I write 10s of pages of text every day. "Reading through it once yourself" is obviously an imperfect solution. See, e.g., Poisson statistics. It's also slow and I bill in 6-minute increments. There is significant value in a grammar tool that protects confidentiality and is more effective than my wetware.

Veen
·
6 days ago
·
[ - ]

People are bad at proofreading their own work. Professional writers often use third-party copy editors and proofreaders for that reason.

victorbjorklund
·
6 days ago
·
[ - ]

I know for example David Sparks (MacSparky https://www.macsparky.com ) uses it (or at leased used it). And he was an American lawyer and he says writing has been his passion his whole life so I assume his English is better than the average person.

Semaphor
·
6 days ago
·
[ - ]

I use it (well, languagetool) in the free version for comments on sites like this. It directly catches mistakes I make, that I'd normally only catch on re-reads. From typos, over my brain doing weird stuff, to sometimes things I simply didn't (actively) know.

speedgoose
·
6 days ago
·
[ - ]

Have you considered that some people aren’t 100% fluent in English but still competent?

Finnucane
·
6 days ago
·
[ - ]

I’m a production editor at an uni press, and I can tell you there’s not a strong correlation between professional/academic level and writing well.

b0a04gl
·
6 days ago
·
[ - ]

this is the right direction. rulebased, local, transparent. not perfect yet, but that's not the point. getting something lightweight and tweakable matters more than catching every edge case out of the box. if it misses, you add rules. simple as that. if you expect it to match grammarly day one then might be we are missing the tradeoff

0xjunhao
·
6 days ago
·
[ - ]

In a world of LLMs, it's great to see classic NLP works like Harper. Both definitely have their own use cases.

klabetron
·
6 days ago
·
[ - ]

Odd choice that the example text on the homepage is almost all obvious typos that a standard spell check would pick up.

paxys
·
6 days ago
·
[ - ]

Looks cool, but it's weird to constantly make comparisons to Grammarly (in the post title, description section of the site, benchmarks) when this is clearly a rule-based spellcheck and very different from what Grammarly offers.

Instead tell me how it compares to the built-in spellcheck in my browser/IDE/word processor/OS.

krick
·
5 days ago
·
[ - ]

How big is English in "English grammar checker"? Is it plausible to add other languages to it, or the underlying framework is so English-specific that it doesn't make sense to even bother building something else than English grammar checker upon it?

jimaek
·
6 days ago
·
[ - ]

I don't understand why we even need such services. Why don't the browsers and maybe even the OS just not improve their included grammar checkers?

The Chrome enhanced grammar checker is still awful after decades.

Maybe the AI hype will finally fix this? I'm still surprised this wasn't the first thing they did.

lurk2
·
6 days ago
·
[ - ]

Who is the target market is for Grammarly? Working professionals who speak English as a second language?

victorbjorklund
·
6 days ago
·
[ - ]

I think it is anyone who wanna make sure they write correctly. I know for example David Sparks (MacSparky https://www.macsparky.com ) uses it (or at leased used it). And he was an American lawyer and he says writing has been his passion his whole life so I assume his English is better than the average person.

InsideOutSanta
·
6 days ago
·
[ - ]

Adam Engst from TidBITs, a person whose job has been writing things for all his life, also uses Grammarly:

https://tidbits.com/2025/01/30/why-grammarly-beats-apples-wr...

Veen
·
6 days ago
·
[ - ]

I use it as a proofreader, not to improve my writing. It’s difficult to proofread your own work, and Grammarly is a useful assistant. Plus, I’m British and I often write on behalf of American clients. I’m pretty good at following US English standards because I’ve been doing it for a long time, but the odd Britishism slips through and Grammarly usually catches it (although a standard spell checker would too, I suppose).

InsideOutSanta
·
6 days ago
·
[ - ]

“Think of how poorly the average person writes, and realize half of them write worse than that.”

(George Carlin or something, quote's veracity depends on what you mean by “average.”)

I think everybody could benefit from having something like Grammarly on their computer. None of us writes perfectly, and it's always beneficial to strive for improvement.

m00dy
·
6 days ago
·
[ - ]

People who haven't heard of LLMs

akazantsev
·
6 days ago
·
[ - ]

LLMs are not nice to use for spell checking. I do not want to read a wall of text from LLM just to find a missed article somewhere and I want to receive feedback as I type.

Also, once I asked LLM to check the message. It said everything looked fine and made a copy of the message in its response with one sentence in the middle removed.

SilverSlash
·
6 days ago
·
[ - ]

I haven't used Grammarly but for simple things like spelling mistakes, missed articles, or punctuation, wouldn't even Google Docs be enough?

msravi
·
6 days ago
·
[ - ]

Looks very good. Was looking to replace ltex (which is really slow), but for some reason the nvim-lspconfig filetype setting for harper doesn't seem to have (la)tex listed as a default, although markdown and typst are listed. Anyone knows why?

chilipepperhott
·
6 days ago
·
[ - ]

Harper maintainer here

We've had some contributors have a go at adding LaTeX support in the past, but they've yet to succeed with a truly polished option. The irregularity of LaTeX makes it somewhat difficult to parse.

We accept contributions, if anyone is interested in getting us across the finish line.

The-Ludwig
·
6 days ago
·
[ - ]

Looks awesome! I’ll give it a try over language tool.

Is there any reason why there is no firefox extension?

chilipepperhott
·
6 days ago
·
[ - ]

There is!

https://addons.mozilla.org/en-US/firefox/addon/private-gramm...

AbstractH24
·
6 days ago
·
[ - ]

My biggest problem with Grammarly has always been how buggy the product is. From not checking random sites to messing up formatting to not updating text with the selected changes.

If Harper does better at this I’d change in a minute.

victorbjorklund
·
6 days ago
·
[ - ]

Very cool. Has anyone integrated this into their own app? How was your experience?

piperly
·
6 days ago
·
[ - ]

Unfortunately, the last time I tested Harper inside Neovim, it alone used more than 1 GB of RAM for just the LSP! However, the concept is nice, open source, no AI, and easy to integrate.

heldrida
·
4 days ago
·
[ - ]

I've been a grammarly paying customer for at least 8 years. Nice to have an alternative :)

pragmatick
·
6 days ago
·
[ - ]

"For most documents, Harper can serve up suggestions in under 10ms." 10l is OK. 10kg as well. Why is 10ms wrong?

·
6 days ago
·
[ - ]

orliesaurus
·
6 days ago
·
[ - ]

Very buggy, but great start!!

I.e. if you write an "MISTAEK" and then you scroll the highlight follows me around the page

crimputer
·
6 days ago
·
[ - ]

Good start. But still has bugs i guess.

I tried with the following phrase -- "This should can't logic be done me." --

No errors.

·
6 days ago
·
[ - ]

cchance
·
6 days ago
·
[ - ]

Any chance to get it working in word? my wife would love to use it most likely

mpaepper
·
6 days ago
·
[ - ]

Are languages other than English also supported? Or is this for English only?

ssernikk
·
6 days ago
·
[ - ]

From their FAQ:

> We currently only support English and its dialects British, American, Canadian, and Australian. Other languages are on the horizon, but we want our English support to be truly amazing before we diversify.

Finnucane
·
6 days ago
·
[ - ]

No serial comma? Screw that.

yablak
·
5 days ago
·
[ - ]

Any chance to make the obsidian plugin work in mobile/Android?

v5v3
·
6 days ago
·
[ - ]

I used to see ads for Grammarly and wondered if anyone was using it.

Then post COVID with the increase in screen sharing video calls, I soon realised nearly every non-native English speaker from countries around the world heavily relied on it in their jobs. As I could see it installed when people share screens.

Huge market, good luck.

EugeneOZ
·
6 days ago
·
[ - ]

Great! Please create an iOS keyboard with Harper

harper
·
6 days ago
·
[ - ]

nice name!

sharkjacobs
·
6 days ago
·
[ - ]

This seems to use a hard coded list of explicit rules, not an LLM

https://writewithharper.com/docs/rules

https://github.com/Automattic/harper/blob/0c04291bfec25d0e93...

        "PointIsMoot" => (
            ["your point is mute"],
            ["your point is moot"],
            "Did you mean `your point is moot`?",
            "Typo: `moot` (meaning debatable) is correct rather than `mute`."
        ),

a2128
·
6 days ago
·
[ - ]

From a quick look phrase corrections is just one type of rule. There are many other rules, some are dynamic like when to use "your" vs "you're", oxford commas, etc.

That it doesn't use LLMs is its advantage, it runs in under 10ms and can be easily embedded in software and still provide useful grammar checking even if it's not exhaustive

skeptrune
·
6 days ago
·
[ - ]

Is this using local LLMs or some other engine?

JPLeRouzic
·
6 days ago
·
[ - ]

I don't think it uses an LLM.

https://github.com/Automattic/harper

stevenhubertron
·
2 days ago
·
[ - ]

It says it's disabled on Outlook's web client and it's not clear why or how to enable. If I can't use this to proofread emails what is the point?

mika6996
·
6 days ago
·
[ - ]

Which LLM is running with Harper?

ognarb
·
6 days ago
·
[ - ]

None

boars_tiffs
·
6 days ago
·
[ - ]

vim plug?

chilipepperhott
·
6 days ago
·
[ - ]

There's an LSP. Not sure if that fits your use-case, though.