ASCII characters are not pixels: a deep dive into ASCII rendering

505
63
alexharri
7 hours ago
alexharri.com

stephantul
·
1 hour ago
·
[ - ]

Amazing post, I didn’t think this through a lot, but since you are normalizing the vectors and calculating the euclidean distance, you will get the same results using a simple matmul, because euclidean distance over normalized vectors is a linear transform of the cosine distance.

Since you are just interested in the ranking, not the actual distance, you could also consider skipping the sqrt. This gives the same ranking, but will be a little faster.

qingcharles
·
59 minutes ago
·
[ - ]

It's stuff like this I would have loved to know when I was doing game engine dev in the 90s.

Izkata
·
9 minutes ago
·
[ - ]

I dunno, going to the last example at the bottom of the page and comparing the contrast slider all the way up and all the way down, all these enhancements combined turns it into a blurry mush where it's harder to distinguish the shapes. It's the exact same problem I had with anti-aliasing fonts on older monitors (smaller resolutions) and why I always disabled it wherever I could.

snackbroken
·
1 hour ago
·
[ - ]

> I don’t believe I’ve ever seen shape utilized in generated ASCII art, and I think that’s because it’s not really obvious how to consider shape when building an ASCII renderer.

Acerola worked a bit on this in 2024[1], using edge detection to layer correctly oriented |/-\ over the usual brightness-only pass. I think either technique has cases where one looks better than the other.

[1]https://www.youtube.com/watch?v=gg40RWiaHRY

zahlman
·
14 minutes ago
·
[ - ]

I can imagine there's room for "style", here, too. Just like how traditional 2d computer art varies from having thick borders and sharp delineations between colour regions, through https://en.wikipedia.org/wiki/Chiaroscuro style that seems to achieve soft edges despite high contrast, etc.

sph
·
6 hours ago
·
[ - ]

Every example I thought "yeah, this is cool, but I can see there's space for improvement" — and lo! did the author satisfy my curiosity and improve his technique further.

Bravo, beautiful article! The rest of this blog is at this same level of depth, worth a sub: https://alexharri.com/blog

echoangle
·
3 hours ago
·
[ - ]

Very cool effect!

> It may seem odd or arbitrary to use circles instead of just splitting the cell into two rectangles, but using circles will give us more flexibility later on.

I still don’t really understand why the inner part of the rectangle can’t just be split in a 2x3 grid. Did I miss the explanation?

DexesTTP
·
2 hours ago
·
[ - ]

It's because circles allow for a stagger and overlap as shown later on. It's not really possible to get the same effect from squares.

echoangle
·
1 hour ago
·
[ - ]

But it seems like you only need the stagger and overlap because you’re using circles in the first place. Would it look worse if you just divided the rectangle into 6 squares without any gaps or overlap?

zestyping
·
1 hour ago
·
[ - ]

My thought exactly. The sampling circles only enable you to (awkwardly) solve a problem that was fabricated by using circles in the first place.

panki27
·
1 hour ago
·
[ - ]

I wondered the same thing, but characters usually don't reach the edges, so I guess circles fit the average character better?

crazygringo
·
2 hours ago
·
[ - ]

> I don’t believe I’ve ever seen shape utilized in generated ASCII art, and I think that’s because it’s not really obvious how to consider shape when building an ASCII renderer.

Not to take away from this truly amazing write-up (wow), but there's at least one generator that uses shape:

https://meatfighter.com/ascii-silhouettify/

See particularly the image right above where it says "Note how the algorithm selects the largest characters that fit within the outlines of each colored region."

There's also a description at the bottom of how its algorithm works, if anyone wants to compare.

dboon
·
2 hours ago
·
[ - ]

Fantastic article! I wrote an ASCII renderer to show a 3D Claude for my Claude Wrapped[^1], and instead of supersampling I just decided to raymarch the whole thing. SDFs give you a smoother result than even super sampling, but of course your scene has to be represented with distance functions and combinations thereof whereas your method is generally applicable.

Taking into account the shape of different ASCII characters is brilliant, though!

[1]: https://spader.zone/wrapped/

wonger_
·
5 hours ago
·
[ - ]

Great breakdown and visuals. Most ASCII filters do not account for glyph shape.

It reminds me of how chafa uses an 8x8 bitmap for each glyph: https://github.com/hpjansson/chafa/blob/master/chafa/interna...

There's a lot of nitty gritty concerns I haven't dug into: how to make it fast, how to handle colorspaces, or like the author mentions, how to exaggerate contrast for certain scenes. But I think 99% of the time, it will be hard to beat chafa. Such a good library.

EDIT - a gallery of (Unicode-heavy) examples, in case you haven't seen chafa yet: https://hpjansson.org/chafa/gallery/

smusamashah
·
2 hours ago
·
[ - ]

But the chafa gallery isn't showing off ascii text rendering. Are there examples that use ascii text?

wonger_
·
1 hour ago
·
[ - ]

Good point. I haven't found many ascii examples online.

Here's a copy-paste snippet where you can try chafa-ascii-fying images in your own terminal, if you have uvx:

  uvx --with chafa-py python -c '
  from chafa import * 
  from chafa.loader import Loader 
  import sys 
  img = Loader(sys.argv[1])
  config = CanvasConfig() 
  config.calc_canvas_geometry(img.width,img.height,0.5,True,False)
  symbol_map = SymbolMap()
  symbol_map.add_by_tags(SymbolTags.CHAFA_SYMBOL_TAG_ASCII)
  config.set_symbol_map(symbol_map)
  config.canvas_mode = CanvasMode.CHAFA_CANVAS_MODE_FGBG
  canvas = Canvas(config)
  canvas.draw_all_pixels(img.pixel_type,img.get_pixels(),img.width,img.height,img.rowstride)
  print(canvas.print().decode())
  ' \
  myimage.jpg

But results are not as good as the OP's work. https://wonger.dev/assets/chafa-ascii-examples.png So I'll revise my claim that chafa is great for unicodey colorful environments, but hand-tailored ascii-only work like the OP is worth the effort.

fwipsy
·
2 hours ago
·
[ - ]

Aha! The 8x8 bitmap approach is the one I used back in college. I was using a fixed font, so I just converted each character to a 64-bit integer and then used popcnt to compare with an 8x8 tile from the image. I wonder whether this approach results in meaningfully different image results from the original post? e.g. focusing on directionality rather than bitmap match might result in more legible large shapes, but fine noise may not be reproduced as faithfully.

keepamovin
·
3 hours ago
·
[ - ]

my favorite ascii glyphs are the classic IBM Code Page 437: https://int10h.org/oldschool-pc-fonts/fontlist/

and damn that article is so cool, what a rabbithole.

mwillis
·
2 hours ago
·
[ - ]

Fantastic technique and deep dive. I will say, I was hoping to see an improved implementation of the Cognition cube array as the payoff at the end. The whole thing reminded me of the blogger/designer who, years ago, showed YouTube how to render a better favicon by using subpixel color contrast, and then IIRC they implemented the improvement. Some detail here: https://web.archive.org/web/20110930003551/http://typophile....

zellyn
·
1 hour ago
·
[ - ]

+1 yo wanting to see the cognition logo with contrast. It was set up as the target, but no payoff!

Lovely article, and the dynamic examples are :chefs-kiss:

joshu
·
1 hour ago
·
[ - ]

https://alumni.media.mit.edu/~nelson/courses/mas814/

AgentMatt
·
3 hours ago
·
[ - ]

Great article!

I think there's a small problem with intermediate values in this code snippet:

  const maxValue = Math.max(...samplingVector)

  samplingVector = samplingVector.map((value) => {
    value = x / maxValue; // Normalize
    value = Math.pow(x, exponent);
    value = x * maxValue; // Denormalize
    return value;
  })

Replace x by value.

alexharri
·
2 hours ago
·
[ - ]

Good catch, thanks! I’ll push a fix once I’m home

octoberfranklin
·
6 minutes ago
·
[ - ]

Application error: a client-side exception has occurred (see the browser console for more information).

CarVac
·
4 hours ago
·
[ - ]

The contrast enhancement seems simpler to perform with an unsharp mask in the continuous image.

It probably has a different looking result, though.

·
2 hours ago
·
[ - ]

symisc_devel
·
4 hours ago
·
[ - ]

There is already a C library that does realtime ascii rendering using décision trees:

GitHub: https://github.com/symisc/ascii_art/blob/master/README.md Docs: https://pixlab.io/art

nowayhaze
·
3 hours ago
·
[ - ]

The OP's ASCII art edges look way better than this

·
5 hours ago
·
[ - ]

chrisra
·
5 hours ago
·
[ - ]

> To increase the contrast of our sampling vector, we might raise each component of the vector to the power of some exponent.

How do you arrive at that? It's presented like it's a natural conclusion, but if I was trying to adjust contrast... I don't see the connection.

c7b
·
5 hours ago
·
[ - ]

What about the explanation presented in the next paragraph?

> Consider how an exponent affects values between 0 and 1. Numbers close to experience a strong pull towards while larger numbers experience less pull. For example 0.1^2=0.01, a 90% reduction, while 0.9^2=0.81, only a reduction of 10%.

That's exactly the reason why it works, it's even nicely visualized below. If you've dealt with similar problems before you might know this in the back of your head. Eg you may have had a problem where you wanted to measure distance from 0 but wanted to remove the sign. You may have tried absolute value and squaring, and noticed that the latter has the additional effect described above.

It's a bit like a math undergrad wondering about a proof 'I understand the argument, but how on earth do you come up with this?'. The answer is to keep doing similar problems and at some point you've developed an arsenal of tricks.

finghin
·
4 hours ago
·
[ - ]

In general for analytic functions like e^x or x^n the behaviour of the function on any open interval is enough to determine its behaviour elsewhere. By extension in mathematics examining values around the fundamental additive and multiplicative units \{ 0, 1 \} is fruitful in illustrating of the quintessential behaviour of the function.

jrmg
·
4 hours ago
·
[ - ]

This is amazing all round - in concept, writing, and coding (both the idea and the blog post about it).

I feel confident stating that - unless fed something comprehensive like this post as input, and perhaps not even then - an LLM could not do something novel and complex like this, and will not be able to for some time, if ever. I’d love to read about someone proving me wrong on that.

Lerc
·
1 hour ago
·
[ - ]

To develop this approach you need to think through the reasoning of what you want to achieve. I don't think the reasoning in LLMs is nonexistent, but it is certainly somewhat limited. This is disguised by their vast knowledge. When they successfully achieve a result by relying on knowledge you get an impression of more reasoning than their is.

Everyone seems now familiar with hallucinations. When a model's knowledge is lacking and it is fine tuned to give an answer. A simplistic calculation says that if an accurate answer gets you 100%, then an answer gets you 50% and being accurate gets you 50%. Hallucinations are trying to get partial credit for bullshit. Teaching a model that a wrong answer is worse than no answer is the obvious solution, turning that lesson into training methods is harder.

That's a bit of a digression but I think it helps explain the difference to why I think a model would find writing an article like this.

Models have difficulty in understanding what is important. The degree to which they do achieve this is amazing, but it is still trained on data that heavily biases their conclusions to the mainstream thinking. In that respect I'm not even sure if it is a fundamental lack in what they could do. It seems to be that they are implicitly made to think of problems as "it's one of those, I'll do what people do when faced with one of those"

There are even hints in fiction that this is what we were going to do. There is a fairly common sci-fi trope of an AI giving a thorough and reasoned analysis of a problem only to be cut off by a human wanting the simple and obvious answer. If not done carefully RLHF becomes the embodiment of this trope in action.

This gives a result that makes the most people immediately happy, without regard for what is best long term, or indeed what is actually needed. Asimov explored the notion of robots lying so as to not hurt feelings. Much of the point of the robot books was to express the notion that what we want AI to be is more complicated than it appears at first glance.

estimator7292
·
2 hours ago
·
[ - ]

Those 3D interactive animations are the smoothest 3D rendering I've ever seen in a mobile browser. I'm impressed

eerikkivistik
·
3 hours ago
·
[ - ]

It reminds me quite a bit of collision engines for 2D physics/games. Could probably find some additional clever optimisations for the lookup/overlap (better than kd-trees) if you dive into those. Not that it matters too much. Very cool.

nickdothutton
·
6 hours ago
·
[ - ]

What a great post. There is an element of ascii rendering in a pet project of mine and I’m definitely going to try and integrate this work. From great constraints comes great creativity.

mark-r
·
2 hours ago
·
[ - ]

This is something I've wanted to do for 50 years, but never found the time or motivation. Well done!

Sesse__
·
4 hours ago
·
[ - ]

I did something very similar to this (searching for similar characters across the grid, including some fuzzy matching for nearby pixels) around 1996. I wonder if I still have the code? It was exceedingly slow, think minutes for a frame at the Pentiums of the time.

Jyaif
·
6 hours ago
·
[ - ]

It's important to note that the approach described focuses on giving fast results, not the best results.

Simply trying every character and considering their entire bitmap, and keeping the character that reduces the distance to the target gives better results, at the cost of more CPU.

This is a well known problem because early computers with monitors used to only be able to display characters.

At some point we were able to define custom character bitmap, but not enough custom characters to cover the entire screen, so the problem became more complex. Which new character do you create to reproduce an image optimally?

And separately we could choose the foreground/background color of individual characters, which opened up more possibilities.

alexharri
·
4 hours ago
·
[ - ]

Yeah, this is good to point out. The primary constraint I was working around was "this needs to run at a smooth 60FPS on mobile devices" which limits the type and amount of work one can do on each frame.

I'd probably arrive at a very different solution if coming at this from a "you've got infinite compute resources, maximize quality" angle.

spuz
·
5 hours ago
·
[ - ]

Thinking more about the "best results". Could this not be done by transforming the ascii glyphs into bitmaps, and then using some kind of matrix multiplication or dot production calculation to calculate the ascii character with the highest similarity to the underlying pixel grid? This would presumably lend itself to SIMD or GPU acceleration. I'm not that familiar with this type of image processing so I'm sure someone with more experience can clarify.

finghin
·
5 hours ago
·
[ - ]

In practice isn’t a large HashMap best for lookup, based on compile-time or static constants describing the character-space?

spuz
·
5 hours ago
·
[ - ]

In the appendix, he talks about reducing the lookup space by quantising the sampled points to just 8 possible values. That allowed him to make a look up table about 2MB in size which were apparently incredibly fast.

finghin
·
5 hours ago
·
[ - ]

I've been working on something similar (didn't get to this stage yet) and was planning to do something very similar to the circle-sampling method but the staggering of circles is a really clever idea I had never considered. I was planning on sampling character pixels' alignment along orthogonal and diagonal axes. You could probably combine these approaches. But yeah, such an approach seemed particularly powerful for the reason you could encode it all in a table.

brap
·
5 hours ago
·
[ - ]

You said “best results”, but I imagine that the theoretical “best” may not necessarily be the most aesthetically pleasing in practice.

For example, limiting output to a small set of characters gives it a more uniform look which may be nicer. Then also there’s the “retro” effect of using certain characters over others.

Sharlin
·
6 hours ago
·
[ - ]

And a (the?) solution is using an algorithm like k-means clustering to find the tileset of size k that can represent a given image the most faithfully. Of course that’s only for a single frame at a time.

nurettin
·
43 minutes ago
·
[ - ]

I love that they don't just work on the edges and declare their work complete. No, shadows also have to be perfect!

Reminds me of this underrated library which uses braille alphabet to draw lines. Behold:

https://github.com/tammoippen/plotille

It's a really nice plotting tool for the terminal. For me it increases the utility of LLMs.

maxglute
·
2 hours ago
·
[ - ]

Mesmerizing, the i, ! shading is unreasonably effective.

nathaah3
·
6 hours ago
·
[ - ]

that was so brilliant! i loved it! thanks for putting it out :)

shiandow
·
4 hours ago
·
[ - ]

I'm not sure if this exponent is actually enhancing contrast or just fixing the gamma.

adam_patarino
·
6 hours ago
·
[ - ]

Tell me someone has turned this into a library we can use

alexharri
·
4 hours ago
·
[ - ]

Author here. There isn't a library around this yet, but the source code for the blog is open source (MIT licensed): https://github.com/alexharri/website

The code for this post is all in PR #15 if you want to take a look.

minimaxir
·
51 minutes ago
·
[ - ]

I was investigating a fun webcam-to-ASCII project so now I am tempted to take an approach at porting the logic from the blog post into something reusable.

nathell
·
5 hours ago
·
[ - ]

Well there's aalib and libcaca, but I'm not sure about their fidelity compared to this.

guerby
·
5 hours ago
·
[ - ]

Don't know what algorithm are used by the famous libcaca:

https://github.com/cacalabs/libcaca

lysace
·
1 hour ago
·
[ - ]

Seems like stellar work. Kudos.

I am however am struck with the from an outsider POV highly niche specific terminology used in the title.

"ASCII rendering".

Yes, I know what ASCII is. I understand text rendering in sometimes painful detail. This was something else.

Yes, it's a niche and niches have their own terminologies that may or may not make sense in a broader context.

HN guidelines says "Otherwise please use the original title, unless it is misleading or linkbait; don't editorialize."

I'm not sure what is the best course of action here - perhaps nothing. I keep bumping into this issue all the time at HN, though. Basically the titles very often don't include the context/niche.

steve1977
·
3 hours ago
·
[ - ]

Thanks! This article put a genuine smile on my face, I can still discover some interesting stuff on the Internet beyond AI slop.

zdimension
·
5 hours ago
·
[ - ]

Well-written post. Very interesting, especially the interactive widgets.

blauditore
·
5 hours ago
·
[ - ]

Nice! Now add colors and we can finally play Doom on the command line.

More seriously, using colors (not trivial probably, as it adds another dimension), and some select Unicode characters, this could produce really fancy renderings in consoles!

krallja
·
3 hours ago
·
[ - ]

"finally"? We were playing Quake II in AAlib in 2006. https://www.jfedor.org/aaquake2/

jrmg
·
4 hours ago
·
[ - ]

At least six dimensions, right? For each character, color of background, color of foreground, and each color has at least three components. And choosing how the components are represented isn’t trivial either - RGB probably isn’t a good choice. YCoCg?

chrisra
·
5 hours ago
·
[ - ]

Next up: proportional fonts and font weights?

finghin
·
4 hours ago
·
[ - ]

I had been thinking of messing around with a DOM-based ‘console’ in Tauri that could handle a lot more font manipulation for a pseudo-TUI application similar to this. It's definitely possible! It would be even simpler to do in TS.

maximgeorge
·
30 minutes ago
·
[ - ]

[dead]

jwr
·
47 minutes ago
·
[ - ]

Hmm. This renderer is impressive. Will it be available for toy projects? (such as an online page with JavaScript for converting family pictures)