> Indo-European languages typically use the Latin alphabet
After the first sentence, "Indo-European" seems to have transformed to just "European" in the author's mind. Hindi and Bengali, languages more widely spoken than half the language in that list, seem to have been forgotten, along with their Devanagri script.
(Over the course of the article, it's seeming like the author just wanted to say European languages, or languages using Latin script, and for some reason chose to use Indo-European instead, despite clearly stating the definition themself.)
Yes you are right, I am not a linguist so I have little knowledge for "Indo" languages.
Originally I adopted the word of "Germanic languages" then I found Spanish is not a Germanic language hence I then adopted "Indo European" language.
This needs a fix for sure.
Also worth noting that out of those you listed, Russian is also not a Germanic language (it's Slavic), and does not use the Latin alphabet.
Can we just say greek-based alphabets?
• Indic scripts need the renderer to support complex text shaping, or else the text will generally be illegible, as though you were drawing your letters wrong, stacking some vowels on top of each other, and other nonsense things like that. As an example, if you’re not familiar with Indic scripts, see the code points used to write my name in Telugu, and how they contribute to the rendering: https://temp.chrismorgan.info/%E0%B0%95%E0%B1%8D%E0%B0%B0%E0.... It’s basically “letter ka, delete the vowel, letter ra, delete the vowel, add vowel i, letter sa, delete the vowel”, but the “kri” will normally be joined together into a conjunct, with the vowel sign drawn on the first consonant, and the second consonant being drawn in a completely different way from normal, which may even affect layout by font—the r conjunct can be a semicircle below, as in that font, but it can also be a curve beginning on the left, shifting the k to the right. (Me, I like the curve style for no particular reason, but the semicircle seems more popular these days. If this concept seems weird to you, reflect that English has allographs too <https://en.wikipedia.org/wiki/Allograph>, though mostly not particularly affecting layout.)
• But as regards line breaking, Indic scripts are much the same as English.
• CJK shaping/rendering can have a bit of complexity because of Han unification <https://en.wikipedia.org/wiki/Han_unification>, and definitely has a lot more nuanced stuff like mixing horizontal and vertical writing modes, and what to do when you mix scripts (which happens much more than with Indic scripts), especially digits, and especially when combining vertical and horizontal. But if your engine doesn’t support any of this, your document should still at least be fully intelligible—just uglier.
• CJK line breaking is awful: where most languages have settled on using spaces to separate words, most CJK languages mostly don’t (Korean does, I believe), and so you pretty much need to know the language to avoid breaking in the middle of words. So you end up things a bit like hyphenation dictionaries to try to do a good-enough job of it. Again, if your engine doesn’t support this, your document should still be intelligible—just uglier.
Regarding CJK line-breaking, my understanding was that it was only Thai and closely related languages that required dictionary-based line breaking, and the Chinese/Japanese had simpler rules mostly concerning punctuation. But I'm not certain about that.
So _this_ must be why the Affinity suite doesn't properly render Devanagari, yet Inkscape can.
It's not true for Chinese. Chinese allows line breaks after any characters.
- Vertical text - in particular when it comes to how CJK punctuation differs in horizontal and vertical environments (not yet solved)
- Staying on CJK, making sure the punctuation marks that follow a character don't break and remain with their preceding characters at all times. (I expect the same holds for opening quotes etc but haven't experimented).
- Highly ligatured fonts - Devangari, Arabic, etc - there's no solution to styling individual characters within a word that I could find.
- Talking of styling ... underlining text is a nightmare - especially if you want to get the little gap between hanging characters and the underline that HTML/CSS browsers do out of the box
- Formatting Thai fonts ... is another World of Hurt[1]
<div style="font-size: 10em;">ع<span style="color: blue;">ر</span>ب<span style="color: red;">ي</span></div>
That uses harfbuzzjs to do shaping and, for the segments that it has to, it paints paths instead of using fillText. There is an even better method which Mozilla's pdfjs uses: for all the glyphs that you want to draw, build a font (easy with HarfBuzz) that maps sequential characters to those glyphs. Then use fillText with that font and the character that corresponds to the glyph that you want for each glyph. That's nearly as fast as fillText on the whole string.
The points you make are really important. I rant about how even Google Sheets doesn't do rich text correctly because of fillText's simplicity here: [1]. But I think many of your points could be solved by using HarfBuzz. I dream of having shapeText and fillGlyphs methods on the canvas as an alternate to HarfBuzz because it would be less wasteful. Leave the high-level APIs up to client-side libraries like dropflow and scrawl.
Google has proposed a placeElement method [2] that allows you to render HTML and CSS into a canvas, but that destroys what's so great about canvas, which is that it's crazy fast. DOM is very heavy-weight.
[1] https://github.com/chearon/dropflow#harfbuzz [2] https://github.com/WICG/canvas-place-element
<div style=font-size:10em><span style=color:#e01b24>క</span><span style=color:#5e5c64>్</span><span style=color:#e66100>ర</span><span style=color:#e5a50a>ి</span><span style=color:#26a269>స</span><span style=color:#1a5fb4>్</span></div>
Ideally it’d render about the same as my manual splitting/colouring: https://temp.chrismorgan.info/%E0%B0%95%E0%B1%8D%E0%B0%B0%E0....Now sometimes you can colour parts differently: if you stick with the inherent vowel on a conjunct, here making it LETTER KA, SIGN VIRAMA, LETTER RA, ditching the VOWEL SIGN I, then essentially the K from KA and the A from RA will be coloured LETTER KA, and the R from LETTER RA will be coloured SIGN VIRAMA. I haven’t decided yet if that’s an improvement! But this split colouring I only see in Dropflow—I’m not experiencing it in Firefox or Chromium, both of which do split Arabic colouring.
You might be able to turn OpenType features off in those browsers to make it look like your manual coloring, I'm not sure.
> if you stick with the inherent vowel on a conjunct, here making it LETTER KA, SIGN VIRAMA, LETTER RA, ditching the VOWEL SIGN I, then essentially the K from KA and the A from RA will be coloured LETTER KA, and the R from LETTER RA will be coloured SIGN VIRAMA. I haven’t decided yet if that’s an improvement!
Maybe because it changes the shaping results? I don't know enough about the writing system to understand this yet :)
I've just read it. Oh, dear ... what an awful proposal!
My initial thoughts, reacting to the README at https://github.com/WICG/canvas-place-element
> There’s a strong need for better text support on Canvas. [...] This includes not only visual features but also the possibility of supporting the same level of user interaction as the rest of the web
Agreed, but ... adding HTML/CSS to a raster (or WebGL etc) image is not the way to do it. I much prefer your idea of incorporating HarfBuzz-like functionality into the canvas/text APIs - especially given that HarfBuzz is included as part of most browsers' code base.
I've played with mixing HTML/CSS with canvas in my canvas library. The results are ... interesting[1][4], but (probably) not the ideal solution. Making it easy for developers to build canvas interactions with HTML/events anywhere on the page is much more productive and useful[2].
> There is currently no guarantee that the canvas fallback content currently used for accessibility always matches the rendered content, and such fallback content can be hard to generate.
My thinking is that directly reflecting the canvas text back into the DOM is often not useful for people using screen readers. They don't need to hear every number on the chart axis when instead they could be presented with just the measure and range of the axis. If the HTML element rapidly/repeatedly updates its content, it's going to be a very unpleasant experience for the user. I've experimented with this sort of thing in [2], but need feedback from real screenreader users to understand if the solution meets their needs.
> Access to live interactive forms, links, editable content with the same quality as the web. This will help to close the app gap with Flash.
Flash is dead. Please leave its bones in the crypt.
> A limited set of CSS shaders, such as filter effects, are already available, but there is a desire to use general WebGL shaders with HTML
I only work with 2D canvas, but I can understand the desire here. A different approach might be to convince browser devs to work on improving SVG (and CSS) filters to support WebGL shaders, which can then be used by the canvas? Though Safari still doesn't support using SVG filters in the canvas so maybe convince them first?
Playing with filter effects is one of the joys I get from working on my canvas library[3] - but that's got nothing to do with text layout ... except when applying the filter to text, of course![4]
[1] - Use stacked DOM artefact corners as pivot points https://scrawl-v8.rikweb.org.uk/demo/dom-015.html
[2] - London crime charts https://scrawl-v8.rikweb.org.uk/demo/modules-001.html
[3] - A gallery of compound filter effects https://scrawl-v8.rikweb.org.uk/demo/filters-103.html
[4] - Editable header text colorizer and animation effect snippets https://scrawl-v8.rikweb.org.uk/demo/snippets-006.html
I've taken a similar approach to layering canvases with normal HTML (typically the HTML is on top). I don't have a problem logically representing what's painted on the canvas and doing my own hit detection either. Shaders and text shaping in canvas sound a lot more attractive to me than placeElement, but I guess we'll see. I should get around to campaigning for ctx.shapeText and ctx.fillGlyphs but I don't know how much folks care about it.
For me, I think groff/LaTeX/SILE/typst all belongs to same category, i.e, the author write some markup language, then processed by some processors, then get an output. I chose LaTeX and Typst as the classic ones in my post:
- LaTeX is the classic, old school typesetting engine - Typst, clearly more modern, with many advanced design like incremental compilation, wasm and web app, instant preview, better error message.
For others, groff/SILE, to be honest I don't have time to dive into each of these.
----
Do you see any advantages of groff over LaTeX?
Pain points include many customization points: re-creating exact document specifications provided externally, using specific typefaces, creating your own macros... Oh, and leaving ASCII (or ISO-8859-1) for multi-script characters. These are not blocking points (there are ways to overcome them, if you find the solutions), but are still major pains.
Today's groff is a very fine software, if you are satisfied with its default settings and your task is in the domain it handles.
With my most recent book, I've moved my PDF generation to Typst. LaTeX, you served me well, but I'm more than happy to never use you again. Typst is better (or decent enough) in every dimension.
I guess: 1) the incremental compilation speed, 2) the modern user experience (better error message things, better syntax, etc)?
I asked what was the best book for learning LaTeX. The response... "There is no book. Sit next to someone writing their dissertation."
Typst on the other hand, feels modern, is readable, and is fast. (4 seconds vs 2 minutes for some of my books.) The developers are responsive.
My only complaint is that some of my code broke during the latest release. I'll not complain too much because of is a nascent project and still making quick progress.
After I realized that Typst had the features I required for my books, I immediately moved to it.
Good riddance LaTeX. You served me well, but I felt like there was never a better option... Until now.
I went back to beamer.
All from the same content using includes, variables, flags, etc. Show interactive plots directly in you presentations, tons of other features.
Almost every project I create now whether it's documentation, presentation, report, website, or anything else project-related can fit within `quarto create project`.
https://github.com/Dherse/codly/issues/35 and in particular, this comment from the author of codly where he sounds constrained by Typst: https://github.com/Dherse/codly/issues/35#issuecomment-24667...
I mentioned https://polytype.dev/ in the end of the post, which has pages.js included.
Is not that hard to simulate pagination with JavaScript, the deal breaker for me is still line breaking and also mixed languages typesetting nuances.
You also didn't mention the new (Chrome only) CSS text-wrap: pretty
https://developer.chrome.com/blog/css-text-wrap-pretty
https://docs.google.com/document/d/1jJFD8nAUuiUX6ArFZQqQo8yT...