BioHacker News | Rendering Crispy Text on the GPU

▲Rendering Crispy Text on the GPU(osor.io)

395 points by ibobev 1 day ago | 29 comments

▲vecplane 1 day ago

Subpixel font rendering is critical for readability but, as the author points out, it's a tragedy that we can't get pixel layout specs from the existing display standards.

▲crazygringo 1 day ago

Only on standard resolution displays. And it's not even "critical" then, it's just a nice-to-have.

But the world has increasingly moved to Retina-type displays, and there's very little reason for subpixel rendering there.

Plus it just has so many headaches, like screenshots get tied to one subpixel layout, you can't scale bitmaps, etc.

It was a temporary innovation for the LCD era between CRT and Retina, but at this point it's backwards-looking. There's a good reason Apple removed it from macOS years ago.

▲NoGravitas 19 hours ago

Even on standard resolution displays with standard subpixel layout, I see color fringing with subpixel rendering. I don't actually have hidpi displays anywhere but my phone, but I still don't want subpixel text rendering. People act like it's a panacea, but honestly the history of how we ended up with it is pretty specific and kind of weird.

▲zozbot234 19 hours ago

> ...I see color fringing with subpixel rendering.

Have you tried adjusting your display gamma for each RGB subchannel? Subpixel antialiasing relies on accurate color space information, even more than other types of anti-aliased rendering.

▲jeroenhd 15 hours ago

> the world has increasingly moved to Retina-type displays

Not my world. Even the display hooked up to the crispy work MacBook is still 1080p (which looks really funky on macOS for some reason).

Even in tech circles, almost everyone I know still has a 1080p laptop. Maybe some funky 1200p resolution to make the screen a bit bigger, but the world is not as retina as you may think it is.

For some reason, there's actually quite a price jump from 1080p to 4k unless you're buying a television. I know the panels are more expensive, but I doubt the manufacturer is indeed paying twice the price for them.

▲josephg 10 hours ago

My desktop monitor is a 47” display … also running at 4k. It’s essentially a TV, adapted into a computer monitor. It takes up the whole width of my desk.

It’s an utterly glorious display for programming. I can have 3 full width columns of code side by side. Or 2 columns and a terminal window.

But the pixels are still the “normal” size. Text looks noticeably sharper with sub-pixel rendering. I get that subpixel rendering is complex and difficult to implement correctly, but it’s good tech. It’s still much cheaper to have a low resolution display with subpixel font rendering than render 4x as many pixels. To get the same clean text rendering at this size, I’d need an 8k display. Not only would that cost way more money, but rendering an 8k image would bring just about any computer to its knees.

It’s too early to kill sub pixel font rendering. It’s good. We still need it.

▲HappMacDonald 12 hours ago

Reading this message on a 4k (3840x2160 UHD) monitor I bought ten (10) years ago for $250usd.

Still bemoaning the loss of the basically impossible (50"? I can't remember precisely) 4k TV we bought that same year for $800usd when every other 4k model that existed at the time was $3.3k and up.

It's black point was "when rendering a black frame the set 100% appears to be unpowered" and the whitepoint was "congratulations, this is what it looks like to stare into baseball stadium floodlights". We kept it at 10% brightness as a matter of course and still playing arbitrary content obviated the need for any other form of lighting in our living room and dining room combined at night.

It was too pure for this world and got destroyed by one of the kids throwing something about in the living room. :(

▲MindSpunk 7 hours ago

MacOS looks garbage on non-retina displays largely because they don't do any sub pixel AA for text.

▲zajio1am 5 hours ago

AFAIK MacOS looks garbage on standard resolution displays mainly because they don't do any grid-fitting.

I (on Linux) do not use any sub-pixel AA (just regular AA), but i use aggressive grid-fitting and i have nice, sharp text.

▲f33d5173 1 day ago

Because apple controls all their hardware and can assume that everyone has a particulr set of features and not care about those without. The rest of the industry doesn't have that luxury.

▲akdor1154 23 hours ago

Apple could easily have ensured screens across their whole ecosystem had a specific subpixel alignment - yet they still nixed the feature.

▲LoganDark 23 hours ago

The artifacts created by subpixel AA are dumb and unnecessary when the pixel density is high enough for grayscale to look good. Plus, with display scaling, subpixel AA creates artifacts. (Not like display scaling itself doesn't also create artifacts - I cannot tolerate the scaling artifacts on iPad, for example)

▲wpm 17 hours ago

Apple cannot guarantee the pixel density will actually be high enough. They make computers and tablets that can attach to any external monitor.

macOS looks *awful* on anything that isn't precisely 218ppi. Other than Apple's overpriced profit-machine displays, there are two displays that reach this: LG's Ultrafine 5K, and Dell's 6K (with its ugly, extraneous webcam attached to the top). Other 6K monitors were shown at CES this year but so far, I haven't actually found any for sale. EDIT: Correction, LG apparently doesn't sell the 5K Ultrafine anymore, at least on their website.

That means, the odds are incredibly high that unless you buy the LG, or drop a wad on an overpriced Studio Display or the even worse valued Pro Display, your experience with macOS on an external monitor will be awful.

That's even before we get into the terrible control we have in the OS over connection settings. I shouldn't have to buy BetterDisplay to pick a refresh rate I know my display is capable of on the port it's plugged into.

▲eviks 22 hours ago

But the world has done nothing of the sorts: what's your assessment of what % of *all* used displays are of retina type?

▲jeroenhd 15 hours ago

The funny thing is that in some ways it's true. Modern phones are all retina (because even 1080p at such a resolution is indistinguishable from pixelless). Tablets, even cheap ones, have impressive screen resolutions. I think the highest tea device I own may be my Galaxy Tab S7 FE at 1600x2500.

Computers, on the other hand, have stuck with 1080p, unless you're spending a fortune.

I can only attribute it to penny pinching by the large computer manufacturers, because with the high-res tablets coming to market for Chromebook prices, I doubt they're unable to put a similarly high-res display in a similarly sized laptop without bumping the price up by 500 euros like I've seen them do.

▲gfody 12 hours ago

> like screenshots get tied to one subpixel layout

we could do with a better image format for screenshots - something that preserves vectors and text instead of rasterizing. HDR screenshots on Windows are busted for similar reasons.

▲zozbot234 1 day ago

It looks like the DisplayID standard (the modern successor to EDID) is at least intended to allow for this, per https://en.wikipedia.org/wiki/DisplayID#0x0C_Display_device_... . Do display manufacturers not implement this? Either way, it's information that could be easily derived and stored in a hardware-info database, at least for the most common display models.

▲jeroenhd 15 hours ago

I don't think any OS exposes an API for this. There's a Linux tool I sometimes use to control the brightness of my screen that works by basically talking directly to the hardware over the GPU.

Unfortunately, EDID isn't always reliable, either: you need to know the screen's orientation as well or rotated screens are going to look awful. You're probably going to need administrator access on computers to even access the hardware to get the necessary data, which can also be a problem for security and ease-of-use reasons.

Plus, some vendors just seem to lie in the EDID. Like with other information tables (ACPI comes to mind), it looks almost like they just copy the config from another product and adjust whatever metadata they remember to update before shipping.

▲jasonthorsness 1 day ago

I don't understand why, this has been a thing for decades :(. The article is excellent and links to this "subpixel zoo" highlighting the variety: https://geometrian.com/resources/subpixelzoo/

▲layer8 17 hours ago

“Tragedy” is a bit overstating it. Each OS could provide the equivalent of Window’s former ClearType tuner for that purpose, and remember the results per screen or monitor model. You’d also want that in the inevitable case where monitors report the wrong layout.

▲mrob 1 day ago

Subpixel rendering isn't necessary in most languages. Bitmap fonts or hinted vector fonts without antialiasing give excellent readability. Only if the language uses characters with very intricate details such as Chinese or Japanese is subpixel rendering important.

▲Fraterkes 22 hours ago

Ah so only 20% of the global population? Nbd

▲osor_io 23 hours ago

Author here, didn't expect the post to make it here! Thanks so much to everyone who's reading it and participating in the interesting chat <3

▲ 11 hours ago

▲muglug 18 hours ago

It's a great post!

What happened to the dot of the italic "j" in the first video?

▲kvemkon 1 day ago

GTK4 moved rendering to GPU and gave up on RGB subpixel rendering. I've heard, that this GPU-centric decision made it impractical to continue with RGB subpixel rendering. The article shows it is possible. So perhaps, the reason for GTK was another one or the presented solution would have disadvantages or just not integrate in the stack...

▲dbcooper 1 day ago

Cosmic Text (Cosmic DE) might do this on the GPU via swash. It has subpixel rendering.

▲xiaoiver 1 day ago

If you're interested in how to implement SDF and MSDF in WebGL / WebGPU, take a look at this tutorial I wrote: https://infinitecanvas.cc/guide/lesson-015#msdf.

▲Buttons840 1 day ago

This looks great. I have some interest in WGPU (Rust's WebGPU implementation), and your tutorial here appears to be an advance course on it--thought it doesn't advertise itself as such. I've translated JavaScript examples to Rust before, and it's ideal for learning, because I can't just copy/paste code, but the APIs are close enough that it's easy to port the code and it gives you an excuse to get used to using the WGPU docs.

▲tamat 22 hours ago

wow, I love the format of the site.

Can you tell me more about it? I love making tutorials about GPU stuff and I would love to structure them like yours.

Is it an existing template? Is it part of some sort of course?

▲jama_ 1 hour ago

Looks like a repurposed VitePress docs template, which is a perfectly fine solution for text-heavy content. The site appears to be open-source, there are links to the repo at the bottom of each page: https://github.com/xiaoiver/infinite-canvas-tutorial

▲dcrazy 1 day ago

The Slug library [1] is a commercial middleware that implements such a GPU glyph rasterizer.

[1]: https://sluglibrary.com/

▲bschwindHN 19 hours ago

They describe a fair amount of their algorithm directly on their website. Do they have patents for it? It would be fun to make an open source wgpu version, maybe using some stuff from cosmic-text for font parsing and layout. But if at the end of that I'd get sued by Slug, that would be no fun.

▲grovesNL 19 hours ago

Slug is patented but there are other similar approaches being worked on (e.g., vello https://news.ycombinator.com/item?id=44236423 that uses wgpu).

I also created glyphon (https://github.com/grovesNL/glyphon) which renders 2D text using wgpu and cosmic-text. It uses a dynamic glyph texture atlas, which works fine in practice for most 2D use cases (I use it in production).

▲bschwindHN 16 hours ago

I did something similar with cosmic-text and glium, but it would be fun to have a vector rendering mode to do fancier stuff with glyph outlines and transforms for games and 3D stuff. And open source, of course.

I suppose vello is heading there but whenever I tried it the examples always broke in some way.

▲mxplerin 11 hours ago

I have an abandoned proof-of-concept of something similar that might be worth checking out https://github.com/mxple/fim

▲bschwindHN 6 hours ago

Very interesting! I think I would get motion sickness judging by what I saw in the video, but I see what you're getting at with bezier rendering.

▲oofabz 1 day ago

Very impressive work. For those who aren't familiar with this field, Valve invented SDF text rendering for their games. They published a groundbreaking paper on the subject in 2007. It remains a very popular technique in video games with few changes.

In 2012, Behdad Esfahbod wrote Glyphy, an implementation of SDF that runs on the GPU using OpenGL ES. It has been widely admired for its performance and enabling new capabilities like rapidly transforming text. However it has not been widely used.

Modern operating systems and web browsers do not use either of these techniques, preferring to rely on 1990s-style Truetype rasterization. This is a lightweight and effective approach but it lacks many capabilities. It can't do subpixel alignment or arbitrary subpixel layout, as demonstrated in the article. Zooming carries a heavy performance penalty and more complex transforms like skew, rotation, or 3d transforms can't be done in the text rendering engine. If you must have rotated or transformed text you are stuck resampling bitmaps, which looks terrible as it destroys all the small features that make text legible.

Why the lack of advancement? Maybe it's just too much work and too much risk for too little gain. Can you imagine rewriting a modern web browser engine to use GPU-accelerated text rendering? It would be a daunting task. Rendering glyphs is one thing but how about handling line breaking? Seems like it would require a lot of communication between CPU and GPU, which is slow, and deep integration between the software and the GPU, which is difficult.

▲chrismorgan 22 hours ago

> Can you imagine rewriting a modern web browser engine to use GPU-accelerated text rendering? […] Rendering glyphs is one thing but how about handling line breaking?

I’m not sure why you’re saying this: text shaping and layout (including line breaking) are almost completely unrelated to rendering.

▲zozbot234 21 hours ago

> Can you imagine rewriting a modern web browser engine to use GPU-accelerated text rendering?

https://github.com/servo/pathfinder uses GPU compute shaders to do this, which has way better performance than trying to fit this task into the hardware 3D rendering pipeline (the SDF approach).

▲Someone 20 hours ago

> Can you imagine rewriting a modern web browser engine to use GPU-accelerated text rendering?

It is tricky, but I thought they already (partly) do that. https://keithclark.co.uk/articles/gpu-text-rendering-in-webk... (2014):

“If an element is promoted to the GPU in current versions of Chrome, Safari or Opera then you lose subpixel antialiasing and text is rendered using the greyscale method”

So, what’s missing? Given that comment, at least part of the step from UTF-8 string to bitmap can be done on the GPU, can’t it?

▲zozbot234 19 hours ago

The issue is not subpixel rendering per se (at least if you're willing to go with the GPU compute shader approach, for a pixel-perfect result), it's just that you lose the complex software hinting that TrueType and OpenType fonts have. But then the whole point of rendering fonts on the GPU is to support smooth animation, whereas a software-hinted font is statically "snapped" to the pixel/subpixel grid. The two use cases are inherently incompatible.

▲kevingadd 17 hours ago

Just for the record, text rendering - including with subpixel antialiasing - has been GPU accelerated on Windows for ages and in Chrome/Firefox for ages. Probably Safari too but I can't testify to that personally.

The idea that the state of the art or what's being shipped to customers haven't advanced is false.

▲vendiddy 23 hours ago

Thanks for the breakdown! I love reading quick overviews like this.

▲moron4hire 1 day ago

SDF is not a panacea.

SDF works by encoding a localized _D_istance from a given pixel to the edge of character as a _F_ield, i.e. a 2d array of data, using a _S_ign bit to indicate whether that distance is inside or outside of the character. Each character has its own little map of data that gets packed together into an image file of some GPU-friendly type (generically called a "map" when it does not represent an image meant for human consumption), along with a descriptor file of where to find the sub-image of each character in that image, to work with the SDF rendering shader.

This definition of a character turns out to be very robust against linear interpolation between field values, enabling near-perfect zoom capability for relatively low resolution maps. And GPUs are pretty good at interpolating pixel values in a map.

But most significantly, those maps have to be pre-processed during development from existing font systems for every character you care to render. Every. Character. Your. Font. Supports. It's significantly less data than rendering every character at high resolution to a bitmap font. But, it's also significantly more data than the font contour definition itself.

Anything that wants to support all the potential text of the world--like an OS or a browser--cannot use SDF as the text rendering system because it would require the SDF maps for the entire Unicode character set. That would be far too large for consumption. It really only works for games because games can (generally) get away with not being localized very well, not displaying completely arbitrary text, etc.

The original SDF also cannot support Emoji, because it only encodes distance to the edges of a glyph and not anything about color inside the glyph. Though there are enhancements to the algorithm to support multiple colors (Multichannel SDF), the total number of colors is limited.

Indeed, if you look closely at games that A) utilize SDF for in-game text and B) have chat systems in which global communities interact, you'll very likely see differences in the text rendering for the in-game text and the chat system.

▲rudedogg 1 day ago

If I understand correctly, the authors approach doesn't really have this problem since they only upload the glyphs being used to the GPU (at runtime). Yes you still have to pre-compute them for your font, but that should be fine.

▲chii 1 day ago

but the grandparent post is talking about a browser - how would a browser pre-compute a font, when the fonts are specified by the webpage being loaded?

▲onion2k 1 day ago

The most common way this is done is by parsing the font and generating the SDF fields on the fly (usually using Troika - https://github.com/protectwise/troika/blob/main/packages/tro...). It slows down the time to the first render, but it's only a matter of hundreds of ms not seconds, and as part of rendering 3D on webpage no one really expects it to be that fast to start up.

▲fc417fc802 22 hours ago

> It slows down the time to the first render

Would caching (domain restricted ofc) not trivially fix that? I don't expect a given website to use very many fonts or that they would change frequently.

▲krona 23 hours ago

webassembly hosting freetype in a webworker. not too difficult.

▲ 1 day ago

▲cyberax 23 hours ago

Why not prepare SDFs on-demand, as the text comes in? Realistically, even for CJK fonts you only need a couple thousand characters. Ditto for languages with complex characters.

▲kevingadd 17 hours ago

Generating SDFs is really slow, especially if you can't use the GPU to do it, and if you use a faster algorithm it tends to produce fields with glitches in them

▲meindnoch 23 hours ago

Because it's slow.

▲ 16 hours ago

▲ 18 hours ago

▲Am4TIfIsER0ppos 23 hours ago

> complex transforms like skew, rotation, or 3d transforms can't be done

Good. My text document viewer only needs to render text in straight lines left to right. I assume right to left is almost as easy. Do the Chinese still want top to bottom?

▲Fraterkes 22 hours ago

God I hope that you don’t work on anything text-related

▲ulfbert_inc 21 hours ago

If you work with ASCII-only monospaced-only text, then yeah sure. It gets weird real quick outside of those boundaries.

▲Philpax 17 hours ago

Believe it or not, other people who aren't you exist.

▲adwn 20 hours ago

> Good. My text document viewer only needs to render text in straight lines left to right.

Yes, inconceivable that somebody might ever want to render text in anything but a "text document viewer"!

▲ChrisClark 16 hours ago

A classic example of main character syndrome, pun not intended :D

▲tuna74 15 hours ago

To all people that want sub-pixel rendering: Unless you know the sub-pixel grid on the display it is going to look worse. Therefore the only good UX that you can do is to ask the user for every display they use if they want to turn it on for that specific display. The OS also have to handle rotations etc as well.

▲strongpigeon 15 hours ago

Even better would be as the author suggests: having a way for the display to indicate its subpixel structure to the system.

▲mananaysiempre 14 hours ago

I always think about the Samsung SyncMaster 173P I used to have once. It was good for its time, but not usable with any kind of subpixel antialiasing (even on Gnome which allowed to you to choose between horizontal and vertical RGB and BGR): the subpixel grid on it was diagonal. Absolutely tractable as far as the signal-processing math, yet would be unlikely to fit in any reasonable protocol.

▲tuna74 13 hours ago

There is, but displays do not send out correct information unfortunately.

▲atoav 14 hours ago

Well it is a style of text you could use to emphasize certain words, which for most people translates to a different pronounciation of said word in their heads.

▲meindnoch 23 hours ago

Impressive work!

But subpixel AA is futile in my opinion. It was a nice hack in the aughts when we had 72dpi monitors, but on modern "retina" screens it's imperceptible. And for a teeny tiny improvement, you get many drawbacks:

- it only works over opaque backgrounds

- can't apply any effect on the rasterized results (e.g. resizing, mirroring, blurring, etc.)

- screenshots look bad when viewed on a different display

▲fleabitdev 22 hours ago

Getting rid of subpixel AA would be a huge simplification, but quite a lot of desktop users are still on low-DPI monitors. The Firefox hardware survey [1] reports that 16% of users have a display resolution of 1366x768.

This isn't just legacy hardware; 96dpi monitors and notebooks are still being produced today.

[1]: https://data.firefox.com/dashboard/hardware

▲layer8 17 hours ago

Even more strikingly, two-thirds are using a resolution of FHD or lower, and only around a sixth are using QHD or 4K. Low-DPI is still the predominant display situation on the desktop.

▲vitorsr 20 hours ago

See also Linux Hardware Database (developer biased) [1] and Steam Hardware & Software Survey (gamer biased) [2].

[1] https://linux-hardware.org/?view=mon_resolution

[2] https://store.steampowered.com/hwsurvey

▲ahartmetz 22 hours ago

What you're saying is "I have a high DPI screen, don't care about those who don't". Because these other arguments are really unimportant compared to the the better results of subpixel rendering where applicable.

▲NoGravitas 19 hours ago

Not sure about that. I don't really like subpixel rendering on a 100dpi screen very much because of color fringing. But add in the other disadvantages and it just seems not worth it.

▲ahartmetz 18 hours ago

Subpixel rendering is configurable. Some algorithms are patented, but the patents have expired. I'm not sure if the "good" algorithms have made it to all corners of computing. I use latest Kubuntu, slight hinting and subpixel rendering. It looks very good to me.

On my rarely used Windows partition, I have used ClearType Tuner (name?) to set up ClearType to my preferences. The results are still somewhat grainy and thin, but that's a general property of Windows font rendering.

▲mistercow 22 hours ago

Also, even if, as the author wishes, there were a protocol for learning the subpixel layout of a display, and that got widespread adoption, you can bet that some manufacturers would screw it up and cause rendering issues that would be very difficult for end users to understand.

▲ahartmetz 18 hours ago

This kind of problem has been dealt with before. It has a known solution:

- A protocol to ask the hardware

- A database of quirks about hardware that is known to provide wrong information

- A user override for when neither of the previous options do the job

▲cchance 15 hours ago

After seeing the cursive all i immediately thought was "who the fuck ever thought cursive was a good idea" lol

▲jml7c5 12 hours ago

People who handwrote. (And especially people who handwrote with quills and fountain pens — usable ballpoint pens are only 70 years old.)

▲adiabatichottub 14 hours ago

People who wrote lots of letters, that's who. The internet and free long-distance calling killed cursive.

▲rossant 1 day ago

I can't find the link to the code is it available?

▲pjmlp 1 day ago

While the article is great, I am missing a WebGL/WebGPU demo to go along the article, instead of videos only.

▲xiaoiver 1 day ago

Maybe you can take a look at this tutorial I wrote: https://infinitecanvas.cc/guide/lesson-015#msdf.

▲kh_hk 17 hours ago

This is a good resource and looks very well written. Many thanks for sharing!

▲pjmlp 23 hours ago

Thanks, looks like a nice reading over the weekend.

▲z3t4 1 day ago

When making a text editor from scratch my biggest surprise was how slow/costly text rendering is.

▲Bengalilol 23 hours ago

Amazing read, I am so envious of being able to go down such "holes".

As a side note, from the first "menu UI" until the end, I had the Persona music in my head ^^ (It was a surprise reading the final words)

▲neurostimulant 19 hours ago

I wonder if editors that use gpu text rendering like Zed would use something like this to improve their text rendering. Or maybe they already do?

▲corysama 8 hours ago

Here's an interview with the creator of Zed. IIRC, the rendering portion of the code is apparently pretty simple.

https://www.youtube.com/watch?v=fV4aPy1bmY0

▲favorited 1 day ago

I watched a conference talk[0] about using MSDFs for GPU text rendering recently, really interesting stuff!

[0] https://www.youtube.com/watch?v=eQefdC2xDY4

▲adamrezich 16 hours ago

Nobody here seems to have noticed but the “pseudocode” in the article is in fact Jai code, which you can tell by the `xx` in

    base_slot_coordinates := decode_morton2_16(xx index);

which in Jai means “autocast”.

▲king_geedorah 15 hours ago

Indeed it is. Nice.

▲dustbunny 1 day ago

This is incredibly well written, interesting and useful.

▲vFunct 1 day ago

I still don't understand why we need text rendered offline and stored in an atlas alongside tricks like SDFs, when GPUs have like infinite vertex/pixel drawing capabilities.. Even the article mentions writing glyph curves to an atlas. Why can't the shaders render text directly? There has to be a way to convert bezier to triangle meshes. I'm about to embark on a GPU text renderer for a CAD app and I hope to figure out why soon.

▲modeless 1 day ago

It's simply less expensive in most cases to cache the results of rendering when you render the same glyph over and over. GPUs are fast but not infinitely fast, and they are extremely good at sampling from prerendered textures.

Also it's not just about speed, but power consumption. Once you are fast enough to hit the monitor frame rate then further performance improvements won't improve responsiveness, but you may notice your battery lasting longer. So there's no such thing as "fast enough" when it comes to rendering. You can always benefit from going faster.

▲account42 1 day ago

> Once you are fast enough to hit the monitor frame rate then further performance improvements won't improve responsiveness

This is not true, if your rendering is faster then you can delay the start of rendering and processing of input to be closer to the frame display time thus reducing input latency.

▲modeless 18 hours ago

This is true but very rarely implemented.

▲animal531 21 hours ago

That's true, but in this case one can cache the output of e.g. a rendered letter and re-use that, without the intermediate SDF etc. steps.

Of course for e.g. games that breaks if the font size changes, letters rotate and/or become skewed etc.

▲MindSpunk 1 day ago

The triangle density of even a basic font is crazy high at typical display sizes. All modern GPU architectures are very bad at handling high density geometry. It's very inefficient to just blast triangles at the GPU for these cases compared to using an atlas or some other scheme.

Most GPUs dispatch pixel shaders in groups of 4. If all your triangles are only 1 pixel big then you end up with 3 of those shader threads not contributing to the output visually. It's called 'quad overdraw'. You also spend a lot of time processing vertices for no real reason too.

▲ 1 day ago

▲ben-schaaf 1 day ago

GPUs don't have infinite vertex/pixel drawing capabilities. Rendering text directly is simply more expensive. Yes, you can do it, but you'll be giving up a portion of your frame budget and increasing power usage for no real benefit.

▲account42 1 day ago

To expand on this, GPUs cannot rasterize text directly because they only work with triangles. You need to either implement the rasterization in shaders or convert the smooth curves in the font into enough triangles that the result doesn't look different (number of triangles required increases with font pixel size).

▲WithinReason 1 day ago

Triangles are the wrong choice, but otherwise you make a good point. This guy uses an atlas because he renders fonts by super sampling bezier curves using up to 512 samples per pixel, which is very expensive. However, you could e.g. compute the integral of the intersection of the bezier curve area with the subpixel area much faster, which I think could run in real time without a need for an atlas and would be more accurate than supersampling.

▲dxuh 1 day ago

GPUs are very fast, but not quite infinite. If you spend your GPU time on text, you can't spend it on something else. And almost always you would like to spend it on something else. Also the more GPU time you require, the faster the minimum required hardware needs to be. Text is cool and important, but maybe not important enough to lose users or customers.

▲andsoitis 1 day ago

> Why can't the shaders render text directly?

https://sluglibrary.com/ implements Dynamic GPU Font Rendering and Advanced Text Layout

▲spookie 1 day ago

Thin tris -> performance nuke

▲account42 1 day ago

GPU rasterizers don't do sub-pixel rendering. This is OK for most 3D geometry but for small text you want to take advantage of any additional resolution you can squeeze out.

On the other hand, if you are rendering to an atlas anyway then you don't really need to bother with a GPU implementation for that an can just use an existing software font rasterizer like FreeType to generate that atlas for you.

▲dustbunny 1 day ago

As far as I understand it, this is exactly what this article is about

▲jbrooks84 1 day ago

Love high ppi retina displays for crispy text

▲b0a04gl 20 hours ago

if we can stream video textures to gpu in real time, why can’t we stream sdf glyphs the same way? what makes text rendering need so much prep upfront?

▲shmerl 1 day ago

> One of those new OLEDs that look so nice, but that have fringing issues because of their non-standard subpixel structure

From what I understood, it's even worse. Not just non standard, but multiple incompatible subpixel layouts that OLEDs have. That's the reason freetype didn't implement subpixel rendering for OLEDs and it's a reason to avoid OLEDs when you need to work with text. But it's also not limited to freetype, a lot of things like GUI toolkits (Qt, GTK. etc.) need to play along too.

Not really sure if there is any progress on solving this.

> I really wish that having access to arbitrary subpixel structures of monitors was possible, perhaps given via the common display protocols.

Yeah, this is a good point. May be this should be communicated in EDIDs.

▲account42 1 day ago

There are oleds with somewhat standard subpixel layouts. E.g. my laptop has a vertical(!) BGR layout that FreeType and KDE support just fine.

I think the weird layouts are mostly due to needing different sizes for the different colors in HDR displays in order to not burn out one color (blue) too fast.

▲shmerl 17 hours ago

May be, but I've seen bug reports with a bunch of layouts and nothing looks standard there. Steam Deck OLED is such example, Lenovo laptops, LG UltreaGear OLEDs and etc. I don't really see any commonality.

* https://bugs.kde.org/show_bug.cgi?id=472340

* https://gitlab.freedesktop.org/freetype/freetype/-/issues/11...

▲ipsum2 1 day ago

In theory, yes, but in practice, I write code on a 4k OLED display and haven't noticed any artifacting.

▲shmerl 1 day ago

Higher resolution might mask the issue more, but it's still there.

▲eptcyka 1 day ago

Would be great if the videos in the article were muted so that iOS didn’t stop playing my music whilst reading this.

▲elia_42 17 hours ago

Really interesting!

▲EnPissant 1 day ago

It's important to point out that SDFs compute a pixel distance to the closest edge, while a more traditional font renderer computes pixel coverage. Pixel coverage is optimal. For small fonts, SDFs can look bad in places where edges meet. Maybe this is less of an issue on high PPI displays. Source: I implemented a SDF renderer and it looked worse than freetype.

▲account42 1 day ago

The coverage/distance distinction isn't relevant - you can trivially compute the coverage in your distance field renderer.

The point about intersections (or hard corners in general) is the issue with distance fields though. You can counteract it a bit by having multiple distance fields and rendering the intersection of them. See e.g. https://github.com/Chlumsky/msdfgen

▲EnPissant 23 hours ago

You don't have the information to compute the coverage. Hard corners is only an issue when scaling up. I am saying they are worse even at 100% scale.

▲exDM69 22 hours ago

Very cool stuff, text rendering is a really hairy problem.

I also got nerd sniped by Sebastian Lague's recent video on text rendering [0] (also linked to in the article) and started writing my own GPU glyph rasterizer.

In the video, Lague makes a key observation: most curves in fonts (at least for Latin alphabet) are monotonic. Monotonic Bezier curves are contained within the bounding box of its end points (applies to any monotonic curve, not just Bezier). The curves that are not monotonic are very easy to split by solving the zeros of the derivative (linear equation) and then split the curve at that point. This is also where Lague went astray and attempted a complex procedure using geometric invariants, when it's trivially easy to split Beziers using de Casteljau's algorithm as described in [1]. It made for entertaining video content but I was yelling at the screen for him to open Pomax's Bezier curve primer [1] and just get on with it.

For monotonic curves, it is computationally easy to solve the winding number for any pixel outside the bounding box of the curve. It's +1 if the pixel is to the right or below the bounding box, -1 if left or above and 0 if outside of the "plus sign" shaped region off to the diagonals.

Further more, this can be expanded to solving the winding number for an entire axis aligned box. This can be done for an entire GPU warp (32 to 64 threads): each thread in a warp looks at one curve and checks if the winding number is the same for the whole warp and accumulate, if not, set a a bit that this curve needs to be evaluated per thread.

In this way, very few pixels actually need to solve the quadratic equation for a curve in the contour.

There's still one optimization I haven't done: solving the quadratic equation in for 2x2 pixel quads. I solve both vertical and horizontal winding number for good robustness of horizontal and vertical lines. But the solution for the horizontal quadratic for a pixel and the pixel below it is the same +/- 1, and ditto for vertical. So you can solve the quadratic for two curves (a square root and a division, expensive arithmetic ops) for the price of one if you do it for 2x2 quads and use warp level swap to exchange the results and add or subtract 1. This can only be done in orthographic projection without rotation, but the rest of the method also works in with perspective, rotation and skew.

For a bit of added robustness, Jim Blinn's "How to solve a quadratic equation?" [2] can be used to get rid of some pesky numerical instability.

I'm not quite done yet, and I've only got a rasterizer, not the other parts you need for a text rendering system (font file i/o, text shaping etc).

But the results are promising: I started at 250 ms per frame at a 4k rendering of a '@' character with 80 quadratic Bezier curves, evaluating each curve at each pixel, but I got down to 15 ms per frame by applying the warp vs. monotonic bounding box optimizations.

These numbers are not very impressive because they are measured on a 10 year old integrated laptop GPU. It's so much faster on a discrete gaming GPU that I could stop optimizing here if it was my target HW. But it's already fast enough for real time in practical use on the laptop because I was drawing an entire screen sized glyphs for the benchmark.

[0] https://www.youtube.com/watch?v=SO83KQuuZvg [1] https://pomax.github.io/bezierinfo/#splitting [2] https://ieeexplore.ieee.org/document/1528437

▲meindnoch 21 hours ago

>Monotonic Bezier curves are contained within the bounding box of its end points

What's a "monotonic Bezier curve"?

Btw, every Bezier curve is contained within its control points' convex hull. It follows from the fact that all points on a Bezier curve are some convex combination of the control points. In other words, the Bezier basis functions sum to 1, and are nonnegative everywhere.

▲exDM69 21 hours ago

> What's a "monotonic Bezier curve"?

Good question!

It's a Bezier curve that has a non-zero derivative for t=0..1 (exclusive).

Just your high school calculus definition of monotonic.

To get from a general quadratic Bezier to monotonic sections, you solve the derivative for zeros in x and y direction (a linear equation). If the zeros are between 0 and 1 (exclusive), split the Bezier curve using de Casteljau's at t=t_0x and t=t_0y. For each quadratic Bezier you get one to three monotonic sections.

> every Bezier curve is contained within its control points' convex hull.

This is true, but only monotonic Bezier curves are contained between the AABB formed by the two end points (so control points in the middle don't need to be considered outside the AABB).

For a quadratic Bezier this means that it is monotonic iff the middle control point is inside the AABB of the two end points.

The monotonicity is a requirement for all the GPU warp level AABB magic to happen (which is a nice 10x to 20x perf increase in my benchmarks). At worst you'd have to deal with 3x the number of curves after splitting (still a win), but because most curves in fonts are monotonic the splitting doesn't increase the the number of curves a lot in practice.

Monotonicity also implies that the quadratic equations have only one unique solution for any horizontal or vertical line. No need to classify the roots as in Lengyel's method.

▲purplesyringa 21 hours ago

This sounds amazing, personally I'd love to take a look at the project or read a blog post when you're done.

▲exDM69 18 hours ago

Thanks for the words of encouragement. I need more time to turn this prototype into a practical piece of software and/or a publication (blog post or research paper). Unfortunately there are only so many hours in a day.

To actually render text I would still need to add text shaping to get from strings of text to pixels on the screen.

But a bigger problem is how to integrate it into someone else's software project. I have the rasterizer GPU shader and some CPU preprocessing code and the required 3d API code (I'm using Vulkan). I'm not sure how people usually integrate this kind of components to their software, do they just want the shader and preprocessor or do they expect the 3d API code too. Packaging it into a library that has the Vulkan code too has its own problems with interoperability.

You need to hope that it rains during my summer vacation, maybe this project will see some progress :)

▲animal531 20 hours ago

I find it amazing how many people are working and/or playing with text rendering at the moment. I work in Unity and have also been dipping my toes, but trying to think of doing some modern enhancements to the whole thing.

The Unity solution is quite limiting, someone about 10 years ago made a great asset and was actively developing it, but then they bought it and integrated it into their subsystem, after that all development basically stopped. Now in modern times a lot of games are using Slug, and it renders some really beautiful text, but unfortunately they are doing large scale licensing to big companies only.

Some thoughts I've come up with: 1. Creating text is basically a case of texture synthesis. We used to just throw it into textures and let the GPU handle it, but obviously its a general purpose device and not meant to know what information (such as edges) are more important than others. Even a 2k texture of a letter doesn't look 100% when you view it at full screen size.

2. Edge Lines: I have a little business card here from my Doctor's office. At e.g. 20-30cm normal human reading distance the text on it looks great, but zooming in close you can see a lot of differences. The card material isn't anywhere close to flat and the edge lines are really all over the place, in all kinds of interesting ways.

3. Filling: The same happens for the center filled part of a letter or vector, when you zoom in you can see a lot of flaws creep in, e.g. 5% of the visible layer has some or the other color flaw. There are black flecks on the white card, white/black flecks in a little apple logo they have etc.

4. So basically I want to add a distance parameter as well as using size. Both these cases for just rendering normal 2d text is relatively irrelevant, but in 3d people will often go stand right up against something, so the extra detailing will add a lot. For the synthesis part, there's no reason that any of the lines should be a solid fill instead of for example some artistic brush stroke, or using some derivative noise curves to create extra/stable information/detailing.

5. Another thing to look at might be work subdivision. Instead of rendering a whole set of letters in N time, if the camera is relatively stable we can refine those over successive frames to improve detailing, for example go from 2 to M subdivisions per curve.

6. There are numerous available examples such as the following concurrent B-Tree: https://www.youtube.com/watch?v=-Ce-XC4MtWk In their demo they fly into e.g. a planetary body and keep subdividing the visible terrain on-screen to some minimum size; then they can synthesize extra coherent detailing that matches that zoom level from some map, in their case e.g. noise for the moon, or code for water etc.

I find that a lot of the people working on text are sort of doing their own thing data structure wise, instead of looking at these already implemented and proven concurrent solutions. Here is another interesting paper, DLHT: A Non-blocking Resizable Hashtable with Fast Deletes and Memory-awareness / https://arxiv.org/abs/2406.09986

Not to say that those are the way to parallelize the work, or even if that is necessary, but it might be an interesting area where one can increase detailing.

▲pimlottc 17 hours ago

“Crisp text” would be more accurate, I thought maybe this was going to be about rendering intentionally degraded text, like in memes

▲ 16 hours ago

▲fatih-erikli-cg 9 hours ago

[dead]

▲moralestapia 9 hours ago

[flagged]

▲tomhow 9 hours ago

We don't know why people downvote things but it may be because people think it's nitpicking. A few people upvoted it, a few downvoted it. It's no big deal and nothing new. Please stop doing this.

▲enriquto 22 hours ago

> You might want to place the glyph at any position on the screen, not necessarily aligned with the pixel grid

No. I don't. This is a horrifying concept. It implies that the same character may look different every time is printed! This is extremely noticeable and really awful. For example when you align equal signs on consecutive lines of code, you notice straight away whether the characters are different.

Nowadays pixels are so small that I don't understand why don't we all just use good quality bitmap fonts. I do, and couldn't be happier with them. They are crisp to a fault, and their correct rendering does not depend on the gamma of the display (which is a serious problem that TFA does not even get into).

▲purplesyringa 21 hours ago

I mean, there's literally an example animation in the post. Maybe you don't want subpixel positioning in long texts, but you absolutely need it whenever you need animated transitions or animation in general.