Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Making fontmake faster

See original GitHub issue

We want to drastically speed up fontmake: https://github.com/googlei18n/fontmake/issues/367

Other than switching booleanOperations to something vector-based and fast (https://github.com/typemytype/booleanOperations/issues/40), the other big offenders are basically 1. MutatorMath / fontMath, and general defcon / read/write UFO. For example, when James replaces defcon use with tiny objects, he saw about 20% speedup:

https://github.com/googlei18n/ufo2ft/commit/2e65007df653515762ec95888f9c29b5137dec28

Cosimo has also reports that using lxml module instead of xml can significantly speed up UFO reading.

Since https://github.com/unified-font-object/ufoLib is essentially a reference implementation, I like to propose keeping it that way, with validators and converters, but add a trimmed-down version in fontTools.ufoLib that is optimized for speed. I suggest the API for this be strictly a subset of the upstream ufoLib, such that users can use one or the other based on their need.

I also like to add fs module support to it, as we need that for build-system experimentations. Which brings us to the topic of the UFO4 branch: https://github.com/unified-font-object/ufoLib/tree/ufo4 In general, I suggest we avoid these big revolutionary major version bumps that take forever to roll out and cause dependecy hell and instead work on an evolutionary UFO that changes slowly over time in master branch.

Please discuss.

@typesupply @typemytype @anthrotype @justvanrossum @LettError @jenskutilek @madig @brawer

Issue Analytics

State:
Created 6 years ago
Comments:251 (136 by maintainers)

Top GitHub Comments

8reactions

behdadcommented, Mar 19, 2021

Great discussion. I have input on various aspects. Different goals justify different actions at this point. I’ll try to group them logically:

Main reason why fontmake is slow is this: it was developed with “correctness first” in mind with limited resources. In my view it was going to be rewritten at some point.

Based on Simon’s flamegraph in https://github.com/fonttools/fonttools/issues/1095#issuecomment-797755327, below I identify low-hanging fruit that can be sped up with relative ease in the existing fontmake, while addressing the “to Python or not Python” and “Babelfont’s 5x faster” questions implicitly. In a subsequent comment I’ll share some ideas that I had back then for what a rewrite might look like.

For the sake of discussion, I like to name three different use-cases for fontmake:

Incremental build: Ie. “edit, build, test, repeat” loop.
Fat build: build a full set of binaries from one source. Eg. variable-ttf; full set of ttf and otf instances; perhaps subsetted versions of the character set, etc.
Narrow build: Just build one output binary from one source. Eg. build a variable-ttf.

The rest of this comment is mostly involved with speeding up narrow builds, as in “make me a variable font”.

Low-hanging fruit

For reference, here’s the flame-graph:

I think we can make it 2x faster by trimming fat from the following:

The middle of the graph (the wide column with _writeTable etc) suggests that individual master binaries are generated; indeed, two columns to its right, the mention of “expand” suggests that then the binaries are being loaded again. That’s unnecessary. We just need to build Python TTFont but no need to compile it before passing to varLib to make a varfont. Removing that unnecessary save (and fixing up any breakage from it) should save at least 1 babelfont of work.
Another ~1 babelfont of time is spend in iup_delta_optimize. There are multiple lessons here:
- babelfont doesn’t necessarily do everything fontmake is doing. So it’s expected that it’s faster.
- Python is fine if it’s used for scripting, orchestrating, but is VERY inefficient for implementing algorithms that do lots of work per each byte / coordinate, etc. And iup_delta_optimize is one such algorithm. It’s a simple dynamic-programming. The runtime is well-understood. It would be probably over 1000x faster if done in C.
  - Note: This is not the only time we have hit that. I have implemented optimal dynamic-programming-based glyf outline compilation as well, but it was too slow and the savings negligible, so it’s disabled and the default greedy implementation used: https://github.com/fonttools/fonttools/blob/2687c12c2e9dc372e3f35ee5e7318c20c1519c26/Lib/fontTools/ttLib/tables/_g_l_y_f.py#L819
- iup_delta_optimize has two parts: simple cases are handled first (entire contour shifted, etc; no actual interpolation involved); then the full algorithm. The full algorithm’s gains are extremely tiny: maybe a dozen bytes in a mid-sized font. So perhaps if it’s very slow and gains are very small, maybe it shouldn’t be enabled by default. If we had optimization levels like compilers do, this should be enabled by -O3 or -Os/-Oz, but not -O2 which would be the recommended default.
- I have a feeling that the slow part can very easily be Cythonized though. Definitely worth exploring.
There’s no point trying to speed up gvar generation using fancy facilities. The gvar is already generated very optimally since entire GlyphCoordinates objects are passed to VariationModel. Now, the GlyphCoordinates uses an array.array for storage; this was done when I removed NumPy. As a result, the __add__/__mul__ etc of it are implemented in Python. They are still fast, but making those happen in C would give us significant speedup in gvar generation. Try Cythonizing those? Ideally maybe share this code / work with fontTools.misc.vector.
Currently OTTableWriter keeps a list of “items” that are concatenated at the end. The items are typically bytes() objects that are 2 or 4 bytes long. That’s insane overhead to use a Python object to store 2 bytes. Now I’m thinking that switching OTTableWriter to use a bytearray internally as a bytes-builder makes more sense. The offsets and other calculated ints need more bookkeeping to know their position inside the table; that can be done easily. I have a feeling that this can make GSUB/GPOS table compilation many times faster.

Tangent

The GSUB/GPOS compiler in fonttools is REALLY slow if overflows happen. I started working on fixing that in the 99proof branch back in 2018:

https://github.com/fonttools/fonttools/tree/99proof

I talked about it extensively at Robothon 2018, the so-called “99 Proof Smal Batch Distillery”:

https://vimeo.com/330981972

But never finished it.

Later in 2018 / 2019, I wrote a design document for the HarfBuzz Subsetter, building on top of those ideas but suggesting a different approach, called “Faster Horse Freezer”:

https://goo.gl/bHvnTn

but I never implemented that. Fortunately, @garretrieger implemented most of the reordering ideas from 99proof into a hb-repacker module, which we are landing in HarfBuzz today:

https://github.com/harfbuzz/harfbuzz/pull/2857

The code is very isolated from the rest of HarfBuzz Subsetter. It basically takes a graph of objects and links, and tries to produce an ordering that wouldn’t overflow. It shouldn’t be too much work to pass the OTTableWriter graph to C and call the repacker on it…

6reactions

simoncozenscommented, Mar 10, 2021

Incidentally, I tried an experiment wrapping the Rust norad library in Python. It’s quite a speedup.

from timeit import timeit
from ufoLib2 import Font as ufoLibFont
from boulmer import Font as boulmerFont
from fontTools.pens.recordingPen import RecordingPen

path = "master_ufo/NotoSerif-Regular.ufo"
all_glyphs = ufoLibFont(path).glyphOrder

def run_test(loader):
    font = loader(path)
    for g in all_glyphs:
        rpen = RecordingPen()
        font[g].draw(rpen)

ufolib_time = timeit(lambda : run_test(ufoLibFont), number = 10)
boulmer_time = timeit(lambda : run_test(boulmerFont), number = 10)
print("ufolib2 time: %f" % ufolib_time)
print("boulmer time: %f" % boulmer_time)

ufolib2 time: 12.610213
boulmer time: 3.382847

Top Results From Across the Web

How to make a variable font using FontForge, AFDKO and ...

FontForge can be used to make variable fonts in combination with AFDKO and fontmake.Want to learn how? In this video I narrate the...

fontmake - PyPI

fontmake compiles fonts from various sources ( .glyphs , .ufo , designspace ) into ... You can use it to create static instances...

5 Tips To Make Google Fonts Faster

5 Tips To Make Google Fonts Faster · Tip 1. Use Fewer Fonts · Tip 2. Specify the Font Text · Tip 3....

What happens when you compile a variable font

notdef glyph, ufo2ft makes one for you. After that, fontmake will say something like Pre-processing glyphs . Now, if you have a Glyphs...

fontmake v/s afdko - TypeDrawers

Hi,I am in process of making (and publishing) my first font.As I research the publishing guidelines and standards, I found that the two...

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free

Top Related Reddit Thread

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

Making fontmake faster

Issue Analytics

Top GitHub Comments

Low-hanging fruit

Tangent

Top Results From Across the Web

Top Related Medium Post

Top Related StackOverflow Question

Troubleshoot Live Code

Top Related Reddit Thread

Top Related Hackernoon Post

Top Related Tweet

Top Related Dev.to Post

Top Related Hashnode Post

Variable glyph missing a chunk?

varLib "IUP" delta optimization as an optional feature instead of on by default