Allsorts Font Shaping Engine 0.6 Release

· Wesley Moore — Developer

In the last six months we've added support for shaping new scripts, glyph positioning, accessing glyph contours, improved performance, and more.

Allsorts is a Rust crate (library) that can parse OpenType, WOFF, and WOFF2 fonts, shape text, and subset fonts1. We use it in Prince for all font parsing, shaping, and subsetting.

Since the Allsorts 0.5 release in Dec 2020 we've continued to improve Allsorts and expand its capabilities. The 0.6 release adds support for shaping Khmer, Lao, Thai, and Sinhala scripts; laying out glyphs and obtaining their positions; accessing glyph contours; improved performance; and a few bug fixes as well.

New Scripts

We continue to expand the scripts Allsorts can shape and have added support for Khmer, Lao, Thai, and Sinhala scripts. Each new script is accompanied by extensive corpus-based tests with text extracted from real-world sources like Wikipedia.

Extract from the Khmer translation of The Tower of Babel shaped in Prince
Extract from the Lao translation of The Tower of Babel shaped in Prince
Extract from the Thai translation of The Tower of Babel shaped in Prince
Extract from the Sinhala translation of The Tower of Babel shaped in Prince

Glyph Positioning and Contours

You can now layout glyphs and obtain their positions as well as access glyph contours. This makes it possible to draw text by pairing Allsorts with any vector graphics library that supports move to, line to, quadratic curve to, cubic curve to, and close path operations.

GlyphLayout is used to obtain the positions for a collection of shaped glyphs. The position for each glyph includes its horizontal and vertical advance as well as any (x, y) offset from the origin. Horizontal layout in left-to-right and right-to-left directions is supported. Basic (but incomplete) support for vertical text is present too.

The position of a series of glyphs is determined from an initial pen position (that the application using Allsorts maintains) that is incremented by the advance of each glyph as they are processed. The position of a particular glyph is the current pen position plus X and Y offset.

The outline module provides access to the contours of glyphs as a series of foundational drawing instruction callbacks. Currently we support reading glyph contours from glyf and CFF tables. The allsorts tool has been extended to make use of this in order to generate an SVGs of glyphs.

SVG generated with the allsorts CLI tool presenting glyphs from TeX Gyre Termes.

In turn, we used this SVG generation ability to hook Allsorts up to the Unicode text rendering tests. Allsorts passes the tests for features that we support. We hope to integrate Allsorts into the upstream text rendering tests project soon.

Sample test from the Unicode text rendering tests showing Allsort produces the expected output.

Performance Improvements

Various improvements have been made to make some code paths more efficient leading to mild speed ups in our tests:

  • Optimised GPOS duplicate lookup removal.
  • Optimised handling of CFF fonts with custom characters sets.
  • Fixing a mistake that resulted in vmtx table being repeatedly cloned.
  • Avoid some allocations when working with glyphs.

Miscellaneous Improvements and Fixes

  • API improvements:
  • Replaced rental with ouroboros as rental was no longer maintained. (Thanks @est31)
  • Handle post tables that map more than one glyph to the same name index.
  • Apply Unicode mark reordering to more scripts, not just Arabic and Indic scripts.
  • Always apply default shaping for complex scripts.
  • Handle non-adjacent cursive connections in GPOS.
  • Support version 1.1 GPOS/GSUB tables.

Font shaping is the process of taking text in the form of Unicode codepoints and a font, and laying out glyphs according to the text. This involves honouring kerning, ligatures, and substitutions specified by the font and performing glyph reordering according to the script-specific OpenType rules.

Font subsetting refers to decreasing the size of a font by only including the data for a reduced set of glyphs.

Previous Post