Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

glyphs/case mapping between caps and lowercases

See original GitHub issue

For example, a font I am trying to onboard has Ydieresis, but not ydieresis. We need a case mapping check cause if someone capitalises ÿ… then there won’t be any Ÿ. We could have the same with small caps too.

There will be few exception like Dz, which doesn’t have a lowercase relative.

Issue Analytics

State:
Created 2 years ago
Comments:5 (5 by maintainers)

Top GitHub Comments

1reaction

RosaWagnercommented, Sep 13, 2021

To what @chrissimpkins mentioned I would add another exception: uni0237 (j dotless), doesn’t have a capital counter part either.

For the smallcaps mapping, it has to have the same mapping as uppercases logically
IMO severy=10 / FAIL, cause if a font shows tofu in caps but not in lowercase then it can be considered broken.
Would it be a problem to have this check implemented prior to have an exhaustive list of exceptions? Cause it is really hard to check case mapping with human eyes, and the exception list could be completed when something comes up?

1reaction

chrissimpkinscommented, Apr 2, 2021

Can we come up with a list of them all?

It looks like maybe this Unicode chart is a start? https://www.unicode.org/charts/case/chart_NoCaseMapping.html

Defined as the following:

If characters have a decomposition containing a cased character, but do not have a case mapping (lower, title, upper, or fold), then they are listed in NoCaseMapping.

Also relevant from the Unicode case mapping docs:

There are a number of complications to case mappings that occur once the repertoire of characters is expanded beyond ASCII.

In most cases, the titlecase is the same as the uppercase, but not always. For example, the titlecase of U+01F1 “DZ” capital dz is U+01F2 “Dz” capital d with small z.

Case mappings may produce strings of different length than the original. For example, the German character U+00DF “ß” small letter sharp s expands when uppercased to the sequence of two characters “SS”. This also occurs where there is no precomposed character corresponding to a case mapping, such as with U+0149 “ŉ” latin small letter n preceded by apostrophe.

There are some characters that require special handling, such as U+0345 combining iota subscript.

Characters may also have different case mappings, depending on the context. For example, U+03A3 “Σ” capital sigma lowercases to U+03C3 “σ” small sigma if it is followed by another letter, but lowercases to U+03C2 “ς” small final sigma if it is not.

Characters may have case mappings that depend on the locale. For example, in Turkish the letter U+0049 “I” capital letter i lowercases to U+0131 “ı” small dotless i.

Since many characters are really caseless (most of the IPA block, for example) and have no matching uppercase, the process of uppercasing a string does not mean that it will no longer contain any lowercase letters.

It might be possible to pull these data out of the ICU lib using something like Cased or Changes_When_* properties?

Top Results From Across the Web

Spacing between upper and lowercase - Glyphs Forum

I have an uppercase P that looks fine next to other caps on the right side but there's a gap whenever a lowercase...

How to convert a value from lower case to upper case in a map

How to convert a value from lower case to upper case in a map. ... The uppercase data can then be mapped to...

Mapping to lower case - Type Classes

Converting the whole input to lower case obliterates all distinctions between upper and lower case letters, and so capitalization should no longer have...

Case Mappings | ICU Documentation

Case mapping is used to handle the mapping of upper-case, lower-case, and title case characters for a given language. Case is a normative...

Character Case Mapping (Guile Reference Manual) - GNU.org

The procedures below provide support for “character case mapping”, i.e., to convert characters or strings to their upper-case or lower-case equivalent.