question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Detect typeface styles

See original GitHub issue

Congrats on the fantastic tool. Have you given any thought to making the network aware of italic, bold, etc., as well as different types of typefaces? As far as I can tell this should (hopefully) be a relatively small change.

Here’s how I imagine it could be implemented:

  1. Treat all “stylistic” info (the specific font, whether it’s bold, italic, etc.) as an extra closed-class classification problem. The person doing the training is responsible for providing info on which kind of stylistic labels are present in the training data. E.g. if the training data has two different typefaces, a main font and an alternate font, and the alternate can optionally be italic, then the new stylistic classifier will have the following classes: main, alternate, alternate_italic.
  2. The training data is somehow annotated for stylistic info. This is the slightly more annoying bit to implement I imagine. One could use some kind of XML markup to denote segments of characters which are in a font different from the main one, e.g.
    This is the main font, then we have <alternate>some text in
    the alternate font</alternate> and finally
    <alternate_italic>the alternate font in italic</alternate_italic>
    
  3. In the forward pass of the network, the old character classifier is kept, but additionally the new stylistic classifier is also run to predict the correct font.
  4. ???
  5. Profit!

Issue Analytics

  • State:open
  • Created 4 years ago
  • Reactions:1
  • Comments:8 (5 by maintainers)

github_iconTop GitHub Comments

2reactions
chreulcommented, Apr 27, 2019

We experimented with something similar when working on a historical lexicon: https://zenodo.org/record/1451482#.XMReFegzY2w In this case study we decided to treat the task as two separate sequence classification problems: textual OCR and typography tagging. The respective models were trained and applied separately and the results were combined during a postprocessing step using the positional information from Calamari’s extended prediction data output. As of now I think that this is the best way to do it. Of course, the computational effort increases but the codecs stay minimal and each model can focus on its specific sub task. I would love a generic implementation of this but @ChWick is a little busy (i.e. lazy) right now 😃.

0reactions
ChWickcommented, Aug 7, 2020

Just random answers:

  1. You can use the ATR model as “pretrained weights” so you do not have to start from scratch. One alternative is to train both models in parallel, i.e. sharing conv, pool, lstm layers, and add two FC layers (one OCR one Typo) and two loss functions. I tested this, but it performed very similar (even a bit worse) to having two models. This code is however not integrated in Calamari.
  2. Having bbbbbbbb instead of one b has several advantages: a) Possible to capture typographic changes within a word (we had a project where this was the case quite often), b) it is straightforward to use the pretrained OCR weights. I think @chreul I think tested this approach

Using a PC to determine the typography at each “Pixel” seems an interesting idea and I would assume good results, however:

  1. The word/character level annotation might not be very accurate
  2. This must be fully implemented (a lot of work) The big advantage is that the “alignment” step is omitted, and I like that! So feel free to test this approach! It will work if you have enough time and training data.

It is also possible to “share” some code. I use the positional prediction of the Calamari types to obtain a pixel-wise labeling (similar to your PC approach!) to solve the alignment.

Read more comments on GitHub >

github_iconTop Results From Across the Web

WhatTheFont Font | MyFonts
How it works: WhatTheFont uses deep learning to search our collection of over 133,000 font styles and find the best match for the...
Read more >
Font Finder By Image — Fontspring Matcherator
Font Finder tool to find what the font is in an image. Upload a photo to the font identifier and identify fonts with...
Read more >
Identify Fonts - The Font Squirrel Matcherator
The Font Matcherator will help you identify what the font is in any image. Just upload any jpg, gif or png.
Read more >
Font Finder 🔎 by What Font Is
Font finder that helps you to identify fonts from any image. 🔎 Upload the image and choose what the font you need. 840000...
Read more >
Top 5 Tools to Identify a Font - School of Motion
Tools to Identify a Font · What the Font by MyFonts · Font Identifier by FontSquirrel · WhatFontIs · Identifont · Adobe Photoshop's...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found