question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

"smart punctuation", i.e. dynamic display of input characters

See original GitHub issue

Some Markdown processors offer features called smart dashes, smart quotes, or smart punctuation, which translate certain sequences of basic characters available from the keyboard into specialized characters. When these characters are displayed literally in the editor pane, then the appearance is less appealing than when they appear translated.

It would be a wonderful enhancement to apply the conversions before displaying text in the editor.

Desired translations depend on localization, but to begin, I have provided a list for English (see below), some items on which may be used also in other languages.

Some translations are simple pattern substitution, whereas others are context dependent.

I also attempted an example set (see bottom).

Note that some editors translate input characters in the saved MarkDown source. I would not recommend such a feature, unless it is only enabled by user preference.

Note also that that backslash escaping would need to be available to suppress translation in cases as desired.


Source literal pattern Target Unicode characters Descriptive name
Exactly one ASCII hyphen: - U+2010, or U+2012 if immediately preceded and followed by digits with no intermediary space Hypthen or Figure dash
Exactly two ASCII hyphens: -- U+2013 En Dash
Exactly three ASCII hyphens: --- U+2014 Em Dash
Exactly three ASCII full stops: ... U+2026 Horizontal Ellipsis
ASCII double quote: " U+201D if not followed by any letter in word, otherwise U+201C Left and right double quote mark
ASCII apostrophe: ' U+2018 if not preceding by any letter in word and if matched by a later character in paragraph interpreted as a right single quote mark, otherwise U+2019. (Interpret as right single quote if not followed by any letter in word and if matching an earlier character in paragraph interpreted as a single left quote.) Left and right single quote mark

Original text Formatted output
The Nov--Dec period is busiest for retail outlets. The Nov–Dec period is busiest for retail outlets.
I went shopping yesterday---I go every Sunday---but the market was closed. I went shopping yesterday—I go every Sunday—but the market was closed.
Ages 15-20 are considered formative for adult personality. Ages 15‒20 are considered formative for adult personality.
It seems he wants us to follow him... It seems he wants us to follow him…
The '29 stock-market crash precipitated the Great Depression. The ’29 stock-market crash precipitated the Great Depression.
Marsha Robinson's dog is ill. Marsha Robinson’s dog is ill.
At least when I asked Tom, he said, "Marsha's dog is ill." At least when I asked Tom, he said, “Marsha’s dog is ill”.
I told you, "Tom said, 'Marsha's dog is ill'". I told you, “Tom said, ‘Marsha’s dog is ill’”.
So you if you see a sick dog, you can say, 'This dog is the Robinsons''. So you if you see an ill dog, you can say, ‘This dog is the Robinsons’’.
Be sure not to say, 'This dog is the *Petersons'*'! Be sure not to say, ‘This dog is the Petersons’’!

Issue Analytics

  • State:open
  • Created 3 years ago
  • Reactions:3
  • Comments:9 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
brainchild0commented, Oct 30, 2020

You are right, now that I think of it the German and Czech styles, which I am familiar with, use different quotation standards (different signsn and placement).

And users writing in such languages load packages that alter processing to follow appropriate conventions. Only English-language users may lazily avoid use of these packages because of the American origin of the software. Ultimately, extensibility is required in any case to meet the demands of all users worldwide with respect to a feature as complicated is the one proposed.


Does LaTeX create an en dash even when processing a single hyphen between digits?

No, it produces a hyphen (U+002D).

Then LaTeX behaves the same as in the rules of the proposal, except that the proposal requires a figure dash in special cases. I won’t change the proposal now, because this distinction is far from the central purpose of the discussion, but I am open to any design choice that favors adherence to some authoritative convention.

1reaction
brainchild0commented, Oct 30, 2020

Thanks for the comments.

When numeric ranges such as page ranges in a citation, the correct character to be used is the en dash. Being used to LaTeX, I’m used to typing pp. 78--103 in the source code and seeing an en dash in the output. This would be a great feature.

Are you sure? I believe that a figure dash is always preferred for numeric ranges, which LaTeX (or any similar software) should identify as the conversion target if the source character is surrounded by digits (as opposed to word characters, e.g. letters). Actually I am not familiar with a convention of a double hyphen in the source when surrounded by digits. LaTeX may convert it to an em dash as it would otherwise, but I have doubts that this result is desirable typographically. Em dash as a different purpose, usually to set apart clauses.

See the Wikipedia articles on the characters for clarification.

Read more comments on GitHub >

github_iconTop Results From Across the Web

"smart punctuation", ie dynamic display of input characters #2087
"smart punctuation", i.e. dynamic display of input characters #2087 ; Exactly one ASCII hyphen: -, U+2010, or U+2012 if immediately preceded and followed...
Read more >
Character input methods for Apple devices - DPWiki
Character input methods for Apple devices. From DPWiki ... 1.1 Smart Punctuation; 1.2 Entering accented and special characters.
Read more >
Curling Quotes in HTML, XML, and SGML - David A. Wheeler
This paper describes how to handle curling quotes (smart quotes) in HTML, SGML, ... quote characters (these marks are called “smart quotes,” “curly...
Read more >
Smart Punctuation on iOS 11 - PSPDFKit
With iOS 11, Apple introduced Smart Punctuation. By default, this feature automatically converts ambidextrous straight quotes to curly quotes, in addition to ...
Read more >
Tech Tips: 3 Guidelines for Stellar Design ... - Dynamic Marketing
In design software, users have the preference to use “smart quotes” or “dumb ... a simple sans serif, and a refined display typeface...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found