question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Issue inline parser of Chinese

See original GitHub issue

When I parse like **Foo:**Bar, the result I want is <strong>Foo:</strong>Bar, but it rendered **Foo:**Bar. Just like before.

I think it is correct under the CommonMark specifications of English. But in Chinese, like **标题:**这是一个例子, We’d prefer it rendered <strong>标题:</strong>这是一个例子. Instead of adding a whitespace between ** and 这是 (It’s all I can do right now), Because we use rather than : . In fact, that’s what most people does.

I wonder if an option can be added to select whether to turn on this “strict verification mode”. I don’t seem to see a similar issue.

Like this, in InlineParserImpl.java#L538, when I always keep canClose=true, that will give me what I need.

image-20211219120230986

I don’t know much about CommonMark, if I am wrong, please correct me, thank you! 😄

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5

github_iconTop GitHub Comments

1reaction
robinstcommented, Dec 23, 2021

Ok. We don’t want to deviate from the spec on this, so we’re not going to change parsing logic in commonmark-java.

You can try raising an issue against the spec here, but I have the feeling in order to make it work other common cases would break: https://github.com/commonmark/commonmark-spec

1reaction
boyshellcommented, Dec 19, 2021

您好!邮件已收到。

Read more comments on GitHub >

github_iconTop Results From Across the Web

Is it harder to parse Chinese, or the Chinese Treebank?
We develop a factored-model statis- tical parser for the Penn Chinese Treebank, showing the implications of gross statistical differences between WSJ and  ......
Read more >
Recovering Chinese Nonlocal Dependencies with a ...
Chinese semantic dependency parsing with generalized categorial grammar In SemEval, 2016. Manjuan Duan and William Schuler. Parsing Chinese with a ...
Read more >
How can I fix my Chinese PDF parsed in Apache Tika for ...
The issue I want to resolve is with Tika for Python parsing the text correctly and not Acrobat. I was just using Acrobat...
Read more >
Softbreak rendering in CJK languanges · Issue #334 - GitHub
they implemented the logic as a parser extention: jgm/pandoc@ ... The issue will be tentatively called all math blocks are "inline" and ...
Read more >
T306862 Chinese Language Converter is not working in the sidebar ...
The Chinese characters in the sidebar table of the contents should be converted ... @JMcLeod_WMF this is an issue in the parser so...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found