question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How do I just tokenize, without using color themes?

See original GitHub issue

Hi - shiki looks great! I’m trying to use it to replace prism.js on https://tigyog.app/.

But I’m a bit confused about how to just get the raw tokens, or to get HTML output that uses normal CSS variables. I need this so I can generate HTML that uses the existing CSS classes and theme.

I can see codeToHtml, but this adds raw CSS colors to the HTML depending on theme. And I can see codeToThemedTokens, but that also seems to give me colors, rather than syntactic token types like variable, comment etc. I eventually found the 'css-variables' theme, but this seems hacky – it gives me strings like "var(--shiki-token-comment)", where I just want the string "comment". I could parse the "comment" string out of this, but it feels like I’m using it wrongly.

Here’s the API I expected:

interface IToken {
    /**
     * The content of the token
     */
    content: string;
    /**
     * E.g. "comment", "variable", etc
     */
    type: string;

    explanation?: ITokenExplanation[];
}

codeToTokens(code: string, lang?: StringLiteralUnion<Lang>): IToken[][];

Basically, I just want to use the tokenizer functionality, without the theme features. Is there a way to do this idiomatically with the shiki API?

Issue Analytics

  • State:closed
  • Created 9 months ago
  • Comments:6 (2 by maintainers)

github_iconTop GitHub Comments

2reactions
Gerrit0commented, Dec 12, 2022

If you don’t want any styling info, you probably don’t want Shiki… using vscode-textmate directly will give you the tokens you’re after, without attaching theming info. You’ll want grammar.tokenizeLine.

You could also set includeExplanation when calling codeToThemedTokens and check the explanation property, but performance will be worse, since that option essentially results in tokenizing everything twice, and slicing strings many times more than without that option

1reaction
wooormcommented, Dec 12, 2022

Maybe my project https://github.com/wooorm/starry-night might be of help in this case. It’s similar but different. Exposes a AST. Uses the same dependencies.

Also wanted to bring this to the attention to maintainers here: some recent issues are solved by starry-night, so if features don’t really make sense in Shiki, you can send users to starry-night.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How do I just tokenize, without using color themes? - PullAnswer
Basically, I just want to use the tokenizer functionality, without the theme features. Is there a way to do this idiomatically with the...
Read more >
8 Ways People of Color are Tokenized in Nonprofits - Medium
This type of tokenizing not only perpetuates economic inequality against POC, but strips POC of ownership over our own stories. I can think...
Read more >
Tokenization - CoreNLP - Stanford NLP Group
Run the tokenizer separately on each line of a file. This has the following consequences: (i) A token (currently only SGML tokens) cannot...
Read more >
Optimizations in Syntax Highlighting, a Visual Studio Code Story
TextMate Themes work with scope selectors which select tokens with certain scopes and apply theming information to them, such as color, ...
Read more >
Design tokens - Adobe Spectrum
They cover the various options of platform scales, color themes, component states, ... Only use global tokens when there are no aliases for...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found