New markdown parser
See original GitHub issueAs per this discussion mdsvex v1 will have a custom flavour of markdown that aims to be simpler and come with a few additional features that are pretty much expected at this point. This is a tracking issue for this work.
This does not include the svelte parser, the aim is to keep the svelte and markdown parsers separate and compose them in another step. This may prove to be too challenging due to the fact that the svelte syntax can appear almost anywhere in a mdsvex file but I’ll see how that pans out.
The parsing strategy listed in the markdown spec is not really appropriate in this instance due to the fact that HTML (and Svelte) syntax will be parsed into a full AST and does not match all markdown semantics.
Broadly speaking there are a few different ‘contexts’ that you can be in when parsing a markdown file.
document
, can containleaf_block
orcontainer_block
.leaf_block
cannot contain other blocks but can containinline
.container_block
can containleaf_block
, or additionalcontainer_block
, container block is recursive.inline
cannot contain blocks but someinline
nodes can nest otherinline
nodes.
→ Document →
→ Leaf Block ↑
→ Inline ⟳
→ Container Block ⟳
→ Leaf Block ↑
→ Container Block ⟳
→ ...
Specific nodes have additional rules on top of these very basic ones.
Probably stuff I’ve missed here, will be updated as we go.
Setup
Leaf blocks
Inline
Container Blocks
Cross-cutting
Issue Analytics
- State:
- Created 2 years ago
- Reactions:10
- Comments:5 (2 by maintainers)
Top GitHub Comments
Why does that matter? As long as the maintainers are comfortable maintaining it then it isn’t an issue.
I also have a preference for large files over many files. It has no bearing on the API and users won’t know the difference, but this will be no different, most things will be inlined for performance reasons.
Just my 2cts but what stands in the way of a generated parser (like lezer) for generating the AST from the markdown/svelte source code? While the grammar may end up being complex it should be a good middle ground between a modular and a full self-written approach. It may also allow you (given the generator is popular) to use an existing markdown grammar and just extend the svelte parts as seen fit.
Edit: It seems I underestimated markdowns parsability. Though the lezer implementation may still be a good starting point