Provide a way to populate field values from Markdown/MDX processing pipeline
See original GitHub issueFeature: I’d like to use an MDX plugin to create programmatic meta (variable declarations) that can then be exposed in the final object contentlayer provides.
Use case: An example of here this would be useful is generating a TOC, creating a list of unique keywords, etc.
Work around: Use computed fields (which don’t expose the the raw AST)
Related issues: https://github.com/kentcdodds/mdx-bundler/issues/169
Consider the following config structure
export default makeSource({
contentDirPath: "content",
documentTypes: [Posts, Things, People],
mdx: {
rehypePlugins: [rehypeSlug, rehypeAutolinkHeadings, searchMeta],
},
});
Where searchMeta
looks at paragraph nodes of mhast, grabs a list of unique words, and adds them to the metadata as searchMeta
.
A markdown file with the structure of:
---
title: Hello World
slug: hello-world
---
Hello World! Please say Hello!
Would generate a final object of:
{
"title": "Hello World",
"slug": "hello-world",
"searchMeta": ["hello", "world", "please", "say"],
"code": "().....",
"_raw": "..."
}
For sake of complete, if not ugly code, here’s a working example of the plugin that adds searchMeta
to the data attribute of the vFile in the rehype plugin chain.
import { visit } from "unist-util-visit";
export default function searchMeta() {
return (tree, file) => {
visit(tree, { tagName: "p" }, (node) => {
let words = node.children.reduce((collector, current) => {
if (typeof current.value === "string") {
let wordList = current.value
.split(" ")
.filter((word) => !word.includes(":"))
.map((word) => word.toLowerCase().replace(/[^a-z0-9]/gi, ""))
.filter((word) => word.length > 3);
let newCollector = new Set([...wordList, ...collector]);
return newCollector;
} else {
return collector;
}
}, new Set());
file.data.searchMeta = [...words];
});
};
}
Issue Analytics
- State:
- Created a year ago
- Reactions:1
- Comments:8
Top GitHub Comments
The more I think about it, accessing vfile.data from computed fields would totally solve my use-case. It’d still be nice to be able to do all the work “in” the handler, but being able to do visit work during the initial parsing and then passing that along with the payload would be more than sufficient. What do you think a reasonable timeline on that would be?
As a temporary workaround, one could consider defining a computedField that parses the raw output from contentlayer. Here’s an example of extracting the table of contents of a markdown file and making it available as a
toc
property in contentlayer