Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Provide a way to populate field values from Markdown/MDX processing pipeline

See original GitHub issue

Feature: I’d like to use an MDX plugin to create programmatic meta (variable declarations) that can then be exposed in the final object contentlayer provides.

Use case: An example of here this would be useful is generating a TOC, creating a list of unique keywords, etc.

Work around: Use computed fields (which don’t expose the the raw AST)

Consider the following config structure

export default makeSource({
  contentDirPath: "content",
  documentTypes: [Posts, Things, People],
  mdx: {
    rehypePlugins: [rehypeSlug, rehypeAutolinkHeadings, searchMeta],
  },
});

Where searchMeta looks at paragraph nodes of mhast, grabs a list of unique words, and adds them to the metadata as searchMeta.

A markdown file with the structure of:


---
title: Hello World
slug: hello-world
---
Hello World! Please say Hello!

Would generate a final object of:

{
"title": "Hello World",
"slug": "hello-world",
"searchMeta": ["hello", "world", "please", "say"],
"code": "().....",
"_raw": "..."
}

For sake of complete, if not ugly code, here’s a working example of the plugin that adds searchMeta to the data attribute of the vFile in the rehype plugin chain.


import { visit } from "unist-util-visit";

export default function searchMeta() {
  return (tree, file) => {
    visit(tree, { tagName: "p" }, (node) => {
      let words = node.children.reduce((collector, current) => {
        if (typeof current.value === "string") {
          let wordList = current.value
            .split(" ")
            .filter((word) => !word.includes(":"))
            .map((word) => word.toLowerCase().replace(/[^a-z0-9]/gi, ""))
            .filter((word) => word.length > 3);
          let newCollector = new Set([...wordList, ...collector]);
          return newCollector;
        } else {
          return collector;
        }
      }, new Set());

      file.data.searchMeta = [...words];
    });
  };
}

Issue Analytics

State:
Created a year ago
Reactions:1
Comments:8

Top GitHub Comments

2reactions

motleydevcommented, Jul 13, 2022

The more I think about it, accessing vfile.data from computed fields would totally solve my use-case. It’d still be nice to be able to do all the work “in” the handler, but being able to do visit work during the initial parsing and then passing that along with the payload would be more than sufficient. What do you think a reasonable timeline on that would be?

1reaction

timlrxcommented, May 10, 2022

As a temporary workaround, one could consider defining a computedField that parses the raw output from contentlayer. Here’s an example of extracting the table of contents of a markdown file and making it available as a toc property in contentlayer

// Assume a remark plugin that stores the information in `vfile.data.toc`
export async function extractTocHeadings(markdown) {
  const vfile = await remark().use(remarkTocHeadings).process(markdown)
  return vfile.data.toc
} 

const computedFields: ComputedFields = {
  toc: { type: 'string', resolve: (doc) => extractTocHeadings(doc.body.raw) },
  ...
}