Improvements to the {{excerpt}} helper
See original GitHub issueIt’s unquestionable, given the vast number of issues, PRs, forum posts, support requests and other mentions of our {{excerpt}}
helper, that it’s leaving a lot to be desired.
Yet, as demonstrated by the wide range of different ideas on how it should be improved, it’s hard to find consensus on what ‘better’ actually looks like. Having looked through the issues, discussions, PRs and what themes are currently using, there are two broad categories into which the concerns fall into: the first being improving the excerpt that Ghost generates from the content, and the second is adding features for custom excerpts.
Custom excerpts are a niche requirement, and we want to focus our efforts on making apps a possibility so that it is possible to add an excerpt field, or a subtitle field, or a standfirst field or whatever custom field suits your use-case, rather than adding these things to core. Therefore this issue exists purely to address the former - improving the generated excerpts.
There are two key things which makes the current {{excerpt}}
helper’s output quite undesirable:
- It cuts off mid-sentence (ugly)
- It strips all formatting (confusing)
The formatting problem has meant many themes and Ghost users are using {{content}}
instead of the {{excerpt}}
helper. This is not ideal as it outputs images and other media that don’t really make sense for an excerpt as well as resulting in the need for some sort of appending feature to make it possible to display read more links. It also still doesn’t solve the cut off problem.
To solve the cut off problem, we want to introduce {{excerpt paragraphs="X"}}
, which will return the first X text paragraphs from the content.
To solve the formatting problem, it makes sense to change the helper so that it leaves valuable links and text formatting in place, rather than stripping all HTML.
The combination of these two things working in tandem should lead to a better excerpt. The media, script and all other non-formatting tags should be stripped from the content first, and then from the remaining text content we can return as many paragraphs as the {{excerpt}}
helper requires.
Moving forward, I think the {{excerpt}}
helper’s default should also be changed from words="50"
to 'paragraphs=“1”`.
When processing HTML in this way, it’s important to do it in such a way that bad HTML doesn’t trip up the code. At the moment we make heavy use of the very smart downsize library for truncating HTML, however the excerpt helper does it’s own brute force stripping of HTML. Therefore this is likely to need a bit of a rethink.
The following is a list of elements that will be permitted in excerpts:
a, abbr, b, bdi, bdo, blockquote, br, cite, code, data, dd, del, dfn, dl, dt, em, i, ins, kbd, li, mark, ol, p, pre, q, rp, rt, rtc, ruby, s, samp, small, span, strong, sub, sup, time, u, ul, var, wbr
This list was generated from https://developer.mozilla.org/en/docs/Web/HTML/Element, and includes all block and inline text formatting elements. Once all other elements are removed, the first X paragraphs should then be returned, not including any empty paragraphs.
In the long term, the excerpt tag allowlist will become extensible via a filter, so that extensions to the editor can also declare additional elements that should appear in excerpts (I’m thinking of things like MathML here).
In the short term, the next step here is to review the downsize library and determine whether these features can be added or whether this needs a bit of a re-think.
Issue Analytics
- State:
- Created 9 years ago
- Comments:17 (9 by maintainers)
Top GitHub Comments
Hi there,
I got an idea but it would require some extra work and probably a lot of tests:
<summary>
HTML tag<style>
,<script>
, etc.)<summary>
as itMaking this would remove any magic brought by theme implementation because when the user types his post, he deliberately knows what he puts in his
<summary>
and this being aware of stripped tags (either from Ghost doc and the MD help modal).Stripping at first, second or whichever paragraph sounds good but let’s be honest here: it’s a lot error prone and will make the code maintainers a living hell as much as users who have unexpected results.
Closing this issue in favour of the new custom excerpt #8793. We can revisit improving the automatic excerpt some other time if there is more demand for it.