Bug: concurrent building breaks plugins that rely on before-build / before-build-all
See original GitHub issueThere is an inconsistency between editing a contents file directly and saving the same file admin UI – and also between initial build / build-all and a subsequent file change in admin UI. This inconsistency breaks plugins that depend on before-build
or before-build-all
.
Saving a file inside admin UI, triggers a build with update_source_info_first
enabled:
File "lektor/devserver.py", line 51, in run
self.build(update_source_info_first=True)
During source info update, the artifact for the saved page (and subartifacts) are built and saved without calling before-build
or before-build-all
.
Here are the steps to reproduce, the time.sleep
is to simulate a lengthy update task:
import time
def on_before_build_all(self, builder, **extra):
time.sleep(0.5)
print('start')
def on_before_build(self, builder, build_state, source, prog, **extra):
time.sleep(0.2)
print('8', source)
And in lektor.builder.Builder.build()
, after emit('before-build')
insert print(9, source)
.
The log for the initial build is in correct order:
Started build
start
8 <Page model='root' path='/'>
9 <Page model='root' path='/'>
U index.html
9 <Directory '/'>
...
However, if you update an existing page:
Started build
9 <Page model='project-entry' path='/projects/barss'>
U projects/barss/index.html
9 <File '/static/style.css'>
9 <File '/static/icons.svg'>
start
8 <Page model='root' path='/'>
9 <Page model='root' path='/'>
9 <Directory '/'>
...
as you can see, the source update triggers a build without emitting a before-build
.
I have a plugin that injects a record variable and does some text replacement on the record. However, the before-build-all
callback is not executed because of this. Or more precisely, it is sporadically not updated properly because of race conditions. The time.sleep
just makes it obvious that there is a problem.
Issue Analytics
- State:
- Created a year ago
- Comments:8 (8 by maintainers)
Top GitHub Comments
I still don’t completely understand what it is you’re trying to do, but you might take a look at the approach used in lektor-index-pages which is the solution I came up with for generating keyword and date indexes for a blog.
Roughly, the indexing (i.e. processing of the children) works by creating virtual source objects to contain all the computed grouping state. When those virtual sources are instantiated, the children are iterated over, classified, and sorted. That state is stored in the virtual source instance. (Which is cached for the lifetime of the pad, so only has to be computed once per build cycle.)
The way this addresses your concerns is:
The grouping is computed (essentially) on-demand, by fetching the appropriate virtual source from the Lektor db. E.g. to get a list of all blog keywords along with a count of the number of articles tagged with each one (untested code):
A
Record
should, I think, be thought of as a database record. It’s a view of what’s in the correspondingcontents.lr
file. As I said above, mutating aRecord
doesn’t feel right (unless you’re talking about modifying what’s in thecontents.lr
as well — but that probably will not be easy to do correctly in the middle of a build cycle).I think a more appropriate place to integrate more global data (group data) is in the page template(s). If you need to perform some operations which are cumbersome to do in jinja, you may create custom jinja filters or global functions to help.
One can always access the data (fields) of the children. As long as the rest of the building is done by the jinja templates (perhaps using jinja macros or custom filters/functions) build order is not important.
E.g. here’s how to generate a list of all pages which reference each keyword.
Yes, with minor corrections.
Those two builds are initiated possibly in parallel, rather than in any particular sequence.
The individual source build is triggered not by the file save, but by the subsequent HTTP request for the primary artifact of the edited source. The build_all is not triggered directly by the file save, but is triggered when the file-system monitor notices that any of the project files has been updated.
The reason behind this (I think) is that:
When we say “build a source object” here, we mean "check all the dependencies (the recorded source files) for the primary artifact of the source object; if the artifact is out-of-date with respect to any of those dependencies then (re)generate the artifact. Unless the artifact is stale (or missing), this is (in theory) a relatively quick process.
Because of the dependency checking, when a page is edited, only one of those two build threads (the build_all and the source-specific build) — whichever one gets to it first (likely the source-specific build) — should[^1] actually regenerate the output artifact.
[^1]: I suspect there are edge cases when both threads will regenerate the artifact.
god damn it. Even with my mixed build processing it is not consistent 😕 getting out of options…