Migrating a Block within a StreamField
See original GitHub issueWagtail includes a StreamField
for freeform page content. This content is stored in blocks, which are JSON-serialized and stored in the database. However, it’s not clear how to migrate a block within a StreamField
. If you change a block and then migrate, the migration simply replaces the old block type with the new block type. This is understandable, because Wagtail doesn’t know how to map from instances of the old block to instances of the new block.
I’ve taken the following approach to solving this problem, but would appreciate guidance on how it can be improved. In particular, I’d like to know whether it’s possible to instantiate and serialize a block directly, rather than using the mapping functions I’ve defined below. The operation of StreamValue
is also opaque, so I’d appreciate some guidance on this class.
I posted my original question to the Wagtail Developers group.
Consider V1 of the CountryRiskReport
model:
class CountryRiskReportPage(Page):
body = StreamField([
('heading', CharBlock()),
('paragraph', RichTextBlock()),
('focusbox', RichTextBlock()),
])
Now consider V2 of the same model:
class FocusBoxBlock(StreamBlock):
heading = CharBlock()
body = StreamBlock([
('paragraph', RichTextBlock()),
])
class CountryRiskReportPage(Page):
body = StreamField([
('heading', CharBlock()),
('paragraph', RichTextBlock()),
('focusbox', FocusBoxBlock()),
])
Notice that in V2, we change focusbox
from a RichTextBlock
to a FocusBoxBlock
. How should we migrate this change?
We seem to need a data migration, not a schema migration. The schema hasn’t changed because the field hasn’t changed: body
was a StreamField
before and it will be a StreamField
after. Consequently, we need two functions that define how to map:
- a serialized rich text block to a serialized focus box block, which will be used by the forwards migration;
- a serialized focus box block to a serialized rich text block, which will be used by the backwards migration.
def richtextblock_to_focusboxblock(block):
return {
'type': 'focusbox',
'value': {
'heading': 'Focus Box',
'body': [{'type': 'paragraph', 'value': block['value']}]
}
}
def focusboxblock_to_richtextblock(block):
heading = '<h1>' + block['value']['heading'] + '</h1>'
body = ''.join([subblock['value'] for subblock in block['value']['body']])
return {
'type': 'focusbox',
'value': heading + body
}
We use the mapping functions when we iterate over a page’s serialized blocks. When we encounter a focusbox
, then we use the appropriate function to map from one block to the other. To save us from writing the same code for both the forwards and backwards migrations, we write a function that accepts a page and a mapping function. This returns a list of serialized blocks and a boolean that indicates whether it encountered a focusbox
.
def get_stream_data(page, mapper):
stream_data = []
mapped = False
for block in page.body.stream_data:
if block['type'] == 'focusbox':
focusboxblock = mapper(block)
stream_data.append(focusboxblock)
mapped = True
else:
stream_data.append(block)
return stream_data, mapped
We will use this list to create a new StreamValue
to replace CountryRiskReportPage.body
, which is also a StreamValue
. We will use this boolean to determine whether or not to save the CountryRiskReportPage
.
def migrate(apps, mapper):
CountryRiskReportPage = apps.get_model('products', 'CountryRiskReportPage')
for page in CountryRiskReportPage.objects.all():
stream_data, mapped = get_stream_data(page, mapper)
if mapped:
stream_block = page.body.stream_block
page.body = StreamValue(stream_block, stream_data, is_lazy=True)
page.save()
All that remains is to define the forwards and backwards migration, as well as the Migration class.
def forwards(apps, schema_editor):
migrate(apps, richtextblock_to_focusboxblock)
def backwards(apps, schema_editor):
migrate(apps, focusboxblock_to_richtextblock)
class Migration(migrations.Migration):
dependencies = [
...
]
operations = [
migrations.RunPython(forwards, backwards),
]
I’ve tested this approach and it works. Nevertheless, I’d appreciate guidance on how it can be improved.
Thanks!
Issue Analytics
- State:
- Created 8 years ago
- Reactions:13
- Comments:15 (6 by maintainers)
Top GitHub Comments
Okay so the idea in my last comment got me going down the right track. After reading through the source again, it appears that you can save the json directly using the raw_text kwarg. So the line in your function above that reads
page.body = StreamValue(stream_block, stream_data, is_lazy=True)
can work independently of the block’s migration state by using raw_text and supplying the json string directly.For example:
This migration now works forwards and backwards for me.
Yes, that’s correct. It’s highly unlikely that we’ll ever change the schema, except for adding new (optional) properties to the dictionary, to be handled by the
StreamField
. (In particular, we’re considering adding an ‘id’ property, to assist with tracking changes between revisions.)