question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Optimise representation of deeply-nested StreamField blocks in migrations

See original GitHub issue

Because the actual database backend of a StreamField is just a simple LONGTEXT populated by json-serialized block contents, the database does not need to know about changes made to the Block structure within the StreamField.

However, since Wagtail currently does provide that information to the migrations, this causes complex StreamField instances to generate gargantuan migrations whenever even the tiniest change is made to a single Block used by the field. Here’s an example. This one model’s migration code is nearly 500,000 characters.

And another half megabyte of text will be repeated in each subsequent migration if even a single field in a single Block has a single attribute changed. This is highly undesirable, and as far as I can tell, completely pointless. The auto-generated migration files don’t need to care about the internal structure of a StreamField; only manually written migrations that migrate the data from an old format to a new one need to care, and those don’t need to be half a megabyte of code.

I’m not yet entirely sure how best to remedy this, but I think it will have something to do with StreamField.deconstruct().:

def deconstruct(self):
    name, path, _, kwargs = super(StreamField, self).deconstruct()
    block_types = self.stream_block.child_blocks.items()
    args = [block_types]
    return name, path, args, kwargs

I don’t think there’s any good reason for it to care about its child blocks. Though while I do have a lot of experience with dealing with, and hacking around, StreamFields, I’m not an expert. Maybe I’m missing something?

Issue Analytics

  • State:open
  • Created 6 years ago
  • Comments:19 (14 by maintainers)

github_iconTop GitHub Comments

5reactions
ababiccommented, Jun 24, 2020

This was discussed in the core team meeting on 25/06/20.

In general, it was agreed that including the streamfield block definitions in migrations was the correct default behaviour. This is because at any point in a model’s migration history, you should be able to access both the structured content/data AND the full streamblock definition, in case they are both needed for use in a data migration. As unlikely as this may seem, until there are more established solutions for migrating streamfield content, we feel it’s important to preserve this as an option.

However, we also recognise the issues that many (even core team members themselves) have reported in relation to this, and would like to offer a way for developers to opt out of this behaviour (including streamfield definitions in migrations) completely on a per-project basis, provided they are happy to accept the consequences (as outlined above).

EDIT: If there are other ways to optimise the representation (as outlined above), then they are still well worth exploring, as simpler migrations by default would be an obvious win.

5reactions
chosakcommented, Feb 20, 2018

@gasman following up on this:

your example code shows a level of configurability that goes way beyond anything I anticipated, or would recommend

This problem (large migrations due to highly configurable StreamFields) is something that we’ve run into as well, and is a major pain point around maintenance of our large Wagtail site.

I’m curious if you have any guidance on how or where to draw the line between having many Page models that differ slightly versus having fewer Pages but making them more configurable. Consider a use case where you have a few core page types that share certain common design elements (headers, footers, sidebars), but their main content may differ in numerous subtle ways. Now compare these few pages:

Each of these pages has the same basic structure but the main content area is very different (different images, callouts, links, expandables). In our case we accomplish this through the use of at least one highly configurable StreamField, which can contain various blocks (which may be StructBlocks and hold, e.g. lists of other blocks).

This usage results in similar migration issues to that reported by @coredumperror. We’ve discussed various other ways to set this up, but it’s not clear how best to do this with a “many page types” approach. You’d need something like PageWithImageAndFormAndLinks and then PageWithImageAndLinksButNoForm - the combinations would quickly get out of hand. Then you’d also have a more difficult editor experience where a page creator would have to choose from a large list of page types that differ in subtle ways. The inability to convert a page from one type to another also makes using page types less attractive than configurable StreamFields.

Any insight/suggestion in how best to set up pages like this in “the Wagtail way”?

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to use StreamField for mixed content
StreamField provides an HTML representation for the stream content as a whole, as well as for each individual block. To include this HTML...
Read more >
Fix list for IBM Engineering Systems Design Rhapsody
Category RTC ID / RATLC ID APAR RFE ID Activity Diagram 141509 / RATLC01429338 ‑‑‑ ‑‑‑ Activity Diagram 221832 / RATLC02828999 PI07675 ‑‑‑ Activity Diagram 222506...
Read more >
StreamBase New and Noteworthy Archives
Use this option to populate fields that are not represented in the data file ... For migration information about data constructs, see EventFlow...
Read more >
How To Migrate RichTextField to StreamField? - Stack Overflow
I was able to migrate a RichTextField to StreamField with a RichTextBlock with this ... StreamFieldPanel from wagtail.images.blocks import ...
Read more >
Caché Upgrade Checklist Archive - InterSystems Documentation
InterSystems Corporation makes no representations and warranties ... When upgrading to this version, if the CACHESYS database block size is ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found