Segfault on WeasyPrint 52.4 in get_first_line
See original GitHub issueHi!
We’re seeing an intermittent segfault when trying to write a PDF from html. I haven’t been able to repro it locally; I’ve only seen it on live code in the cloud. The code in question calls Weasyprint like this:
from weasyprint import CSS, HTML
...
html = HTML(
string=render_html(product, template_name, payload),
base_url=STATIC_DIR,
)
css = CSS(filename=css_filename)
pdf = html.write_pdf(stylesheets=[css])
Running python with faulthandler.enable()
traces the problem to this line:
Fatal Python error: Segmentation fault
(other threads omitted)
Current thread 0x00007f6af0959b48 (most recent call first):
File "/opt/app/.venv/lib/python3.9/site-packages/weasyprint/text.py", line 736 in get_first_line
File "/opt/app/.venv/lib/python3.9/site-packages/weasyprint/text.py", line 1271 in show_first_line
File "/opt/app/.venv/lib/python3.9/site-packages/weasyprint/draw.py", line 1059 in draw_text
File "/opt/app/.venv/lib/python3.9/site-packages/weasyprint/draw.py", line 1031 in draw_inline_level
File "/opt/app/.venv/lib/python3.9/site-packages/weasyprint/draw.py", line 269 in draw_stacking_context
File "/opt/app/.venv/lib/python3.9/site-packages/weasyprint/draw.py", line 279 in draw_stacking_context
File "/opt/app/.venv/lib/python3.9/site-packages/weasyprint/draw.py", line 275 in draw_stacking_context
File "/opt/app/.venv/lib/python3.9/site-packages/weasyprint/draw.py", line 161 in draw_page
File "/opt/app/.venv/lib/python3.9/site-packages/weasyprint/document.py", line 282 in paint
File "/opt/app/.venv/lib/python3.9/site-packages/weasyprint/document.py", line 654 in write_pdf
File "/opt/app/.venv/lib/python3.9/site-packages/weasyprint/__init__.py", line 222 in write_pdf
(continues from here within application code)
Additional info:
- alpine linux, specifically, dockerized using:
FROM python:3.9-alpine
- pango version
pango-dev-1.48.5-r0
, installed via apk - we occasionally see errors from pango in our logs, but as far as I can tell they’re isolated from this. And by that I mean, we’ll see a log for the pango error, and then more than an hour later, the segfault happens. And we see the segfault both with and without the pango error showing up in the logs.
Do you have any guidance on this? I took a quick look at the WeasyPrint source, but python CFFI is beyond my powers of speculation and/or debugging.
Issue Analytics
- State:
- Created 2 years ago
- Comments:5 (3 by maintainers)
Top Results From Across the Web
segmentation fault:11 · Issue #1273 · Kozea/WeasyPrint - GitHub
Hi, I am having trouble trying to get WeasyPrint to work. The current code I am trying to run is: from weasyprint import...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Thanks a lot for this information.
Then we can assume that this bug is caused by the way Alpine compiles Pango. Alpine using musl libc may discover problems that were not visible when using glibc. Here, I suppose that there’s a race condition that unreferences Fontconfig patterns twice.
We could report this issue. But without a sample in C showing the problem, I suppose that it’s almost impossible to debug.
gdb is a kind of nightmare for me too 😉.
I close the issue, as there’s nothing more we can easily do. If anyone meets the same problem and has more time and experience to debug it, feel free to reopen, I’m sure that the Pango team would be happy to fix the bug.
I have some hopefully-helpful updates! First off, we are indeed using WeasyPrint in a threaded context, ie, behind a wsgi server. But much more importantly, I was able to reliably reproduce the crash locally, using a script that simply hammers away identical canned requests at the endpoint we were seeing the crash in until the crash happens. Sometimes it takes a few seconds, sometimes it takes several minutes, but it very reliably reproduces the crash. And that allowed me to do a little testing. I think your hunch re: alpine was right on the money, because switching the docker image to debian solved the problem for us. That’s not much help to anyone who’s stuck on alpine, but for us, it’s a perfectly valid solution. I’ll leave this issue open in case you want to investigate further, but as far as I’m concerned you’re welcome to close it!
Despite having a local repro, I was unfortunately not able to get a traceback using gdb. I tried valiantly, but I simply couldn’t get gdb to run within alpine + poetry + debug symbols. Maybe it’s just because of my relative inexperience with gdb, but at a certain point I just didn’t have any more time I could justify devoting to it. Particularly because we’re perfectly happy to just change our distro to debian and be done with it 😃
Also, I think the pango errors we see might actually be related. Every single local crash I’ve been able to repro has had a pango failure first, and then a while later, the segfault happens. I think in our production logs it just seems like it’s not always related, because our production traffic is low enough that it gets drowned out by the rest of the logs. For example, here’s one I got locally just now: