Issue with damaged XLSX created with cell content not truncated to 32767 chars
See original GitHub issueHi, We are using XlsxWriter in ScanCcde.io to craft XLSX outputs . See below for links.
Sometimes a long text may be collected that may contain CRLF and be longer than the max length of a cell (e.g. 32767). Yet the string is not truncated by XlsxWriter and MS Excel on Windows reports the workbook as damaged. If I replace the CRLF by LF, then the XlsxWriter truncation takes place “as usual”.
I am using Python 3.6 or 3.9 on Linux with XlsxWriter latest version 1.4.3
$ python --version
Python 3.9.0
$ python -c 'import xlsxwriter; print(xlsxwriter.__version__)'
1.4.3
Here is some code that demonstrates the problem:
import shutil
import tempfile
import xml.etree.ElementTree as ET
from pathlib import Path
import xlsxwriter
def test_workbook_with_long_text():
"""
Create a workbook with a worksheet with a cell with ``original_text``
and then extract, and read and to compute the length.
"""
test_dir = Path(tempfile.mkdtemp())
print("temp test_dir:", test_dir)
long_text = "a\r\n" * 32 * 1024
output_file = test_dir / "foobar.xlsx"
with xlsxwriter.Workbook(str(output_file)) as workbook:
worksheet = workbook.add_worksheet("baz")
worksheet.write_row(row=0, col=0, data=[long_text])
extract_dir = test_dir / "extracted"
shutil.unpack_archive(
filename=output_file,
extract_dir=extract_dir,
format="zip",
)
# This XML doc contains the strings stored in cells and has this shape:
# <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
# <sst xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main"
# count="2" uniqueCount="2">
# <si><t>foo</t></si>
# <si><t>f0123456789</t></si>
# </sst>
shared_strings = extract_dir / "xl" / "sharedStrings.xml"
print("XLSX shared_strings file:", shared_strings)
sstet = ET.parse(str(shared_strings))
# here the text we care is the last element of the XML
texts = list(e.text for e in sstet.getroot().iter())
print("length text", len(texts[-1]))
if __name__ == "__main__":
test_workbook_with_long_text()
Issue Analytics
- State:
- Created 2 years ago
- Comments:9 (6 by maintainers)
Top Results From Across the Web
Is there a way to force automatic truncation of characters in ...
I know that data validation can set a character limit on cells so users can't enter in cells past that, but what I...
Read more >Failure to open project XLSX in MSFT Excel on Windows
It is because the package description exceeds 32767 characters length which is a limit in Excel. XlsxWriter should truncate these but does not...
Read more >Does pandas .to_csv export cells properly (with no data ...
the data in cells at 32,767 characters for an xlsx, or for a csv export, it puts information in to the next rows...
Read more >RESOLVED - Excel: text in cell being truncated after save
I found a limitation on the number of characters in the header row of tables - it is limited to 255 characters and...
Read more >The value of the column was truncated because its length ...
Symptoms After exporting metadata to excel, you see this error message: The value of the column 'text' was truncated because its length...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Thanks for the detailed report.
yes, you are absolutely right, the string does contain emojis. Thank you for priceless info!