BUG: "with pd.ExcelWriter" produces a corrupt Excel file in case of .xlsm extension
See original GitHub issue-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the master branch of pandas.
Reproducible Example
import pandas as pd
excel_path = r"D:\my_test.xlsm"
with pd.ExcelWriter(excel_path, mode='a') as writer:
# df.to_excel(writer, sheet_name='Sheet1')
pass
df_excel = pd.read_excel(excel_path, 'Sheet1')
Issue Description
The specified Excel file gets corrupted, despite the append mode. Changing its extension to .xlsx can retrieve the raw data from it, but the macro is forever gone.
In my example I used with and had only a pass statement inside it, yet the file became useless.
Q1: Can we know for sure that the default engine is used for .xlsm files? Q2: Should I contact OpenPyxl instead of you with this issue?
Expected Behavior
As I’ve read since my first encounter of this bug, ExcelWriter doesn’t support macros, but a warning of some kind would be nice before wiping out a carefully written VBA code from the Excel file.
Installed Versions
INSTALLED VERSIONS
commit : 945c9ed766a61c7d2c0a7cbb251b6edebf9cb7d5 python : 3.7.4.final.0 python-bits : 64 OS : Windows OS-release : 10 Version : 10.0.19041 machine : AMD64 processor : Intel64 Family 6 Model 158 Stepping 9, GenuineIntel byteorder : little LC_ALL : None LANG : None LOCALE : None.None pandas : 1.3.4 numpy : 1.17.3 pytz : 2021.3 dateutil : 2.8.2 pip : 19.3.1 setuptools : 40.8.0 Cython : None pytest : None hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : None IPython : None pandas_datareader: None bs4 : None bottleneck : None fsspec : None fastparquet : None gcsfs : None matplotlib : None numexpr : None odfpy : None openpyxl : 3.0.9 pandas_gbq : None pyarrow : None pyxlsb : None s3fs : None scipy : None sqlalchemy : None tables : None tabulate : None xarray : None xlrd : None xlwt : None numba : None
Issue Analytics
- State:
- Created 2 years ago
- Comments:25 (10 by maintainers)
Top GitHub Comments
What I do to create a valid .xlsm file directly in python is also with openpyxl I do something like
The first part with
Workbook()
creates the file, but it is corrupted, however, when you open it withload_workbook
and thekeep_vba=True
arg, the file is magically fixed and ready to be manipulated.The most I could do with this was to modify
pandas.io.excel._openpyxl.py.OpenpyxlWriter.__init__.book (line 65)
to directly add"keep_vba = True"
since this line does not include any motor_kwargs. And from what I tried this doesn’t affect .xlsx filesHowever to append data to existing sheet, you should do something like:
or modify again
_openpyxl.py
to add something like an"append"
option todef write_cells()