ENH: Possible to add dtype/converters as arguments for pandas.read_xml() ?
See original GitHub issueIs your feature request related to a problem?
I am using pandas lib to read xml for further processes, however a number of columns with leading ZERO are always converted to numbers, so I lost the original data.
Describe the solution you’d like
It would be great to add dtype/converter arguments for pandas.read_xml()
to force pandas to interprete certain columns with given dtype/converters.
Just like similar IO read (read_csv, read_html, etc)
API breaking implications
Probably not, this argument could be optional.
Describe alternatives you’ve considered
Write my own code to pull data by each xml nodes, which results in very bad performance.
Issue Analytics
- State:
- Created 2 years ago
- Comments:5 (4 by maintainers)
Top Results From Across the Web
pandas.read_xml — pandas 1.5.2 documentation
Read XML document into a DataFrame object. New in version 1.3.0. Parameters. path_or_bufferstr, path object, or file-like object.
Read more >Applying function with multiple arguments to create a new ...
You can go with @greenAfrican example, if it's possible for you to ... If you need to create multiple columns at once: ......
Read more >Possible to add dtype/converters as arguments for pandas ...
ENH : Possible to add dtype/converters as arguments for pandas.read_xml() ? ... I am using pandas lib to read xml for further processes,...
Read more >Pandas DataFrame apply() Examples - DigitalOcean
The important parameters are: func: The function to apply to each row or column of the DataFrame. axis: axis along which the function...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
As a current workaround, consider running XSLT to quote the nodes with leading zeroes and then convert on the pandas side. If using the default
lxml
parser, XSLT 1.0 scripts are supported inread_xml
. Below XSLT runs the standard Identity Template and encloses the text values of thezip
with double quotes.Agreed! Good feature to add to running list. Also,
read_xml
passes parsed data toTextParser
shared by other io readers.