Support for Semi-Structured XML
See original GitHub issueSemi-structured element should not be parsed and the content should be kept as is. For example:
<level1>
<level2>
<level3>First level3.</level3>
text outside 1st level3 at the end
</level2>
<level2>
text outside 2nd level3 at the beginning
<level3>Second level3.</level3>
text outside 2nd level3 at the end
</level2>
<level2>
text outside 3rd level3 at the beginning
<level3>Third leve3.</level3>
</level2>
</level1>
will produce (at the corresponding levels):
level2: [
{ #text: "text outside 1st level3 at the end", level3: "First level3." },
{ #text: "text outside 2nd level3 at the beginningtext outside 2st level3 at the end", level3: "Second level3." },
{ #text: "text outside 3rd level3 at the end", level3: "Third level3." },
]
which is not only irreversible (not keeping order) but for 2nd level3 also meaningless (joining the texts). According to “spec” you claim you are adopting, at least the case of 2nd level3 should not be parsed.
It would be also great if a tag name(s) could be specified (as a parameter to parse function) whose content wouldn’t be parsed at all. It could also solve the described issue sometimes (as the user would specify that tag level2
shouldn’t be parsed and its content should be kept in #text
property).
Thank you in advance for your comments.
Issue Analytics
- State:
- Created 9 years ago
- Reactions:7
- Comments:12 (2 by maintainers)
Top Results From Across the Web
XML <and Semi-Structured Data> - ACM Queue
How does XML help solve the semi-structured data problem? XML provides a tool for representing and grappling with the data and recognizing the ......
Read more >What is Semi-structured Data? - Snowflake
HTML, XML, and other markup languages are all considered semi-structured. Their schemas may be descriptive, partial, or evolving. Semi-structured web data often ...
Read more >Document semi-structured (JSON, XML) data in relational ...
Document semi-structured (JSON, XML) data in relational databases · The hidden data complexity · Document JSON · Linking documents and columns · End ......
Read more >Semi-structured data - Wikipedia
Semi-structured data is a form of structured data that does not obey the tabular structure of data models associated with relational databases or...
Read more >What is Semi-structured data? - GeeksforGeeks
XML is widely used to store and exchange semi-structured data. It allows its user to define tags and attributes to store the data...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
AFAIK, there are two options:
which unescape
&
,<
, and>
in a string of data.Or
in Python 2
in Python 3
which is not documented.
Both are a standard part of Python. The (current) resulting string could be passed into one of these functions to get rid of the entities. But I’m not familiar with the internal processes of your module so I don’t know if it can be really used as I suppose.
I agree. In case of external independent json parser we get different elements order.