question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

POM XML formatter incorrectly adds newlines before CDATA sections

See original GitHub issue

(This was originally filed as eclipse-m2e/m2e-core#1015, but I was told I need to file it with Eclipse Wild Web Developer.)

Using Eclipse 2022-09, I can add a CDATA section inside a Maven POM (pom.xml) property definition like this:

<properties>
    <test.cdata><![CDATA[<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
	<entry key="foo">bar</entry>
</properties>]]></test.cdata>
</properties>

The purpose is obvious: I don’t want to XML-encode all the XML delimiters in the value.

However when I format the file using Ctrl+Shift+F, Eclipse adds newlines and indents before and after the CDATA section:

<properties>
    <test.cdata>
      <![CDATA[<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
	<entry key="foo">bar</entry>
</properties>]]>
    </test.cdata>
<properties>

This corrupts the data! The whole point of using a CDATA section is that I wanted the exact content specified in a property. Now extra whitespace has been added in the value of the property. (It would have been worse to format inside the CDATA section, but this is still not good.)

I would guess the problem is that the formatting algorithm is treating CDATA sections as if they were XML DOM elements. That is, I understand that if I have this XML:

<foo><bar>…</bar></foo>

Then the XML formatter should format it to:

<foo>
  <bar>…</bar>
</foo>

However CDATA sections are not elements! Their purpose is merely to indicate whether content needs escaping or should be interpreted literally. They do not create a new element nesting context!!! They must be left along entirely; no newlines, whitespace, or anything else should be added, neither inside the CDATA section, before the CDATA section, nor after the CDATA section.

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:11 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
angelozerrcommented, Nov 5, 2022

WWD should be updated to expose that setting. So https://github.com/eclipse/wildwebdeveloper/issues/941 still stands

@fbricon we would like to switch the experimental formatter as current formmater ASAP when it will supports all features from the current formatter. @JessicaJHee works hard on this topic and I think we are in the good way.

In other words WWD should not expose this setting which will disappear in few months.

1reaction
mickaelistriacommented, Nov 4, 2022

Thanks. I could reproduce the issue it in latest snapshots of Wild Web Developer. However, the problem is not “hosted” by Wild Web Developer itself but by the LemMinX language server that is used by Wild Web Developer to provide formatting (among other things). Now that you’re getting familiar with opening the same issue on multiple projects as investigation goes on, would you mind bringing this issue to https://github.com/eclipse/lemminx/issues 😉 I really believe it will be the final destination here.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Adding a new line/break tag in XML [duplicate] - Stack Overflow
Usually we do not use line separator for binary files but For some reason our customer wished to add a line break as...
Read more >
Problem with CDATA and new line character — oracle-tech
I parse a xml document with org.w3c.dom, the document contain a CDATA section in which there are more lines of text, each line...
Read more >
Resolving problems when using messages - IBM
Use the advice given here to help you to resolve common problems that can arise when you use messages.
Read more >
Solving the XML Problem with Jackson - Stackify
Adding the Jackson XML module to the project only needs a single dependency ... character data sections – when not binding to Java...
Read more >
etree/etree.go at master · beevik/etree - GitHub
ErrXML is returned when XML parsing fails due to incorrect formatting. var ErrXML = errors. ... Data string // the simple text or...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found