Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

html deserializer behave unintuitively

See original GitHub issue

Do you want to request a feature or report a bug?

Document/clarify html deserializer’s behavior

What’s the current behavior?

Consider the markup below containing two paragraphs:

const htmlContent = `
<p>
  Paragraph 1
  Hello World
</p>

<p>Paragraph 2</p>
`;

const initialValue = html.deserialize(htmlContent);

When this markup is deserialized, the rules are triggered in the following sequence (console output from the provided example):

---> el.tagName: P, block: paragraph
---> el.tagName: , block: undefined
---> el.tagName: , block: undefined
---> el.tagName: P, block: paragraph
---> el.tagName: , block: undefined

The two calls with el.tagName: P are understandable, but I cannot explain the three calls with no tagName. I suspect they are fired because of line breaks in the markup, but that’s not very intuitive. If these are intentional, how should they be handled?

The paragraphs are rendered as nested <p> and <span> tags (following the Slate data model), but that creates an issue with rendering of the first paragraph which has a line break inside the <p> tag.

Here’s the rendered markup:

And here’s how it is rendered on screen:

As you can see, paragraph 1 has extra space in the beginning and is broken into two lines. Based on the markup it should be all in one line - white space should be ignored. What is the recommended way to solve this issue?

Three paragraphs are rendered instead of two - see the large gap between paragraph 1 and paragraph 2 above. I suspect this is coming from the line break between the two paragraphs in the markup. What is the recommended way to work around this?

Supporting material:

CodeSandbox reproducing the issue: https://codesandbox.io/s/slate-editor-issue-rktwj
Slate: 0.47.4, Slate React: 0.22.4, Slate HTML Serializer: 0.8.6, latest Chrome, MacOS

What’s the expected behavior?

The deserializer should

not trigger rules with undefined tag names (in the example above, only two rules should be fired with tagName = P)
render paragraphs with line breaks correctly - leading whitespace and line breaks should be ignored
line breaks between tags should be ignored

Issue Analytics

State:
Created 4 years ago
Reactions:7
Comments:7 (3 by maintainers)

Top GitHub Comments

1reaction

ianstormtaylorcommented, Nov 28, 2019

I believe that this may be fixed by https://github.com/ianstormtaylor/slate/pull/3093, which has changed a lot of the logic in Slate and slate-react especially. I’m going to close this out, but as always, feel free to open a new issue if it persists for you. Thanks for understanding.

0reactions

nareshbhatiacommented, Oct 3, 2019

@minimaluminium, I have not made any progress on this. I am currently focused on other aspects of creating an off-the-shelf Slate editor. Here’s my work so far.

Top Results From Across the Web

Object serialization - R3 Documentation

Deserialization, the reverse process, creates objects from a stream of bytes. ... on mutable JavaBean style objects, the API may behave unintuitively.

HTML 5.3 - W3C

This section is non-normative. The syntax of HTML is constrained to avoid a wide variety of problems. Unintuitive error-handling behavior.

Deserialization - OWASP Cheat Sheet Series

The java.io.ObjectInputStream class is used to deserialize objects. It's possible to harden its behavior by subclassing it. This is the best solution if:....

Implementing Deserialize - Serde

The Deserializer trait supports two entry point styles which enables different kinds of deserialization. The deserialize_any method. Self-describing data ...

Deserialization - MDN Web Docs Glossary: Definitions of Web ...

The process whereby a lower-level format (e.g. that has been transferred over a network, or stored in a data store) is translated into...