Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Internal trailing newline character exposed via getText/getLength/etc.

See original GitHub issue

It appears that internally, quill appends a trailing newline character to the text model. This trailing new line character is not displayed to the user in the editor, nor can it be added to the editor’s selection via the setSelection method. The trailing new line character is, however, returned in getText and getContents, and it is counted in getLength. This is a little confusing for programmatic use, when trying to verify the contents of the editor after setting them via the api, for instance.

Steps for Reproduction

Visit quilljs.com.
Pause js execution.
Set the text of the quill instance to a string via setText.
Get the text of the quill instance via getText.
Verify that the text is set correctly using string equality.

Example:

Expected behavior: The new line character is stripped/ignored in return values from API calls.

Actual behavior: The new line character is returned/counted in return values from API calls.

Platforms: n/a

Version: 1.0.0-rc.2

It is entirely possible that there is a good use for exposing this newline character externally, but I have not been able to find reference to it in issues or documentation. Thanks!

Issue Analytics

State:
Created 7 years ago
Comments:6 (4 by maintainers)

Top GitHub Comments

3reactions

pdiveriscommented, Apr 15, 2018

Those carriage returns are crazy. I am at a loss to understand what role they might have been designed to fulfil one day. I’ve spent a whole day working on a parser and renderer, fetching with getContents and passing to my server side over JSON. Left unescaped It breaks JSON, you simply get nulls. And of course it will be left unescaped because who would have thought to look for carriage returns? I’ve been scratching my head as to why the H tags appear to be linked to empty inserts, even though the actual text specified in a previous OP is correctly shown as a H1 or whatever. I am now writing code in my parser and renderer so that it can merge the newlines and their attributes to the text they belong to, which sits before them. But I do not understand why I have to do that. It’s crazy, or I am missing something. Are they there for a reason?

1reaction

jhchencommented, Sep 3, 2016

This is expected behavior.

The main reason is line formats are represented by attributes on the newline character, which implies every line needs to have a newline character. It could be added “just in time” but then when you apply line formatting, the change will include not a format instruction, but an insert formatted text instruction. Similarly a remove line format instruction would not be a remove format instruction, but a delete text. These behaviors are also surprising but has the additional downside of requiring error prone bookkeeping. I say error prone because off by one errors cover a large class of bugs and Quill from experience of going down this “add/remove newline just in time” route has experienced many of them. It’s much simpler to always be able to rely in a trailing newline character for every line.

Then the question of course is why are line formats represented by an attribute on the newline character? As an example let’s consider a “The Two Towers” formatted as header text. There are only two alternatives to represent this formatting given the current Delta format if we do not have a newline:

Any character has the header attribute:

[{
  insert: "The", attributes: { header: 1 }
}, {
  insert: " Two Towers"
}]

All characters have the header attribute:

[{
  insert: "The Two Towers", attributes: { header: 1 }
}, {
  insert: " Two Towers"
}]

Going with Option 1, what if we delete the text “The”? Its header attribute goes with it and suddenly the line is no longer formatted. Also it introduces ambiguity that additional complexity is required to solve. For example what header level would this line be?

[{
    insert: "The", attributes: { header: 1 }
  }, {
    insert: " Two Towers", attributes: { header: 2 }
  }]

Option 2 again has the same problem in different forms. Using the mixed header example from above, option 2 says the line has no header format. But again deleting “The” suddenly formats the line with header: 2.

The core problem of both above solutions can also be seen intuitively: headers do not describe any individual or combination of characters on the line, it describes the line itself. So using anything that does not describe the line itself is likely going to cause problems.

We can go deeper and ask why does the Delta format have this limitation then? Right now the format is incredibly simple and expressive. Using only characters and attributes, it can describe any document. With just three operations, it can describe any change to any document. Though there are other benefits of adding an additional “line” primitive, it compounds the complexity and will propagate through everything it touches.

It’s much easier for Quill to just force trailing newlines.

I’ll add a note to the getText and getLength docs.

Top Results From Across the Web

Removing trailing newline character from fgets() input

The steps to remove the newline character in the perhaps most obvious way: Determine the length of the string inside NAME by using...

Python | Removing newline character from string

Method 1: Use the replace function to Remove a Newline Character ... is used to remove all the leading and trailing spaces from...

Adding new closing tag =?> for keeping trailing newline

I'd like to propose adding a new closing tag =?> to the language. PHP currently removes the newline character immediately following the closing ......

Can sed replace new line characters? - Unix StackExchange

Almost always, a newline is appended to each consecutive output of sed. The GNU sed is able to avoid printing a trailing newline...

Removing Trailing Newlines and Other Characters - The Ruby ...

On some systems such as UNIX, the newline character is represented internally as a linefeed (\n). On others such as DOS and Windows,...