Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Buffer performance improvements

See original GitHub issue

Problem

Memory

Right now our buffer is taking up too much memory, particularly for an application that launches multiple terminals with large scrollbacks set. For example, the demo using a 160x24 terminal with 5000 scrollback filled takes around 34mb memory (see https://github.com/Microsoft/vscode/issues/29840#issuecomment-314539964), remember that’s just a single terminal and 1080p monitors would likely use wider terminals. Also, in order to support truecolor (https://github.com/sourcelair/xterm.js/issues/484), each character will need to store 2 additional number types which will almost double the current memory consumption of the buffer.

Slow fetching of a row’s text

There is the other problem of needing to fetch the actual text of a line swiftly. The reason this is slow is due to the way that the data is laid out; a line contains an array of characters, each having a single character string. So we will construct the string and then it will be up for garbage collection immediately afterwards. Previously we didn’t need to do this at all because the text is pulled from the line buffer (in order) and rendered to the DOM. However, this is becoming an increasingly useful thing to do though as we improve xterm.js further, features like the selection and links both pull this data. Again using the 160x24/5000 scrollback example, it takes 30-60ms to copy the entire buffer on a Mid-2014 Macbook Pro.

Supporting the future

Another potential problem in the future is when we look at introducing some view model which may need to duplicate some or all of the data in the buffer, this sort of thing will be needed to implement reflow (https://github.com/sourcelair/xterm.js/issues/622) properly (https://github.com/sourcelair/xterm.js/pull/644#issuecomment-298058556) and maybe also needed to properly support screen readers (https://github.com/sourcelair/xterm.js/issues/731). It would certainly be good to have some wiggle room when it comes to memory.

This discussion started in https://github.com/sourcelair/xterm.js/issues/484, this goes into more detail and proposes some additional solution.

I’m leaning towards solution 3 and moving towards solution 5 if there is time and it shows a marked improvement. Would love any feedback! /cc @jerch, @mofux, @rauchg, @parisk

1. Simple solution

This is basically what we’re doing now, just with truecolor fg and bg added.

// [0]: charIndex
// [1]: width
// [2]: attributes
// [3]: truecolor bg
// [4]: truecolor fg
type CharData = [string, number, number, number, number];

type LineData = CharData[];

Pros

Very simple

Cons

Too much memory consumed, would nearly double our current memory usage which is already too high.

2. Pull text out of CharData

This would store the string against the line rather than the line, this would probably see very large gains in selection and linkifying and would be more useful as time goes on having quick access to a line’s entire string.

interface ILineData {
  // This would provide fast access to the entire line which is becoming more
  // and more important as time goes on (selection and links need to construct
  // this currently). This would need to reconstruct text whenever charData
  // changes though. We cannot lazily evaluate text due to the chars not being
  // stored in CharData
  text: string;
  charData: CharData[];
}

// [0]: charIndex
// [1]: attributes
// [2]: truecolor bg
// [3]: truecolor fg
type CharData = Int32Array;

Pros

No need to reconstruct the line whenever we need it.
Lower memory than today due to the use of an Int32Array

Cons

Slow to update individual characters, the entire string would need to be regenerated for single character changes.

3. Store attributes in ranges

Pulling the attributes out and associating them with a range. Since there can never be overlapping attributes, this can be laid out sequentially.

type LineData = CharData[]

// [0]: The character
// [1]: The width
type CharData = [string, number];

class CharAttributes {
  public readonly _start: [number, number];
  public readonly _end: [number, number];
  private _data: Int32Array;

  // Getters pull data from _data (woo encapsulation!)
  public get flags(): number;
  public get truecolorBg(): number;
  public get truecolorFg(): number;
}

class Buffer extends CircularList<LineData> {
  // Sorted list since items are almost always pushed to end
  private _attributes: CharAttributes[];

  public getAttributesForRows(start: number, end: number): CharAttributes[] {
    // Binary search _attributes and return all visible CharAttributes to be
    // applied by the renderer
  }
}

Pros

Lower memory than today even though we’re also storing truecolor data
Can optimize application of attributes, rather than checking every single character’s attribute and diffing it to the one before
Encapsulates the complexity of storing the data inside an array (.flags instead of [0])

Cons

Changing attributes of a range of characters inside another range is more complex

4. Put attributes in a cache

The idea here is to leverage the fact that there generally aren’t that many styles in any one terminal session, so we should not create as few as necessary and reuse them.

// [0]: charIndex
// [1]: width
type CharData = [string, number, CharAttributes];

type LineData = CharData[];

class CharAttributes {
  private _data: Int32Array;

  // Getters pull data from _data (woo encapsulation!)
  public get flags(): number;
  public get truecolorBg(): number;
  public get truecolorFg(): number;
}

interface ICharAttributeCache {
  // Never construct duplicate CharAttributes, figuring how the best way to
  // access both in the best and worst case is the tricky part here
  getAttributes(flags: number, fg: number, bg: number): CharAttributes;
}

Pros

Similar memory usage to today even though we’re also storing truecolor data
Encapsulates the complexity of storing the data inside an array (.flags instead of [0])

Cons

Less memory savings than the ranges approach

5. Hybrid of 3 & 4

type LineData = CharData[]

// [0]: The character
// [1]: The width
type CharData = [string, number];

class CharAttributes {
  private _data: Int32Array;

  // Getters pull data from _data (woo encapsulation!)
  public get flags(): number;
  public get truecolorBg(): number;
  public get truecolorFg(): number;
}

interface CharAttributeEntry {
  attributes: CharAttributes,
  start: [number, number],
  end: [number, number]
}

class Buffer extends CircularList<LineData> {
  // Sorted list since items are almost always pushed to end
  private _attributes: CharAttributeEntry[];
  private _attributeCache: ICharAttributeCache;

  public getAttributesForRows(start: number, end: number): CharAttributeEntry[] {
    // Binary search _attributes and return all visible CharAttributeEntry's to
    // be applied by the renderer
  }
}

interface ICharAttributeCache {
  // Never construct duplicate CharAttributes, figuring how the best way to
  // access both in the best and worst case is the tricky part here
  getAttributes(flags: number, fg: number, bg: number): CharAttributes;
}

Pros

Protentially the fastest and most memory efficient
Very memory efficient when the buffer contains many blocks with styles but only from a few styles (the common case)
Encapsulates the complexity of storing the data inside an array (.flags instead of [0])

Cons

More complex than the other solutions, it may not be worth including the cache if we already keep a single CharAttributes per block?
Extra overhead in CharAttributeEntry object
Changing attributes of a range of characters inside another range is more complex

6. Hybrid of 2 & 3

This takes the solution of 3 but also adds in a lazily evaluates text string for fast access to the line text. Since we’re also storing the characters in CharData we can lazily evaluate it.

type LineData = {
  text: string,
  CharData[]
}

// [0]: The character
// [1]: The width
type CharData = [string, number];

class CharAttributes {
  public readonly _start: [number, number];
  public readonly _end: [number, number];
  private _data: Int32Array;

  // Getters pull data from _data (woo encapsulation!)
  public get flags(): number;
  public get truecolorBg(): number;
  public get truecolorFg(): number;
}

class Buffer extends CircularList<LineData> {
  // Sorted list since items are almost always pushed to end
  private _attributes: CharAttributes[];

  public getAttributesForRows(start: number, end: number): CharAttributes[] {
    // Binary search _attributes and return all visible CharAttributes to be
    // applied by the renderer
  }

  // If we construct the line, hang onto it
  public getLineText(line: number): string;
}

Pros

Lower memory than today even though we’re also storing truecolor data
Can optimize application of attributes, rather than checking every single character’s attribute and diffing it to the one before
Encapsulates the complexity of storing the data inside an array (.flags instead of [0])
Faster access to the actual line string

Cons

Extra memory due to hanging onto line strings
Changing attributes of a range of characters inside another range is more complex

Solutions that won’t work

Storing the string as an int inside an Int32Array will not work as it takes far to long to convert the int back to a character.

Issue Analytics

State:
Created 6 years ago
Reactions:14
Comments:73 (72 by maintainers)

Top GitHub Comments

4reactions

Tyriarcommented, Oct 8, 2018

Current state:

Refactor to allow multiple buffer implementations https://github.com/xtermjs/xterm.js/pull/1632
Add TypedArray-based buffer implementation https://github.com/xtermjs/xterm.js/pull/1641
Reuse arrays that are trimmed off the end of the scrollback https://github.com/xtermjs/xterm.js/pull/1731
Remove old buffer implementation (after we’re sure the new is stable), improve access to typed array buffer so it’s not creating unnecessary in-between objects: https://github.com/xtermjs/xterm.js/blob/8d16bb6ea2e90312b0efb28bee3f39d3995a906f/src/BufferLine.ts#L153-L162

After:

True color https://github.com/xtermjs/xterm.js/issues/484

4reactions

rebornixcommented, May 22, 2018

Hope my answer to where monaco stores the buffer is not too late.

Alex and I are in favor of Array Buffer and most of the time it gives us good performance. Some places we use ArrayBuffer:

Line break offsets in Text Buffer https://github.com/Microsoft/vscode/blob/rebornix/review/src/vs/editor/common/model/pieceTreeTextBuffer/pieceTreeBase.ts#L31 .
Offset <–> Position mapping https://github.com/Microsoft/vscode/blob/rebornix/review/src/vs/editor/common/viewModel/prefixSumComputer.ts#L26
Tokens https://github.com/Microsoft/vscode/blob/rebornix/review/src/vs/editor/common/model/textModelTokens.ts#L1 You may be interested in this one. We tokenize the source code in another thread and store them in Array Buffer, and then we use the same array buffer as the backing store of the View Model. I’m even thinking about communicating with the tokenizer (node-oniguruma) in a web worker and then the UI code doesn’t need to worry about responsiveness and the tokens can be shared to the main thread by SharedArrayBuffer

We use simple strings for text buffer instead of Array Buffer as V8 string is easier to manipulate

We do the encoding/decoding at the very beginning of loading a file, so files are converted JS string. V8 decides whether to use one byte or two to store a character.
We do edits on the text buffer very often, strings are easier to handle.
We are using nodejs native module and have access to V8 internals when necessary.

Top Results From Across the Web

Performance overview - Buffer Help Center

This chart allows you to gain insight into the general engagement of your posts, your overall number of followers and to compare performance...

How can buffering improve the performance of a computer ...

Buffering just means temporarily storing a large chunk of something before processing it. This makes the job faster because the processor spends less...

Buffer Performance Improvements - Atlas Triggers & Functions

I'm having performance issues with the buffer module in realm functions, a simple function to transform base64 string to a buffer taking too ......

Application Buffer-Cache Management for Performance

One significant technique we employ to improve MRTG's scalability is to divide the targets amongst a configurable number of MRTG daemons that we...

Buffer a table to improve performance - Microsoft Power BI ...

Could you let me know if the buffers are doing anything good or wrong ? If wrong, how should i use the Buffer...