Buffer performance improvements
See original GitHub issueProblem
Memory
Right now our buffer is taking up too much memory, particularly for an application that launches multiple terminals with large scrollbacks set. For example, the demo using a 160x24 terminal with 5000 scrollback filled takes around 34mb memory (see https://github.com/Microsoft/vscode/issues/29840#issuecomment-314539964), remember that’s just a single terminal and 1080p monitors would likely use wider terminals. Also, in order to support truecolor (https://github.com/sourcelair/xterm.js/issues/484), each character will need to store 2 additional number
types which will almost double the current memory consumption of the buffer.
Slow fetching of a row’s text
There is the other problem of needing to fetch the actual text of a line swiftly. The reason this is slow is due to the way that the data is laid out; a line contains an array of characters, each having a single character string. So we will construct the string and then it will be up for garbage collection immediately afterwards. Previously we didn’t need to do this at all because the text is pulled from the line buffer (in order) and rendered to the DOM. However, this is becoming an increasingly useful thing to do though as we improve xterm.js further, features like the selection and links both pull this data. Again using the 160x24/5000 scrollback example, it takes 30-60ms to copy the entire buffer on a Mid-2014 Macbook Pro.
Supporting the future
Another potential problem in the future is when we look at introducing some view model which may need to duplicate some or all of the data in the buffer, this sort of thing will be needed to implement reflow (https://github.com/sourcelair/xterm.js/issues/622) properly (https://github.com/sourcelair/xterm.js/pull/644#issuecomment-298058556) and maybe also needed to properly support screen readers (https://github.com/sourcelair/xterm.js/issues/731). It would certainly be good to have some wiggle room when it comes to memory.
This discussion started in https://github.com/sourcelair/xterm.js/issues/484, this goes into more detail and proposes some additional solution.
I’m leaning towards solution 3 and moving towards solution 5 if there is time and it shows a marked improvement. Would love any feedback! /cc @jerch, @mofux, @rauchg, @parisk
1. Simple solution
This is basically what we’re doing now, just with truecolor fg and bg added.
// [0]: charIndex
// [1]: width
// [2]: attributes
// [3]: truecolor bg
// [4]: truecolor fg
type CharData = [string, number, number, number, number];
type LineData = CharData[];
Pros
- Very simple
Cons
- Too much memory consumed, would nearly double our current memory usage which is already too high.
2. Pull text out of CharData
This would store the string against the line rather than the line, this would probably see very large gains in selection and linkifying and would be more useful as time goes on having quick access to a line’s entire string.
interface ILineData {
// This would provide fast access to the entire line which is becoming more
// and more important as time goes on (selection and links need to construct
// this currently). This would need to reconstruct text whenever charData
// changes though. We cannot lazily evaluate text due to the chars not being
// stored in CharData
text: string;
charData: CharData[];
}
// [0]: charIndex
// [1]: attributes
// [2]: truecolor bg
// [3]: truecolor fg
type CharData = Int32Array;
Pros
- No need to reconstruct the line whenever we need it.
- Lower memory than today due to the use of an
Int32Array
Cons
- Slow to update individual characters, the entire string would need to be regenerated for single character changes.
3. Store attributes in ranges
Pulling the attributes out and associating them with a range. Since there can never be overlapping attributes, this can be laid out sequentially.
type LineData = CharData[]
// [0]: The character
// [1]: The width
type CharData = [string, number];
class CharAttributes {
public readonly _start: [number, number];
public readonly _end: [number, number];
private _data: Int32Array;
// Getters pull data from _data (woo encapsulation!)
public get flags(): number;
public get truecolorBg(): number;
public get truecolorFg(): number;
}
class Buffer extends CircularList<LineData> {
// Sorted list since items are almost always pushed to end
private _attributes: CharAttributes[];
public getAttributesForRows(start: number, end: number): CharAttributes[] {
// Binary search _attributes and return all visible CharAttributes to be
// applied by the renderer
}
}
Pros
- Lower memory than today even though we’re also storing truecolor data
- Can optimize application of attributes, rather than checking every single character’s attribute and diffing it to the one before
- Encapsulates the complexity of storing the data inside an array (
.flags
instead of[0]
)
Cons
- Changing attributes of a range of characters inside another range is more complex
4. Put attributes in a cache
The idea here is to leverage the fact that there generally aren’t that many styles in any one terminal session, so we should not create as few as necessary and reuse them.
// [0]: charIndex
// [1]: width
type CharData = [string, number, CharAttributes];
type LineData = CharData[];
class CharAttributes {
private _data: Int32Array;
// Getters pull data from _data (woo encapsulation!)
public get flags(): number;
public get truecolorBg(): number;
public get truecolorFg(): number;
}
interface ICharAttributeCache {
// Never construct duplicate CharAttributes, figuring how the best way to
// access both in the best and worst case is the tricky part here
getAttributes(flags: number, fg: number, bg: number): CharAttributes;
}
Pros
- Similar memory usage to today even though we’re also storing truecolor data
- Encapsulates the complexity of storing the data inside an array (
.flags
instead of[0]
)
Cons
- Less memory savings than the ranges approach
5. Hybrid of 3 & 4
type LineData = CharData[]
// [0]: The character
// [1]: The width
type CharData = [string, number];
class CharAttributes {
private _data: Int32Array;
// Getters pull data from _data (woo encapsulation!)
public get flags(): number;
public get truecolorBg(): number;
public get truecolorFg(): number;
}
interface CharAttributeEntry {
attributes: CharAttributes,
start: [number, number],
end: [number, number]
}
class Buffer extends CircularList<LineData> {
// Sorted list since items are almost always pushed to end
private _attributes: CharAttributeEntry[];
private _attributeCache: ICharAttributeCache;
public getAttributesForRows(start: number, end: number): CharAttributeEntry[] {
// Binary search _attributes and return all visible CharAttributeEntry's to
// be applied by the renderer
}
}
interface ICharAttributeCache {
// Never construct duplicate CharAttributes, figuring how the best way to
// access both in the best and worst case is the tricky part here
getAttributes(flags: number, fg: number, bg: number): CharAttributes;
}
Pros
- Protentially the fastest and most memory efficient
- Very memory efficient when the buffer contains many blocks with styles but only from a few styles (the common case)
- Encapsulates the complexity of storing the data inside an array (
.flags
instead of[0]
)
Cons
- More complex than the other solutions, it may not be worth including the cache if we already keep a single
CharAttributes
per block? - Extra overhead in
CharAttributeEntry
object - Changing attributes of a range of characters inside another range is more complex
6. Hybrid of 2 & 3
This takes the solution of 3 but also adds in a lazily evaluates text string for fast access to the line text. Since we’re also storing the characters in CharData
we can lazily evaluate it.
type LineData = {
text: string,
CharData[]
}
// [0]: The character
// [1]: The width
type CharData = [string, number];
class CharAttributes {
public readonly _start: [number, number];
public readonly _end: [number, number];
private _data: Int32Array;
// Getters pull data from _data (woo encapsulation!)
public get flags(): number;
public get truecolorBg(): number;
public get truecolorFg(): number;
}
class Buffer extends CircularList<LineData> {
// Sorted list since items are almost always pushed to end
private _attributes: CharAttributes[];
public getAttributesForRows(start: number, end: number): CharAttributes[] {
// Binary search _attributes and return all visible CharAttributes to be
// applied by the renderer
}
// If we construct the line, hang onto it
public getLineText(line: number): string;
}
Pros
- Lower memory than today even though we’re also storing truecolor data
- Can optimize application of attributes, rather than checking every single character’s attribute and diffing it to the one before
- Encapsulates the complexity of storing the data inside an array (
.flags
instead of[0]
) - Faster access to the actual line string
Cons
- Extra memory due to hanging onto line strings
- Changing attributes of a range of characters inside another range is more complex
Solutions that won’t work
- Storing the string as an int inside an
Int32Array
will not work as it takes far to long to convert the int back to a character.
Issue Analytics
- State:
- Created 6 years ago
- Reactions:14
- Comments:73 (72 by maintainers)
Top GitHub Comments
Current state:
After:
Hope my answer to where monaco stores the buffer is not too late.
Alex and I are in favor of Array Buffer and most of the time it gives us good performance. Some places we use ArrayBuffer:
We use simple strings for text buffer instead of Array Buffer as V8 string is easier to manipulate