Discussion: PseudoStack frames and categories
See original GitHub issuePseudoStack frames and categories
I’ve been thinking about how to change our PseudoStack and our set of categories into something useful. Here’s a dump of my current thoughts on the topic.
First, the problems that I’d like to solve.
Problems
Problem 1: Need pretty colors
For every sample that was taken during a profiling run, we’d like to be able to determine a category, so that we can draw a graph of what type of code is running on the CPU at any given moment. One of those categories should be “Idle”.
These categories should be fairly coarse-grained; there shouldn’t be more than 10 of them because it would be hard to remember the meaning of too many different colors.
Problem 2: Limited JS-only call tree
The JavaScript-only call tree is not all that useful at the moment, because it really shows you only JavaScript and nothing else: Most importantly, you don’t see the entry and exit points of JS in the stack.
The entry point is interesting because you want to know why a certain piece of code is even running: Is it an event listener? A setTimeout callback? The contents of a <script>
that has just finished loading?
And the exit point is interesting because you want to know where the time in a given JS function is spent: Is it the JS code itself that’s taking up the CPU time? Or some DOM API that’s called by the function? Or a built-in function from the JS VM? Or maybe even a GC, or the compilation of some eval()ed code?
Problem 3: Would like a medium-granularity call tree
Firefox front-end developers need to look at C++ call stacks if they want to know what’s going on in the platform e.g. during Firefox startup. It would be nice to give them a less overwhelming view of the call tree that still has call nodes for relevant platform work. Examples of such platform work are: JSM loading, DLL loading, launching a child process, running XBL constructors.
Problem 4: Enable cheap querying of the “current type of work on the CPU”
For Firefox telemetry, it would be great if we had a cheap way to sample what type of work a thread is busy with, without interrupting that thread or walking stacks. We could do this by having a single per-thread integer that stores a “type of work” value which can be queried cheaply at any time.
These problems may sound unrelated but I think the same solution can help all of them.
Current status
We have a PseudoStack that gets interleaved with the native stack and the JS stack. Each PseudoStack frame (called ProfileEntry
in the profiler’s C++ code) carries three bits of key information: One or two strings, and a category.
The two strings
The second string (the “dynamic” string) is usually a URL, or an event name, or a message name, or empty.
The first string is either a C++ function name, or, for WebIDL frames, the name of the WebIDL API.
The fact that we use C++ function names for PseudoStack frame labels has historic reasons: When the profiler was created, native stackwalking was not part of the original scope. It was expected that the PseudoStack is the only stack we have. So people tried to make the PseudoStack look as much as the native stack as possible, just with a coarser granularity.
The category
The current set of categories is:
enum class Category : uint32_t {
OTHER = 1u << 4,
CSS = 1u << 5,
JS = 1u << 6,
GC = 1u << 7,
CC = 1u << 8,
NETWORK = 1u << 9,
GRAPHICS = 1u << 10,
STORAGE = 1u << 11,
EVENTS = 1u << 12
}
The fact that these categories are set up as a bitfield is entirely unnecessary, by the way - you never want to assign more than one category to a PseudoStack frame.
Proposed solutions
All the problems described above can be solved by annotating PseudoStack frames with more information, and by having a central list of all important PseudoStack frame information configurations.
For problem 1, we just need to update our existing list of categories a little (mostly to add the Idle category). We also need a side table that maps each category to a color.
Problem 2 calls for a flag on the PseudoStack frame that says “This PseudoStack frame is relevant to the JavaScript-centric view of the call tree”.
Problem 3 calls for a flag on the PseudoStack frame that says “This PseudoStack frame has details about platform work that may be interesting even if you don’t want to look at the full native stack. Moreover, there is a way to get a human readable string for this frame instead of just the C++ function name.”
Problem 4 calls for a central list of “types of platform work”.
I think we can use the category field on the PseudoStack frame to capture all of this information. Instead of storing a coarse-grained category, we can make it store a subcategory:
- There would be one subcategory for every “type of platform work”.
- There would also be one subcategory for each outer category, to mark “generic work of that category”.
- There would be a way to map each subcategory to the following pieces of information:
- The outer category that this subcategory belongs to.
- A human-readable name (not a C++ function name).
- Whether frames of this category are interesting for a JavaScript-centric view of the call stack.
- Whether frames of this category are interesting for a platform-centric view of the call stack.
Profiler platform implementation
Here’s a list of subcategories that we can start with:
enum class Subcategory : uint32_t {
eIdle,
eOther,
eOther_LoadJSM,
eOther_LaunchChildProcess,
eOther_LoadDLL,
eOther_DestroyWindow,
eOther_EnterJS,
eJavaScript,
eJavaScript_RunBuiltIn,
eJavaScript_RunInterpreter,
eJavaScript_CompileBaseline,
eJavaScript_RunBaseline,
eJavaScript_CompileIon,
eJavaScript_RunIon,
eJavaScript_CompileRegExp,
eLayout,
eLayout_InitialLayout,
eLayout_RefreshTick,
eLayout_ConstructFrames,
eLayout_Reflow,
eLayout_ParseCSS,
eLayout_MatchSelector,
eLayout_RecomputeStyle,
eLayout_BuildDisplayList,
eLayout_MergeDisplayLists,
eGraphics,
eGraphics_BuildLayers,
eGraphics_CreateWebRenderCommands,
eGraphics_Rasterize,
eGraphics_ImageDecode,
eDOM,
eDOM_WebIDLConstructor,
eDOM_WebIDLMethod,
eDOM_WebIDLGetter,
eDOM_WebIDLSetter,
eGCCC,
eGCCC_GarbageCollection,
eGCCC_GCIncrementalMarking,
eGCCC_CycleCollection,
eGCCC_CCForgetSkippable,
eNetwork,
eHTMLParsing
}
The “is relevant for a JS-centric stack view” flag would be true for eOther_EnterJS
and the eDOM_WebIDL*
subcategories only.
The “is relevant for a platform-centric stack view” flag would be true for all subcategories that are not the generic entries for an outer category.
Profile format
We should avoid having to keep perf.html in sync with any changes we make to these (sub)categories. Therefore, I think knowledge of the categories and subcategories should be contained entirely within the profile.
For the outer categories, I think we can just have a central list in the profile’s meta struct.
I’m not entirely sure what to do about the subcategories. We could have a list of them somewhere in the profile data, or we could put the information directly into the frame. I think I prefer the latter. For example, the location
string of a frame could be the human readable name of the subcategory, possibly combined with the dynamic string of the PseudoStack frame in some way, and there could be two new columns relevantForJS
and relevantForPlatform
which carry booleans.
Sketch of the new profile format:
profile = {
meta: {
// ...
categories: [
{ name: "Idle", color: "transparent" },
{ name: "Other", color: "gray" },
{ name: "JavaScript", color: "yellow" },
{ name: "Layout", color: "purple" },
{ name: "Graphics", color: "green" },
{ name: "DOM", color: "blue" },
{ name: "GC / CC", color: "orange" },
{ name: "Network", color: "lightblue" },
{ name: "HTML Parsing", color: "brown" }
],
}
threads: [
{
// ...
frameTable: {
schema: {
location: 0,
implementation: 1,
optimizations: 2,
line: 3,
category: 4,
relevantForJS: 5,
relevantForPlatform: 6,
},
data: [ ... ]
}
}
// ...
],
// ...
}
perf.html changes
- The graphs at the top should make use of the category data.
- The JS-only view should be renamed to the JS-centric view.
- We should add a new platform-centric view and maybe rename the “combined stacks” view to something that indicates the overwhelming nature of it.
- The way we filter by implementation may need to be updated.
- Once we have a sidebar, we can display pie charts of categories and subcategories in it.
Call for comments
Does the above sound reasonable? Do you have suggestions for alternatives, or possibly different names?
Issue Analytics
- State:
- Created 6 years ago
- Comments:7 (7 by maintainers)
Top GitHub Comments
PseudoStack frames that have
relevantForJS == true
would be treated as JS frames, and all others would still be treated as C++ frames. Once we add a platform-centric view, we’ll need another implementation filter which affects both what’s displayed in the call tree and how transforms work.Closing as much of the work has been done.
Some of the follow-ups: See #1956 See #1392