Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Discussion: PseudoStack frames and categories

See original GitHub issue

PseudoStack frames and categories

I’ve been thinking about how to change our PseudoStack and our set of categories into something useful. Here’s a dump of my current thoughts on the topic.

First, the problems that I’d like to solve.

Problems

Problem 1: Need pretty colors

For every sample that was taken during a profiling run, we’d like to be able to determine a category, so that we can draw a graph of what type of code is running on the CPU at any given moment. One of those categories should be “Idle”.

These categories should be fairly coarse-grained; there shouldn’t be more than 10 of them because it would be hard to remember the meaning of too many different colors.

Problem 2: Limited JS-only call tree

The JavaScript-only call tree is not all that useful at the moment, because it really shows you only JavaScript and nothing else: Most importantly, you don’t see the entry and exit points of JS in the stack.

The entry point is interesting because you want to know why a certain piece of code is even running: Is it an event listener? A setTimeout callback? The contents of a <script> that has just finished loading?

And the exit point is interesting because you want to know where the time in a given JS function is spent: Is it the JS code itself that’s taking up the CPU time? Or some DOM API that’s called by the function? Or a built-in function from the JS VM? Or maybe even a GC, or the compilation of some eval()ed code?

Problem 3: Would like a medium-granularity call tree

Firefox front-end developers need to look at C++ call stacks if they want to know what’s going on in the platform e.g. during Firefox startup. It would be nice to give them a less overwhelming view of the call tree that still has call nodes for relevant platform work. Examples of such platform work are: JSM loading, DLL loading, launching a child process, running XBL constructors.

Problem 4: Enable cheap querying of the “current type of work on the CPU”

For Firefox telemetry, it would be great if we had a cheap way to sample what type of work a thread is busy with, without interrupting that thread or walking stacks. We could do this by having a single per-thread integer that stores a “type of work” value which can be queried cheaply at any time.

These problems may sound unrelated but I think the same solution can help all of them.

Current status

We have a PseudoStack that gets interleaved with the native stack and the JS stack. Each PseudoStack frame (called ProfileEntry in the profiler’s C++ code) carries three bits of key information: One or two strings, and a category.

The two strings

The second string (the “dynamic” string) is usually a URL, or an event name, or a message name, or empty.

The first string is either a C++ function name, or, for WebIDL frames, the name of the WebIDL API.

The fact that we use C++ function names for PseudoStack frame labels has historic reasons: When the profiler was created, native stackwalking was not part of the original scope. It was expected that the PseudoStack is the only stack we have. So people tried to make the PseudoStack look as much as the native stack as possible, just with a coarser granularity.

The category

The current set of categories is:

    enum class Category : uint32_t {
        OTHER    = 1u << 4,
        CSS      = 1u << 5,
        JS       = 1u << 6,
        GC       = 1u << 7,
        CC       = 1u << 8,
        NETWORK  = 1u << 9,
        GRAPHICS = 1u << 10,
        STORAGE  = 1u << 11,
        EVENTS   = 1u << 12
    }

The fact that these categories are set up as a bitfield is entirely unnecessary, by the way - you never want to assign more than one category to a PseudoStack frame.

Proposed solutions

All the problems described above can be solved by annotating PseudoStack frames with more information, and by having a central list of all important PseudoStack frame information configurations.

For problem 1, we just need to update our existing list of categories a little (mostly to add the Idle category). We also need a side table that maps each category to a color.

Problem 2 calls for a flag on the PseudoStack frame that says “This PseudoStack frame is relevant to the JavaScript-centric view of the call tree”.

Problem 3 calls for a flag on the PseudoStack frame that says “This PseudoStack frame has details about platform work that may be interesting even if you don’t want to look at the full native stack. Moreover, there is a way to get a human readable string for this frame instead of just the C++ function name.”

Problem 4 calls for a central list of “types of platform work”.

I think we can use the category field on the PseudoStack frame to capture all of this information. Instead of storing a coarse-grained category, we can make it store a subcategory:

There would be one subcategory for every “type of platform work”.
There would also be one subcategory for each outer category, to mark “generic work of that category”.
There would be a way to map each subcategory to the following pieces of information:
- The outer category that this subcategory belongs to.
- A human-readable name (not a C++ function name).
- Whether frames of this category are interesting for a JavaScript-centric view of the call stack.
- Whether frames of this category are interesting for a platform-centric view of the call stack.

Profiler platform implementation

Here’s a list of subcategories that we can start with:

enum class Subcategory : uint32_t {
  eIdle,
  eOther,
  eOther_LoadJSM,
  eOther_LaunchChildProcess,
  eOther_LoadDLL,
  eOther_DestroyWindow,
  eOther_EnterJS,
  eJavaScript,
  eJavaScript_RunBuiltIn,
  eJavaScript_RunInterpreter,
  eJavaScript_CompileBaseline,
  eJavaScript_RunBaseline,
  eJavaScript_CompileIon,
  eJavaScript_RunIon,
  eJavaScript_CompileRegExp,
  eLayout,
  eLayout_InitialLayout,
  eLayout_RefreshTick,
  eLayout_ConstructFrames,
  eLayout_Reflow,
  eLayout_ParseCSS,
  eLayout_MatchSelector,
  eLayout_RecomputeStyle,
  eLayout_BuildDisplayList,
  eLayout_MergeDisplayLists,
  eGraphics,
  eGraphics_BuildLayers,
  eGraphics_CreateWebRenderCommands,
  eGraphics_Rasterize,
  eGraphics_ImageDecode,
  eDOM,
  eDOM_WebIDLConstructor,
  eDOM_WebIDLMethod,
  eDOM_WebIDLGetter,
  eDOM_WebIDLSetter,
  eGCCC,
  eGCCC_GarbageCollection,
  eGCCC_GCIncrementalMarking,
  eGCCC_CycleCollection,
  eGCCC_CCForgetSkippable,
  eNetwork,
  eHTMLParsing
}

The “is relevant for a JS-centric stack view” flag would be true for eOther_EnterJS and the eDOM_WebIDL* subcategories only.

The “is relevant for a platform-centric stack view” flag would be true for all subcategories that are not the generic entries for an outer category.

Profile format

We should avoid having to keep perf.html in sync with any changes we make to these (sub)categories. Therefore, I think knowledge of the categories and subcategories should be contained entirely within the profile.

For the outer categories, I think we can just have a central list in the profile’s meta struct.

I’m not entirely sure what to do about the subcategories. We could have a list of them somewhere in the profile data, or we could put the information directly into the frame. I think I prefer the latter. For example, the location string of a frame could be the human readable name of the subcategory, possibly combined with the dynamic string of the PseudoStack frame in some way, and there could be two new columns relevantForJS and relevantForPlatform which carry booleans.

Sketch of the new profile format:

profile = {
  meta: {
    // ...
    categories: [
    	{ name: "Idle", color: "transparent" },
    	{ name: "Other", color: "gray" },
    	{ name: "JavaScript", color: "yellow" },
    	{ name: "Layout", color: "purple" },
    	{ name: "Graphics", color: "green" },
    	{ name: "DOM", color: "blue" },
    	{ name: "GC / CC", color: "orange" },
    	{ name: "Network", color: "lightblue" },
    	{ name: "HTML Parsing", color: "brown" }
    ],
  }
  threads: [
    {
      // ...
      frameTable: {
        schema: {
          location: 0,
          implementation: 1,
          optimizations: 2,
          line: 3,
          category: 4,
          relevantForJS: 5,
          relevantForPlatform: 6,
        },
        data: [ ... ]
      }
    }
    // ...
  ],
  // ...
}

perf.html changes

The graphs at the top should make use of the category data.
The JS-only view should be renamed to the JS-centric view.
We should add a new platform-centric view and maybe rename the “combined stacks” view to something that indicates the overwhelming nature of it.
The way we filter by implementation may need to be updated.
Once we have a sidebar, we can display pie charts of categories and subcategories in it.

Call for comments

Does the above sound reasonable? Do you have suggestions for alternatives, or possibly different names?

Issue Analytics

State:
Created 6 years ago
Comments:7 (7 by maintainers)

Top GitHub Comments

1reaction

mstangecommented, Mar 7, 2018

How do pseudo stacks play into transforms? Right now we treat them as C++ call nodes. This is an unvoiced concern that I have had when we move forward with smarter and better pseudo stacks.

PseudoStack frames that have relevantForJS == true would be treated as JS frames, and all others would still be treated as C++ frames. Once we add a platform-centric view, we’ll need another implementation filter which affects both what’s displayed in the call tree and how transforms work.

0reactions

gregtatumcommented, Apr 17, 2019

Closing as much of the work has been done.

Some of the follow-ups: See #1956 See #1392

Top Results From Across the Web

Frame label entries are hard to distinguish from symbolicated native ...

A frame is a pseudostack frame if it's not a JS frame and not a C++ frame. JS frames have a func which...

1385998 - Reduce the cost of modifications to PseudoStack ...

Summary: Reduce the overhead of the PseudoStack stackPointer ... to the ProfileEntry by the *next* pseudo stack frame that we'll be pushing to...

Diff - platform/external/libchrome - Google Git

For now, -// this is the pseudo stack where frames are created by trace event macros. In -// the future, we might add...

WDCTools W65C02S C COMPILER/OPTIMIZER USER GUIDE

Pseudo-Stack Frame. ... A detailed discussion of compiler output files appears in CHAPTER 1. ... correct number and types of parameters are passed....

Hi, DMPer, Whether this tool is still maintained? Or be ...

To view this discussion on the web visit ... And in your question, was it intent to use stack frame for analyzing n_heap...