Simplify evaluation in iframes
See original GitHub issueI’m currently porting Cockpit’s ingration tests from PhantomJS to the Chrome Debug Protocol. By and large this is going nicely (big thanks!), but the aspect of testing pages with iFrames is excruciatingly hard to get right with chrome. It took me four rewrites with different approaches and several days to get this right, there is very little Google juice about this, and I figure others might stumble over this as well. So I’m filing this both as a place for discussing improvements to the protocol as well as publishing my solution where others can find it.
Cockpit’s tests use an abstract Python API such as Browser.open(url)
, Browser.wait_present(selector)
, Browser.eval_js(expression)
, Browser.switch_to_frame(frame_name)
, and Browser.switch_to_top()
(going back to the topmost document), i. e. the current iframe name is a state that needs to be respected by eval_js()
or wait_present()
; these all eventually are implemented through Runtime.evaluate() (and formerly in terms of PhantomJS incantations).
If all of your iframes come from the same origin, it’s actually fairly simple. One can just remember the frame name and then determine HTML document to query with
if (current_frame)
frame_doc = document.querySelector(`iframe[name="${current_frame}"]`).contentDocument.documentElement;
else
frame_doc = document;
and run the query on frame_doc
. However, this doesn’t work if the iframe to query has a different origin, as JS that runs on the page cannot look inside the content. Then you have to use the DOM shadow tree. Runtime.evaluate()
accepts a contextId to select which iframe document the query gets run in, which works fine. This requires building a frame name → contextId map.
However, execution context IDs are very transient things which need careful tracking. They get invalidated on page reloads and navigation clicks which switch pages (obviously), but I’ve also seen jQuery pages that destroy and recreate the execution context when changing an element (not so obvious), so that this could even hit you in the middle of a “wait for a JS condition to become true” query. Also, there is no way to enumerate the current ExecutionContext
s, map an execution context to a frame name, or map a frame name to an execution context.
The only thing you can do is to keep track of an execution context ID → frame ID mapping through Runtime.executionContextCreated and -Destroyed, and keep another mapping between frame ID → frame name through frameNavigated. These two don’t have a defined order either, so one has to keep both maps and only do the lookup when querying. On top of that we also need to provide a way to wait for a frame name to load (see above “jQuery can invalidate entire document” problem). As there can only be one handler for Page.frameNavigated()
, we have to use a chained promise there:
var frameIdToContextId = {};
var frameNameToFrameId = {};
// set these to wait for a frame to be loaded
var frameWaitName = null;
var frameWaitPromiseResolve = null;
client.Page.enable();
client.Runtime.enable();
// map frame names to frame IDs; root frame has no name, no need to track that
client.Page.frameNavigated(info => {
if (info.frame.name)
frameNameToFrameId[info.frame.name] = info.frame.id;
// were we waiting for this frame to be loaded?
if (frameWaitPromiseResolve && frameWaitName === info.frame.name) {
frameWaitPromiseResolve();
frameWaitPromiseResolve = null;
}
});
// track execution contexts so that we can map between context and frame IDs
client.Runtime.executionContextCreated(info => {
frameIdToContextId[info.context.auxData.frameId] = info.context.id;
});
client.Runtime.executionContextDestroyed(info => {
for (let frameId in frameIdToContextId) {
if (frameIdToContextId[frameId] == info.executionContextId) {
delete frameIdToContextId[frameId];
break;
}
}
});
function getFrameExecId(frame) {
var frameId = frameNameToFrameId[frame];
if (!frameId)
throw Error(`Frame ${frame} is unknown`);
var execId = frameIdToContextId[frameId];
if (!execId)
throw Error(`Frame ${frame} (${frameId}) has no executionContextId`);
return execId;
}
With that under the belt, we can finally do a query in the currently selected frame name:
client.Runtime.evaluate({expression: [...], contextId: getFrameExecId(cur_frame_name)});
and write a helper to wait for a frame to get loaded:
function expectLoadFrame(name, timeout) {
return new Promise((resolve, reject) => {
let tm = setTimeout( () => reject("timed out waiting for frame load"), timeout );
// we can only have one Page.frameNavigated() handler, so let our handler above resolve this promise
frameWaitName = name;
new Promise((fwpResolve, fwpReject) => { frameWaitPromiseResolve = fwpResolve })
.then(() => {
// For the frame to be fully valid for queries, it also needs the corresponding
// executionContextCreated() signal. This might happen before or after frameNavigated(), so wait in case
// it happens afterwards.
function pollExecId() {
if (frameIdToContextId[frameNameToFrameId[name]]) {
clearTimeout(tm);
resolve();
} else {
setTimeout(pollExecId, 100);
}
}
pollExecId();
});
});
}
This finally seems to work well, but I daresay that it’s not entirely obvious. Can the API be extended to become simpler? PhantomJS has switch_to_frame(name) which henceforth makes all queries apply to that. This is stateful and thus doesn’t directly fit into the CDP API. But these API extensions would help, in descending abstractness/ascending amount of work for the client:
-
Do the frame name → frameId → contextId tracking internally and have
Runtime.evaluate
accept a frame name. This would get rid of all of the above code. -
Do the frame object → contextId tracking internally and have
Runtime.evaluate
accept anodeId
for the frame, in whose context the query runs. This would also get rid of all of the above code, just requires an extra DOM.querySelector() to map a frame name to a nodeId, and avoids introducing the rather special “frame name” type as an API parameter. -
Provide a way to map a frame name to its current contextId. Almost as easy as above, but doesn’t require the API to constantly track the mapping itself, it can just be called right before each
Runtime.evaluate()
. This introduces the need to haveexpectLoadFrame()
though, or handle “unknown context ID” errors from it and retry in a loop. -
Do the frameId → contextId tracking internally and have
Runtime.evaluate
accept a frameId. These are as transient as executionIds, but unlike contextIds they can be queried from the DOM tree (which is quite laborious, but avoids having to track all events).
Thanks in advance!
Issue Analytics
- State:
- Created 6 years ago
- Reactions:14
- Comments:10 (3 by maintainers)
Top GitHub Comments
This repository is related to Chrome DevTools Protocol, but does not track issues regarding its definition or implementation. If you want to file an issue for the Chrome DevTools Protocol, please open an issue on https://crbug.com under
component: Platform>DevTools>Platform
. Thanks in advance!When you call Runtime.enable, it sends executionContextCreated events for all existing execution contexts. So you don’t have to worry about connecting late.