question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[BUG] frame.inner_text('body') is returning <script> content as well

See original GitHub issue

Context:

  • Playwright Version: playwright==1.27.1
  • Operating System: Ubuntu 20.04
  • Python Version: 3.8
  • Browser: Chromium 105.0.5195.19

Code Snippet

Help us help you! Put down a short code snippet that illustrates your bug and that we can run and debug locally.

import asyncio
from playwright.async_api import async_playwright, Error


async def request(request):
    pass


async def coroutine():
    async with async_playwright() as playwright:
        # Launch browser
        binary = playwright.chromium
        browser = await binary.launch(headless=True)
        page = await browser.new_page()
        page.on('request', request)
        await page.goto("https://www.w3schools.com/tags/tryit.asp?filename=tryhtml_areamap2")
        for frame in page.frames:
            try:
                txt = await frame.inner_text('body', timeout=3000)
            except Error:
                txt = ''
            print('*****' + frame.name)
            print('*****' + frame.url)
            print(txt)
            print('\n\n\n')
        await browser.close()


asyncio.run(coroutine())

Describe the bug

I am trying to extract inner frame text using the code above. Instead of printing only the body text part, it is printing <script> tag as well. However when I open the URL and access the frame in my browser, it returns the text correctly e.g. the following works fine:

document.getElementById('adg-0-sync').innerText

Issue Analytics

  • State:closed
  • Created 10 months ago
  • Comments:8 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
dgozmancommented, Nov 22, 2022

@sohaib17 Oh, here you are calling innerText not on the body element, but on the outer iframe element instead. I’d suggest to select the <body> in the elements panel, and then try $0.innerText in the console. Let me know how that goes.

0reactions
dgozmancommented, Nov 28, 2022

It seems like the issue has been resolved. If you still encounter problems, please file a new issue with a repro and link to this one.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Getting innerHTML or innerText error "The requested action ...
Content titles and body ... my solution was to have my script check that the innerhtml is there ... Look for the object...
Read more >
Javascript Iframe innerHTML - Stack Overflow
You are grabbing the innerHTML of the html element and not the body, and this is a pretty backwards method to do it....
Read more >
The poor, misunderstood innerText - Perfection Kills
This attribute returns the text content of this node and its descendants. [...] On getting, no serialization is performed, the returned string ...
Read more >
InnerHTML vs InnerText vs TextContent | Playwright Tutorial
Hey Guys, in this video we will learn the difference between innerHTML vs innerText and TextContent in detail.Inner html returns the html ...
Read more >
HTMLElement.innerText - Web APIs | MDN
The innerText property of the HTMLElement interface represents the rendered text content of a node and its descendants.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found