question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Is there support for scraping single elements in table?

See original GitHub issue

Hi,

I have this simple code that fails everytime.

var player = {
    url: "https://fr.wikipedia.org/wiki/Kossi_Agassa",
    data: {
        club: {
            selector: "#mw-content-text > table.wikitable.alternance2.centre > tbody > tr:nth-child(3) > td:nth-child(2) > span > b > a"
        }
    }
};

scrape(player.url, player.data)
    .then(console.log)
    .catch(console.error);

The output is always { club: ‘’ } when it really should be { club: “FC Metz” }

Selecting table rows as listItem data seems to work but never selecting single elements from a table. I even tried with “#mw-content-text > table.infobox_v2 > tbody > tr:nth-child(4) > td > a”. However, selecting single element outside of tables also works like a charm.

What am I doing wrong?

Issue Analytics

  • State:closed
  • Created 7 years ago
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

2reactions
IonicaBizaucommented, Oct 21, 2016

That looks like a cheerio issue, but I think in the scrape-it spirit, it would be better to do something like this:

var scrape = require(".");
var player = {
    url: "https://fr.wikipedia.org/wiki/Kossi_Agassa",
    data: {
        club: {
            selector: "#mw-content-text > table.wikitable.alternance2.centre tr",
            eq: 3,
            data: {
                _: {
                    selector: "td",
                    eq: 1
                }
            }
        }
    }
};

scrape(player.url, player.data)
    .then(console.log)
    .catch(console.error);

This outputs:

{ club: { _: 'FC Metz' } }
2reactions
IonicaBizaucommented, Oct 21, 2016

Ah, sorry, this is already the complete code. 😂 Checking.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Web Scraping 1: Scraping Table Data | by Kiprono Elijah Koech
In this article, we will focus on BeautifulSoup and how to use it to scrape GDP data from Wikipedia page. The data we...
Read more >
Data scraping of a table having single row - Help
A web page has a table with a single row and multiple columns. The problem is,while using data scraping, it asks to indicate...
Read more >
How to scrape specific text from specific table elements
I have tried soup.find('li') and text.strip() to find individual elements but with seller rank, it returns all 3 ranks jumbled in one return ......
Read more >
How to Scrape HTML Table in JavaScript + Ready-To-Use Code
HTML tables can be tricky to traverse and scrape. Follow this simple script to extract data from any HTML table and export it...
Read more >
How to scrape data from a table - ParseHub Help Center
Scraping multiple tables and aggregating the results is easy, and almost identical to scraping a single table. 1. Instead of using your select ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found