Crawling SyntaxError: Unmatched selector: @href
See original GitHub issueSubject of the issue
Error while scraping crawled links, doesn’t make collection.
Your environment
- version of node: v6.3.1
- version of npm: 4.2.0 Ubuntu 16.04LTS
Steps to reproduce
var Xray = require('x-ray');
var x = Xray();
html = '<div> <a href="http://www.google.com">google</a> <a href="http://www.bing.com"> bing </a> </div>';
x(html,'a', [{
engine: x('a'),
links: x('a@href', [{href:'a@href',text:'a'}])
}]).write('result.json');
Expected behaviour / result
[ { engine: 'google', links:
[{ href: 'http://www.google.com/imghp?tab=wi', text: 'Images' },
...
]},
{ engine: 'bing', links:
[{ href: 'javascript:void(0)' }.
...
]} ]
Actual behaviour
SyntaxError: Unmatched selector: @href
While without the brackets (array/collection) it does give the first link as result, like above:
links: x('a@href', {href:'a@href',text:'a'})
Also this works:
x(html,'a', [{
engine: x('a'),
href: x('a@href', ['a@href']),
text: x('a@href', ['a']),
}])
This gives two arrays (href, and text) as result, so you expect that surrounding the original selector with brackets it should return a collection of links.
AFAICS, it sets the wrongscope
on line 218 in index.js. Probably it can’t parse the sub-scope at the moment, but it would be nice if it can.
Any ideas?
Issue Analytics
- State:
- Created 6 years ago
- Reactions:5
- Comments:6
Top Results From Across the Web
Cheerio unmatched selector error while selecting plain text
I'm scraping a web page with cheerio's .map method. The page's html code looks like this:
Read more >Database Engine events and errors - SQL Server
Consult this MSSQL error code list to find explanations for error messages for SQL Server database engine events.
Read more >An Empirical Study of CSS Code Smells in Web Frameworks
language standards maintained by W3C, which sometimes results in HTML code with missing closing tags or CSS code with unmatched selectors [32].
Read more >What is causing the "syntax error near unexpected token `do ...
In general, when I get that type of error (unmatched if/else, do, while, etc) and I can't find the offending code, I make...
Read more >6.5 Technical Notes Red Hat Enterprise Linux 6
cookie_hmac_alg is used to select the keyed-hash message authentication code ... When an igb link us up, the following ethtool fields display incorrect ......
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Having this same problem now.
See above: ‘a@href’ isn’t a selector in that use case. Just use ‘a’.