New selector method: extract_first()
See original GitHub issueI think about suggestion to improve scrapy Selector. I’ve seen this construction in many projects:
result = sel.xpath('//div/text()').extract()[0]
And what about if result:
and else:
, or try:
and except
:, which should be always there?
When we don’t want ItemLoaders, the most common use of selector is retrieving only single element.
Maybe there should be method xpath1
or xpath_one
or xpath_first
that returns first matched element or None
?
Issue Analytics
- State:
- Created 10 years ago
- Reactions:1
- Comments:54 (41 by maintainers)
Top Results From Across the Web
Selectors — Scrapy 2.7.1 documentation
Selectors ¶. When you're scraping web pages, the most common task you need to perform is to extract data from the HTML source....
Read more >Extract first element with XPath and scrapy
There is a new Scrapy built in method get() can be used instead of extract_first() which always returns a string and None if...
Read more >Selectors - Scrapy documentation - Read the Docs
Scrapy comes with its own mechanism for extracting data. They're called selectors because they “select” certain parts of the HTML document specified either...
Read more >Scrapy - Extracting Items
Scrapy - Extracting Items, For extracting data from web pages, Scrapy uses a technique called selectors based on XPath and CSS expressions.
Read more >Selectors
Selector also has a .re() method for extracting data using regular expressions. However, unlike using .xpath() and .css() methods, .re() returns ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
It also could be like this:
sel.css('span').extract_first()
Maybe it will prevent constructions that aren’t clear much, eg.:
sel.css('span').extract(True)
2014-01-30 Daniel Graña notifications@github.com
Closed by ff64584