Amazon scraper inconsistent/not reliable
See original GitHub issueI haven’t checked the other scrapers as close, but the Amazon one seems like it has issues (aside from #51), which could potentially miss an item or be delayed. Here is what I am seeing. I put a single item (RTX 2080 that is available) into a config. I started the container and watch the logs, there are many checks that log as not in stock, then it will finally alert as “in stock” much later that I would expect. The subsequent checks reports “not in stock” and it follows this way for several more checks before randomly alerting “in stock” again. For some reason I really suck at python, I really want to help out but I can’t follow the code enough to see what’s happening.
I2020-12-05 05:13:49,399 scraper initialized for https://www.amazon.com/MSI-GeForce-Architecture-Overclocked-Graphics/dp/B08CLV8CKP
W2020-12-05 05:13:51,402 warning: using selenium webdriver for scraping... this feature is under active development
W2020-12-05 05:13:54,273 missing title: https://www.amazon.com/MSI-GeForce-Architecture-Overclocked-Graphics/dp/B08CLV8CKP
I2020-12-05 05:13:54,277 B08CLV8CKP: not in stock
W2020-12-05 05:13:58,602 missing title: https://www.amazon.com/MSI-GeForce-Architecture-Overclocked-Graphics/dp/B08CLV8CKP
I2020-12-05 05:13:58,605 B08CLV8CKP: not in stock
W2020-12-05 05:14:03,011 missing title: https://www.amazon.com/MSI-GeForce-Architecture-Overclocked-Graphics/dp/B08CLV8CKP
I2020-12-05 05:14:03,015 B08CLV8CKP: not in stock
W2020-12-05 05:14:07,671 missing title: https://www.amazon.com/MSI-GeForce-Architecture-Overclocked-Graphics/dp/B08CLV8CKP
I2020-12-05 05:14:07,674 B08CLV8CKP: not in stock
I2020-12-05 05:14:16,376 B08CLV8CKP: now in stock at 939.99!
W2020-12-05 05:14:20,897 missing title: https://www.amazon.com/MSI-GeForce-Architecture-Overclocked-Graphics/dp/B08CLV8CKP
I2020-12-05 05:14:20,902 B08CLV8CKP: not in stock
W2020-12-05 05:14:25,415 missing title: https://www.amazon.com/MSI-GeForce-Architecture-Overclocked-Graphics/dp/B08CLV8CKP
I2020-12-05 05:14:25,417 B08CLV8CKP: not in stock
W2020-12-05 05:14:30,144 missing title: https://www.amazon.com/MSI-GeForce-Architecture-Overclocked-Graphics/dp/B08CLV8CKP
I2020-12-05 05:14:30,146 B08CLV8CKP: not in stock
W2020-12-05 05:14:34,454 missing title: https://www.amazon.com/MSI-GeForce-Architecture-Overclocked-Graphics/dp/B08CLV8CKP
I2020-12-05 05:14:34,457 B08CLV8CKP: not in stock
W2020-12-05 05:14:38,961 missing title: https://www.amazon.com/MSI-GeForce-Architecture-Overclocked-Graphics/dp/B08CLV8CKP
I2020-12-05 05:14:38,963 B08CLV8CKP: not in stock
W2020-12-05 05:14:43,947 missing title: https://www.amazon.com/MSI-GeForce-Architecture-Overclocked-Graphics/dp/B08CLV8CKP
I2020-12-05 05:14:43,950 B08CLV8CKP: not in stock
W2020-12-05 05:14:48,691 missing title: https://www.amazon.com/MSI-GeForce-Architecture-Overclocked-Graphics/dp/B08CLV8CKP
I2020-12-05 05:14:48,694 B08CLV8CKP: not in stock
W2020-12-05 05:14:53,083 missing title: https://www.amazon.com/MSI-GeForce-Architecture-Overclocked-Graphics/dp/B08CLV8CKP
I2020-12-05 05:14:53,086 B08CLV8CKP: not in stock
W2020-12-05 05:14:57,544 missing title: https://www.amazon.com/MSI-GeForce-Architecture-Overclocked-Graphics/dp/B08CLV8CKP
I2020-12-05 05:14:57,546 B08CLV8CKP: not in stock
W2020-12-05 05:15:02,491 missing title: https://www.amazon.com/MSI-GeForce-Architecture-Overclocked-Graphics/dp/B08CLV8CKP
I2020-12-05 05:15:02,493 B08CLV8CKP: not in stock
W2020-12-05 05:15:07,265 missing title: https://www.amazon.com/MSI-GeForce-Architecture-Overclocked-Graphics/dp/B08CLV8CKP
I2020-12-05 05:15:07,268 B08CLV8CKP: not in stock
W2020-12-05 05:15:11,628 missing title: https://www.amazon.com/MSI-GeForce-Architecture-Overclocked-Graphics/dp/B08CLV8CKP
I2020-12-05 05:15:11,631 B08CLV8CKP: not in stock
W2020-12-05 05:15:16,007 missing title: https://www.amazon.com/MSI-GeForce-Architecture-Overclocked-Graphics/dp/B08CLV8CKP
I2020-12-05 05:15:16,012 B08CLV8CKP: not in stock
W2020-12-05 05:15:20,535 missing title: https://www.amazon.com/MSI-GeForce-Architecture-Overclocked-Graphics/dp/B08CLV8CKP
I2020-12-05 05:15:20,537 B08CLV8CKP: not in stock
W2020-12-05 05:15:25,497 missing title: https://www.amazon.com/MSI-GeForce-Architecture-Overclocked-Graphics/dp/B08CLV8CKP
I2020-12-05 05:15:25,503 B08CLV8CKP: not in stock
W2020-12-05 05:15:30,303 missing title: https://www.amazon.com/MSI-GeForce-Architecture-Overclocked-Graphics/dp/B08CLV8CKP
I2020-12-05 05:15:30,306 B08CLV8CKP: not in stock
I2020-12-05 05:15:38,673 B08CLV8CKP: now in stock at 939.99!
W2020-12-05 05:15:44,671 missing title: https://www.amazon.com/MSI-GeForce-Architecture-Overclocked-Graphics/dp/B08CLV8CKP
I2020-12-05 05:15:44,673 B08CLV8CKP: not in stock
W2020-12-05 05:15:49,413 missing title: https://www.amazon.com/MSI-GeForce-Architecture-Overclocked-Graphics/dp/B08CLV8CKP
I2020-12-05 05:15:49,415 B08CLV8CKP: not in stock
W2020-12-05 05:15:54,225 missing title: https://www.amazon.com/MSI-GeForce-Architecture-Overclocked-Graphics/dp/B08CLV8CKP
I2020-12-05 05:15:54,230 B08CLV8CKP: not in stock
W2020-12-05 05:15:58,513 missing title: https://www.amazon.com/MSI-GeForce-Architecture-Overclocked-Graphics/dp/B08CLV8CKP
I2020-12-05 05:15:58,515 B08CLV8CKP: not in stock
Issue Analytics
- State:
- Created 3 years ago
- Comments:5 (3 by maintainers)
Top Results From Across the Web
5 Major Challenges That Make Amazon Data Scraping Painful
Scraping data from Amazon can be difficult. Let us talk about a few issues that we can face with extracting web data from...
Read more >Common Challenges During Amazon Data Collection | Grepsr
Our Amazon scrape data is used in a variety of ways, including: ... Inconsistent versions and features of Amazon across the growing list...
Read more >Finding code inconsistencies using Amazon CodeGuru ...
The inconsistency can be described as: in this package, typically there is an API call of toLowerCase() follows AppConfig.
Read more >How AWS dumps the mental burden of inconsistent APIs on ...
No. It's ClusterSecurityGroups . Amazon Cognito. Cognito has always been “different,” and its API is no exception. Where every other API ...
Read more >web scraping - Scrapy Returns Inconsistent Results
You can use a CSS selector. import scrapy from scrapy import Request class AmzsingleSpider(scrapy.Spider): name = 'amzsingle-parse' def ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@Pipodi Yeah definitely. Right now, I’m working on standing up a unit testing framework so that we can add proper internationalization without breaking anything.
@EricJMarti Yeah, now it works! Thanks. If you have time, we could brainstorm and address https://github.com/EricJMarti/inventory-hunter/issues/54 in some more efficient way.