Glassdoor.com is not working
See original GitHub issueIssue Template
Description
Just today I discovered that when scraping Glassdoor.com, JobFunnel fails. Please include the steps to reproduce. List any additional libraries that are affected.
Steps to Reproduce
- Comment out Indeed and Monster from providers options in
settings.yaml
as such:
# - 'Indeed'
# - 'Monster'
- 'GlassDoor'
- Run job funnel
funnel -s settings.yaml
Expected behavior
Scrape Glassdoor.com and store jobs in master_list.csv
Actual behavior
JobFunnel output:
jobfunnel initialized at 2020-05-05
no master-list, filter-list was not updated
jobfunnel glassdoor to pickle running @ 2020-05-05
failed to scrape GlassDoor: 'NoneType' object has no attribute 'text'
Traceback (most recent call last):
File "/usr/local/bin/funnel", line 11, in <module>
load_entry_point('JobFunnel==2.1.6', 'console_scripts', 'funnel')()
File "/usr/local/lib/python3.6/dist-packages/jobfunnel/__main__.py", line 55, in main
jf.update_masterlist()
File "/usr/local/lib/python3.6/dist-packages/jobfunnel/jobfunnel.py", line 291, in update_masterlist
raise ValueError('No scraped jobs, cannot update masterlist')
Environment
- Operating system and version: Linux Mint(Ubuntu 18.04)
- Desktop Environment and/or Window Manager: Cinnamon
- Tested on .com(United States domain) and .ca(Canada domain) NOTE: I also ran JobFunnel on an isolated docker container(Ubuntu 18.04) and the issue persisted.
I discovered this while inspecting glassdoor.py for testing. I will try my best to tackle this issue in the upcoming days. Hopefully we’ll fix it soon!
Cheers!
Issue Analytics
- State:
- Created 3 years ago
- Reactions:2
- Comments:6
Top Results From Across the Web
Glassdoor down? Current status and problems - Downdetector
Real-time overview of problems with Glassdoor. Service down, can't log in or send messages? We'll tell you what is going on.
Read more >Troubleshooting Tips | Glassdoor Help Center
Refresh the page and try again · Disable pop-up or ad blockers, refresh the page and try again · Open the site in...
Read more >Is Glassdoor Down Right Now?
Glassdoor down? Check whether Glassdoor.com server is down right now or having outage problems for everyone or just for you.
Read more >Glassdoor down today December, 2022 ... - UpdownRadar
Glassdoor website down Today December, 2022? Can't log in? Real-time problems and outages - here you'll see what is going on.
Read more >Glassdoor - Jobs Search & More - Apps on Google Play
Search jobs, find companies hiring now, and get useful interview tips. We've got comprehensive job search tools and advice to help you get...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Can confirm that selenium is a solution to this problem. They seem to be running javascript before bringing up the page which is why we can’t get any html data. Using a webdriver you can bring up the page pretty easily and requires minimal effort but slows the process of scraping.
You first must implement the get method for glassdoor as I have done.
We can keep other methods of scraping the same while changing glassdoor. If the user enables scraping of glassdoor in the yaml we will have to give warning of the need for
chromedriver
orgeckodriver
prior.Checkout my branch to see the changes https://github.com/PaulMcInnis/JobFunnel/tree/studentbrad/glassdoor
No problem! I’m really thankful that you’re willing to help out on this 😄