Spider: Illinois Department of Corrections Advisory Board
See original GitHub issueURL: https://www2.illinois.gov/idoc/aboutus/advisoryboard/Pages/default.aspx
Spider Name: il_corrections
Agency Name: Illinois Department of Corrections Advisory Board
See the contribution guide for information on how to get started
Some information will need to be parsed from PDFs, so it could be useful to look at chi_human_relations
for an example of how to handle this
Issue Analytics
- State:
- Created 4 years ago
- Comments:25 (8 by maintainers)
Top Results From Across the Web
IDOC Advisory Board
An advisory board to the agency is established under the Illinois Compiled Statutes. The Adult Advisory Board is established by Chapter 730 Illinois...
Read more >Stories — Just Media.
We syndicate, or “co-publish,” reporting and narrative writing with local and national outlets. To explore partnership, contact james@justmediaproject.org.
Read more >2017 SECURECHOICE SPIDER ASSURANCE PROGRAM
Demon WP and Tandem, fail to provide adequate reduction of spider ... the Purpose Icon and the Syngenta logo are trademarks of a...
Read more >Guidelines for Environmental Infection Control in Health-Care ...
health-care facilities. Recommendations from CDC and the Healthcare Infection Control Practices. Advisory Committee (HICPAC). Chicago IL; American Society ...
Read more >Homeowner Guide to Spiders around the Home and Yard
spiders – the black widow and the hobo ... The external body of all arachnids – spiders and non- ... Records maintained by...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@cherdeman sure! Scrapy requests are made asynchronously, so if you want to guarantee you have data available for another method you’ll need to make sure you’re chaining callbacks so that they’re called in the right order.
You can see an example of this in
chi_human_relations
where the first request (instart_urls
) goes to a department detail page. Theparse
method is the default callback, so the response from the detail page is checked there, and from there look for a PDF link to parse andyield
a request to that link that will be handled by_parse_schedule
. In_parse_schedule
we handle the response body and then chain another request to_parse_documents
now that we know the schedule details will be available and can be used toyield
meetings.From taking an initial look at your scraper, it looks like you could follow a similar pattern of first parsing all of the links, then
yield
a request to each meeting, pull the PDF, and then potentiallyyield
back to the meeting (which is where this one might be a bit circular). Let me know if that’s helpful! The Scrapy docs are also generally good, so if there’s anything I can help with from those let me know@cherdeman great! I’ll assign you