Header order is lost when passing a session to create_scraper
See original GitHub issueThe headers copied from a normal requests
Session instance are not ordered aka isinstance(session.headers, OrderedDict) is False
. If that session is passed to cfscrape.create_scraper(sess=requests.Session())
the scraper returned will not have it’s headers
attribute defined properly since it’s overridden.
Issue Analytics
- State:
- Created 4 years ago
- Comments:13 (5 by maintainers)
Top Results From Across the Web
python - Having trouble maintaining order of Session headers ...
Now what happens if we pass some headers to it in some format we want? import requests headers = { "accept": "text/html ...
Read more >Advanced Usage — Requests 2.28.1 documentation
Any dictionaries that you pass to a request method will be merged with the session-level values that are set. The method-level parameters override...
Read more >Sessions apparently corrupting header values on subsequent ...
On the first case, they match the (valid) API key I wrote on the test code. In the failed request, the last few...
Read more >Request Headers for Web Scraping - YouTube
With every HTTP request there are headers that contain information about that request. We can maipulate these with requests or which ever ...
Read more >cloudscraper - PyPI
Prints out header and content information of the request for debugging. ... If you already have an existing Requests session, you can pass...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I was thinking something like this:
Code snippet
@Anorov @lukele What do you think?
I’ve made up my mind on how I think this should be done. The headers should be merged in order to retain order and the
sess
argument should be incompatible with other arguments. If other arguments such as header/cookies/params/data are allowed with thesess
argument then they should be merged.There is a helper function to aid in merging those attributes: https://github.com/kennethreitz/requests/blob/a79a63390bc963e5924021086744e53585566307/requests/sessions.py#L49-L77
But at least cookies would require special handling. I’m voting to disallow extra argument with the
sess
argument. Whether or not to change the current behavior to make use ofSession.__attrs__
is a completely different issue. I don’t plan on including that in a PR.Code snippet
I think we can release this with a minor bump since passing
sess
with keyword arguments doesn’t currently make sense and didn’t really work for 99% of use cases anyway.Alternatively, we can simply keep everything the same and only address the headers which might be the best possible option as of right now.
@Anorov @lukele