Make it possible to update settings in `__init__` or `from_crawler`
See original GitHub issueThis issue might be related to https://github.com/scrapy/scrapy/issues/1305
I noticed that settings
are frozen in https://github.com/scrapy/scrapy/blob/master/scrapy/crawler.py#L57
However, in a given project I had a requirement to change some settings based on some spider arguments. An alternative would be to write this spider as a base class and extend it from specific spiders setting the proper settings
.
However, I think it would make sense to only freeze settings after the spider and other components were initialized. Or, provide some other entry point to configure settings based on arguments.
The other option is to use -s
arguments, but in my case I was changing the FEED_EXPORT_FIELDS
setting (https://docs.scrapy.org/en/latest/topics/feed-exports.html#std:setting-FEED_EXPORT_FIELDS).
Any thoughts here?
Issue Analytics
- State:
- Created 5 years ago
- Comments:12 (7 by maintainers)
Top GitHub Comments
usage of
-s
argument with list basedFEED_EXPORT_FIELDS
setting:scrapy crawl quotes -s FEED_EXPORT_FIELDS=author,quote -o data_without_tags.csv
Option is to set list setting like
FEED_EXPORT_FIELDS
inside command is actual for all settings where BaseSettings.getlist method is used to read settings.: https://github.com/scrapy/scrapy/blob/b8594353d03be5574f51766c35566b713584302b/scrapy/settings/__init__.py#L161-L178I noticed that
BaseSettings.freeze
method does only one thing: https://github.com/scrapy/scrapy/blob/b8594353d03be5574f51766c35566b713584302b/scrapy/settings/__init__.py#L352-L360frozen
attribute is used in_assert_mutability
method which is actually prevents any changes to settings https://github.com/scrapy/scrapy/blob/b8594353d03be5574f51766c35566b713584302b/scrapy/settings/__init__.py#L336-L338But if we change
frozen
attribute toFalse
settings will become mutable and application will able to change make changes to settings using methods where_assert_mutability
called:set
,setmodule
,update
,delete
,__delitem__
Spider code with updating settings inside
from_crawler
will look like this:One possible solution for this could also be creating a few class variables and using them in the
custom_settings
being passed to the spider and then update the values of these class variables in the__init__
function of the spider, so when these custom settings are being applied, it will start using the updated values as passed from the__init__
.Example: