question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Logging can't format stack trace with non-ascii chars on Python 2

See original GitHub issue

Hi,

I experience the same issue as described in #1602. However, I’m not using Django.

The stats look like this:

{'downloader/request_bytes': 47621,
 'downloader/request_count': 103,
 'downloader/request_method_count/GET': 103,
 'downloader/response_bytes': 1162618,
 'downloader/response_count': 103,
 'downloader/response_status_count/200': 101,
 'downloader/response_status_count/302': 2,
 'dupefilter/filtered': 2,
 'finish_reason': 'finished',
 'finish_time': datetime.datetime(2018, 3, 9, 2, 3, 15, 748633),
 'httpcache/firsthand': 72,
 'httpcache/hit': 31,
 'httpcache/miss': 72,
 'httpcache/store': 72,
 'item_scraped_count': 48,
 'log_count/DEBUG': 215,
 'log_count/ERROR': 1,
 'log_count/INFO': 9,
 'memusage/max': 121434112,
 'memusage/startup': 69783552,
 'mongodb/item_stored_count': 48,
 'request_depth_max': 3,
 'response_received_count': 101,
 'scheduler/dequeued': 102,
 'scheduler/dequeued/memory': 102,
 'scheduler/enqueued': 102,
 'scheduler/enqueued/memory': 102,
 'spider_exceptions/AttributeError': 1,
 'start_time': datetime.datetime(2018, 3, 9, 2, 0, 52, 510449)}

But there’s no mention about AttributeError (or any other error) in the log.

I’m executing the spiders on Scrapyd instances running in Docker container. This is the first two lines of the log showing components’ versions:

2018-03-09 02:00:52 [scrapy.utils.log] INFO: Scrapy 1.5.0 started (bot: realestate)
2018-03-09 02:00:52 [scrapy.utils.log] INFO: Versions: lxml 4.1.1.0, libxml2 2.9.7, cssselect 1.0.3, parsel 1.3.1, w3lib 1.18.0, Twisted 17.9.0, Python 2.7.12 (default, Nov 20 2017, 18:23:56) - [GCC 5.4.0 20160609], pyOpenSSL 17.5.0 (OpenSSL 1.1.0g  2 Nov 2017), cryptography 2.1.4, Platform Linux-4.4.0-103-generic-x86_64-with-Ubuntu-16.04-xenial

There’s nothing special in the settings.py, but I can provide it if needed.

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:20 (9 by maintainers)

github_iconTop GitHub Comments

3reactions
dangracommented, Mar 16, 2018

Thanks for sharing the code that causes the issue @tlinhart

I managed to reproduce it with the following minimal spider:

# -*- coding: utf-8 -*-
import scrapy


class Spider(scrapy.Spider):
    name = 'bug3161'
    start_urls = ["https://scrapy.org"]

    def parse(self, response):
        u'příprav' in None.lower()
$ scrapy runspider myspider.py --logfile foo.log
Traceback (most recent call last):
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/logging/__init__.py", line 882, in emit
    stream.write(fs % msg.encode("UTF-8"))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc5 in position 404: ordinal not in range(128)
Logged from file scraper.py, line 158
$ scrapy runspider myspider.py
2018-03-16 15:54:21 [scrapy.utils.log] INFO: Scrapy 1.5.0 started (bot: scrapybot)
2018-03-16 15:54:21 [scrapy.utils.log] INFO: Versions: lxml 4.2.0.0, libxml2 2.9.8, cssselect 1.0.3, parsel 1.4.0, w3lib 1.19.0, Twisted 17.9.0, Python 2.7.10 (default, Jul 15 2017, 17:16:57) - [GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.31)], pyOpenSSL 17.5.0 (OpenSSL 1.1.0g  2 Nov 2017), cryptography 2.1.4, Platform Darwin-17.4.0-x86_64-i386-64bit
2018-03-16 15:54:21 [scrapy.crawler] INFO: Overridden settings: {'SPIDER_LOADER_WARN_ONLY': True}
2018-03-16 15:54:21 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.memusage.MemoryUsage',
 'scrapy.extensions.logstats.LogStats',
 'scrapy.extensions.telnet.TelnetConsole',
 'scrapy.extensions.corestats.CoreStats']
2018-03-16 15:54:21 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
 'scrapy.downloadermiddlewares.retry.RetryMiddleware',
 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
 'scrapy.downloadermiddlewares.stats.DownloaderStats']
2018-03-16 15:54:21 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
 'scrapy.spidermiddlewares.referer.RefererMiddleware',
 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
 'scrapy.spidermiddlewares.depth.DepthMiddleware']
2018-03-16 15:54:21 [scrapy.middleware] INFO: Enabled item pipelines:
[]
2018-03-16 15:54:21 [scrapy.core.engine] INFO: Spider opened
2018-03-16 15:54:21 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2018-03-16 15:54:21 [scrapy.extensions.telnet] DEBUG: Telnet console listening on 127.0.0.1:6023
2018-03-16 15:54:21 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://scrapy.org> (referer: None)
2018-03-16 15:54:21 [scrapy.core.scraper] ERROR: Spider error processing <GET https://scrapy.org> (referer: None)
Traceback (most recent call last):
  File "/Users/daniel/envs/test-scrapy-bug-JRmRrwNr/lib/python2.7/site-packages/twisted/internet/defer.py", line 653, in _runCallbacks
    current.result = callback(current.result, *args, **kw)
  File "/Users/daniel/myspider.py", line 12, in parse
    u'příprav' in None.lower()
AttributeError: 'NoneType' object has no attribute 'lower'
2018-03-16 15:54:21 [scrapy.core.engine] INFO: Closing spider (finished)
2018-03-16 15:54:21 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 209,
 'downloader/request_count': 1,
 'downloader/request_method_count/GET': 1,
 'downloader/response_bytes': 14583,
 'downloader/response_count': 1,
 'downloader/response_status_count/200': 1,
 'finish_reason': 'finished',
 'finish_time': datetime.datetime(2018, 3, 16, 18, 54, 21, 734389),
 'log_count/DEBUG': 2,
 'log_count/ERROR': 1,
 'log_count/INFO': 7,
 'memusage/max': 48529408,
 'memusage/startup': 48529408,
 'response_received_count': 1,
 'scheduler/dequeued': 1,
 'scheduler/dequeued/memory': 1,
 'scheduler/enqueued': 1,
 'scheduler/enqueued/memory': 1,
 'spider_exceptions/AttributeError': 1,
 'start_time': datetime.datetime(2018, 3, 16, 18, 54, 21, 180960)}
2018-03-16 15:54:21 [scrapy.core.engine] INFO: Spider closed (finished)

@ayushmankoul if you want to help, above is a starting point. thanks

2reactions
ayushmankoulcommented, Mar 14, 2018

Hey, @cathalgarvey @kmike I am interested in solving this bug.Please help me to fix this bug.Any guidance would be surely helpful to me.Thank You

Read more comments on GitHub >

github_iconTop Results From Across the Web

properly logging unicode & utf-8 exceptions in python 2
The logging.exception(msg) call seems to properly use the repr() for formatting the exception for logging and prefixes it with your msg .
Read more >
Logging HOWTO — Python 3.11.1 documentation
Logging is a means of tracking events that happen when some software runs. The software's developer adds logging calls to their code to...
Read more >
Overcoming frustration: Correctly using unicode in python2
Python will try to implicitly convert from unicode to byte str... but it will throw an exception if the bytes are non-ASCII:.
Read more >
1541963 – Error when deploying overcloud - Red Hat Bugzilla
Description of problem: While deploying overcloud i had an issue UnicodeEncodeError: [stack@director ~]$ ./deploy.sh Removing the current plan files ...
Read more >
Resolving Issue with Logging Formatter & Unicode in Python 2.7
In Python 2.7 on Windows, I was using a RotatingFileHandler with a ... the logging, I was getting an error with the following...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found