question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

can't export to json file when use JsonWriterPipeline

See original GitHub issue

Scrapy 1.0.5 Python 2.7.11

pipelines.py

# -*- coding: utf-8 -*-

# Define your item pipelines here
#
# Don't forget to add your pipeline to the ITEM_PIPELINES setting
# See: http://doc.scrapy.org/en/latest/topics/item-pipeline.html

import codecs
import json


class JsonWriterPipeline(object):
    def __init__(self):
        self.file = codecs.open('items.json', mode = 'wb', encoding = 'utf-8')

    def process_item(self, item, spider):
        line = json.dumps(dict(item)) + "\n"
        self.file.write(line.decode('unicode_escape'))
        return item

Error information

2016-03-16 00:03:39 [scrapy] ERROR: Error processing {'records': [{'en_kicker': u'Sinosphere'}]}
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/site-packages/twisted/internet/defer.py", line 588, in _runCallbacks
    current.result = callback(current.result, *args, **kw)
  File "/Users/linzhen/OneDrive/Source/newsCrawler/newsCrawler/pipelines.py", line 17, in process_item
    line = json.dumps(dict(item)) + "\n"
  File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 244, in dumps
    return _default_encoder.encode(obj)
  File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/encoder.py", line 207, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/encoder.py", line 270, in iterencode
    return _iterencode(o, 0)
  File 
![qq20160316-0 2x](https://cloud.githubusercontent.com/assets/4090783/13784812/a6394e30-eb0b-11e5-9753-4fb1bfd3530b.png)
"/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/encoder.py", line 184, in default
    raise TypeError(repr(o) + " is not JSON serializable")
TypeError: {'en_kicker': u'Sinosphere'} is not JSON serializable

item variables

Issue Analytics

  • State:closed
  • Created 8 years ago
  • Comments:8 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
peter-wang-wslcommented, Mar 15, 2017

@kmike I try it.And your sample need to be changed like below.

from scrapy.utils.serialize import ScrapyJSONEncoder json.dumps(…, cls=ScrapyJSONEncoder)

0reactions
peter-wang-wslcommented, Mar 15, 2017

@kmike thanks your answer and suggestion

Read more comments on GitHub >

github_iconTop Results From Across the Web

Read/Write JSON Files with Node.js | heynode.com
Here we create a function called jsonReader that will read and parse a JSON file for us. It takes the path to the...
Read more >
How to view and "accept malformed JSON" with CDAP 4.2 ...
I'm using it to "wrangle" a JSON file, which appeared to be fine in the Data Preparation screen, ... MalformedJsonException: Use JsonReader.
Read more >
Read a JSON Stream - Data Pipeline
JsonReader is an input reader that can read a JSON stream. It can be created using a Reader object as demonstrated in this...
Read more >
bash - Parsing JSON with Unix tools - Stack Overflow
To write maintainable code, I always use a real parsing library. I haven't tried jsawk, but if it works well, that would address...
Read more >
How to manage a large JSON file efficiently and quickly - Sease
Anyway, if you have to parse a big JSON file and the structure of the data is too complex, it can be very...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found