twarc2 Broken Data (search query)
See original GitHub issueHello,
I am using the academic API v2 and twarc2. It is the first time I am using twarc but something seems to be broken.
I am doing the following commands:
twarc2 search "cats" cats.jsonl #collecting some tweets
It collects the tweets and everything seems fine.
twarc2 csv cats.jsonl cats.csv #converting to csv
Now I get the following error:
Traceback (most recent call last): File “C:\Users\User\AppData\Local\Programs\Python\Python36-32\Scripts\twarc2-script.py”, line 11, in <module> load_entry_point(‘twarc==2.5.0’, ‘console_scripts’, ‘twarc2’)() File “c:\users\user\appdata\local\programs\python\python36-32\lib\site-packages\click\core.py”, line 1137, in call return self.main(*args, **kwargs) File “c:\users\user\appdata\local\programs\python\python36-32\lib\site-packages\click\core.py”, line 1062, in main rv = self.invoke(ctx) File “c:\users\user\appdata\local\programs\python\python36-32\lib\site-packages\click\core.py”, line 1668, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File “c:\users\user\appdata\local\programs\python\python36-32\lib\site-packages\click\core.py”, line 1404, in invoke return ctx.invoke(self.callback, **ctx.params) File “c:\users\user\appdata\local\programs\python\python36-32\lib\site-packages\click\core.py”, line 763, in invoke return __callback(*args, **kwargs) File “c:\users\user\appdata\local\programs\python\python36-32\lib\site-packages\twarc_csv.py”, line 44, in csv na_action=“ignore”, TypeError: applymap() got an unexpected keyword argument ‘na_action’
So I tried to convert the json file via an online json-to-csv converter. That kind of works but there are a lot of empty rows and missing information in the csv when I open it with Excel. So I am assuming something is wrong with the data that twarc2 collects for me.
What I tried to do:
- Update twarc and twarc csv (did not help)
- Flattening the data (did not help either)
I am really desperate right now and would appreciate any kind of help! I need to finish my academic project within a given time frame and this is a serious problem for me 😦
Issue Analytics
- State:
- Created 2 years ago
- Comments:10 (6 by maintainers)
Top GitHub Comments
Ah, glad that worked 😃
For your second problem - that is nothing to do with the output and everything to do with Excel mangling your csv file. I wouldn’t recommend using Excel at all if you can avoid it.
If you do need to use Excel there is a totally non obvious set of tricks to read tweet related csvs, but I’d have to look it up when I’m in the office tomorrow.
On Sun, 26 Sep 2021, 23:53 Sam Hames, @.***> wrote:
Give Google Sheets a try maybe?