question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

twarc2 Broken Data (search query)

See original GitHub issue

Hello,

I am using the academic API v2 and twarc2. It is the first time I am using twarc but something seems to be broken.

I am doing the following commands:

twarc2 search "cats" cats.jsonl #collecting some tweets

It collects the tweets and everything seems fine.

twarc2 csv cats.jsonl cats.csv #converting to csv

Now I get the following error:

Traceback (most recent call last): File “C:\Users\User\AppData\Local\Programs\Python\Python36-32\Scripts\twarc2-script.py”, line 11, in <module> load_entry_point(‘twarc==2.5.0’, ‘console_scripts’, ‘twarc2’)() File “c:\users\user\appdata\local\programs\python\python36-32\lib\site-packages\click\core.py”, line 1137, in call return self.main(*args, **kwargs) File “c:\users\user\appdata\local\programs\python\python36-32\lib\site-packages\click\core.py”, line 1062, in main rv = self.invoke(ctx) File “c:\users\user\appdata\local\programs\python\python36-32\lib\site-packages\click\core.py”, line 1668, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File “c:\users\user\appdata\local\programs\python\python36-32\lib\site-packages\click\core.py”, line 1404, in invoke return ctx.invoke(self.callback, **ctx.params) File “c:\users\user\appdata\local\programs\python\python36-32\lib\site-packages\click\core.py”, line 763, in invoke return __callback(*args, **kwargs) File “c:\users\user\appdata\local\programs\python\python36-32\lib\site-packages\twarc_csv.py”, line 44, in csv na_action=“ignore”, TypeError: applymap() got an unexpected keyword argument ‘na_action’

So I tried to convert the json file via an online json-to-csv converter. That kind of works but there are a lot of empty rows and missing information in the csv when I open it with Excel. So I am assuming something is wrong with the data that twarc2 collects for me.

What I tried to do:

  • Update twarc and twarc csv (did not help)
  • Flattening the data (did not help either)

I am really desperate right now and would appreciate any kind of help! I need to finish my academic project within a given time frame and this is a serious problem for me 😦

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:10 (6 by maintainers)

github_iconTop GitHub Comments

2reactions
SamHamescommented, Sep 26, 2021

Ah, glad that worked 😃

For your second problem - that is nothing to do with the output and everything to do with Excel mangling your csv file. I wouldn’t recommend using Excel at all if you can avoid it.

If you do need to use Excel there is a totally non obvious set of tricks to read tweet related csvs, but I’d have to look it up when I’m in the office tomorrow.

On Sun, 26 Sep 2021, 23:53 Sam Hames, @.***> wrote:

Okay, I think I see what your problem is now - I don’t think pandas is building precompiled wheels for a version of python that old. I’d recommend you update your Python version and try installing again.

On Sun, 26 Sep 2021, 23:44 desperateBeginner7, @.***> wrote:

I already tried that as well:

C:\Users\User>pip3 install --upgrade pandas Requirement already satisfied: pandas in c:\users\user\appdata\local\programs\python\python36-32\lib\site-packages (1.1.5) Requirement already satisfied: pytz>=2017.2 in c:\users\user\appdata\local\programs\python\python36-32\lib\site-packages (from pandas) (2021.1) Requirement already satisfied: python-dateutil>=2.7.3 in c:\users\user\appdata\local\programs\python\python36-32\lib\site-packages (from pandas) (2.8.1) Requirement already satisfied: numpy>=1.15.4 in c:\users\user\appdata\local\programs\python\python36-32\lib\site-packages (from pandas) (1.19.5) Requirement already satisfied: six>=1.5 in c:\users\user\appdata\local\programs\python\python36-32\lib\site-packages (from python-dateutil>=2.7.3->pandas) (1.15.0)

Did not help…

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/DocNow/twarc/issues/546#issuecomment-927309167, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACADAUPZW46CL7HCCV5BXFTUD4PVHANCNFSM5EYVAMKQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

1reaction
edsucommented, Sep 26, 2021

Give Google Sheets a try maybe?

Read more comments on GitHub >

github_iconTop Results From Across the Web

twarc2 search without configure on Windows throws JSON ...
When I run the command: twarc2 stream blm > tweets.json1, it creates a file "tweets" but without any data. I have tried installing,...
Read more >
twarc.Client2 - Read the Docs
Gets geographic places that can be useful in queries. This is a v1.1 endpoint but is useful in querying the v2 API. Calls...
Read more >
Query problems using full archive tweet search
I got a problem with the parameter 'query', I want to collect tweets ... Search - PeerTubeFR It will be better to use...
Read more >
Search Tweets - How to build a query | Docs - Twitter Developer
Use results to narrow the query. As you test the query, you should scan the returned Tweets to see if they include the...
Read more >
Harvesting Twitter Data with twarc - The Carpentries Incubator
If you are a twitter user, what terms have you encountered in your use? ... Let's start configuring twarc by sending typing in...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found