question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

R - read_feather() error: embedded nul in string ...

See original GitHub issue

First off - thanks for the feather package! I use it all the time to quickly transfer large data between R and Python, and just to read and write data because it’s FAST.

Unfortunately I frequently run into the same issue in R and debugging it has been tricky. I usually work with data.tables, writing and reading them as feather files with something like

write_feather(transactions, "Data/transactions1.feather")
transactions1 <- data.table(read_feather("Data/transactions1.feather"))

Unfortunately, I frequently get the error “embedded nul in string”. In this case,

Error in coldataFeather(x, i) : 
  embedded nul in string: 'Acoustic\0\0\0\0\0\0\031\xa5\x9b\001)\xa5\x9b\001礛\001\006\xa5\x9b\0016\xa4\x9b\0017\xa4\x9b\001\x9d\xa4\x9b\001\x96\xa4\x9b\001\xf3\xa2\x9b\001M\xa3\x9b\001K\xa3\x9b\001P\xa3\x9b\001a\xa3\x9b\001\xba\xa9\x9b\001\xed\xa3\x9b\001\x8a\xa3\x9b\001\xff\xa2\x9b\0011\xa3\x9b\001\xb1\xa3\x9b\001a\xa2\x9b\001\b\xa3\x9b\001\ua89b\001\xf2\xa2\x9b\001\x98\xa9\x9b\001\xaa\xa2\x9b\001\xa9\xa2\x9b\001\xb0\xa2\x9b\001١\x9b\001R\xa1\x9b\001[\xa1\x9b\001H\xa1\x9b\001\xb6\xa2\x9b\001s\xa1\x9b\001Ѡ\x9b\001\x87\xa9\x9b\001\x96\xa0\x9b\001\x99\xa1\x9b\001\x9c\xa1\x9b\001-\xa1\x9b\001|\xa0\x9b\001\xa6\xa0\x9b\001\xab\xa0\x9b\001f\xa0\x9b\001h\xa0\x9b\001(\xa0\x9b\001\x81\xa9\x9b\0017\xa0\x9b\001\x80\xa9\x9b\001\a\xa1\x9b\001ӟ\x9b\001\xbb\x9f\x9b\001\xbc\x9f\x9b\001m\x9f\x9b'

Debugging is epecially weird - if I try slicing my data in half, sometimes each half of the dataset will read and write as feather format just fine. Needless to say I haven’t been able to build a reproducible example of this error and I can’t share my large transactions dataset. Any tips to help me figure out what’s going wrong?

str(transactions1)
Classes ‘data.table’ and 'data.frame':  3000001 obs. of  6 variables:
 $ ArticleID    : int  13516378 13516378 13516378 13516379 13516379 13516379 13516379 13516379 13516379 13516379 ...
 $ ArticleTags  : int  34 34 34 24 24 24 24 24 24 24 ...
 $ Tagset       : chr  "Apex predator|Autonomy|City|Climate|Ethics|Exercise|Human resource management|Jacksonville Jaguars|Jacksonville, Florida|Jaguar"| __truncated__ "Apex predator|Autonomy|City|Climate|Ethics|Exercise|Human resource management|Jacksonville Jaguars|Jacksonville, Florida|Jaguar"| __truncated__ "Apex predator|Autonomy|City|Climate|Ethics|Exercise|Human resource management|Jacksonville Jaguars|Jacksonville, Florida|Jaguar"| __truncated__ "AFC North|Baltimore Ravens|Blood|Cincinnati|Cincinnati Bengals|Cleveland Browns|Discrimination|Emotion|Hatred|Heinz|Hematology|"| __truncated__ ...
 $ TransactionID: int  153089414 153089435 153089428 153089444 153089445 153089448 153089450 153089446 153089447 153089453 ...
 $ TagID        : int  23892 26058 26229 344 1977 2776 4828 4829 4963 7076 ...
 $ Tag          : chr  "Stained glass" "Trousers" "U.S. state" "AFC North" ...
 - attr(*, ".internal.selfref")=<externalptr> 

Issue Analytics

  • State:closed
  • Created 7 years ago
  • Comments:23 (6 by maintainers)

github_iconTop GitHub Comments

3reactions
wesmcommented, Nov 14, 2018

We’re working on getting Feather users migrated over to the Arrow C++ libraries (see https://github.com/apache/arrow/pull/2947), so this issue should be resolved after the migration, but we should test to verify. @jameslamb would you be up for writing an ad hoc test (not necessarily to be run in testthat, but could be in a separate directory of integration tests) to check?

1reaction
nealrichardsoncommented, Aug 15, 2021

We can discuss doing a release with @hadley, who is the package’s current maintainer. Seems like it could be a good idea since there are known limitations/bugs in the old implementation.

Read more comments on GitHub >

github_iconTop Results From Across the Web

embedded nul in string in númeric data - Stack Overflow
In my case, the problem with fread was the size of my file (2.7G). Using R version 3.6.0, fread was unable to read...
Read more >
R - read_feather() error: embedded nul in string - Bountysource
First off - thanks for the feather package! I use it all the time to quickly transfer large data between R and Python,...
Read more >
Dealing with embedded nul in string manipulation with R
The past hours I've been ramming my head into the same problem over and over. I had to deal with multiple strings of...
Read more >
In scan(file... : embedded nul(s) found in input - Statistics Globe
In this R tutorial you'll learn how to deal with the “Warning message: In scan(file = file, what = what, sep = sep,...
Read more >
[Solved]-Error with fread in R--embedded nul in string: '\0'-R
[Solved]-Error with fread in R--embedded nul in string: '\0'-R. Search. score:4. Accepted answer. The csv files were populated with ^@ and they were...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found