question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

CSV Null interpretation

See original GitHub issue

When exporting CSVs, I suggest that there should be an extra option of selecting how to represent NULLs.

Options:

  1. Empty field
  2. NULL
  3. \N

If option 1 is available, another option should appear, which is that an empty string should be outputted as "" and not just an empty field.

The latter 2 options are automatically recognised by MySQL.

Issue Analytics

  • State:closed
  • Created 7 years ago
  • Reactions:1
  • Comments:15 (7 by maintainers)

github_iconTop GitHub Comments

3reactions
hmontonecommented, Aug 20, 2018

@serge-rider I have been thinking about this “standard conformance” in the context of CSV.

We could try to interpret various special strings as NULLs but this won’t be standard.

I don’t think this belongs to CSV specification at all. If I understand it correctly, it defines the format of CSV file, but does not suggest anything about contents itself, semantically. For example, if we export NULLs as, say, RaNdOm StRiNg, then we have a file where NULL-field is written as RaNdOm StRiNg. That does not conflict with CSV specification, because that is fully valid CSV-file. Also, we can read that CSV-file and interpret RaNdOm StRiNg as NULL in some other context, because CSV file format standard does not have any idea about the data itself, or the application that is going to use the data in it.

On the other hand, if we were to use quoted empty field for empty string and non-quoted empty field for NULL, it could be argued that importing such a file might be non-conformant with CSV specification, since RFC4180 states:

Each field may or may not be enclosed in double quotes

, which could be interpreted so that quoted and non-quoted field should be treated equally.

But since CSV specification does not distinguish NULL and zero-length string anyway, which means that empty string would be interpreted as either NULL or zero-length string (or possibly something else), it does not seem very serious after all (even if is a violation of the specification, which is something I am not fully sure about), because:

  1. The non-comformant behaviour would be optional
  2. Actual CSV file would still be valid as defined in RFC4180
  3. Interpretation of distinguished vs non-distinguished NULL and zero-length string would be identical if parsed with strictly standard-compliant parser anyway (meaning that empty fields are interpreted as either NULL or zero-length string, or possibly something else). Whole thing would make any difference only when parsed with assumingly non-compliant parser, which, again, would be optionally chosen behaviour.

I think that bottom line is: If I use DBeaver to export some file as CSV, and there is a feature of CSV import also, then it should be possible to do that in a way that imported data is identical to exported data. That does not apply currently, if there are both NULLs and zero-length strings in the data.

1reaction
hmontonecommented, Aug 18, 2018

So, if I have a table:

CREATE TABLE t (
  c1 VARCHAR(255),
  c2 VARCHAR(255) NOT NULL
)

, and there is a line where c1 is NULL and c2 is an empty string, is it possible to export and import that as CSV? I am not talking specifically about MySQL, in fact I currently use IBM DB2. In my tests non-quoted NULL is imported as string containing “NULL”. One idea would be to interpret empty value without quotes as NULL and empty value with quotes as an empty string. I would like to import relatively big set of data, I am afraid that using INSERT -clauses would make file unnecessarily big. In fact, that would choke DBeaver if opened in it.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Representing Null in CSV - Garret Wilson
A problem arises when representing null for strings. How can one distinguish between an empty string value "" , and no value at...
Read more >
How to keep null values when writing to csv - Stack Overflow
Specifies the string that represents a null value. The default is \N (backslash-N) in text format, and an unquoted empty string in CSV...
Read more >
5 Magic Fixes for the Most Common CSV File reader Problems
Make a conscious choice about how you want to handle NULL values. Typically you can use \N to represent NULL values in the...
Read more >
Scripting for CSV Null Value Is Not Interpreted By Loftware As ...
Any field that is null, or with nothing in the value, is being sent as "". In Loftware, the scripting engine sees the...
Read more >
Data interpretation from CSV file - alphanumeric records ...
Those alpha's are appearing as Null. I can correct the issue for this one column by switching the Text Qualifier from automatic to...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found