Error when LAS data contains text with spaces enclosed in double quotes
See original GitHub issueI am encountering an error trying to load a LAS file that has a paramaeter that sometimes contains spaces. This seem to occur for many LAS files that contain picks: the pick name is a text field which may contain spaces.
It is possible to replace these spaces with underscores, or something like that?
Here’s an example LAS file:
~Version
VERS . 2.0 : CWLS LOG ASCII STANDARD - VERSION 2.0
WRAP . NO : ONE LINE PER DEPTH STEP
DLM . SPACE : DELIMITING CHARACTER(SPACE TAB OR COMMA)
~Well Information
#_______________________________________________________________________________
#
#PARAMETER_NAME .UNIT VALUE : DESCRIPTION
#_______________________________________________________________________________
STRT .m 321.16 : First reference value
STOP .m 3188.59 : Last reference value
STEP .m 0 : Step increment
NULL . -9999 : Missing value
WELL . xxx : Well name
~Curve Information
#_______________________________________________________________________________
#
#LOGNAME .UNIT LOG_ID : DESCRIPTION
#_______________________________________________________________________________
MD .m :
ZONE .unitless :
~Ascii
321.16 pick_alpha
1753.2 pick_beta
1953.5 "pick gamma"
2141.05 "pick delta"
2185.34 pick_epsilon
Here is what I get from LASIO
# Try to load data as pandas table
las = lasio.read('data/test_file.las')
las.df()
The result is badly parsed:
ZONE
MD
321.16 pick_alpha
1753.2 pick_beta
1953.5 "pick
gamma" 2141.05
"pick delta"
2185.34 pick_epsilon
PS thanks for your work on the library so far, it is proving tremendously useful!
Issue Analytics
- State:
- Created 5 years ago
- Comments:8 (4 by maintainers)
Top Results From Across the Web
Error with Line Breaks, Double Quotes and white spaces in ...
Hi,. I'm facing issues, while generating CSV file using Data Flow activity with Common Data Model. having issues when we have Line Breaks, ......
Read more >Snowflake unable to load data with double quote in Data as ...
The way you overcome this error is to re-export your source data with the double quotes being escaped somehow. I don't know Snowflake, ......
Read more >Comma-separated values - Wikipedia
A comma-separated values (CSV) file is a delimited text file that uses a comma to separate values. Each line of the file is...
Read more >2. Lexical analysis — Python 3.11.1 documentation
Python reads program text as Unicode code points; the encoding of a source file ... A logical line that contains only spaces, tabs,...
Read more >Why does my shell script choke on whitespace or other ...
Always use double quotes around variable substitutions and command ... the files, this can't work with file names containing spaces.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@Connossor I have changed the data section code to split into items while respecting quoted strings. Hopefully that fixes the original issue you raised, although obviously other NULL values are still being ignored per @Anjum48’s comment. I’ve opened #422 to deal with that.
At the moment, if one or more of the columns are non-numeric, the NULL_POLICY fails to replace missing values with
np.nan
.This is because when the below array is created https://github.com/kinverarity1/lasio/blob/817fb82914cbe62009f651340ebec51fbb466174/lasio/reader.py#L456-L458 the array will be of type string (<U32), and the missing numbers in
lasio.defaults.NULL_SUBS
won’t be matched, since we are comparing as string e.g."-999"
against a number like-999
.