question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Parsing the data file incorrectly

See original GitHub issue

I am currently working on a project converting EBCDIC binary file to UTF-8 text file. I am using cobrix but the output seems to be incorrect. I am using the following script to load the datafile and copyBook:

dataFrame = spark.read.format("cobol").options(copybook = copyBook).option("is_record_sequence", "true").load(filename)

The output is showing as below: image

A screenshot from original data presentation from mainframe:

image

It looks like when parsing the data, it always skips the “EOB_FAMILY_NUM” field and that field will always be Null. Other fields are mismatched as well. I have tried adding more options like .option("rdw_adjustment", 4) but it doesn’t solve the issue. Do you know anything I can do to solve that issues?

I also attached the copy book screenshot below:

image

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:7

github_iconTop GitHub Comments

1reaction
jianyu-gongcommented, Jan 4, 2021

Hi Ruslan, after getting my new files with RDW. Cobrix is working perfectly fine now. Thanks again!

0reactions
yruslancommented, Dec 26, 2020

In order to read variable record length files, there should be a way to determine the record length for each record. RDW is the best way, it is general, explicit, and deterministic. So if you can preserve RDW it would be very easy to extract data from the file. Other options are more complicated. If one of the record fields contains record size, it can be used. If there is no such field, but there is a field that determines record type, a custom record extractor can be used.

You can send the file and the copybook (or links to them on GDrive/Dropbox) to yruslan@gmail.com.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to Fix Photoshop Problem Parsing the JPEG Data
The above-mentioned reasons can be the causes of the Photoshop problem parsing the data. However, the main cause could be a corrupt image...
Read more >
How to solve problem parsing the JPEG data in Photoshop
Methods to fix Photoshop JPEG parsing error · Open the JPEG file in the Windows default picture viewer · Rotate the image by...
Read more >
How to resolve Rational DOORS "Error while parsing file ... - IBM
Answer · Note the file name and location named in the DOORS error that indicates data corruption · Stop the DOORS server ·...
Read more >
5 most common parsing errors in CSV files (and ... - Medium
Typically the problem will appear when the CSV file is not using double quotes to enclose text and number fields. Names and addresses...
Read more >
[Solved] How to Fix There Was A Problem Parsing the Package
Six Fixes on Parse Error on Android · Fix 1: Enable "Allow installation of apps from unknown sources" · Fix 2: Turn on...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found