VARCHAR fields are padded with additional characters until field's maximum length is reached when trimming is disabled
See original GitHub issueFirst of all, thank you for all the effort that you’ve put in releasing this library and sharing it. It looks great.
We’ve been testing it and we’ve detected that when we process ebcdic files with a varchar field at the end of the record, when it gets translated into ascii(utf-8) it gets padded with additional characters at the end of the field until it reaches the maximum length defined for that field in the copybook. Obviously this happens only when the varchar field in the ebcdic file uses less bytes than the maximum specified in the copybook and cobrix-trimming-option is set to “none”.
For instance, given the following copybook (taken fom za.co.absa.cobrix.spark.cobol.source.regression.Test04VarcharFields):
01 R.
03 N PIC X(1).
03 V PIC X(10).
If the input ebcdic record was:
------------------------------------
| RDW | N | V |
------------------------------------
|0x00 0x00 0x02 0x00 | 0xF4 | 0xF1 |
------------------------------------
The expected output when trimming is set to “none” should be just 2 bytes:
- N -> the ebcdic2ascii translation for 0xF4
- V -> the ebcdic2ascii translation for 0xF1
But instead of that the actual output is 11 bytes:
- N -> the ebcdic2ascii translation for 0xF4 -> OK
- V -> the ebcdic2ascii translation for 0xF1 plus 9 additional characters (whitespaces or non-printable, depending on the codepage used) -> Not OK
Would it be possible to fix this so that the translated field doesn’t get those extra characters that don’t come from the source field ?
Also, are there any plans to support multiple varchar fields in the copybook ?
Thank you!
Issue Analytics
- State:
- Created 4 years ago
- Comments:7
Top GitHub Comments
Snapshot version is not slower that 1.0.1. Whatever our performance issue is, it’s not caused by the snapshot version.
Cool, thanks! Please, let me know if the snapshot version is slower than
1.0.1
. If no performance degradation is observed in1.0.2-SNAPSHOT
, we will release1.0.2
on Monday.