Products Resources Support About Us

Rocket Software

Handling binary files in git that contain a x'14' (Line Feed) character

I have an EBCDIC file that contains various other non-displayable characters that I know won’t round trip from z/OS to GitHub and back. One of the characters contained in my file is a x’14’ which USS sees as a line feed. I don’t need to edit the file in USS. So I OCOPY it in from a PDS member, where the x’14’ means nothing. My .gitattributes file has a line to say all these sorts of files are binary:

*.play binary

I commit and push the file to GitHub and if I look at it I see that it has been treated as binary, and is not viewable. If I then clone the repo back, then OCOPY the file back to the PDS, the line has been split at the x’14’.
According to the git doc for binary - Git will understand that the files specified are not text, and it should not try to change them.

Has the Rocket team tried to round trip a binary file containing a x’14’? Did you see the same results?

Hi Liam,

  1. Could you say what version and build of git you are using?
  2. Could you provide some more information about type of PDS data set like format, record length, block size of the PDS?
  3. Did you copy USS file back in the same PDS?
  4. Do you use OCOPY with the binary parameter, for example, ‘OCOPY INDD(…) OUTDD(…) BINARY’ in both cases?
  5. Is it x’14’ not x’15’ (EBCDIC new line)?

I have not reproduced it yet, but I will try on the same version of git again.

Thanks,
-Sergey

Hey Sergey,

Thanks for the prompt response. To answer your questions:

  1. Version: 2.14.4 Build Number: 08
  2. PDS where the file originates from is RECFM=VB, LRECL=32756 BLKSIZE=32760. I can send you an example of the file if required, but I think you can just mock a line up in a file. Normal EBCDIC stuff but a number of non-displayable characters. x’15’ is the problem one.
  3. I did an OCOPY from the PDS member to a file in the zFS that is in my local clone. Staged, Committed and pushed. File is tagged as binary in .gitattributes. Deleted local clone, then cloned again. Then same OCOPY back to the same PDS, although I used a different member name so as not to overwrite the original.
  4. I tried both with BINARY and without. Unfortunately BINARY puts the file back in a single record, when the original member had multiple records.
  5. Yes x’15’ apologies my bad.

Hi Liam,

If I understand it correctly, your PDS member contains multiple records, with a chance of having x’15’ characters somewhere in the data. Here is how OCOPY works for such members:

  • When you OCOPY such a member into a USS file in TEXT mode, each record is copied into a separate line. That is, extra newline (x’15’) characters are inserted into your data after each record, and they are indistinguishable from the x’15’ characters you had in the PDS members.
  • When you OCOPY the text file back to the data set in TEXT mode, all those x’15’ characters become points where data is split into records. That’s where your data will get split on those x’15’ you initially had in your PDS member.
  • If you change the OCOPY mode to BINARY, no x’15’ are added, that is, your data from the PDS member is copied as-is into the file. However, the file therefore won’t have any information about PDS record lengths; and since your PDS is variable-length, you can’t separate it back into records anymore (for a FB data set, you could split it every LRECL bytes). When you copy it back to the PDS member, x’15’ are preserved and no splits occur; thus all the data ends up being in a single record.

This all is a documented behaviour for OCOPY - see z/OS UNIX System Services Command Reference. You can try to exclude Git and just OCOPY your member to USS and back - and see if the problem persists.

The fundamental problem, however, is not in Git or OCOPY but is in the fact that EBCDIC newlines have the same character code x’15’ as you already have in the middle of your data. In a plain-text file you need a way to indicate line lengths somehow, hence newline characters. Even translating your data into ASCII on copy probably won’t help because x’15’ will be translated into ASCII newlines, too.

I’d suggest one of the following - depending on the nature of your data, different things may or may not work for you:

  • Change the contents of the PDS member to never have x’15’ (e.g. for an ISPF panel, change the character codes for attribute bytes.
  • Change the PDS to an FB data set or use a temporary FB data set and OCOPY in BINARY mode. This way, your PDS records will be padded on their way to USS, and will be split into records in exact same positions on their way back. The x’15’ characters won’t interfere with the process.

Regards,
Vladimir

Thanks Vladimir, I was looking at this again last night and had wondered if it was OCOPY that was splitting the line. Thanks for your explanation, that really helps. Unfortunately the x’15’ in the data is put there by some tool that generates the file. The tool is working with PDS so the x’15’ does not matter. We are investigation storing the file in a modern SCM like Git or RTC. We have already approached the developer of the tool about using FB instead of VB and will continue to investigate what will work.

Again, many thanks for your help!

Liam