question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

IPTC: Envelope values "Character set" and "ModelVersion" result in broken Codepage

See original GitHub issue

Run this snippet with Magick.NET 7.22.3.0:

var raw = new byte[]
{
    // A naked 10x10 pixel jpg.
    0xFF, 0xD8, 0xFF, 0xE0, 0x00, 0x10, 0x4A, 0x46, 0x49, 0x46, 0x00, 0x01, 0x01, 0x01, 0x00, 0x48, 0x00,
    0x48, 0x00, 0x00, 0xFF, 0xDB, 0x00, 0x43, 0x00, 0x02, 0x01, 0x01, 0x01, 0x01, 0x01, 0x02, 0x01, 0x01,
    0x01, 0x02, 0x02, 0x02, 0x02, 0x02, 0x04, 0x03, 0x02, 0x02, 0x02, 0x02, 0x05, 0x04, 0x04, 0x03, 0x04,
    0x06, 0x05, 0x06, 0x06, 0x06, 0x05, 0x06, 0x06, 0x06, 0x07, 0x09, 0x08, 0x06, 0x07, 0x09, 0x07, 0x06,
    0x06, 0x08, 0x0B, 0x08, 0x09, 0x0A, 0x0A, 0x0A, 0x0A, 0x0A, 0x06, 0x08, 0x0B, 0x0C, 0x0B, 0x0A, 0x0C,
    0x09, 0x0A, 0x0A, 0x0A, 0xFF, 0xDB, 0x00, 0x43, 0x01, 0x02, 0x02, 0x02, 0x02, 0x02, 0x02, 0x05, 0x03,
    0x03, 0x05, 0x0A, 0x07, 0x06, 0x07, 0x0A, 0x0A, 0x0A, 0x0A, 0x0A, 0x0A, 0x0A, 0x0A, 0x0A, 0x0A, 0x0A,
    0x0A, 0x0A, 0x0A, 0x0A, 0x0A, 0x0A, 0x0A, 0x0A, 0x0A, 0x0A, 0x0A, 0x0A, 0x0A, 0x0A, 0x0A, 0x0A, 0x0A,
    0x0A, 0x0A, 0x0A, 0x0A, 0x0A, 0x0A, 0x0A, 0x0A, 0x0A, 0x0A, 0x0A, 0x0A, 0x0A, 0x0A, 0x0A, 0x0A, 0x0A,
    0x0A, 0x0A, 0x0A, 0x0A, 0x0A, 0xFF, 0xC0, 0x00, 0x11, 0x08, 0x00, 0x0A, 0x00, 0x0A, 0x03, 0x01, 0x22,
    0x00, 0x02, 0x11, 0x01, 0x03, 0x11, 0x01, 0xFF, 0xC4, 0x00, 0x15, 0x00, 0x01, 0x01, 0x00, 0x00, 0x00,
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x09, 0xFF, 0xC4, 0x00, 0x14,
    0x10, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
    0x00, 0xFF, 0xC4, 0x00, 0x14, 0x01, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xFF, 0xC4, 0x00, 0x14, 0x11, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00,
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xFF, 0xDA, 0x00, 0x0C, 0x03, 0x01,
    0x00, 0x02, 0x11, 0x03, 0x11, 0x00, 0x3F, 0x00, 0xBF, 0x80, 0x03, 0xFF, 0xD9
};

using (var stream = new MemoryStream())
{
    using (var image = new MagickImage(raw))
    {
        var iptcProfile = new IptcProfile();
        iptcProfile.SetValue(IptcTag.Byline, "Vašek");
        image.SetProfile(iptcProfile);

        image.Write(stream);
    }

    //Debug.WriteLine(string.Join(" ", stream.ToArray().Select(@byte => $"{@byte:X2}")));
    File.WriteAllBytes("test.jpg", stream.ToArray());
}

results in broken metadata (e.g. on left IrvanView doesn’t recognize UTF-8 / on the right IrvanView saves the IPTC with Character set and ModelVersion): image

I drilled down the issue to the IPTC profile:

wrong (Magick.NET):
FF ED  00 28  50 68 6F 74 6F 73 68 6F 70 20 33 2E 30 00 38 42 49 4D 04 04 00 00 00 00  00 0B                                                 1C 02 50 00 06  56 61 C5 A1 65 6B           00
correct:
FF ED  00 36  50 68 6F 74 6F 73 68 6F 70 20 33 2E 30 00 38 42 49 4D 04 04 00 00 00 00  00 1A  1C 01 5A 00 03 1B 25 47  1C 01 00 00 02 00 04  1C 02 50 00 06  56 61 C5 A1 65 6B
Marker Length P  h  o  t  o  s  h  o  p     3  .  0     8  B  I  M                     Length       Character set            ModelVersion          Byline     V  a     š  e  k  EvenPadding

Is it possible to add this IPTC Envelope data? Maybe “Character set” is sufficient but I didn’t find any specification.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
dlemstracommented, Jan 17, 2021

The new release has been published. Can you give it a try with that release?

1reaction
dlemstracommented, Jan 14, 2021

Thanks for reporting this. I could add the IPTC Envelope.

I did find this about the character set (https://www.iptc.org/std/IIM/4.1/specification/IIMV4.1.pdf):

Optional, not repeatable, up to 32 octets, consisting of one or more control functions used for the announcement, invocation or designation of coded character sets. The control functions follow the ISO 2022 standard and may consist of the escape control character and one or more graphic characters. For more details see Appendix C, the IPTC-NAA Code Library. The control functions apply to character oriented DataSets in records 2-6. They also apply to record 8, unless the objectdata explicitly, or the File Format implicitly, defines character sets otherwise. If this DataSet contains the designation function for Unicode in UTF-8 then no other announcement, designation or invocation functions are permitted in this DataSet or in records 2-6. For all other character sets, one or more escape sequences are used: • for the announcement of the code extension facilities used in the data which follows, • for the initial designation of the G0, G1, G2 and G3 graphic character sets and • for the initial invocation of the graphic set (7 bits) or the lefth and and the right-hand graphic set (8 bits) and for the initial invocation of the C0 (7 bits) or of the C0 and the C1 control character sets (8 bits). The announcement of the code extension facilities, if transmitted, must appear in this data set. Designation and invocation of graphic and control function sets (shifting) may be transmitted anywhere where the escape and the other necessary control characters are permitted. However, it is recommended to transmit in this DataSet an initial designation and invocation, i.e. to define all designations and the shift status currently in use by transmitting the appropriate escape sequences and locking-shift functions. If 1:90 is omitted, the default for records 2-6 and 8 is ISO 646 IRV (7 bits) or ISO 4873 DV (8 bits). Record 1 shall always use ISO 646 IRV or ISO 4873 DV respectively.

Your bytes contain this:

1C #
01 #
5A # 90 (Coded Character Set)
00 # 
03 # size
1B # escape
25 # %
47 # G

The APPENDIX C is also in the document but I don’t understand what is written there. I could probably add what is written by IrvanView for now.

This also means I made a design mistake when creating IIptcProfile and IIptcValue. The encoding should be moved from the value to the profile. For now I can probably only support UTF8 and allow setting it at a later moment when I figure out how to write other encodings.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Character set not copied when copying IPTC data between ...
I have a source file with this IPTC data (stripped): Envelope Record Version : 4. Coded Character Set : UTF8
Read more >
write IPTC:CodedCharacterSet tag when creating/updating ...
Quote In the (Default) setting IMatch (and ExifTool) assume that the character set used for IPTC is UTF-8 if the IPTC record in...
Read more >
Image::ExifTool IPTC UTF-8 Support?
The value of setting CodedCharacterSet is that it informs other applications that your IPTC strings contain UTF-8 characters. As far as ExifTool is...
Read more >
IPTC datasets
Tag (hex) Tag (dec) Key Type M. R. Min. bytes Max. bytes 0x0000 0 Iptc.Envelope.ModelVersion Short Yes No 2 2 0x0005 5 Iptc.Envelope.Destination String No...
Read more >
UTF-8 characters not displaying properly from JPEG IPTC ...
When reading the IPTC data from an image, UTF-8 accented characters are not displaying properly when reading them via PHP. For example: é,...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found